AI at Meta

Enterprise

company

Verified

https://ai.facebook.com/

facebookresearch

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

valeriulacatusu updated a Space 2 days ago

facebook/omnisealbench

mduppes updated a Space 2 days ago

facebook/omnisealbench

Benjamin-eecs authored a paper 19 days ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

View all activity

Articles

Faster Text Generation with Self-Speculative Decoding

Nov 20, 2024

• 59

facebook 's collections 30

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann

facebook/vjepa2-vitl-fpc64-256

Video Classification • 0.3B • Updated Jun 17 • 35.3k • 140
facebook/vjepa2-vith-fpc64-256

Video Classification • 0.7B • Updated Jun 17 • 1.97k • 12
facebook/vjepa2-vitg-fpc64-256

Video Classification • 1B • Updated Jun 17 • 6.29k • 15
facebook/vjepa2-vitg-fpc64-384

Video Classification • 1B • Updated Jun 17 • 7.5k • 27

blt

facebook/blt

Updated Apr 30 • 50 • 70
facebook/blt-1b

5B • Updated May 1 • 179 • 17
facebook/blt-7b

11B • Updated May 1 • 10 • 61
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 106

Perception Encoder

facebook/PE-Core-L14-336

Zero-Shot Image Classification • Updated Apr 30 • 29.9k • 40
facebook/PE-Core-G14-448

Zero-Shot Image Classification • Updated Apr 30 • 37.5k • 14
facebook/PE-Lang-L14-448

Image Feature Extraction • Updated Apr 30 • 1.45k • 7
facebook/PE-Lang-G14-448

Image Feature Extraction • Updated Apr 30 • 242 • 13

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages.

facebook/drama-base

Sentence Similarity • 0.2B • Updated Mar 4 • 2.58k • 20
facebook/drama-large

Sentence Similarity • 0.4B • Updated Mar 4 • 73 • 7
facebook/drama-1b

Sentence Similarity • 1B • Updated Mar 4 • 77 • 11

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 132
facebook/MobileLLM-125M

Text Generation • Updated May 5 • 1.1k • 120
facebook/MobileLLM-350M

Text Generation • Updated May 5 • 466 • 35
facebook/MobileLLM-600M

Text Generation • Updated May 5 • 606 • 29

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 80
facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 837 • 14
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 360 • 5
facebook/layerskip-llama2-70B

Text Generation • 69B • Updated Nov 3, 2024 • 190 • 4

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation.

Seamless: Multilingual Expressive and Streaming Speech Translation

Paper • 2312.05187 • Published Dec 8, 2023 • 14
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 43.4k • 862
Runtime error

516

516

Seamless M4T v2

📞
facebook/seamless-expressive

Text-to-Speech • Updated Jan 4, 2024 • 185

Wav2Vec 2.0

A collection for the first release of Wav2Vec 2.0, a speech encoder that learns powerful representations from unlabelled audio data.

facebook/wav2vec2-large-960h-lv60-self

Automatic Speech Recognition • Updated May 23, 2022 • 62.4k • 149
facebook/wav2vec2-large-960h

Automatic Speech Recognition • Updated Apr 5, 2022 • 51.7k • 31
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 0.1B • Updated Nov 14, 2022 • 1.03M • 358
facebook/wav2vec2-base-100h

Automatic Speech Recognition • Updated May 27, 2022 • 1.33k • 6

XLSR

A collection of multilingual Wav2Vec 2.0 checkpoints pre-trained on 53 languages and fine-tuned for CTC speech recognition.

facebook/wav2vec2-large-xlsr-53

Updated Mar 18, 2022 • 864k • 143
facebook/wav2vec2-xlsr-53-espeak-cv-ft

Automatic Speech Recognition • Updated Dec 10, 2021 • 203k • 35
facebook/wav2vec2-large-xlsr-53-dutch

Automatic Speech Recognition • Updated Jul 6, 2021 • 869 • 3
facebook/wav2vec2-large-xlsr-53-french

Automatic Speech Recognition • Updated Jul 6, 2021 • 17.1k • 13

Robust Wav2Vec 2.0

A collection of "robust" Wav2Vec 2.0 checkpoints pre-trained on datasets from multiple domains.

facebook/wav2vec2-large-robust

Updated Nov 5, 2021 • 76.5k • 35
facebook/wav2vec2-large-robust-ft-libri-960h

Automatic Speech Recognition • 0.3B • Updated Jun 23, 2023 • 235k • 15
facebook/wav2vec2-large-robust-ft-swbd-300h

Automatic Speech Recognition • Updated Apr 5, 2022 • 494 • 20
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

Paper • 2104.01027 • Published Apr 2, 2021 • 1

VoxPopuli v2

A collection of checkpoints from the second VoxPopuli release.

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-bg-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 14
facebook/wav2vec2-base-cs-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 9 • 1
facebook/wav2vec2-base-da-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 10

Fairseq S^2 TTS

Text-to-speech models from fairseq s^2

facebook/fastspeech2-en-ljspeech

Text-to-Speech • Updated Jan 28, 2022 • 140 • 273
facebook/fastspeech2-en-200_speaker-cv4

Text-to-Speech • Updated Jan 28, 2022 • 22 • 6
facebook/tts_transformer-ar-cv7

Text-to-Speech • Updated Jan 28, 2022 • 11 • 8
facebook/tts_transformer-vi-cv7

Text-to-Speech • Updated Jan 28, 2022 • 14 • 11

MusicGen Stereo

A collection of stereo music generation models as part of the v2 MusicGen release.

facebook/musicgen-stereo-small

Text-to-Audio • 0.6B • Updated Mar 6, 2024 • 2.07k • 30
facebook/musicgen-stereo-medium

Text-to-Audio • 2B • Updated Mar 6, 2024 • 139 • 31
facebook/musicgen-stereo-large

Text-to-Audio • 3B • Updated Mar 6, 2024 • 238 • 78
facebook/musicgen-stereo-melody-large

Text-to-Audio • 3B • Updated Apr 24, 2024 • 70 • 57

Chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 37k • 185
facebook/chameleon-30b

Image-Text-to-Text • 34B • Updated Jul 30, 2024 • 500 • 88

OPT

OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3.

facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 4.52M • 208
facebook/opt-350m

Text Generation • Updated Sep 15, 2023 • 140k • 146
facebook/opt-1.3b

Text Generation • Updated Sep 15, 2023 • 118k • 175
facebook/opt-2.7b

Text Generation • Updated Sep 15, 2023 • 52.3k • 85

Web-SSL

facebook/webssl-dino300m-full2b-224

Image Feature Extraction • 0.3B • Updated Apr 24 • 683 • 9
facebook/webssl-dino1b-full2b-224

Image Feature Extraction • 1B • Updated Apr 24 • 1.87k • 1
facebook/webssl-dino2b-full2b-224

Image Feature Extraction • 2B • Updated Apr 24 • 92
facebook/webssl-dino3b-full2b-224

Image Feature Extraction • 3B • Updated Apr 24 • 401

Perception LM

facebook/Perception-LM-1B

Image-Text-to-Text • 2B • Updated 5 days ago • 429 • 30
facebook/Perception-LM-3B

Image-Text-to-Text • 4B • Updated 5 days ago • 438 • 17
facebook/Perception-LM-8B

Image-Text-to-Text • 10B • Updated 5 days ago • 272 • 51
facebook/PLM-VideoBench

Viewer • Updated May 21 • 44k • 648 • 10

FAIR Chemistry

facebook/OMAT24

Updated Apr 14 • 79
facebook/OMAT24

Updated Jan 8 • 71 • 56
facebook/OMol25

Updated 29 days ago • 135
facebook/UMA

Updated 17 days ago • 111

Meta Motivo

A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

facebook/metamotivo-S-1

0.0B • Updated Dec 12, 2024 • 17.9k • 8
facebook/metamotivo-S-2

0.0B • Updated Dec 12, 2024 • 112 • 2
facebook/metamotivo-S-3

0.0B • Updated Dec 12, 2024 • 110 • 2
facebook/metamotivo-S-4

0.0B • Updated Dec 12, 2024 • 81 • 2

Sparsh

Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing

facebook/sparsh-dino-base

Updated Oct 21, 2024 • 5
facebook/sparsh-dino-small

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-base

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-small

Updated Oct 21, 2024 • 1

MelodyFlow

MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4, 2024 • 18
facebook/melodyflow-t24-30secs

Updated Oct 23, 2024 • 25
Running on Zero

107

107

MelodyFlow

🎵

Generate music from text and melody

MAGNeT

Masked Audio Generation using a Single Non-Autoregressive Transformer

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44
facebook/magnet-small-10secs

Text-to-Audio • Updated Jan 16, 2024 • 900 • 25
facebook/magnet-medium-10secs

Text-to-Audio • Updated Jan 16, 2024 • 241 • 9
facebook/magnet-small-30secs

Text-to-Audio • Updated Jan 16, 2024 • 230 • 8

SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly.

Runtime error

951

951

Seamless M4T

📞
facebook/hf-seamless-m4t-large

Text-to-Speech • Updated Dec 8, 2023 • 2.01k • 58
facebook/hf-seamless-m4t-medium

Text-to-Speech • Updated Dec 8, 2023 • 6.81k • 29
facebook/seamless-m4t-large

Automatic Speech Recognition • Updated Dec 14, 2023 • 513

XLS-R

First release checkpoints for XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0.

facebook/wav2vec2-xls-r-300m

Updated Aug 10, 2022 • 166k • 95
facebook/wav2vec2-xls-r-1b

Updated Aug 10, 2022 • 5.7k • 28
facebook/wav2vec2-xls-r-2b

Updated Aug 10, 2022 • 2.77k • 40
facebook/wav2vec2-xls-r-300m-en-to-15

Automatic Speech Recognition • Updated Jan 26, 2023 • 52 • 6

VoxPopuli

A collection of open-source artefacts (datasets + checkpoints) from the first VoxPopuli release.

facebook/voxpopuli

Updated Oct 14, 2022 • 6.95k • 125
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-100k-voxpopuli

Automatic Speech Recognition • Updated Nov 5, 2021 • 139 • 4
facebook/wav2vec2-base-10k-voxpopuli-ft-cs

Automatic Speech Recognition • Updated Jul 6, 2021 • 5

HuBERT

A collection of checkpoints from the HuBERT release, a speech encoder that learns powerful representations from unlabelled audio data.

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Paper • 2106.07447 • Published Jun 14, 2021 • 3
facebook/hubert-base-ls960

Feature Extraction • Updated Nov 5, 2021 • 237k • • 58
facebook/hubert-large-ll60k

Feature Extraction • Updated Nov 5, 2021 • 65.6k • 30
facebook/hubert-large-ls960-ft

Automatic Speech Recognition • Updated May 24, 2022 • 546k • 69

Dinov2

facebook/dinov2-small

Image Feature Extraction • 0.0B • Updated Sep 6, 2023 • 1.37M • 41
facebook/dinov2-base

Image Feature Extraction • 0.1B • Updated Jan 17, 2024 • 2.36M • 137
facebook/dinov2-large

Image Feature Extraction • 0.3B • Updated Sep 6, 2023 • 883k • 86
facebook/dinov2-giant

Image Feature Extraction • 1B • Updated Sep 6, 2023 • 96.8k • 46

LLM Compiler

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning.

facebook/llm-compiler-7b

Text Generation • Updated Jun 27, 2024 • 528 • 134
facebook/llm-compiler-7b-ftd

Text Generation • Updated Jun 27, 2024 • 25 • 27
facebook/llm-compiler-13b

Text Generation • Updated Jun 27, 2024 • 16 • 87
facebook/llm-compiler-13b-ftd

Text Generation • Updated Jun 27, 2024 • 31 • 56

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22, 2024 • 93
facebook/sapiens

Updated Sep 20, 2024 • 28 • 241
Running on Zero

58

58

Sapiens Pose

📊

Detect and estimate poses in images
Running on Zero

121

121

Sapiens Segmentation

🌍

Segment body parts in images

FAIR's LayerSkip Llama models

facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 837 • 14
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 360 • 5
facebook/layerskip-codellama-7B

Text Generation • 7B • Updated Oct 19, 2024 • 6
facebook/layerskip-codellama-34B

Text Generation • 34B • Updated Oct 19, 2024 • 11 • 4

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann

facebook/vjepa2-vitl-fpc64-256

Video Classification • 0.3B • Updated Jun 17 • 35.3k • 140
facebook/vjepa2-vith-fpc64-256

Video Classification • 0.7B • Updated Jun 17 • 1.97k • 12
facebook/vjepa2-vitg-fpc64-256

Video Classification • 1B • Updated Jun 17 • 6.29k • 15
facebook/vjepa2-vitg-fpc64-384

Video Classification • 1B • Updated Jun 17 • 7.5k • 27

Web-SSL

facebook/webssl-dino300m-full2b-224

Image Feature Extraction • 0.3B • Updated Apr 24 • 683 • 9
facebook/webssl-dino1b-full2b-224

Image Feature Extraction • 1B • Updated Apr 24 • 1.87k • 1
facebook/webssl-dino2b-full2b-224

Image Feature Extraction • 2B • Updated Apr 24 • 92
facebook/webssl-dino3b-full2b-224

Image Feature Extraction • 3B • Updated Apr 24 • 401

blt

facebook/blt

Updated Apr 30 • 50 • 70
facebook/blt-1b

5B • Updated May 1 • 179 • 17
facebook/blt-7b

11B • Updated May 1 • 10 • 61
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 106

Perception LM

facebook/Perception-LM-1B

Image-Text-to-Text • 2B • Updated 5 days ago • 429 • 30
facebook/Perception-LM-3B

Image-Text-to-Text • 4B • Updated 5 days ago • 438 • 17
facebook/Perception-LM-8B

Image-Text-to-Text • 10B • Updated 5 days ago • 272 • 51
facebook/PLM-VideoBench

Viewer • Updated May 21 • 44k • 648 • 10

Perception Encoder

facebook/PE-Core-L14-336

Zero-Shot Image Classification • Updated Apr 30 • 29.9k • 40
facebook/PE-Core-G14-448

Zero-Shot Image Classification • Updated Apr 30 • 37.5k • 14
facebook/PE-Lang-L14-448

Image Feature Extraction • Updated Apr 30 • 1.45k • 7
facebook/PE-Lang-G14-448

Image Feature Extraction • Updated Apr 30 • 242 • 13

FAIR Chemistry

facebook/OMAT24

Updated Apr 14 • 79
facebook/OMAT24

Updated Jan 8 • 71 • 56
facebook/OMol25

Updated 29 days ago • 135
facebook/UMA

Updated 17 days ago • 111

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages.

facebook/drama-base

Sentence Similarity • 0.2B • Updated Mar 4 • 2.58k • 20
facebook/drama-large

Sentence Similarity • 0.4B • Updated Mar 4 • 73 • 7
facebook/drama-1b

Sentence Similarity • 1B • Updated Mar 4 • 77 • 11

Meta Motivo

A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

facebook/metamotivo-S-1

0.0B • Updated Dec 12, 2024 • 17.9k • 8
facebook/metamotivo-S-2

0.0B • Updated Dec 12, 2024 • 112 • 2
facebook/metamotivo-S-3

0.0B • Updated Dec 12, 2024 • 110 • 2
facebook/metamotivo-S-4

0.0B • Updated Dec 12, 2024 • 81 • 2

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 132
facebook/MobileLLM-125M

Text Generation • Updated May 5 • 1.1k • 120
facebook/MobileLLM-350M

Text Generation • Updated May 5 • 466 • 35
facebook/MobileLLM-600M

Text Generation • Updated May 5 • 606 • 29

Sparsh

Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing

facebook/sparsh-dino-base

Updated Oct 21, 2024 • 5
facebook/sparsh-dino-small

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-base

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-small

Updated Oct 21, 2024 • 1

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 80
facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 837 • 14
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 360 • 5
facebook/layerskip-llama2-70B

Text Generation • 69B • Updated Nov 3, 2024 • 190 • 4

MelodyFlow

MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4, 2024 • 18
facebook/melodyflow-t24-30secs

Updated Oct 23, 2024 • 25
Running on Zero

107

107

MelodyFlow

🎵

Generate music from text and melody

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation.

Seamless: Multilingual Expressive and Streaming Speech Translation

Paper • 2312.05187 • Published Dec 8, 2023 • 14
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 43.4k • 862
Runtime error

516

516

Seamless M4T v2

📞
facebook/seamless-expressive

Text-to-Speech • Updated Jan 4, 2024 • 185

MAGNeT

Masked Audio Generation using a Single Non-Autoregressive Transformer

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44
facebook/magnet-small-10secs

Text-to-Audio • Updated Jan 16, 2024 • 900 • 25
facebook/magnet-medium-10secs

Text-to-Audio • Updated Jan 16, 2024 • 241 • 9
facebook/magnet-small-30secs

Text-to-Audio • Updated Jan 16, 2024 • 230 • 8

Wav2Vec 2.0

A collection for the first release of Wav2Vec 2.0, a speech encoder that learns powerful representations from unlabelled audio data.

facebook/wav2vec2-large-960h-lv60-self

Automatic Speech Recognition • Updated May 23, 2022 • 62.4k • 149
facebook/wav2vec2-large-960h

Automatic Speech Recognition • Updated Apr 5, 2022 • 51.7k • 31
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 0.1B • Updated Nov 14, 2022 • 1.03M • 358
facebook/wav2vec2-base-100h

Automatic Speech Recognition • Updated May 27, 2022 • 1.33k • 6

SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly.

Runtime error

951

951

Seamless M4T

📞
facebook/hf-seamless-m4t-large

Text-to-Speech • Updated Dec 8, 2023 • 2.01k • 58
facebook/hf-seamless-m4t-medium

Text-to-Speech • Updated Dec 8, 2023 • 6.81k • 29
facebook/seamless-m4t-large

Automatic Speech Recognition • Updated Dec 14, 2023 • 513

XLSR

A collection of multilingual Wav2Vec 2.0 checkpoints pre-trained on 53 languages and fine-tuned for CTC speech recognition.

facebook/wav2vec2-large-xlsr-53

Updated Mar 18, 2022 • 864k • 143
facebook/wav2vec2-xlsr-53-espeak-cv-ft

Automatic Speech Recognition • Updated Dec 10, 2021 • 203k • 35
facebook/wav2vec2-large-xlsr-53-dutch

Automatic Speech Recognition • Updated Jul 6, 2021 • 869 • 3
facebook/wav2vec2-large-xlsr-53-french

Automatic Speech Recognition • Updated Jul 6, 2021 • 17.1k • 13

XLS-R

First release checkpoints for XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0.

facebook/wav2vec2-xls-r-300m

Updated Aug 10, 2022 • 166k • 95
facebook/wav2vec2-xls-r-1b

Updated Aug 10, 2022 • 5.7k • 28
facebook/wav2vec2-xls-r-2b

Updated Aug 10, 2022 • 2.77k • 40
facebook/wav2vec2-xls-r-300m-en-to-15

Automatic Speech Recognition • Updated Jan 26, 2023 • 52 • 6

Robust Wav2Vec 2.0

A collection of "robust" Wav2Vec 2.0 checkpoints pre-trained on datasets from multiple domains.

facebook/wav2vec2-large-robust

Updated Nov 5, 2021 • 76.5k • 35
facebook/wav2vec2-large-robust-ft-libri-960h

Automatic Speech Recognition • 0.3B • Updated Jun 23, 2023 • 235k • 15
facebook/wav2vec2-large-robust-ft-swbd-300h

Automatic Speech Recognition • Updated Apr 5, 2022 • 494 • 20
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

Paper • 2104.01027 • Published Apr 2, 2021 • 1

VoxPopuli

A collection of open-source artefacts (datasets + checkpoints) from the first VoxPopuli release.

facebook/voxpopuli

Updated Oct 14, 2022 • 6.95k • 125
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-100k-voxpopuli

Automatic Speech Recognition • Updated Nov 5, 2021 • 139 • 4
facebook/wav2vec2-base-10k-voxpopuli-ft-cs

Automatic Speech Recognition • Updated Jul 6, 2021 • 5

VoxPopuli v2

A collection of checkpoints from the second VoxPopuli release.

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-bg-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 14
facebook/wav2vec2-base-cs-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 9 • 1
facebook/wav2vec2-base-da-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 10

HuBERT

A collection of checkpoints from the HuBERT release, a speech encoder that learns powerful representations from unlabelled audio data.

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Paper • 2106.07447 • Published Jun 14, 2021 • 3
facebook/hubert-base-ls960

Feature Extraction • Updated Nov 5, 2021 • 237k • • 58
facebook/hubert-large-ll60k

Feature Extraction • Updated Nov 5, 2021 • 65.6k • 30
facebook/hubert-large-ls960-ft

Automatic Speech Recognition • Updated May 24, 2022 • 546k • 69

Fairseq S^2 TTS

Text-to-speech models from fairseq s^2

facebook/fastspeech2-en-ljspeech

Text-to-Speech • Updated Jan 28, 2022 • 140 • 273
facebook/fastspeech2-en-200_speaker-cv4

Text-to-Speech • Updated Jan 28, 2022 • 22 • 6
facebook/tts_transformer-ar-cv7

Text-to-Speech • Updated Jan 28, 2022 • 11 • 8
facebook/tts_transformer-vi-cv7

Text-to-Speech • Updated Jan 28, 2022 • 14 • 11

Dinov2

facebook/dinov2-small

Image Feature Extraction • 0.0B • Updated Sep 6, 2023 • 1.37M • 41
facebook/dinov2-base

Image Feature Extraction • 0.1B • Updated Jan 17, 2024 • 2.36M • 137
facebook/dinov2-large

Image Feature Extraction • 0.3B • Updated Sep 6, 2023 • 883k • 86
facebook/dinov2-giant

Image Feature Extraction • 1B • Updated Sep 6, 2023 • 96.8k • 46

MusicGen Stereo

A collection of stereo music generation models as part of the v2 MusicGen release.

facebook/musicgen-stereo-small

Text-to-Audio • 0.6B • Updated Mar 6, 2024 • 2.07k • 30
facebook/musicgen-stereo-medium

Text-to-Audio • 2B • Updated Mar 6, 2024 • 139 • 31
facebook/musicgen-stereo-large

Text-to-Audio • 3B • Updated Mar 6, 2024 • 238 • 78
facebook/musicgen-stereo-melody-large

Text-to-Audio • 3B • Updated Apr 24, 2024 • 70 • 57

LLM Compiler

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning.

facebook/llm-compiler-7b

Text Generation • Updated Jun 27, 2024 • 528 • 134
facebook/llm-compiler-7b-ftd

Text Generation • Updated Jun 27, 2024 • 25 • 27
facebook/llm-compiler-13b

Text Generation • Updated Jun 27, 2024 • 16 • 87
facebook/llm-compiler-13b-ftd

Text Generation • Updated Jun 27, 2024 • 31 • 56

Chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 37k • 185
facebook/chameleon-30b

Image-Text-to-Text • 34B • Updated Jul 30, 2024 • 500 • 88

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22, 2024 • 93
facebook/sapiens

Updated Sep 20, 2024 • 28 • 241
Running on Zero

58

58

Sapiens Pose

📊

Detect and estimate poses in images
Running on Zero

121

121

Sapiens Segmentation

🌍

Segment body parts in images

OPT

OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3.

facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 4.52M • 208
facebook/opt-350m

Text Generation • Updated Sep 15, 2023 • 140k • 146
facebook/opt-1.3b

Text Generation • Updated Sep 15, 2023 • 118k • 175
facebook/opt-2.7b

Text Generation • Updated Sep 15, 2023 • 52.3k • 85

FAIR's LayerSkip Llama models

facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 837 • 14
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 360 • 5
facebook/layerskip-codellama-7B

Text Generation • 7B • Updated Oct 19, 2024 • 6
facebook/layerskip-codellama-34B

Text Generation • 34B • Updated Oct 19, 2024 • 11 • 4

AI & ML interests

Recent Activity

Articles

Faster Text Generation with Self-Speculative Decoding

Team members 332

facebook 's collections 30

Seamless M4T v2

MelodyFlow

Seamless M4T

Sapiens Pose

Sapiens Segmentation