Testing Meta’s new Audio Language Model - Spirit-LM
DigiDecode DigiDecode
66 subscribers
8 views
2

 Published On Oct 19, 2024

Meta has released a research only Audio Language Model called Spirit-LM, it can take both text & audio and input and generate both audio & text output

Meta Spirit LM is trained with a word-level interleaving method on speech and text datasets to enable cross-modality generation

Spirit LM lets people generate more natural sounding speech, and it has the ability to learn new tasks across modalities such as automatic speech recognition, text-to-speech, and speech classification.

Spirit-LM blog post:

https://ai.meta.com/blog/fair-news-se...

Spirit-LM research paper:

https://arxiv.org/abs/2402.05755

Spirit LM code:

https://github.com/facebookresearch/s...

Spirit-LM developer’s video:

   • SpiRit-LM, an Interleaved Spoken and ...  

Follow on Twitter: https://x.com/digi_decode

show more

Share/Embed