
The World’s Largest Multimodal Dataset | Episode 5
Building the next CLIP model with E-MM1
Frederik, Head of Machine Learning at Encord, explains how they built a retrieval model with the world's largest multimodal AI dataset. In this episode, Frederik details how they built their baseline retrieval model with the E-MM1 dataset, how it compares to similar models like CLIP, and how others can use this model for generative AI and RAG systems across multiple modalities.
Speakers

Frederik Hvilshøj
ML Lead @ Encord
The World's Largest Multimodal AI Dataset
The open-source E-MM1 dataset has 100+ million groups of images, videos, text, audio and 3D point clouds, giving AI teams more training data for their AI models.