Multimodal AI Models on Apple Silicon with MLX [Prince Canuma] – 744

Multimodal AI Models on Apple Silicon with MLX [Prince Canuma] - 744

Today, we’re joined by Prince Canuma, an ML engineer and open-source developer focused on optimizing AI inference on Apple Silicon devices. Prince shares his journey to becoming one of the most prolific contributors to Apple’s MLX ecosystem, having published over 1,000 models and libraries that make open, multimodal AI accessible and performant on Apple devices. We explore his workflow for adapting new models in MLX, the trade-offs between the GPU and Neural Engine, and how optimization methods like pruning and quantization enhance performance. We also cover his work on "Fusion," a weight-space method for combining model behaviors without retraining, and his popular packages—MLX-Audio, MLX-Embeddings, and MLX-VLM—which streamline the use of MLX across different modalities. Finally, Prince introduces Marvis, a real-time speech-to-speech voice agent, and shares his vision for the future of AI, emphasizing the move towards "media models" that can handle multiple modalities, and more.

🗒️ For the full list of resources for this episode, visit the show notes page: https://twimlai.com/go/744.

🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confirmation=1

🗣️ CONNECT WITH US!
===============================
Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/
Follow us on Twitter: https://twitter.com/twimlai
Follow us on LinkedIn: https://www.linkedin.com/company/twimlai/
Join our Slack Community: https://twimlai.com/community/
Subscribe to our newsletter: https://twimlai.com/newsletter/
Want to get in touch? Send us a message: https://twimlai.com/contact/

📖 CHAPTERS
===============================
00:00 – Introduction to MLX
11:04 – Timeline of MLX
13:57 – Is MLX an official Apple product?
14:59 – Is MLX used in Apple products?
18:16 – Neural engine vs. GPU
22:52 – Fusion
30:12 – Model porting to MLX process
36:26 – Model quantization
41:23 – Pruning
44:24 – MLX-Audio, MLX-Embeddings, and MLX-VLM
51:25 – Marvis
55:24 – Voice agents
1:02:14 – Future directions
1:04:05 – What’s next in AI

🔗 LINKS & RESOURCES
===============================
FastMLX – https://blaizzy.github.io/fastmlx/
MLX-VLM – https://github.com/Blaizzy/mlx-vlm
MLX-Embeddings – https://github.com/Blaizzy/mlx-embeddings
MLX-Audio – https://github.com/Blaizzy/mlx-audio
MLX LM – https://github.com/ml-explore/mlx-lm

📸 Camera: https://amzn.to/3TQ3zsg
🎙️Microphone: https://amzn.to/3t5zXeV
🚦Lights: https://amzn.to/3TQlX49
🎛️ Audio Interface: https://amzn.to/3TVFAIq
🎚️ Stream Deck: https://amzn.to/3zzm7F5

Genie 3: A New Frontier for World Models [Jack Parker-Holder and Shlomi Fruchter] – 743

Title 10 Crisis – ICE Raids & Unrest in Los Angeles | #058 The AI Podcast | Politics & News

Genie 3: A New Frontier for World Models [Jack Parker-Holder and Shlomi Fruchter] – 743

Title 10 Crisis – ICE Raids & Unrest in Los Angeles | #058 The AI Podcast | Politics & News

Related posts

OpenAI Is About to Launch AI JOBS NETWORK (LinkedIn for AI)

Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park | Lex Fridman Podcast #480

OpenAI is Making ChatGPT Into Something WAY BIGGER