-
🎄ThursdAI - Dec19 - o1 vs gemini reasoning, VEO vs SORA, and holiday season full of AI surprises
- 2024/12/20
- 再生時間: 1 時間 36 分
- ポッドキャスト
-
サマリー
あらすじ・解説
For the full show notes and links visit https://sub.thursdai.news🔗 Subscribe to our show on Spotify: https://thursdai.news/spotify🔗 Apple: https://thursdai.news/appleHo, ho, holy moly, folks! Alex here, coming to you live from a world where AI updates are dropping faster than Santa down a chimney! 🎅 It's been another absolutely BANANAS week in the AI world, and if you thought last week was wild, and we're due for a break, buckle up, because this one's a freakin' rollercoaster! 🎢In this episode of ThursdAI, we dive deep into the recent innovations from OpenAI, including their 1-800 ChatGPT phone service and new advancements in voice mode and API functionalities. We discuss the latest updates on O1 model capabilities, including Reasoning Effort settings, and highlight the introduction of WebRTC support by OpenAI. Additionally, we explore the groundbreaking VEO2 model from Google, the generative physics engine Genesis, and new developments in open source models like Cohere's Command R7b. We also provide practical insights on using tools like Weights & Biases for evaluating AI models and share tips on leveraging GitHub Gigi. Tune in for a comprehensive overview of the latest in AI technology and innovation.00:00 Introduction and OpenAI's 12 Days of Releases00:48 Advanced Voice Mode and Public Reactions01:57 Celebrating Tech Innovations02:24 Exciting New Features in AVMs03:08 TLDR - ThursdAI December 1912:58 Voice and Audio Innovations14:29 AI Art, Diffusion, and 3D16:51 Breaking News: Google Gemini 2.023:10 Meta Apollo 7b Revisited33:44 Google's Sora and Veo234:12 Introduction to Veo2 and Sora34:59 First Impressions of Veo235:49 Comparing Veo2 and Sora37:09 Sora's Unique Features38:03 Google's MVP Approach43:07 OpenAI's Latest Releases44:48 Exploring OpenAI's 1-800 CHAT GPT47:18 OpenAI's Fine-Tuning with DPO48:15 OpenAI's Mini Dev Day Announcements49:08 Evaluating OpenAI's O1 Model54:39 Weights & Biases Evaluation Tool - Weave01:03:52 ArcAGI and O1 Performance01:06:47 Introduction and Technical Issues01:06:51 Efforts on Desktop Apps01:07:16 ChatGPT Desktop App Features01:07:25 Working with Apps and Warp Integration01:08:38 Programming with ChatGPT in IDEs01:08:44 Discussion on Warp and Other Tools01:10:37 GitHub GG Project01:14:47 OpenAI Announcements and WebRTC01:24:45 Modern BERT and Smaller Models01:27:37 Genesis: Generative Physics Engine01:33:12 Closing Remarks and Holiday WishesHere’s a talking podcast host speaking excitedly about his showTL;DR - Show notes and Links* Open Source LLMs* Meta Apollo 7B – LMM w/ SOTA video understanding (Page, HF)* Microsoft Phi-4 – 14B SLM (Blog, Paper)* Cohere Command R 7B – (Blog)* Falcon 3 – series of models (X, HF, web)* IBM updates Granite 3.1 + embedding models (HF, Embedding)* Big CO LLMs + APIs* OpenAI releases new o1 + API access (X)* Microsoft makes CoPilot Free! (X)* Google - Gemini Flash 2 Thinking experimental reasoning model (X, Studio)* This weeks Buzz* W&B weave Playground now has Trials (and o1 compatibility) (try it)* Alex Evaluation of o1 and Gemini Thinking experimental (X, Colab, Dashboard)* Vision & Video* Google releases Veo 2 – SOTA text2video modal - beating SORA by most vibes (X)* HunyuanVideo distilled with FastHunyuan down to 6 steps (HF)* Kling 1.6 (X)* Voice & Audio* OpenAI realtime audio improvements (docs)* 11labs new Flash 2.5 model – 75ms generation (X)* Nexa OmniAudio – 2.6B – multimodal local LLM (Blog)* Moonshine Web – real time speech recognition in the browser (X)* Sony MMAudio - open source video 2 audio model (Blog, Demo)* AI Art & Diffusion & 3D* Genesys – open source generative 3D physics engine (X, Site, Github)* Tools* CerebrasCoder – extremely fast apps creation (Try It)* RepoPrompt to chat with o1 Pro – (download) This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe