-
📆 ThursdAI - Dec 5 - OpenAI o1 & o1 pro, Tencent HY-Video, FishSpeech 1.5, Google GENIE2, Weave in GA & more AI news
- 2024/12/06
- 再生時間: 1 時間 32 分
- ポッドキャスト
-
サマリー
あらすじ・解説
Well well well, December is finally here, we're about to close out this year (and have just flew by the second anniversary of chatGPT 🎂) and it seems that all of the AI labs want to give us X-mas presents to play with over the holidays! Look, I keep saying this, but weeks are getting crazier and crazier, this week we got the cheapest and the most expensive AI offerings all at once (the cheapest from Amazon and the most expensive from OpenAI), 2 new open weights models that beat commercial offerings, a diffusion model that predicts the weather and 2 world building models, oh and 2 decentralized fully open sourced LLMs were trained across the world LIVE and finished training. I said... crazy week! And for W&B, this week started with Weave launching finally in GA 🎉, which I personally was looking forward for (read more below)!TL;DR Highlights* OpenAI O1 & Pro Tier: O1 is out of preview, now smarter, faster, multimodal, and integrated into ChatGPT. For heavy usage, ChatGPT Pro ($200/month) offers unlimited calls and O1 Pro Mode for harder reasoning tasks.* Video & Audio Open Source Explosion: Tencent’s HYVideo outperforms Runway and Luma, bringing high-quality video generation to open source. Fishspeech 1.5 challenges top TTS providers, making near-human voice available for free research.* Open Source Decentralization: Nous Research’s DiStRo (15B) and Prime Intellect’s INTELLECT-1 (10B) prove you can train giant LLMs across decentralized nodes globally. Performance is on par with centralized setups.* Google’s Genie 2 & WorldLabs: Generating fully interactive 3D worlds from a single image, pushing boundaries in embodied AI and simulation. Google’s GenCast also sets a new standard in weather prediction, beating supercomputers in accuracy and speed.* Amazon’s Nova FMs: Cheap, scalable LLMs with huge context and global language coverage. Perfect for cost-conscious enterprise tasks, though not top on performance.* 🎉 Weave by W&B: Now in GA, it’s your dashboard and tool suite for building, monitoring, and scaling GenAI apps. Get Started with 1 line of codeOpenAI’s 12 Days of Shipping: O1 & ChatGPT ProThe biggest splash this week came from OpenAI. They’re kicking off “12 days of launches,” and Day 1 brought the long-awaited full version of o1. The main complaint about o1 for many people is how slow it was! Well, now it’s not only smarter but significantly faster (60% faster than preview!), and officially multimodal: it can see images and text together.Better yet, OpenAI introduced a new ChatGPT Pro tier at $200/month. It offers unlimited usage of o1, advanced voice mode, and something called o1 pro mode — where o1 thinks even harder and longer about your hardest math, coding, or science problems. For power users—maybe data scientists, engineers, or hardcore coders—this might be a no-brainer. For others, 200 bucks might be steep, but hey, someone’s gotta pay for those GPUs. Given that OpenAI recently confirmed that there are now 300 Million monthly active users on the platform, and many of my friends already upgraded, this is for sure going to boost the bottom line at OpenAI! Quoting Sam Altman from the stream, “This is for the power users who push the model to its limits every day.” For those who complained o1 took forever just to say “hi,” rejoice: trivial requests will now be answered quickly, while super-hard tasks get that legendary deep reasoning including a new progress bar and a notification when a task is complete. Friend of the pod Ray Fernando gave pro a prompt that took 7 minutes to think through! I've tested the new o1 myself, and while I've gotten dangerously close to my 50 messages per week quota, I've gotten some incredible results already, and very fast as well. This ice-cubes question failed o1-preview and o1-mini and it took both of them significantly longer, and it took just 4 seconds for o1. Open Source LLMs: Decentralization & Transparent ReasoningNous Research DiStRo & DeMo OptimizerWe’ve talked about decentralized training before, but the folks at Nous Research are making it a reality at scale. This week, Nous Research wrapped up the training of a new 15B-parameter LLM—codename “Psyche”—using a fully decentralized approach called “Nous DiStRo.” Picture a massive AI model trained not in a single data center, but across GPU nodes scattered around the globe. According to Alex Volkov (host of ThursdAI), “This is crazy: they’re literally training a 15B param model using GPUs from multiple companies and individuals, and it’s working as well as centralized runs.”The key to this success is “DeMo” (Decoupled Momentum Optimization), a paper co-authored by none other than Diederik Kingma (yes, the Kingma behind Adam optimizer and VAEs). DeMo drastically reduces communication overhead and still maintains stability and speed. The training loss curve they’ve shown looks just as good as a normal centralized run, proving that ...