In this first episode of Gradient Descent, hosts Vishnu Vettrivel (CTO of Wisecube AI) and Alex Thomas (Principal Data Scientist) discuss the rapid evolution of AI, the breakthroughs in LLMs, and the role of Natural Language Processing in shaping the future of artificial intelligence. They also share their experiences in AI development and explain why this podcast differs from other AI discussions.
Chapters:
00:00 – Introduction
01:56 – DeepSeek Overview
02:55 – Scaling Laws and Model Performance
04:36 – Peak Data: Are we running out of quality training data?
08:10 – Industry reaction to DeepSeek
09:05 – Jevons' Paradox: Why cheaper AI can drive more demand
11:04 – Supervised Fine-Tuning vs Reinforcement Learning (RLHF)
14:49 – Why Reinforcement Learning helps AI models generalize
20:29 – Distillation and Training Efficiency
25:01 – AI safety concerns: Toxicity, bias, and censorship
30:25 – Future Trends in LLMs: Cheaper, more specialized AI models?
37:30 – Final thoughts and upcoming topics
Listen on:
• YouTube: https://youtube.com/@WisecubeAI/podcasts
• Apple Podcast: https://apple.co/4kPMxZf
• Spotify: https://open.spotify.com/show/1nG58pwg2Dv6oAhCTzab55
• Amazon Music: https://bit.ly/4izpdO2
Follow us:
• Pythia Website: www.askpythia.ai
• Wisecube Website: www.wisecube.ai
• Linkedin: www.linkedin.com/company/wisecube
• Facebook: www.facebook.com/wisecubeai
• Reddit: www.reddit.com/r/pythia/
Mentioned Materials:
- Jevons’ Paradox: https://en.wikipedia.org/wiki/Jevons_paradox
- Scaling Laws for Neural Language Models: https://arxiv.org/abs/2001.08361
- Distilling the Knowledge in a Neural Network: https://arxiv.org/abs/1503.02531
- SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training: https://arxiv.org/abs/2501.17161
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning: https://arxiv.org/abs/2501.12948
- Reinforcement Learning: An Introduction (Sutton & Barto): https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf