エピソード

  • Agentic AI – Hype or the Next Step in AI Evolution?
    2025/04/12

    Let’s dive into Agentic AI, guided by the "Cognitive Architectures for Language Agents" (CoALA) paper. What defines an agentic system? How does it plan, leverage memory, and execute tasks? We explore semantic, episodic, and procedural memory, discuss decision-making loops, and examine how agents integrate with external APIs (think LangGraph). Learn how AI tackles complex automation — from code generation to playing Minecraft — and why designing robust action spaces is key to scaling systems. We also touch on challenges like memory updates and the ethics of agentic AI. Get actionable insight…🔗 Links to the CoALA paper, LangGraph, and more in the description. 🔔 Subscribe to stay updated with Gradient Descent!

    Listen on:

    • ⁠YouTube⁠: https://youtube.com/@WisecubeAI/podcasts

    • ⁠Apple Podcast⁠: https://apple.co/4kPMxZf

    • ⁠Spotify⁠: https://open.spotify.com/show/1nG58pwg2Dv6oAhCTzab55

    • ⁠Amazon Music⁠: https://bit.ly/4izpdO2

    Mentioned Materials:• Cognitive Architectures for Language Agents (CoALA) - https://arxiv.org/abs/2309.02427• LangChain - https://python.langchain.com/docs/introduction/• LangGraph - https://langchain-ai.github.io/langgraph/Our solutions:- https://askpythia.ai/ - LLM Hallucination Detection Tool- https://www.wisecube.ai - Wisecube AI platform can analyze millions of biomedical publications, clinical trials, protein and chemical databases. Follow us: - Pythia Website: https://askpythia.ai/- Wisecube Website: https://www.wisecube.ai- LinkedIn: https://www.linkedin.com/company/wisecube/ - Facebook: https://www.facebook.com/wisecubeai- X: https://x.com/wisecubeai- Reddit: https://www.reddit.com/r/pythia/- GitHub: https://github.com/wisecubeai#AgenticAI #FutureOfAI #AIInnovation #ArtificialIntelligence #MachineLearning #DeepLearning #LLM

    続きを読む 一部表示
    41 分
  • LLM as a Judge: Can AI Evaluate Itself?
    2025/03/22
    In the second episode of Gradient Descent, Vishnu Vettrivel (CTO of Wisecube) and Alex Thomas (Principal Data Scientist) explore the innovative yet controversial idea of using LLMs to judge and evaluate other AI systems. They discuss the hidden human role in AI training, limitations of traditional benchmarks, automated evaluation strengths and weaknesses, and best practices for building reliable AI judgment systems.Timestamps:00:00 – Introduction & Context 01:00 – The Role of Humans in AI 03:58 – Why Is Evaluating LLMs So Difficult? 09:00 – Pros and Cons of LLM-as-a-Judge 14:30 – How to Make LLM-as-a-Judge More Reliable? 19:30 – Trust and Reliability Issues 25:00 – The Future of LLM-as-a-Judge 30:00 – Final Thoughts and Takeaways Listen on:• ⁠YouTube⁠: https://youtube.com/@WisecubeAI/podcasts• ⁠Apple Podcast⁠: https://apple.co/4kPMxZf• ⁠Spotify⁠: https://open.spotify.com/show/1nG58pwg2Dv6oAhCTzab55• ⁠Amazon Music⁠: https://bit.ly/4izpdO2 Follow us: • ⁠Pythia Website⁠: www.askpythia.ai• ⁠Wisecube Website⁠: www.wisecube.ai• ⁠Linkedin⁠: www.linkedin.com/company/wisecube• ⁠Facebook⁠: www.facebook.com/wisecubeai• ⁠Reddit⁠: www.reddit.com/r/pythia/Mentioned Materials:- Best Practices for LLM-as-a-Judge: https://www.databricks.com/blog/LLM-auto-eval-best-practices-RAG - LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods: https://arxiv.org/pdf/2412.05579v2- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena: https://arxiv.org/abs/2306.05685- Guide to LLM-as-a-Judge: https://www.evidentlyai.com/llm-guide/llm-as-a-judge - Preference Leakage: A Contamination Problem in LLM-as-a-Judge: https://arxiv.org/pdf/2502.01534- Large Language Models Are Not Fair Evaluators: https://arxiv.org/pdf/2305.17926- Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment: https://arxiv.org/pdf/2402.14016v2- Optimization-based Prompt Injection Attack to LLM-as-a-Judge: https://arxiv.org/pdf/2403.17710v4- AWS Bedrock: Model Evaluation: https://aws.amazon.com/blogs/machine-learning/llm-as-a-judge-on-amazon-bedrock-model-evaluation/ - Hugging Face: LLM Judge Cookbook: https://huggingface.co/learn/cookbook/en/llm_judge
    続きを読む 一部表示
    32 分
  • AI Scaling Laws, DeepSeek’s Cost Efficiency & The Future of AI Training
    2025/03/06

    In this first episode of Gradient Descent, hosts Vishnu Vettrivel (CTO of Wisecube AI) and Alex Thomas (Principal Data Scientist) discuss the rapid evolution of AI, the breakthroughs in LLMs, and the role of Natural Language Processing in shaping the future of artificial intelligence. They also share their experiences in AI development and explain why this podcast differs from other AI discussions.


    Chapters:

    00:00 – Introduction

    01:56 – DeepSeek Overview

    02:55 – Scaling Laws and Model Performance

    04:36 – Peak Data: Are we running out of quality training data?

    08:10 – Industry reaction to DeepSeek

    09:05 – Jevons' Paradox: Why cheaper AI can drive more demand

    11:04 – Supervised Fine-Tuning vs Reinforcement Learning (RLHF)

    14:49 – Why Reinforcement Learning helps AI models generalize

    20:29 – Distillation and Training Efficiency

    25:01 – AI safety concerns: Toxicity, bias, and censorship

    30:25 – Future Trends in LLMs: Cheaper, more specialized AI models?

    37:30 – Final thoughts and upcoming topics


    Listen on:

    • YouTube: https://youtube.com/@WisecubeAI/podcasts

    • Apple Podcast: https://apple.co/4kPMxZf

    • Spotify: https://open.spotify.com/show/1nG58pwg2Dv6oAhCTzab55

    • Amazon Music: https://bit.ly/4izpdO2


    Follow us:

    • Pythia Website: www.askpythia.ai

    • Wisecube Website: www.wisecube.ai

    • Linkedin: www.linkedin.com/company/wisecube

    • Facebook: www.facebook.com/wisecubeai

    • Reddit: www.reddit.com/r/pythia/

    Mentioned Materials:

    - Jevons’ Paradox: https://en.wikipedia.org/wiki/Jevons_paradox

    - Scaling Laws for Neural Language Models: https://arxiv.org/abs/2001.08361

    - Distilling the Knowledge in a Neural Network: https://arxiv.org/abs/1503.02531

    - SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training: https://arxiv.org/abs/2501.17161

    - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning: https://arxiv.org/abs/2501.12948

    - Reinforcement Learning: An Introduction (Sutton & Barto): https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

    続きを読む 一部表示
    40 分