エピソード

  • No Lag: Building the Future of High-Performance Cloud with Nathan Goulding
    2025/06/09
    Warren talks with Nathan Goulding, SVP of Engineering at Vultr, about what it actually takes to run a high-performance cloud platform. They cover everything from global game server latency and hybrid models to bare metal provisioning and the power/cooling constraints that come with modern GPU clusters.

    The discussion gets into real-world deployment challenges like scaling across 32 data centers, edge use cases that actually matter, and how to design systems for location-sensitive customers—whether that’s due to regulation or performance. Additionally, there's talk about where the hyperscalers have overcomplicated pricing and where simplicity in a flatter pricing model and optimized defaults are better for everyone.

    There’s a section on nuclear energy (yes, really), including SMRs, power procurement, and what it means to keep scaling compute with limited resources. If you're wondering whether your app actually needs high-performance compute or just better visibility into your costs, this is the episode.

    Picks​
    • The Adventures In DevOps: Survey
    • Warren: Jetlag: The Game
    • Nathan: Money Heist (La Casa de Papel)
    続きを読む 一部表示
    1 時間 1 分
  • Ground Truth & Guided Journeys: Rethinking Data for AI with Inna Tokarev Sela
    2025/06/04
    Inna Tokarev Sela, CEO and founder of Illumex, joins the crew to break down what it really means to make your data “AI-ready.” This isn’t just about clean tables—it’s about semantic fabric, business ontologies, and grounding agents in your company's context to prevent the dreaded LLM hallucination. We dive into how modern enterprises just cannot build a single source of truth, not matter how hard they try. All the while knowing that it's required to build effected agents utilizing the available knowledge graphs and.

    The conversation unpacks democratizing data access and avoiding analytics anarchy. Inna explains how automation and graph modeling are used to extract semantic meaning from disconnected data stores, and how to resolve conflicting definitions. And yes, Warren finally coughs up what's so wrong with most dashboards.

    Lastly, we quickly get to the core philosophical questions of agentic systems and AGI, including why intuition is the real differentiator between humans and machines. Plus: storage cost regrets, spiritual journeys disguised as inference pipelines, and a very healthy fear of subscription-based sleep wearables.

    Picks​
    • The Adventures In DevOps: Survey
    • Warren: The Non-Computability of Intuition
    • Will: The Arc Browser
    • Inna: Healthy GenAI skepticism
    続きを読む 一部表示
    53 分
  • Incident Vibing: The Self-Healing System - DevOps 242
    2025/05/29
    Sylvain Kalache, Head of Developer Relations at Rootly joins us to explore the new frontier of incident response powered by large language models. We dive into the evolution of DevRel and how we meet the new challenges impacting our systems.

    We explore Sylvain's origin story in self-healing systems, dating back to his SlideShare and LinkedIn days. From ingesting logs via Fluentd to building early ML-driven RCA tools, he shares a vision of self-healing infrastructure that targets root causes rather than just restarting boxes. Plus, we trace the historical arc of deterministic and non-deterministic tools.

    The conversation shifts toward real-world applications, where we're combining logs, metrics, transcripts, and postmortems to give SREs superpowers. We get tactical on integrating LLMs, why fine-tuning isn't always worth it, and how the Model Context Protocol (MCP) could be the USB of AI ops, but how it is still insecure. We wrap by facing the harsh reality of "incident vibing" in a world increasingly built by prompts, not people—and how to prepare for it.Picks​
    • Warren: There is no AI Revolution
    • Sylvain: Incident Vibing and Rootly Labs SRE event on April 24th
    続きを読む 一部表示
    1 時間 10 分
  • Decentralized Chaos: Web3 Infra, NodeOps, and the Art of Blockchain Load Balancing - DevOps 241
    2025/05/22
    This week, Paul Marston from Ankr joins the crew to unpack the madness that is modern blockchain infrastructure. From his wild career transition out of financial services into 24/7 node ops for Web3, Paul shares the brutal truth about uptime expectations, decentralization challenges, and why hard forks are more like enterprise schema upgrades with a community twist. If you’ve ever wondered why managing a blockchain node is like owning a temperamental pet server, this one’s for you.

    The team goes deep on the nitty-gritty of load balancing across dozens of chains, explaining why routing traffic to the “wrong” archive node could ruin your day—and how Ankr’s custom load balancer is basically magic for JSON-RPC calls. Warren tosses out wild scenarios about encrypted data smuggling via blockchain, while Will confesses his angry typing habit (yes, it’s back). The discussion gets even more fun with debates on innovation vs. rigor, Web2's forgotten best practices, and why testing in prod might not be such a dirty word after all.

    But don’t think it’s all crypto and code. Paul shares battle-won wisdom from running over 100 chains across bare metal, giving us a peek at the operational sophistication and automation involved. From Terraform templates to Docker configs, he walks through the process of onboarding new chains and tuning for performance. The episode also touches on emerging risks like data exfiltration via public blockchains, and why AI (used wisely) might just be the sidekick DevOps always needed.

    And of course memes, we talk a bit about this one: Tree Swing Product Development

    Picks​
    • Warren: Dvorak Keyboard Setup and Logitech K295
    • Will: Quirky Record Player from Miniot
    • Paul: Super Whisper - Voice Transcription Tool
    続きを読む 一部表示
    1 時間 16 分
  • Observability in the CI/CD Pipeline with Adriana Villela - DevOps 240
    2025/05/15
    In this episode, Will and Warren welcome Adriana Villela — CNCF ambassador, Dynatrace advocate, and host of the Geeking Out podcast — for a wide-ranging conversation on observability in CI/CD pipelines. Adriana shares her journey from “On Call Me Maybe” to her own podcast, her work with OpenTelemetry, and why observability isn’t just for SREs anymore.

    The crew digs into how telemetry should be integrated across the software development lifecycle — from development to QA to production — and what that really looks like in modern teams. Adriana drops knowledge on CI/CD failures, distributed traces, and even how to bring observability to other parts of the business like recruiting and onboarding. She also explains how she got involved in the OpenTelemetry end-user SIG and what’s next for the observability movement.

    Things get persona as we trade war stories about SVN, terrible version control systems, reusable grocery bags, and the ethics of AI log parsers. Adriana closes with a powerful take: observability is a team sport, and the better we play it, the more effective — and environmentally conscious — our systems can become.Picks​
    • Warren: Adventures In DevOps survey - How can we make it better for you?
    • Adriana: Bouldering — she recommends it both as a physical activity and a therapeutic mental reset, especially when traveling
    • Jillian: Expeditionary Force
    • Will: Iron Neck and Purpose & Prophet
    続きを読む 一部表示
    1 時間 21 分
  • Building Engineering Excellence with Ganesh Datta of Cortex - DevOps 239
    2025/05/08
    In this episode, I (flying solo today!) sat down with Ganesh Datta, the CTO and co-founder of Cortex, to explore what it really means to drive engineering excellence at scale. And spoiler: it’s not just about better dashboards or fancy developer tools—it’s about treating software development like the competitive advantage it is.

    We went deep into the why behind internal developer portals (IDPs) and how they’re transforming platform engineering, developer experience, and organizational maturity. Ganesh shares how Cortex came to life—from being paged at 2am for a mystery Game of Thrones-named microservice (yep, we've all been there), to realizing that every other business function had a system of record—except engineering.

    Key Takeaways:
    • IDPs are like CRMs for Engineering: Just as sales teams wouldn’t function without a CRM, modern engineering orgs shouldn’t be flying blind without a structured, centralized developer portal.
    • Engineering Excellence = Business Outcomes: Whether it’s reliability, security, or platform efficiency, IDPs help codify best practices and align teams toward measurable goals.
    • Start Small to Win Big: You don’t need to overhaul everything on day one. Start with a pain point you already know—like production readiness—and improve that incrementally.
    • SREs and Platform Engineers Love IDPs: Because it gives them the data, ownership visibility, and real-time checks they need, without the honor-system chaos.
    • Developer Experience is Just the Beginning: Tools like Cortex aren’t just about dev productivity—they’re about creating resilient, aligned, scalable engineering orgs.
    We also geeked out about everything from naming services (“Brewer” for a feature extraction tool? Chef’s kiss.) to the surprising power of reading 15 minutes before bed to improve sleep quality—yep, we went there!

    If you’re part of an engineering team (or leading one) and want to know how to move faster and smarter, this is the episode for you.
    続きを読む 一部表示
    51 分
  • Modern DevOps Challenges: Automation, AI, and Scaling in 2025 - DevOps 238
    2025/05/02
    In this episode of DevOps 238, we sat down with Zach Lloyd to dive into what’s really happening in the world of modern DevOps—from automation and AI to scaling systems and maintaining team culture in fast-paced environments.

    We talked about the evolving role of DevOps engineers, the shift toward platform engineering, and why tool sprawl is becoming a bigger issue than ever. Zach shared some powerful real-world lessons on implementing CI/CD pipelines, avoiding burnout in high-pressure environments, and how teams can stay aligned without drowning in Slack notifications or endless dashboards.

    One of our favorite takeaways? The idea that simplicity and communication still beat out fancy tooling—every time. We also touched on emerging trends like AI-assisted deployments, observability, and what DevOps might look like in 2026 and beyond.

    If you’re navigating legacy systems, scaling rapidly, or just trying to keep your team sane and productive, this episode’s packed with insights you won’t want to miss.




    • Dungeon Crawler Carl (Book)
    • Dabble of DevOps AI Data Discovery Tool
    • The Impact of Generative AI on Critical Thinking (Research Paper)
    • Granola AI Meeting Notes
    • A Travel Guide to the Middle Ages (Book)
    続きを読む 一部表示
    1 時間 19 分
  • Matt Lea Discusses Cloud War Games and Elevating Everyday DevOps - DevOps 237
    2025/04/02
    We dive into the world of cloud architecture and engineering with a fascinating discussion led by our hosts Warren Parad and Will Button, and joined by our special guest, Matt Lea. Matt, hailing from Wisconsin, is the driving force behind innovative projects like CloudWarGames.com, a platform designed to enhance DevOps training and hiring through engaging problem-solving scenarios. As we explore his journey, from coaching gymnastics to developing digital training ecosystems, you'll discover how Matt's experiences shape his unique perspectives on technical challenges, team dynamics, and the ever-evolving landscape of cloud solutions. Whether you're curious about the technical intricacies of infrastructure or seeking inspiration for your own career path, this episode offers a captivating look at the intersection of technology, creativity, and human connections. So, sit back, relax, and get ready to explore the world of DevOps in a whole new way.
    続きを読む 一部表示
    1 時間 14 分