Home

Scaling AI at Inference: The Road to Agent-Driven ROI

Scaling AI at Inference: The Road to Agent-Driven ROI

Roman Chernin joins Patrick Moorhead and Daniel Newman to discuss how AI infrastructure is shifting from training to inference, why Nebius built Token Factory to optimize system-level performance, and how agent-driven ROI will define AI success in 2026 and beyond.

AI has moved beyond model training, inference is the new frontier.

This Six Five Webcast features Patrick Moorhead and Daniel Newman, joined by Roman Chernin, Co-founder & Chief Business Officer at Nebius, to explore how AI infrastructure is evolving from massive training clusters to production-grade inference systems built for agents, open-source models, and real ROI.

Nebius positions itself as an AI-specialized cloud, purpose-built to optimize inference workloads at scale. As AI shifts from research labs to product companies and enterprise agents, performance, cost efficiency, and system-level orchestration have become the defining battleground.

Key Takeaways:

🔹 The shift from training to inference: Why budgets, architectures, and customer priorities are changing.

🔹 The Nebius Token Factory: How full-stack optimization across hardware, software, and orchestration improves unit economics.

🔹 Open-source in the enterprise: Why flexibility, tunability, and cost control matter as much as frontier intelligence.

🔹 Agent-driven ROI: Why 2026 will demand measurable business outcomes, not just model benchmarks.

🔹 Performance beyond GPUs: How CPUs, workload orchestration, caching, quantization, and stack optimization tie in to define success.

Nebius combines next-generation silicon access with a purpose-built cloud stack and white-glove technical support to help customers ship AI products that are fast, affordable, and compliant at scale.

The next phase of AI won’t be defined by a model, it will be defined by who can run inference most efficiently.

To learn more about how Nebius is scaling AI for real-world inference and agent-driven ROI, read about it here and explore the full solution: HERE

Watch the full webcast at sixfivemedia.com or subscribe to our YouTube channel so you never miss an episode.

Disclaimer: Six Five Media is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript

MORE VIDEOS

How Autonomous IT Is Redefining Enterprise Operations

Matt Quinn, CTO of Tanium, joins Patrick Moorhead and Daniel Newman at RSAC 2026 to discuss how Autonomous IT is transforming enterprise operations, shifting from reactive systems to real-time, AI-driven decision-making at the endpoint.

Resilience in the AI Era: Why Security, Data, and Recovery Must Converge

At RSAC 2026, Commvault’s Anna Griffin and Michelle Graff join Patrick Moorhead and Daniel Newman to discuss how AI is reshaping resilience strategy. The conversation explores ResOps, platform unification, and why security, identity, and recovery must converge in the AI era.

Managing Intelligent Fleets: How HPE Is Redefining Compute Ops at Scale - Signal65 Webcast

Signal65’s Ryan Shrout and Russ Fellows discuss HPE’s unified ProLiant compute stack with Ganesh Subramanian, exploring cloud-native fleet management, AI-assisted operations, edge resilience, and how policy-driven orchestration is redefining enterprise infrastructure.

See more

Other Categories

CYBERSECURITY

QUANTUM