Artwork

A tartalmat a The New Stack Podcast and The New Stack biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a The New Stack Podcast and The New Stack vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.
Player FM - Podcast alkalmazás
Lépjen offline állapotba az Player FM alkalmazással!

Why the CNCF's New Executive Director is Obsessed With Inference

25:09
 
Megosztás
 

Manage episode 523450957 series 2574278
A tartalmat a The New Stack Podcast and The New Stack biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a The New Stack Podcast and The New Stack vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

Jonathan Bryce, the new CNCF executive director, argues that inference—not model training—will define the next decade of computing. Speaking at KubeCon North America 2025, he emphasized that while the industry obsesses over massive LLM training runs, the real opportunity lies in efficiently serving these models at scale. Cloud-native infrastructure, he says, is uniquely suited to this shift because inference requires real-time deployment, security, scaling, and observability—strengths of the CNCF ecosystem.

Bryce believes Kubernetes is already central to modern inference stacks, with projects like Ray, KServe, and emerging GPU-oriented tooling enabling teams to deploy and operationalize models. To bring consistency to this fast-moving space, the CNCF launched a Kubernetes AI Conformance Program, ensuring environments support GPU workloads and Dynamic Resource Allocation. With AI agents poised to multiply inference demand by executing parallel, multi-step tasks, efficiency becomes essential. Bryce predicts that smaller, task-specific models and cloud-native routing optimizations will drive major performance gains. Ultimately, he sees CNCF technologies forming the foundation for what he calls “the biggest workload mankind will ever have.”

Learn more from The New Stack about inference:

Confronting AI’s Next Big Challenge: Inference Compute

Deep Infra Is Building an AI Inference Cloud for Developers

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

306 epizódok

Artwork
iconMegosztás
 
Manage episode 523450957 series 2574278
A tartalmat a The New Stack Podcast and The New Stack biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a The New Stack Podcast and The New Stack vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

Jonathan Bryce, the new CNCF executive director, argues that inference—not model training—will define the next decade of computing. Speaking at KubeCon North America 2025, he emphasized that while the industry obsesses over massive LLM training runs, the real opportunity lies in efficiently serving these models at scale. Cloud-native infrastructure, he says, is uniquely suited to this shift because inference requires real-time deployment, security, scaling, and observability—strengths of the CNCF ecosystem.

Bryce believes Kubernetes is already central to modern inference stacks, with projects like Ray, KServe, and emerging GPU-oriented tooling enabling teams to deploy and operationalize models. To bring consistency to this fast-moving space, the CNCF launched a Kubernetes AI Conformance Program, ensuring environments support GPU workloads and Dynamic Resource Allocation. With AI agents poised to multiply inference demand by executing parallel, multi-step tasks, efficiency becomes essential. Bryce predicts that smaller, task-specific models and cloud-native routing optimizations will drive major performance gains. Ultimately, he sees CNCF technologies forming the foundation for what he calls “the biggest workload mankind will ever have.”

Learn more from The New Stack about inference:

Confronting AI’s Next Big Challenge: Inference Compute

Deep Infra Is Building an AI Inference Cloud for Developers

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

306 epizódok

Minden epizód

×
 
Loading …

Üdvözlünk a Player FM-nél!

A Player FM lejátszó az internetet böngészi a kiváló minőségű podcastok után, hogy ön élvezhesse azokat. Ez a legjobb podcast-alkalmazás, Androidon, iPhone-on és a weben is működik. Jelentkezzen be az feliratkozások szinkronizálásához az eszközök között.

 

Gyors referencia kézikönyv

Hallgassa ezt a műsort, miközben felfedezi
Lejátszás