Artwork

A tartalmat a Hugo Bowne-Anderson biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a Hugo Bowne-Anderson vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.
Player FM - Podcast alkalmazás
Lépjen offline állapotba az Player FM alkalmazással!

Episode 40: What Every LLM Developer Needs to Know About GPUs

1:43:34
 
Megosztás
 

Manage episode 457226388 series 3317544
A tartalmat a Hugo Bowne-Anderson biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a Hugo Bowne-Anderson vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you.

Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even training from scratch. They unpack the real pain points developers face, like figuring out:

  • How much VRAM you actually need.
  • Why memory—not compute—ends up being the bottleneck.
  • How to make quick, back-of-the-envelope calculations to size up hardware for your tasks.
  • And where things like fine-tuning, quantization, and retrieval-augmented generation (RAG) fit into the mix.

One thing Hugo really appreciate is that Charles and the Modal team recently put together the GPU Glossary—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.

🔧 Charles also does a demo during the episode—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.

This is the "Building LLM Applications for Data Scientists and Software Engineers" course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).

LINKS

  continue reading

61 epizódok

Artwork
iconMegosztás
 
Manage episode 457226388 series 3317544
A tartalmat a Hugo Bowne-Anderson biztosítja. Az összes podcast-tartalmat, beleértve az epizódokat, grafikákat és podcast-leírásokat, közvetlenül a Hugo Bowne-Anderson vagy a podcast platform partnere tölti fel és biztosítja. Ha úgy gondolja, hogy valaki az Ön engedélye nélkül használja fel a szerzői joggal védett művét, kövesse az itt leírt folyamatot https://hu.player.fm/legal.

Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you.

Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even training from scratch. They unpack the real pain points developers face, like figuring out:

  • How much VRAM you actually need.
  • Why memory—not compute—ends up being the bottleneck.
  • How to make quick, back-of-the-envelope calculations to size up hardware for your tasks.
  • And where things like fine-tuning, quantization, and retrieval-augmented generation (RAG) fit into the mix.

One thing Hugo really appreciate is that Charles and the Modal team recently put together the GPU Glossary—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.

🔧 Charles also does a demo during the episode—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.

This is the "Building LLM Applications for Data Scientists and Software Engineers" course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).

LINKS

  continue reading

61 epizódok

Minden epizód

×
 
Loading …

Üdvözlünk a Player FM-nél!

A Player FM lejátszó az internetet böngészi a kiváló minőségű podcastok után, hogy ön élvezhesse azokat. Ez a legjobb podcast-alkalmazás, Androidon, iPhone-on és a weben is működik. Jelentkezzen be az feliratkozások szinkronizálásához az eszközök között.

 

Gyors referencia kézikönyv

Hallgassa ezt a műsort, miközben felfedezi
Lejátszás