Lépjen offline állapotba az Player FM alkalmazással!
Harnessing the Power of Python Polars
Manage episode 499027560 series 2637014
What are the advantages of using Polars for your Python data projects? When should you use the lazy or eager APIs, and what are the benefits of each? This week on the show, we speak with Jeroen Janssens and Thijs Nieuwdorp about their new book, Python Polars: The Definitive Guide.
Jeroen and Thijs describe how they were introduced to Polars while working at Xomnia. They were converting a large data project to Python and saw surprising speed increases using the new library.
We discuss converting projects from pandas to Polars, getting away from indexes, consistent syntax, and using lazy vs eager APIs. Along the way, Jeroen and Thijs offer tips for getting the most out of Polars in your code.
We dig into the process of writing a definitive guide and the advantages of working collaboratively on a book project. They also share resources for practicing data wrangling and building visualizations with Pydy Tuesday.
Course Spotlight: Working With Python Polars
Welcome to the world of Polars, a powerful DataFrame library for Python. In this video course, you’ll get a hands-on introduction to Polars’ core features and see why this library is catching so much buzz.
Topics:
- 00:00:00 – Introduction
- 00:02:47 – Polars start at Xomnia
- 00:04:08 – Putting Polars into production
- 00:07:18 – Realizing the speed differences
- 00:08:49 – Converting the project from R to Python
- 00:14:34 – How did Polars improve the project?
- 00:16:34 – Making the code more ergonomic and readable
- 00:19:21 – Only grabbing the data that is needed
- 00:20:37 – Titling and deciding to write the book
- 00:24:40 – Advantages to collaboration
- 00:29:34 – What were you excited to include in the book?
- 00:31:55 – Working with different engines and Nvidia’s Cuda
- 00:35:05 – Defining a Polars expression
- 00:36:11 – Transitioning from pandas to Polars
- 00:37:34 – Not needing an index
- 00:39:56 – What inspired the syntax?
- 00:45:01 – Defining lazy vs eager workflows
- 00:49:16 – Examples covered in first chapter preview
- 00:51:51 – Video Course Spotlight
- 00:53:14 – Data formats and Arrow
- 00:55:41 – Working with NaN, null, or None
- 00:58:11 – Measuring performance through a benchmark
- 00:59:12 – Advantages to working with the Discord community
- 01:02:32 – Code examples and applying the techniques
- 01:03:34 – Pydy Tuesday
- 01:05:47 – What are you excited about in the world of Python?
- 01:09:21 – What do you want to learn next?
- 01:13:26 – What’s the best way to follow your work online?
- 01:14:14 – Thanks and goodbye
Survey:
Show Links:
- Python Polars: The Definitive Guide
- Janssens & Nieuwdorp - What we learned by converting a large codebase from Pandas to Polars - YouTube
- Polars — DataFrames for the new era
- polars · PyPI
- Xomnia - Home Page
- Episode #140: Speeding Up Your DataFrames With Polars
- Data Science at the Command Line - Jeroen Janssens
- Tidyverse
- PySpark Overview — PySpark 4.0.0 documentation
- Episode #193: Wes McKinney on Improving the Data Stack & Composable Systems
- Apache Arrow
- TPC-H Homepage
- Community – Python Polars: The Definitive Guide
- pydytuesday: A Python package to download TidyTuesday datasets
- PydyTuesday - Python How-to Videos - YouTube
- Astral: High-performance Python tooling
- Episode #238: Charlie Marsh: Accelerating Python Tooling With Ruff and uv
- uv: An extremely fast Python package and project manager, written in Rust.
- PEP 723 – Inline script metadata
- Inline script metadata - Python Packaging User Guide
- Package Your Python Code as a CLI - PyData London 25 - YouTube
- marimo - A next-generation Python notebook
- The Rust Programming Language Book
- Pimsleur - Learn New Languages Online
- Official Rosetta Stone - How Language Is Learned
- Thijs Nieuwdorp
- Jeroen Janssens
- Python Polars: The Definitive Guide
Level up your Python skills with our expert-led courses:
272 epizódok
Manage episode 499027560 series 2637014
What are the advantages of using Polars for your Python data projects? When should you use the lazy or eager APIs, and what are the benefits of each? This week on the show, we speak with Jeroen Janssens and Thijs Nieuwdorp about their new book, Python Polars: The Definitive Guide.
Jeroen and Thijs describe how they were introduced to Polars while working at Xomnia. They were converting a large data project to Python and saw surprising speed increases using the new library.
We discuss converting projects from pandas to Polars, getting away from indexes, consistent syntax, and using lazy vs eager APIs. Along the way, Jeroen and Thijs offer tips for getting the most out of Polars in your code.
We dig into the process of writing a definitive guide and the advantages of working collaboratively on a book project. They also share resources for practicing data wrangling and building visualizations with Pydy Tuesday.
Course Spotlight: Working With Python Polars
Welcome to the world of Polars, a powerful DataFrame library for Python. In this video course, you’ll get a hands-on introduction to Polars’ core features and see why this library is catching so much buzz.
Topics:
- 00:00:00 – Introduction
- 00:02:47 – Polars start at Xomnia
- 00:04:08 – Putting Polars into production
- 00:07:18 – Realizing the speed differences
- 00:08:49 – Converting the project from R to Python
- 00:14:34 – How did Polars improve the project?
- 00:16:34 – Making the code more ergonomic and readable
- 00:19:21 – Only grabbing the data that is needed
- 00:20:37 – Titling and deciding to write the book
- 00:24:40 – Advantages to collaboration
- 00:29:34 – What were you excited to include in the book?
- 00:31:55 – Working with different engines and Nvidia’s Cuda
- 00:35:05 – Defining a Polars expression
- 00:36:11 – Transitioning from pandas to Polars
- 00:37:34 – Not needing an index
- 00:39:56 – What inspired the syntax?
- 00:45:01 – Defining lazy vs eager workflows
- 00:49:16 – Examples covered in first chapter preview
- 00:51:51 – Video Course Spotlight
- 00:53:14 – Data formats and Arrow
- 00:55:41 – Working with NaN, null, or None
- 00:58:11 – Measuring performance through a benchmark
- 00:59:12 – Advantages to working with the Discord community
- 01:02:32 – Code examples and applying the techniques
- 01:03:34 – Pydy Tuesday
- 01:05:47 – What are you excited about in the world of Python?
- 01:09:21 – What do you want to learn next?
- 01:13:26 – What’s the best way to follow your work online?
- 01:14:14 – Thanks and goodbye
Survey:
Show Links:
- Python Polars: The Definitive Guide
- Janssens & Nieuwdorp - What we learned by converting a large codebase from Pandas to Polars - YouTube
- Polars — DataFrames for the new era
- polars · PyPI
- Xomnia - Home Page
- Episode #140: Speeding Up Your DataFrames With Polars
- Data Science at the Command Line - Jeroen Janssens
- Tidyverse
- PySpark Overview — PySpark 4.0.0 documentation
- Episode #193: Wes McKinney on Improving the Data Stack & Composable Systems
- Apache Arrow
- TPC-H Homepage
- Community – Python Polars: The Definitive Guide
- pydytuesday: A Python package to download TidyTuesday datasets
- PydyTuesday - Python How-to Videos - YouTube
- Astral: High-performance Python tooling
- Episode #238: Charlie Marsh: Accelerating Python Tooling With Ruff and uv
- uv: An extremely fast Python package and project manager, written in Rust.
- PEP 723 – Inline script metadata
- Inline script metadata - Python Packaging User Guide
- Package Your Python Code as a CLI - PyData London 25 - YouTube
- marimo - A next-generation Python notebook
- The Rust Programming Language Book
- Pimsleur - Learn New Languages Online
- Official Rosetta Stone - How Language Is Learned
- Thijs Nieuwdorp
- Jeroen Janssens
- Python Polars: The Definitive Guide
Level up your Python skills with our expert-led courses:
272 epizódok
כל הפרקים
×Üdvözlünk a Player FM-nél!
A Player FM lejátszó az internetet böngészi a kiváló minőségű podcastok után, hogy ön élvezhesse azokat. Ez a legjobb podcast-alkalmazás, Androidon, iPhone-on és a weben is működik. Jelentkezzen be az feliratkozások szinkronizálásához az eszközök között.


 
 
 
 
