Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine

Abstract: 

Pandas 2 brings new Arrow data types, faster calculations and better scalability. Dask scales Pandas across cores. Polars is a new competitor to Pandas designed around Arrow with native multicore support. Which should you choose for modern research workflows? We'll solve a ""just about fits in ram"" data task using the 3 solutions, talking about the pros and cons so you can make the best choice for your research workflow. You'll leave with a clear idea of whether Pandas 2, Dask or Polars is the tool to invest in.

Do you still need 5x working RAM for Pandas operations (probably not!)? Can Pandas string operations actually be fast (sure)? Since Polars uses Arrow data structures, can we easily use tools like Scikit-learn and matplotlib (yes-maybe)? What limits do we still face?

Bio: 

Ian is a Chief Data Scientist, has helped co-organise the annual PyDataLondon conference raising $100k+ annually for the open source movement along with the associated 12,000+ member monthly meetup. Using data science he's helped clients find $2M in recoverable fraud, created the core IP which opened funding rounds for automated recruitment start-ups and diagnosed how major media companies can better supply recommendations to viewers. He gives conference talks internationally often as keynote speaker and is the author of the bestselling O'Reilly book High Performance Python (2nd edition). He has over 25 years of experience as a senior data science leader, trainer and team coach. For fun he's walked by his high-energy Springer Spaniel, surfs the Cornish coast and drinks fine coffee. Past talks and articles can be found at:

https://ianozsvald.com/
https://notanumber.email/
https://github.com/ianozsvald/
https://twitter.com/ianozsvald
https://fosstodon.org/@ianozsvald
https://www.linkedin.com/in/ianozsvald/

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google