Learning (or Improving at) pandas

Data is central to any analytics project. In Python, by far the most commonly-used package to manage data is pandas. In this short post, I will offer a few suggestions for those of you who want to get up to speed using pandas or take your skills to the next level. There are a huge number of resources out there. Hopefully this will help you choose where to start.

The pandas package was originally created by Wes McKinney. He has also literally written the book on pandas. As you’ll see, there are lots of free resources on pandas, but you might find that the book makes sense if you like to learn in that format.

Free Introductory Resources

The easiest place to start is with the official pandas documentation, in particular the introductory 10 Minutes to pandas tutorial.

Another very good and important introductory resource, also from the official documentation, is this overview of the data structures used in pandas.

If you like video instruction, you might find these tutorials helpful. The tutorials are in the form of a Jupyter notebook, with links to YouTube videos. You can also find some updates to these materials here.

Allison Parrish of NYU has a great introduction to pandas in the form of a Jupyter notebook.

Free Intermediate Resources

If you are comfortable with the basics of Pandas, there are a number of ways to get to the next level.

First, there is the cookbook from the official pandas documentation, which has lots of short examples for various common tasks.

The blogger Chris Moffitt writes at Practical Business Python and has a number of very helpful posts on pandas. There are dozens of posts under the pandas tag, but some of the most useful ones are:

Data scientist Chris Albon has a lot of great posts about pandas. Just look at the links under “Data Wrangling” on his site.

More Advanced Material

Data scientist Greg Reda has several important pandas tutorials on his site. Here is the first one.

Data scientist [Ted Petrou] has these suggestions on how to go from beginner to pandas expert. He is also the author of this book, which I can recommend if you are inclined to master pandas.

This Jupyter notebook is a very nice walk-through of various tasks in pandas. It also includes links to documentation explaining the various topics.

In the category of more advanced but important material, I would include this series of posts by Tom Augspurger.

Wrapping Up

I hope this material helps you learn or improve at pandas. I also hope that the examples on this site will help you become more comfortable using this powerful package in your sports analytics projects.

If you can’t find the answer to a question you have about pandas in these resources, check out the StackOverflow section on pandas. It’s a good bet that your question (or a similar one) has already been asked and answered there.

about contact pp tos