The very first thing that comes to mind when the words statistics and programming are associated is definetly R. The R language represents the most used coding language in statistics and data analysis. But, what if you had to embedd the statistic part of your work in a bigger scripting project? What if you need to use the output of your statistical analysis as an input for your script?
Python users can enjoy Pandas
There are many solutions actually. You can use pipelines to connect different scripts, or you can use “bridges”, libraries designed to connect R scripts with other languages, such as R to Python or JRI for Java. Python users have another option, the very famous library PANDAS, that imports the R phylosophy in a full python library.
I don’t think that many readers out there are totally unaware of this library. Anyways, I still remember that you can have a look and download PANDAS from the officlal website.
A very simple 3H seminar!
For those who are starting to use this library and want to move their very first steps, the video embedded on the top can be a good tool. Wes Mckinney, PANDAS creator, gives a hands-on introduction to manipulating and analyzing large and small structured data sets in Python using the pandas library. So, if you have 3 hours to spend on this, you are very very welcome (WTF Wes!?!?).
Something about Wes Mckinney
I have heard about him since a while and really looks like a proven authority on Python data analysis. San Francisco- based python hacker and enterprouer, he’s also the author of the Python for data analysis book. I often keep an eye on his work on his blog and his twitter profile.