Why Python?

It’s a great language

The best reason to learn Python is also the hardest to articulate to someone who is just starting to work with Python: in terms of structure and syntax, it’s a beautifully designed, intuitive, but exceedingly powerful general-purpose programming language.

Python was explicitly designed (a) so code written in Python would be easy for humans to read, and (b) to minimize the amount of time required to write code. Indeed, its ease of use is the reason that according to a recent study, 80% of the top 10 CS programs in the country use Python in their intro to computer science classes.

Generalizable skills > non-generalizable skills

At the same time, however, it’s a real, general-purpose programming language. Major companies like Google and Dropbox use Python in their core applications.

This sets Python apart from “Domain Specific Languages” languages like R that are highly tuned to serve only a specific purpose – like statistics – and work for a specific audience. John Chambers created R with the goal of making a language that non-programmers could get started with quickly, but which could also be used by “power users”. To a large degree he succeeded, as is evidenced by R’s uptake. But in trying to make the language so accessible to non-programmers, many compromises were made in the language. R only really serves one purpose – statistical analysis – and the language syntax has all sorts of oddities and warts that come from this original bargain. Python does require a little more training to get started with (though not that much more), but as a result there’s no ceiling to what you can do with Python. If you learn Python, you’re learning a full programming language. This means if you ever need to work in a different language like Java or C for some reason, understand code someone else has written, or otherwise deal with a programming problem, your background in a real programming language will give you a good conceptual foundation for whatever you come across. Indeed, this is the reason top CS programs teach in Python.

Of all the reasons to choose Python, I think this is by far the most compelling. Python sets you up to understand and operate in the broader programming world. And if you’re at all interested in doing computational social science, building a generalizable programming skill just makes you more flexible. R is great if you want to just run regressions or do things that perfectly fit the mold someone has created with an R function. But as social scientists keep finding new sources of data (like text) and new ways to analyze it, the more literate you are in general programming, the more prepared you will be to steal tools from other disciplines and to write new tools yourself.

Python only, or Python and ...

Personally, I find the idea of working in a single programming environment incredibly appealing. I first came to Python because I was doing my econometrics in Stata, my GIS work in ArcGIS, and my network analysis in R, and I just wanted to unify my work flow. For me, one of the best parts of Python is that I’m confident I can do anything I want in this one environment.

But not everyone feels that way, and many people use Python AND other tools like R, moving back and forth depending on the application at hand. But even if you plan to mix and match, one of the great things about Python is that because of its generality, anecdotally many people say getting better at Python has made them much better programmers, not just in Python, but also in R or Stata.

Performance

Performance never comes into play for the vast majority of social science applications, so this is not one of the top reasons to choose Python. However, if you find yourself in a situation where it does, Python does have some major performance advantages over most other high-level languages, including Matlab and R, both in terms of computation speed and memory use (R is a notorious memory hog).

More importantly, though, there are new tools that make it possible to write code in Python that runs at nearly the speed of code written in C or FORTRAN – orders of magnitude faster than R or native Python. Again, this is a second-order consideration in most cases, but another example of how Python gives you options no matter what the future brings.

Why NOT Python?

There is one huge reason one might choose to use R over Python, in my view: colleagues. If you know lots of people who work with R, then if you choose to use R (a) you can turn to the person next to you and ask for help, and (b) if you co-author, collaboration will be easier. Python has a great support community and mailing lists, but there is no substitute for personal help.