Welcome to Data Analysis in Python!¶
Looking for Python for Social Scientists?
That’s us! While the focus of this site is still social scientists, I’m finding lots of people with similar needs in other disciplines (like astronomy) are finding the site useful, so I thought I’d formally broaden the frame to welcome those users. If you are from another discipline and have suggestions for how to improve the content here to meet your needs, please let me know!
Python is an increasingly popular tool for data analysis. In recent years, a number of libraries have reached maturity, allowing R and Stata users to take advantage of the beauty, flexibility, and performance of Python without sacrificing the functionality these older programs have accumulated over the years.
This site is designed to offer an introduction to Python specifically tailored for social scientists and people doing applied data analysis – users with little or no serious programming experience who just want to get things done, and who have experience with programs like R and Stata but are anxious for something better.
- Core Skill Sequence: A collection of four numbered tutorials that cover core skills everyone needs to work in Python in social science. I recommend you visit these in sequence – a site for setting up Python on your computer using the Anaconda distribution, an intro to Python for those not familiar with the language, an introduction to the pandas library for working with tabular data (analogous to data.frames in R, or everything you ever did in Stata), and a guide to installing libraries to expand Python.
- Specific Resources for Different Research Topics: “topic” pages, which you should feel free to jump through as appropriate for your purposes: statsmodels, quantecon, and stan for econometrics, machine learning with scikit-learn, seaborn and ggplot for graphing, network analysis using igraph, geo-spatial analysis, ways to accelerate Python, big data tools, and text analysis libraries. The topic pages also include two topics that are a little unusual, but I think potentially quite useful: guide to getting effective help online, and resources on evidence-based research on how to teach programming for anyone teaching this material.
- Resources for Other Software Tools: Resources on tools and programs you may come across while using Python with descriptions of the tool, guidance on what you need to know most, and links to other tutorials. These include pages on the Command Line, iPython, and Git and Github.
Ready to get started? Head on over to Setup!
Question or comments? Please send them my way! Feedback of all sorts is greatly appreciated, and if you have any experience with github, suggested changes to this site can also be submitted as pull-requests here Contents:
- Why Python?
- Note to R Users
- Note to Stata Users
- 1. Setting Up Python
- 2. Basic Python
- 3. Pandas
- 4. Installing Packages
- Machine learning
- GIS in Python
- Network Analysis
- Making Python faster
- Big Data / Parallelization
- Text Analysis
- Getting Help
- Teaching Programming
- R-to-Python Table
- ST: iPython
- ST: Command Line
- ST: Git and Github