Python or R for Data Science?

Avatar photo Dr. Aimee Schwab-McCoy

A question we hear from instructors is, how do you choose the right language for your data science courses?

The two primary languages used in data science are Python and R. In fact, zyBooks publishes two programming versions of our foundational introduction to data science – one for each language. (And another one without coding.)

To give you some perspective, here’s a quick chart comparing Python to R:

PYTHONR
General-purpose language widely used in industry and intro CS courses Used by statisticians, data analysts and research scientists
Very flexible for multiple applicationsSpecifically written for data and statistical analysis, displaying graphics, and statistical modeling 
Open source and supports object-oriented programmingOpen source and supports object-oriented programming 
Commonly used data science packages are pandas, seaborn, and scikit-learnCommonly used data science packages are tidyr, ggplot2, and dplyr.
Cleaner syntax; students find Python easier to learnMore difficult syntax, but commonly used data science packages within tidyverse ecosystem are designed to work together

In this short video, data science professor and zyBooks co-author Dr. Aimee Schwab-McCoy walks you through how to pick the right language for your students: 

Avatar photo
Author Bio

Dr. Aimee Schwab-McCoy

Aimee Schwab-McCoy is the Senior Manager for Content Development in Data Science, Mathematics, and Statistics. She completed her PhD in Statistics at the University of Nebraska-Lincoln (2015). Before joining zyBooks in 2022, Dr. Schwab-McCoy was an Assistant Professor and Data Science Program Director at Creighton University, and a Lecturer at Institute of Technology Sligo. Dr. Schwab-McCoy has published several articles in statistics and data science education, and has received awards for teaching statistics in the health sciences.