Chapter 3: Get out the vote

In the third case study, you will use data on attitudes and beliefs in the United States to predict voter turnout. You will apply your skills in dealing with imbalanced data and explore more resampling options.

1Predicting voter turnout from survey data

2Choosing an appropriate model

3Exploring the VOTER data

4Visualization for exploratory data analysis

5Imbalanced data

6Fit a simple model

7VOTE 2016

8Training and testing data

9Upsampling for imbalanced data

10Cross-validation

11Understanding cross-validation

12Training models with cross-validation

13Comparing model performance

14Confusion matrix for your training data

15Confusion matrix for your testing data

16Which model is best?

About this course

This is a free, open source course on supervised machine learning in R using the caret package. In this course, you'll work through four case studies and practice skills from exploratory data analysis through model evaluation. Ines Montani designed the web framework that runs this course, and Florencia D'Andrea helped build the site.

Contributions and comments on how to improve this course are welcome! Please file an issue or submit a pull request if you find something that could be fixed or improved.

Creative Commons License

About me

My name is Julia Silge and I'm a data scientist and software engineer at RStudio where I build modeling tools. I am both an international keynote speaker and a real-world practitioner focused on data analysis and machine learning practice. I love making beautiful charts and communicating about technical topics with diverse audiences.