Open In Colab

2. Linear Regression for Regression, Logistic Regression for Classification and Statistical Forecasting#

In this chapter, our goal is to familiarize ourselves with the use of linear regression for regression problems and logistic regression for classification problems.

Our main learning objectives are:

  1. Become familiar with the terminology used in classification and regression problems

  2. Become familiar with some classifiers available in SciKit Learn

  3. Understand how a machine learning algorithm can be implemented from scratch

  4. Apply classifier and linear models to environmental science problems

Our exercises adapts the notebooks that accompany Géron’s Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, which are available on GitHub, as well as material from Wilks’ Statistical Methods in the Atmospheric Sciences.

We will be relying on Scikit Learn, whose documentation you can find here, and the notebooks assume that you will run them on Google Colab (Though everything can be run locally! There are only a handful of lines that use Google specific libraries)

If you are struggling with some of the exercises, do not hesitate to:

  • Use a direct Internet search, or stackoverflow

  • Ask your neighbor(s), the teacher, or the TA for help

  • Debug your program, e.g. by following this tutorial

  • Use assertions, e.g. by following this tutorial

If you’re done early, consider:

  • Trying out the notebook’s bonus exercises

  • Helping students around you if applicable

  • Giving feedback on how to improve this notebook (typos, hints, exercises that may be improved/removed/added, etc.) by messaging the teacher and TA(s) on Moodle

  • Work on your final project for this course.