2. Linear Regression for Regression, Logistic Regression for Classification and Statistical Forecasting#
In this chapter, our goal is to familiarize ourselves with the use of linear regression for regression problems and logistic regression for classification problems.
Our main learning objectives are:
Become familiar with the terminology used in classification and regression problems
Become familiar with some classifiers available in SciKit Learn
Understand how a machine learning algorithm can be implemented from scratch
Apply classifier and linear models to environmental science problems
Our exercises adapts the notebooks that accompany Géron’s Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, which are available on GitHub, as well as material from Wilks’ Statistical Methods in the Atmospheric Sciences.
We will be relying on Scikit Learn, whose documentation you can find here, and the notebooks assume that you will run them on Google Colab (Though everything can be run locally! There are only a handful of lines that use Google specific libraries)
If you are struggling with some of the exercises, do not hesitate to:
Use a direct Internet search, or stackoverflow
Ask your neighbor(s), the teacher, or the TA for help
Debug your program, e.g. by following this tutorial
Use assertions, e.g. by following this tutorial
If you’re done early, consider:
Trying out the notebook’s bonus exercises
Helping students around you if applicable
Giving feedback on how to improve this notebook (typos, hints, exercises that may be improved/removed/added, etc.) by messaging the teacher and TA(s) on Moodle
Work on your final project for this course.