Recurrent Neural Networks and Hydrological Modeling

7. Recurrent Neural Networks and Hydrological Modeling#

In this chapter, the learning objectives are:

Define a recurrent neural network
Distinguish recurrent from convolutional neural networks
Discuss the forecasting of environmental time series
Define natural language processing
Know at least three algorithms to process time-series (Vanilla RNN, LSTM, GRU Attention)

The Exercises will help you to learn:

Implementing and training a recurrent neural network using [Keras].
Gain intuition on architecture design.
Cross-validate a recurrent neural network
Use a neural network to generate new data
Implementing an LSTM network for Hydrological Applications using [PyTorch]

This week’s exercises:

This exercise adapts Géron et al.’s Jupyter notebook exercises for chapter 15 (License) of his book “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition”.
Adapts a watershed modeling exercise developed by Yikui Zhang (ETH Zurich) for Profs. Nadav Peleg & Peter Molnar, which focuses on modeling the Gsteig catchment.

If you are struggling with some of the exercises, do not hesitate to:

Use a direct Internet search, or stackoverflow
Ask your neighbor(s), the teacher, or the TA for help
Debug your program, e.g. by following this tutorial
Use assertions, e.g. by following this tutorial

Way to go with the flow 😎🌧🌧🏄🌧🌧

If you’re done early, consider:

Giving feedback on how to improve this notebook (typos, hints, exercises that may be improved/removed/added, etc.) by messaging the teacher and TA(s) on Moodle
Working on your final project for this course.

Final Project The final project’s goal is to answer a well-defined scientific question by applying one of the ML algorithms introduced in class on an environmental dataset of your choice (e.g., related to your Masters thesis or your PhD research).

Now that you found a large environmental dataset linked to a scientific question you are passionate about, which machine learning algorithm can you use to address it? Is it a classification, a regression, or a data exploration project?
How could you format the dataset to facilitate its manipulation in Python?
If you’re still hunting for a dataset of interest, consider browsing the list of benchmark datasets maintained by Pangeo and Kaggle.