{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "2ODTNI8L-2jb" }, "source": [ "# Exercise 1: Comparing Different Types of Recurrent and Convolutional Neural Networks to Compose Bach Chorales" ] }, { "cell_type": "markdown", "metadata": { "id": "qnIEa8B29vSO" }, "source": [ "![joey-huang-XBh4DOGqMfc-unsplash.jpg]()" ] }, { "cell_type": "markdown", "metadata": { "id": "5xU0vb9p99PZ" }, "source": [ "Can you compose a new [Bach chorale](https://en.wikipedia.org/wiki/List_of_chorale_harmonisations_by_Johann_Sebastian_Bach) using recurrent and/or convolutional neural networks? 🎼 🎶 🎹" ] }, { "cell_type": "markdown", "metadata": { "id": "YasolsLS9j5J" }, "source": [ "**Source**: Photo by Joey Huang on Unsplash\n", "\n", "This exercise adapts Géron et al.'s Jupyter notebook exercises for [chapter 15](https://github.com/ageron/handson-ml2/blob/master/15_processing_sequences_using_rnns_and_cnns.ipynb) \\([License](https://github.com/ageron/handson-ml2/blob/master/LICENSE)) of his book [\"Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition\"](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/).\n", " " ] }, { "cell_type": "markdown", "metadata": { "id": "OiXkBRxxMRgA" }, "source": [ "## Part I: Setup" ] }, { "cell_type": "markdown", "metadata": { "id": "_9mM0nmxMRgA" }, "source": [ "First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed (although Python 2.x may work, it is deprecated so we strongly recommend you use Python 3 instead), as well as Scikit-Learn ≥0.20 and TensorFlow ≥2.0." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RDH9bERJMRgA" }, "outputs": [], "source": [ "# Python ≥3.5 is required\n", "import sys\n", "assert sys.version_info >= (3, 5)\n", "\n", "# Is this notebook running on Colab or Kaggle?\n", "IS_COLAB = \"google.colab\" in sys.modules\n", "IS_KAGGLE = \"kaggle_secrets\" in sys.modules\n", "\n", "# Scikit-Learn ≥0.20 is required\n", "import sklearn\n", "assert sklearn.__version__ >= \"0.20\"\n", "\n", "# TensorFlow ≥2.0 is required\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "assert tf.__version__ >= \"2.0\"\n", "\n", "if not tf.config.list_physical_devices('GPU'):\n", " print(\"No GPU was detected. LSTMs and CNNs can be very slow without a GPU.\")\n", " if IS_COLAB:\n", " print(\"Go to Runtime > Change runtime and select a GPU hardware accelerator.\")\n", " if IS_KAGGLE:\n", " print(\"Go to Settings > Accelerator and select GPU.\")\n", "\n", "# Common imports\n", "import numpy as np\n", "import os\n", "from pathlib import Path\n", "\n", "# to make this notebook's output stable across runs\n", "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "# To plot pretty figures\n", "%matplotlib inline\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "mpl.rc('axes', labelsize=14)\n", "mpl.rc('xtick', labelsize=12)\n", "mpl.rc('ytick', labelsize=12)\n", "\n", "# Where to save the figures\n", "PROJECT_ROOT_DIR = \".\"\n", "CHAPTER_ID = \"rnn\"\n", "IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID)\n", "os.makedirs(IMAGES_PATH, exist_ok=True)\n", "\n", "def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n", " path = os.path.join(IMAGES_PATH, fig_id + \".\" + fig_extension)\n", " print(\"Saving figure\", fig_id)\n", " if tight_layout:\n", " plt.tight_layout()\n", " plt.savefig(path, format=fig_extension, dpi=resolution)\n", "\n", "# Loading Tensorboard\n", "%load_ext tensorboard" ] }, { "cell_type": "markdown", "metadata": { "id": "uQy9j2q-M9bn" }, "source": [ "Let's import two more libraries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bgyr-cIrM-_d" }, "outputs": [], "source": [ "import pooch # Import the pooch library to load data from URL\n", "import pandas as pd # Import the pandas library to handle arrays" ] }, { "cell_type": "markdown", "metadata": { "id": "Qgb5ri9iMAvl" }, "source": [ "Second, let's load the data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ebyV8I_TCRgC" }, "outputs": [], "source": [ "url = \"https://unils-my.sharepoint.com/:u:/g/personal/tom_beucler_unil_ch/EdQBrnnm_kJLhEYL4ZEaTxAB0XlVYiQrXtDRUKMxHZVIlg?download=1\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DnXIqfMMDDJA" }, "outputs": [], "source": [ "files = pooch.retrieve(url,processor=pooch.Unzip(),known_hash='57b0aa3c1716e862b9ab9a941d516b88841c350a08ee216027ecb69f59f442a0')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sovRibEkAdVf" }, "outputs": [], "source": [ "# Finds the directory containing the files and ...\n", "jsb_chorales_dir = Path(files[0]).parent.parent\n", "\n", "# ... Sort these files into training/validation/test\n", "train_files = sorted(jsb_chorales_dir.glob(\"train/chorale_*.csv\"))\n", "valid_files = sorted(jsb_chorales_dir.glob(\"valid/chorale_*.csv\"))\n", "test_files = sorted(jsb_chorales_dir.glob(\"test/chorale_*.csv\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MBBVvZR4AdVg" }, "outputs": [], "source": [ "# Load the chorales from the training, validation, and test set\n", "def load_chorales(filepaths):\n", " return [pd.read_csv(filepath).values.tolist() for filepath in filepaths]\n", "\n", "train_chorales = load_chorales(train_files)\n", "valid_chorales = load_chorales(valid_files)\n", "test_chorales = load_chorales(test_files)" ] }, { "cell_type": "markdown", "metadata": { "id": "U21DQCAJNoA8" }, "source": [ "Third, let's define the functions Géron implemented to listen to these chorales. According to Géron:\n", "\n", "\"*You don't need to understand the details here, and in fact there are certainly simpler ways to do this, for example using MIDI players, but I just wanted to have a bit of fun writing a synthesizer.*\"\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4aSInW7RNl0O" }, "outputs": [], "source": [ "from IPython.display import Audio\n", "\n", "def notes_to_frequencies(notes):\n", " # Frequency doubles when you go up one octave; there are 12 semi-tones\n", " # per octave; Note A on octave 4 is 440 Hz, and it is note number 69.\n", " return 2 ** ((np.array(notes) - 69) / 12) * 440\n", "\n", "def frequencies_to_samples(frequencies, tempo, sample_rate):\n", " note_duration = 60 / tempo # the tempo is measured in beats per minutes\n", " # To reduce click sound at every beat, we round the frequencies to try to\n", " # get the samples close to zero at the end of each note.\n", " frequencies = np.round(note_duration * frequencies) / note_duration\n", " n_samples = int(note_duration * sample_rate)\n", " time = np.linspace(0, note_duration, n_samples)\n", " sine_waves = np.sin(2 * np.pi * frequencies.reshape(-1, 1) * time)\n", " # Removing all notes with frequencies ≤ 9 Hz (includes note 0 = silence)\n", " sine_waves *= (frequencies > 9.).reshape(-1, 1)\n", " return sine_waves.reshape(-1)\n", "\n", "def chords_to_samples(chords, tempo, sample_rate):\n", " freqs = notes_to_frequencies(chords)\n", " freqs = np.r_[freqs, freqs[-1:]] # make last note a bit longer\n", " merged = np.mean([frequencies_to_samples(melody, tempo, sample_rate)\n", " for melody in freqs.T], axis=0)\n", " n_fade_out_samples = sample_rate * 60 // tempo # fade out last note\n", " fade_out = np.linspace(1., 0., n_fade_out_samples)**2\n", " merged[-n_fade_out_samples:] *= fade_out\n", " return merged\n", "\n", "def play_chords(chords, tempo=160, amplitude=0.1, sample_rate=44100, filepath=None):\n", " '''\n", " Reads chords (sets of 4 notes) in a chorale (list of chords) \n", " and outputs an audio file that can be played.\n", " \n", " Arguments:\n", " chords: A list of chords, for instance a full chorale\n", " Optional arguments:\n", " tempo: The tempo of the music\n", " amplitude: The amplitude of the sine waves to be played\n", " sample_rate: How many frequencies are sampled ~ music quality\n", " '''\n", " samples = amplitude * chords_to_samples(chords, tempo, sample_rate)\n", " if filepath:\n", " from scipy.io import wavfile\n", " samples = (2**15 * samples).astype(np.int16)\n", " wavfile.write(filepath, sample_rate, samples)\n", " return display(Audio(filepath))\n", " else:\n", " return display(Audio(samples, rate=sample_rate))" ] }, { "cell_type": "markdown", "metadata": { "id": "eLMqDcHfOuIN" }, "source": [ "## Part II: Preliminary Data Analysis and Preprocessing" ] }, { "cell_type": "markdown", "metadata": { "id": "P-zJjNinOxch" }, "source": [ "The dataset is composed of 382 chorales composed by Johann Sebastian Bach. Each chorale is 100 to 640 time steps long, and each time step contains 4 integers, where each integer corresponds to a note's index on a piano (except for the value 0, which means that no note is played). \n", "\n", "Our goal is to train a model—recurrent, convolutional, or both—that can predict the next time step (four notes), given a sequence of time steps from a chorale. Once trained, we can use this model to generate Bach-like music, one note at a time. We can do this by giving the model the start of a chorale and asking it to predict the next time step, then appending these time steps to the input sequence and asking the model for the next note, and so on. " ] }, { "cell_type": "markdown", "metadata": { "id": "krWYHPmx4WIM" }, "source": [ "### **Q1) Check that notes range from `min_note = 36` (which is C1, i.e. C/Do on octave 1) to `max_note = 81` (which is A5, i.e. A/La on octave 5), including `0` for silences. Calculate the total number of notes `n_notes`.**" ] }, { "cell_type": "markdown", "metadata": { "id": "Gmg9q2tv6SKM" }, "source": [ "First, explore the dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "TgnGCmyt7Vv6" }, "outputs": [], "source": [ "# Explore the dataset: How is it structured? What does a sample look like? etc." ] }, { "cell_type": "markdown", "metadata": { "id": "mt_1hlL87drp" }, "source": [ "Second, let's group all of the chorales' notes in a set called `notes`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2kZ-AMcpAdVg" }, "outputs": [], "source": [ "notes = set() # Initialize the notes with an empty set\n", "for chorales in (_____, _____, _____): # Loop through chorales in training/validation/test sets\n", " for chorale in _____: # Loop through all chorales\n", " for chord in _____: # Loop through chords within a chorale \n", " notes |= set(chord) # Add notes that are in chord but not yet in notes" ] }, { "cell_type": "markdown", "metadata": { "id": "N9_5Nu3I6xJi" }, "source": [ "Third, calculate `min_note` and `max_note`.\n", "\n", "Hint: Be careful to exclude 0 when calculating `min_note`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pCVmvNAz6WVF" }, "outputs": [], "source": [ "min_note = __________\n", "max_note = __________" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Ly1Jh-ku5igN" }, "outputs": [], "source": [ "# This will return an error message if your code is erroneous\n", "assert min_note == 36\n", "assert max_note == 81" ] }, { "cell_type": "markdown", "metadata": { "id": "vltWq4fdk69F" }, "source": [ "Finally, calculate the total number of notes `n_notes`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6g64vvrelA-4" }, "outputs": [], "source": [ "# Write your code below" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "TeLbN19zlDEc" }, "outputs": [], "source": [ "# This will return an error message if your code is erroneous\n", "assert n_notes == 47" ] }, { "cell_type": "markdown", "metadata": { "id": "bAvF17_G_Rhb" }, "source": [ "### **Q2) What is the training/validation/test split?**\n", "\n", "Hint: You may use the [`len`](https://docs.python.org/3/library/functions.html#len) build-in function to get the length of a list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YhH1T8pe_jjZ" }, "outputs": [], "source": [ "# Calculate the number of chorales in each of the training/validation/test sets" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Y70tAM1r_FOq" }, "outputs": [], "source": [ "# What is the training/validation/test split (in %)?" ] }, { "cell_type": "markdown", "metadata": { "id": "lakqnDVyAdVh" }, "source": [ "### **Q3) Listen to a few chorales 🎵**\n", "\n", "Hint 1: Use the `play_chords` function defined above. Type `play_chords?` to display its documentation.\n", "\n", "Hint 2: You can start by selecting chorales from the training set `train_chorales`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YuujH-JT8BO7" }, "outputs": [], "source": [ "# Explore `train_chorales` and `play_chords` to get some intuition" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YODP8PyLAdVi" }, "outputs": [], "source": [ "# Use `play_chords` to produce chorales you can listen to" ] }, { "cell_type": "markdown", "metadata": { "id": "uwUt5MVWAdVi" }, "source": [ "Divine! 🎶" ] }, { "cell_type": "markdown", "metadata": { "id": "I0c8YNDSAdVi" }, "source": [ "In order to be able to generate new chorales, we want to train a model that can predict the next chord given all the previous chords. If we naively try to predict the next chord in one shot, predicting all 4 notes at once, we run the risk of getting notes that don't go very well together (believe Géron, he tried). It's much better and simpler to predict one note at a time. \n", "\n", "So we will need to preprocess every chorale, turning each chord into an arpegio (i.e., a sequence of notes rather than notes played simultaneously). Each chorale will be a long sequence of notes (rather than chords), and we can just train a model that can predict the next note given all the previous notes. We will use a sequence-to-sequence approach, where we feed a window to the neural net, and it tries to predict that same window shifted one time step into the future.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jJ1riBCLAdVi" }, "outputs": [], "source": [ "def create_target(batch):\n", " X = batch[:, :-1]\n", " Y = batch[:, 1:] # predict next note in each arpegio, at each step\n", " return X, Y" ] }, { "cell_type": "markdown", "metadata": { "id": "uuqfxD3lCrq_" }, "source": [ "We will also shift the values so that they range from 0 to 46, where 0 represents silence, and values 1 to 46 represent notes 36 (C1) to 81 (A5)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6Wx4mGrfCv85" }, "outputs": [], "source": [ "def preprocess(window):\n", " window = tf.where(window == 0, window, window - min_note + 1) # shift values\n", " return tf.reshape(window, [-1]) # convert to arpegio" ] }, { "cell_type": "markdown", "metadata": { "id": "NH43cA-yCVIL" }, "source": [ "And we will train the model on windows of 128 notes (i.e., 32 chords).\n", "\n", "Since the dataset fits in memory, we could preprocess the chorales in RAM using any Python code we like, but I will demonstrate here how to do all the preprocessing using `tf.data` (see Géron Ch16 more details about creating windows using `tf.data`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IIClQTBNCacY" }, "outputs": [], "source": [ "def bach_dataset(chorales, batch_size=32, shuffle_buffer_size=None,\n", " window_size=32, window_shift=16, cache=True):\n", " def batch_window(window):\n", " return window.batch(window_size + 1)\n", "\n", " def to_windows(chorale):\n", " dataset = tf.data.Dataset.from_tensor_slices(chorale)\n", " dataset = dataset.window(window_size + 1, window_shift, drop_remainder=True)\n", " return dataset.flat_map(batch_window)\n", "\n", " chorales = tf.ragged.constant(chorales, ragged_rank=1)\n", " dataset = tf.data.Dataset.from_tensor_slices(chorales)\n", " dataset = dataset.flat_map(to_windows).map(preprocess)\n", " if cache:\n", " dataset = dataset.cache()\n", " if shuffle_buffer_size:\n", " dataset = dataset.shuffle(shuffle_buffer_size)\n", " dataset = dataset.batch(batch_size)\n", " dataset = dataset.map(create_target)\n", " return dataset.prefetch(1)" ] }, { "cell_type": "markdown", "metadata": { "id": "7KJd4y3moPW3" }, "source": [ "Note that this `bach_dataset` function is designed to output the sequence using the shape `(batch_size,number_of_chords)` because the sequence is then fed into an [`Embedding`](https://keras.io/api/layers/core_layers/embedding/) layer. It would need to output the sequence using the shape `(batch_size,number_of_chords,1)` to be directly fed into an [LSTM layer](https://keras.io/api/layers/recurrent_layers/lstm/)." ] }, { "cell_type": "markdown", "metadata": { "id": "coB_lFeXC9ak" }, "source": [ "### **Q4) Use the function `bach_dataset` above to create the training, validation, and test sets. Use a `shuffle_buffer_size` of `1000` for the training set.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cBt_-SJBAdVk" }, "outputs": [], "source": [ "train_set = _________(_________, __________=_____)\n", "valid_set = _________(_________)\n", "test_set = _________(_________)" ] }, { "cell_type": "markdown", "metadata": { "id": "pCi9SOD3Dvzi" }, "source": [ "## Part III: Training a small [WaveNet](https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio) model and generating your first chorale" ] }, { "cell_type": "markdown", "metadata": { "id": "-TiGmzoWtFxg" }, "source": [ "### **Q5) Implement a small [WaveNet](https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio) model to process the sequence of chords**" ] }, { "cell_type": "markdown", "metadata": { "id": "A3QcHiG_t7QP" }, "source": [ "We could feed the note values directly to the model, as floats, but this would probably not give good results. Indeed, the relationships between notes are not that simple: for example, if you replace a C3 with a C4, the melody will still sound fine, even though these notes are 12 semi-tones apart (i.e., one octave). Conversely, if you replace a C3 with a C\\#3, it's very likely that the chord will sound horrible, despite these notes being just next to each other. So we will use an `Embedding` layer to convert each note to a small vector representation (see Géron Chapter 16 for more details on embeddings). We will use 5-dimensional embeddings, so the output of this first layer will have a shape of `[batch_size, window_size, n_embedding_dims=5]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZHGQqo7CtrXn" }, "outputs": [], "source": [ "# Choose the number of embedding dimensions here. Géron recommends 5.\n", "n_embedding_dims = ______" ] }, { "cell_type": "markdown", "metadata": { "id": "NqrAKXlGuisg" }, "source": [ "Now implement a small [WaveNet](https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio) (we recommend starting with no more than 5 layers).\n", "\n", "Hint 1: You need to start with an embedding layer to convert integer notes into a vector of length `n_embedding_dims`. For that purpose, the syntax is: `keras.layers.Embedding(input_dim=input_dim, output_dim=output_dim,input_shape=[None])`. `input_dim` is the number of possible integer categories of the notes we would like to convert, while `output_dim` is the number of embedding dimensions.\n", "\n", "Hint 2: A [WaveNet](https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio) is a sequence of [Conv1D](https://keras.io/api/layers/convolution_layers/convolution1d/) layers with increased dilation rate. For instance, below is a WaveNet with 3 layers and a constant number of filters equal to 128 (increase the filter size for more representation power). Note the increase in the dilation rate. \n", "```\n", "keras.layers.Conv1D(filters=128, kernel_size=2, padding=\"causal\", activation=\"relu\", dilation_rate=2),\n", "keras.layers.Conv1D(filters=128, kernel_size=2, padding=\"causal\", activation=\"relu\", dilation_rate=4),\n", "keras.layers.Conv1D(filters=128, kernel_size=2, padding=\"causal\", activation=\"relu\", dilation_rate=8),\n", "```\n", "\n", "Hint 3: For the final layer, you need to output the probability of the note belonging to each one of the `n_notes`. Because the probabilities need to sum to 1, we have to use a [`softmax`](https://keras.io/api/layers/activations/) activation function, used as the `activation` argument of the [`keras.layers.Dense`](https://keras.io/api/layers/activations/) layer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qP87koqCuJt_" }, "outputs": [], "source": [ "# We use the sequential API below but you may use the functional API instead\n", "model = keras.models.Sequential([\n", " keras.layers.Embedding(input_dim=______, output_dim=______,\n", " input_shape=[None]),\n", " __________________________________________,\n", " __________________________________________,\n", " keras.layers.Dense(________, activation=__________)\n", "])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qAQcAQdnFbR9" }, "outputs": [], "source": [ "# Check that your model looks right\n", "model.summary()" ] }, { "cell_type": "markdown", "metadata": { "id": "5Of7la1hypWb" }, "source": [ "### **Q6) Compile your model using [\"sparse_categorical_crossentropy\"](https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy) as the loss since your outputs are not one-hot encoded, and \"accuracy\" as an additional metric to monitor during training.**\n", "\n", "Hint: Potential Keras optimizers are listed [at this link](https://keras.io/api/optimizers/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XBb0UeBL0uR-" }, "outputs": [], "source": [ "# Choose your optimizer\n", "optimizer = keras.optimizers._______________________" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DD0-HeS200ln" }, "outputs": [], "source": [ "# Compile your model\n", "model.compile(loss=____________, optimizer=__________,\n", " metrics=[_______ ])" ] }, { "cell_type": "markdown", "metadata": { "id": "LnS3Py23HFWx" }, "source": [ "### **Q7) Train your model on the training set and plot the learning curves. Is your model overfitting?**\n", "\n", "Hint 1: You may use 20 epochs for training and a patience of 20 epochs for your early stopping callback.\n", "\n", "Hint 2: To plot your learning curves with [Tensorboard](https://www.tensorflow.org/tensorboard), fill out the information in the cell below" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "leX_NPGZIEEx" }, "outputs": [], "source": [ "#Change this number and rerun this cell whenever you want to change runs\n", "run_index = ___ # it should be an integer, e.g. 1\n", "\n", "run_logdir = os.path.join(os.curdir, \"my_bach_logs\", \"run_{:03d}\".format(run_index))\n", "\n", "print(run_logdir)" ] }, { "cell_type": "markdown", "metadata": { "id": "a11RV36VIugF" }, "source": [ "Define your [callbacks](https://keras.io/api/callbacks/) below. For the checkpoint `checkpoint_cb`, we recommend monitoring the validation loss to avoid overfitting. Look for \"monitor\" in the `model_checkpoint`'s documentation [at this link](https://keras.io/api/callbacks/model_checkpoint/)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RwgGwyOiILas" }, "outputs": [], "source": [ "early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=_____)\n", "checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(\"my_bach_model.h5\", \n", " save_best_only=True,\n", " monitor=_______)\n", "tensorboard_cb = tf.keras.callbacks.TensorBoard(run_logdir)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HJojibqtISEU" }, "outputs": [], "source": [ "history = model.fit(____, # training data\n", " epochs=___, #epochs\n", " validation_data=____, # validation data\n", " callbacks=[checkpoint_cb, early_stopping_cb, tensorboard_cb])" ] }, { "cell_type": "markdown", "metadata": { "id": "soiApJl9JW4r" }, "source": [ "Visualize your learning curves using [Tensorboard](https://www.tensorflow.org/tensorboard):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dyVY8FYZJAxs" }, "outputs": [], "source": [ "%tensorboard --logdir=./my_bach_logs --port=8888 # Pick any 4 digits for the port" ] }, { "cell_type": "markdown", "metadata": { "id": "9ZJYztSHKX-c" }, "source": [ "### **Q8) To double check whether your model is overfitting, evaluate it on the test set using the built-in [evaluate](https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate) method.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "85OUuqmsKmp8" }, "outputs": [], "source": [ "model.______(_________) # Evaluate the model on the test set" ] }, { "cell_type": "markdown", "metadata": { "id": "J025prJpNCDh" }, "source": [ "Ideally, you should reach an accuracy of at least 40%. If you don't, you may:\n", "\n", "* Increase the number of trainable parameters in your model, e.g. by increasing the filter size, \n", "* Train your model for more epochs, or \n", "* Adjust your checkpoint, e.g. to monitor your validation loss ('val_loss') to avoid overfitting." ] }, { "cell_type": "markdown", "metadata": { "id": "AMUltDfuLR8Q" }, "source": [ "Now let's write a function that will generate a new chorale. We will give it a few seed chords, it will convert them to arpegios (the format expected by the model), and use the model to predict the next note, then the next, and so on. In the end, it will group the notes 4 by 4 to create chords again, and return the resulting chorale." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "P0rr0ILoLxUz" }, "outputs": [], "source": [ "def generate_chorale(model, seed_chords, length):\n", " arpegio = preprocess(tf.constant(seed_chords, dtype=tf.int64))\n", " arpegio = tf.reshape(arpegio, [1, -1])\n", " for chord in range(length):\n", " for note in range(4):\n", " #next_note = model.predict_classes(arpegio)[:1, -1:]\n", " next_note = np.argmax(model.predict(arpegio), axis=-1)[:1, -1:]\n", " arpegio = tf.concat([arpegio, next_note], axis=1)\n", " arpegio = tf.where(arpegio == 0, arpegio, arpegio + min_note - 1)\n", " return tf.reshape(arpegio, shape=[-1, 4])" ] }, { "cell_type": "markdown", "metadata": { "id": "0gZ7GKs5LUfo" }, "source": [ "### **Q9) Using seed chords from the test set, generate your first chorale! 🎼**" ] }, { "cell_type": "markdown", "metadata": { "id": "Eat6qKRuODZ3" }, "source": [ "Extract some `seed_chords` from the test set `test_chorales`.\n", "\n", "Hint: You can simply use the first 5-10 chords of one of the test chorales." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fPJIaPVsOY9K" }, "outputs": [], "source": [ "seed_chords = ________[_____][____:______]" ] }, { "cell_type": "markdown", "metadata": { "id": "FEuAEnSbOkL1" }, "source": [ "and play them 😃\n", "\n", "Hint: You may use the function `play_chords` [defined above](#play_chords)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mx5X2dQwO3kZ" }, "outputs": [], "source": [ "# Play the seed chords" ] }, { "cell_type": "markdown", "metadata": { "id": "kRXQa-IsRIv7" }, "source": [ "Now we are ready to generate our first chorale! Let's ask the function to generate `n_generated` more chords:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hGxg44CXRXLz" }, "outputs": [], "source": [ "n_generated = ____________ # Choose a number of chords to generate (e.g., 20-100)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7JrWfkXcRlZZ" }, "outputs": [], "source": [ "n_generated = 50" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6Hbqj0pbRh1Q" }, "outputs": [], "source": [ "new_chorale = generate_chorale(model, seed_chords, n_generated)\n", "play_chords(new_chorale)" ] }, { "cell_type": "markdown", "metadata": { "id": "cmsT-25PODs_" }, "source": [ "From Géron:\n", "\n", "\"*This approach has one major flaw: it is often too conservative. Indeed, the model will not take any risk, it will always choose the note with the highest score, and since repeating the previous note generally sounds good enough, it's the least risky option, so the algorithm will tend to make notes last longer and longer. Pretty boring. Plus, if you run the model multiple times, it will always generate the same melody.*\n", "\n", "*So let's spice things up a bit! Instead of always picking the note with the highest score, we will pick the next note randomly, according to the predicted probabilities. For example, if the model predicts a C3 with 75% probability, and a G3 with a 25% probability, then we will pick one of these two notes randomly, with these probabilities. We will also add a `temperature` parameter that will control how \"hot\" (i.e., daring) we want the system to feel. A high temperature will bring the predicted probabilities closer together, reducing the probability of the likely notes and increasing the probability of the unlikely ones.*\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DgckBx4mODs_" }, "outputs": [], "source": [ "def generate_chorale_v2(model, seed_chords, length, temperature=1):\n", " arpegio = preprocess(tf.constant(seed_chords, dtype=tf.int64))\n", " arpegio = tf.reshape(arpegio, [1, -1])\n", " for chord in range(length):\n", " for note in range(4):\n", " next_note_probas = model.predict(arpegio)[0, -1:]\n", " rescaled_logits = tf.math.log(next_note_probas) / temperature\n", " next_note = tf.random.categorical(rescaled_logits, num_samples=1)\n", " arpegio = tf.concat([arpegio, next_note], axis=1)\n", " arpegio = tf.where(arpegio == 0, arpegio, arpegio + min_note - 1)\n", " return tf.reshape(arpegio, shape=[-1, 4])" ] }, { "cell_type": "markdown", "metadata": { "id": "fA35AL5yODs_" }, "source": [ "### **Q10) Using the function `generate_chorale_v2`, generate 3 chorales using this new function: one cold (`temperature<1`), one medium (`temperature=1`), and one hot (`temperature>1`).**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MG9RfJPsODs_", "scrolled": true }, "outputs": [], "source": [ "new_chorale_v2_cold = generate_chorale_v2(____,____,____,____)\n", "play_chords(new_chorale_v2_cold, filepath=\"bach_cold.wav\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "SSIS_EVUODs_" }, "outputs": [], "source": [ "new_chorale_v2_medium = generate_chorale_v2(____,____,____,____)\n", "play_chords(new_chorale_v2_medium, filepath=\"bach_medium.wav\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "-muViLabODs_" }, "outputs": [], "source": [ "new_chorale_v2_hot = generate_chorale_v2(____,____,____,____)\n", "play_chords(new_chorale_v2_hot, filepath=\"bach_hot.wav\")" ] }, { "cell_type": "markdown", "metadata": { "id": "7oBoc8PnSiMG" }, "source": [ "## Part IV: Generating a Masterpiece Using Recurrent Neural Networks" ] }, { "cell_type": "markdown", "metadata": { "id": "oUP174DWSw6U" }, "source": [ "### **Q11) Improve your model's accuracy by adding batch normalization and at least one recurrent neural network layer at the end of your model.**\n", "\n", "Hint 1: Consider adding a [Long Short-Term Memory](https://keras.io/api/layers/recurrent_layers/lstm/), a [Gated Recurrent Unit](https://keras.io/api/layers/recurrent_layers/gru/), or an [Attention](https://keras.io/api/layers/attention_layers/attention/) layer.\n", "\n", "Hint 2: Batch normalization layers are documented [at this link](https://keras.io/api/layers/normalization_layers/batch_normalization/), and you may insert them between any layers to accelerate convergence during training.\n", "\n", "Hint 3: You may reuse some of the code you wrote for Q5-Q8.\n", "\n", "Hint 4: If you would like to be more systematic about your model architecture choices, you can optimize hyperparameters of your model, such as the number of filters, the layer parameters, the learning rate, the optimizer, etc. using hyperparameter optimization libraries such as the [KerasTuner](https://keras.io/keras_tuner/)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hibDwZdBUBCz" }, "outputs": [], "source": [ "# Redesign your model's architecture" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dAJAHSahUExB" }, "outputs": [], "source": [ "# Compile your model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KXa2-wbpUI9J" }, "outputs": [], "source": [ "# Define callbacks (they can really improve the accuracy if well-chosen!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gvnr9IPBURE2" }, "outputs": [], "source": [ "# Train your model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "70C2H5-TUfF4" }, "outputs": [], "source": [ "# Evaluate its accuracy on the test set" ] }, { "cell_type": "markdown", "metadata": { "id": "dWirW51_UhQv" }, "source": [ "You should be able to reach accuracy values larger than 60% at this stage 😲" ] }, { "cell_type": "markdown", "metadata": { "id": "9Lq2f9-UUrrk" }, "source": [ "### **Q12) Compose a masterpiece.**\n", "\n", "Hint 1: You may reuse some of the code you wrote for Q9 and Q10.\n", "\n", "Hint 2: Experiment with other seeds, lengths and temperatures to compose your masterpiece.\n", "\n", "From Géron:\n", "**Please share your most beautiful generated chorale with me on Twitter @aureliengeron, I would really appreciate it! :))**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KR21u4LRV4gg" }, "outputs": [], "source": [ "# Choose and play the seed chords" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZL_ETZeIV7k7" }, "outputs": [], "source": [ "# Generate a new chorale that's even more beautiful than the previous one" ] }, { "cell_type": "markdown", "metadata": { "id": "9r0feOv_ODs_" }, "source": [ "You can try a fun social experiment: send your friends a few of your favorite generated chorales, plus the real chorale, and ask them to guess which one is the real one!" ] }, { "cell_type": "markdown", "metadata": { "id": "R4gLDpKSODs_" }, "source": [ "Check out [Google's Coconet model](https://homl.info/coconet), which was used for a nice [Google doodle about Bach](https://www.google.com/doodles/celebrating-johann-sebastian-bach)." ] } ], "metadata": { "accelerator": "GPU", "colab": { "include_colab_link": true, "name": "S6_1_Composing_Music_With_RNNs_CNNs.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "nav_menu": {}, "toc": { "navigate_menu": true, "number_sections": true, "sideBar": true, "threshold": 6, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }