Music generation using Deep Learning
“If I had my life to live over again, I would have made a rule to read some poetry and listen to some music at least once every week.”― Charles Darwin
Life exists on the sharp edged wire of the Guitar. Once you jump, it’s echos can be heard with immense intangible pleasure. Let's explore this intangible pleasure…
Music is nothing but a sequence of nodes(events). Here input to the model is a sequence of nodes.
Some of the music generated example using RNNs shown below
Music Representation:
- sheet-music
- ABC-notation: it has a sequence of characters which is very simple for Neural Network train. https://en.wikipedia.org/wiki/ABC_notation
- MIDI: https://towardsdatascience.com/how-to-generate-music-using-a-lstm-neural-network-in-keras-68786834d4c5
- mp3- store only audio file.
Char-RNN
Here I'm using char-RNN structure(Many-Many RNN) where one output corresponds to each input(input Ci -.> output C(i+1)) at each time step(cell). It can have multiple hidden layers(multiple LSTM layers).
Visualizing the predictions and the “neuron” firings in the RNN
Under every character, we visualize (in red) the top 5 guesses that the model assigns for the next character. The guesses are colored by their probability (so dark red = judged as very likely, white = not very likely). The input character sequence (blue/green) is colored based on the firing of a randomly chosen neuron in the hidden representation of the RNN. Think about it as green = very excited and blue = not very excited.
Process:
- Obtaining data
- preprocessing(generating batch-sgd)to feed into char-RNN
Please follow the below link for more datasets. Here I used only Jigs (340 tunes) dataset in ABC-format.
The dataset will be fed into RNN training using a batch size of 16.
Here two LSTM cell represents for each input. The input X0 goes in all LSTM cells in the first input layer. You will get output(h0) and information send to the next time step layer. All output at time step one, LSTM_t1_1, LSTM_t1_2 connected to dense layer whose output is h0. The dense layer at time-step one is called time distributed dense layer. Similarly for the next time step.
- Return sequence=True in Keras used in case of when you want to generate output at each input in timestamp sequence. For every input, we need sequence of output. The same input will go to every cell and generate output at every cell in one layer. Every time step(i), we will get a vector of output(256 in given problem).
2. Time distributed dense layer. Please follow the above discussion for a better understanding. At every timestep, it will take all LSTM output and construct a dense layer of size 86. Here 86 is number of unique characters in whole vocabulary.
3. Stateful=True, the last state for each sample at index i in a batch will be used as the initial state for the sample of index i in the following batch. It used in the case when you want to connect one batch with the second batch with the input of the second batch is the output of the first batch. In the case of stateful=false, each batch has zero input to the first time step layer.
Model Architecture and Training:
It is a multi-class classification in which a given input, it will give an output which is an anyone of the total number of character.
The training model generates 86 character after every input of character. based on probability, it will decide the final output character.
Next, we will feed C(i+1) to model, it will generate C(i+2) character. This will continue until all batches of character feed of whole data.
Output:
Open the following link and Paste your generated music is given space in order to play.
For Tabla music:
If you are able to change each sequence as a character than you can use the above char-RNN model. Please read the following blog for a detailed understanding.
MIDI music generation:
Here we will use Music21 python library to read MIDI file and able to convert into the sequence of event. Please read the following blog for detailed understanding.
Models other than char-RNN(Very recent blog):
Its survey blog, have all models apart from the char-RNN model based on Neural Network. Please follow if want to explore.
Google project on generating music:
Based on Tensorflow and LSTM, the project of google researcher.
Reference:
Google Image(for image ) rest link given at respective section
========Thanks(Love to hear from your side)=========
Find detail code on my GitHub account…