Lesson 5.4: Deep Learning for Time Series (RNNs & LSTMs)
An introduction to using recurrent neural networks for sequence data.
The Problem with 'Memoryless' Models
Standard feed-forward networks (MLPs) and tree-based models like XGBoost have no inherent memory. We have to manually create features like "return from 5 days ago" to give them a sense of the past. This is limiting.
Recurrent Neural Networks (RNNs) solve this by introducing a **feedback loop**. The output from one step is fed back as an input to the next step, creating a "hidden state" that acts as a memory of the sequence seen so far.
RNNs and their Flaws
A simple RNN cell updates its hidden state based on the current input and the previous hidden state .
The Problem: Vanishing Gradients. During training (backpropagation through time), the gradient signal from distant past steps can shrink exponentially, making it impossible for simple RNNs to learn long-term dependencies.
The Solution: Gated Architectures
LSTMs and GRUs
Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks solve the vanishing gradient problem with a more sophisticated cell structure that includes "gates."
- Forget Gate: Decides what information to throw away from the cell's memory.
- Input Gate: Decides what new information to store.
- Output Gate: Decides what part of the memory to use for the current output.
These gates allow the network to selectively remember important information over very long sequences, making them ideal for complex time series and natural language processing.
What's Next? Advanced Concepts
We've now seen how ML can be applied to time series, from feature engineering for tree models to end-to-end learning with LSTMs.
The final lesson of this module will touch upon some advanced concepts from modern quantitative finance, such as **Meta-Labeling** and the importance of **Feature Importance** in strategy development.