From 1133ec5394084c313006a2d8fa334563c51aa488 Mon Sep 17 00:00:00 2001
From: Jon Wheeler <jonathan.wheeler01@gmail.com>
Date: Thu, 31 Aug 2023 15:25:00 -0600
Subject: [PATCH] some explanatory content added to ep 5

---
 episodes/05-multi-step-forecasts.md | 60 ++++++++++++++++++++---------
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/episodes/05-multi-step-forecasts.md b/episodes/05-multi-step-forecasts.md
index 8b74cce..9ec1e17 100644
--- a/episodes/05-multi-step-forecasts.md
+++ b/episodes/05-multi-step-forecasts.md
@@ -437,6 +437,11 @@ Linear performance against test data: [0.339562326669693, 0.28846967220306396]
 
 ## Dense neural network
 
+Similar to the definition of the dense model for single step forecasts, the
+definition of the multi-step dense model adds a ```Dense``` layer to the
+preceeding linear model pipeline. The activation function is again ```relu```,
+and the units argument is increased to 512. The unit argument specifies the
+shape of the output that is passed to the next layer.
 
 ```python
 multi_dense_model = tf.keras.Sequential([
@@ -454,6 +459,9 @@ multi_performance['Dense'] = multi_dense_model.evaluate(multi_window.test, verbo
 
 print("Dense performance against validation data:", multi_val_performance["Dense"])
 print("Dense performance against test data:", multi_performance["Dense"])
+
+# add a plot
+multi_window.plot(multi_dense_model)
 ```
 
 ```output
@@ -464,16 +472,16 @@ Dense performance against validation data: [0.2058122605085373, 0.18464779853820
 Dense performance against test data: [0.22725100815296173, 0.19131870567798615]
 ```
 
-And plot:
-
-```python
-multi_window.plot(multi_dense_model)
-```
-
 ![Plot of a multi step dense neural network.](./fig/ep5_fig4.png)
 
 ## Convolution neural network
 
+For the multi-step convolution neural network model, the first ```Dense```
+layer of the previous multi-step dense model is replaced with a one dimensional
+convolution layer. The arguments to this layer include the number of filters,
+in this case 256, and the kernal size, which specifies the width of the 
+convolution window.
+
 ```python
 CONV_WIDTH = 3
 multi_conv_model = tf.keras.Sequential([
@@ -491,6 +499,9 @@ multi_performance['Conv'] = multi_conv_model.evaluate(multi_window.test, verbose
 
 print("CNN performance against validation data:", multi_val_performance["Conv"])
 print("CNN performance against test data:", multi_performance["Conv"])
+
+# add a plot
+multi_window.plot(multi_conv_model)
 ```
 
 ```output
@@ -501,17 +512,19 @@ CNN performance against validation data: [0.20042525231838226, 0.179121389985084
 CNN performance against test data: [0.2245914489030838, 0.18907274305820465]
 ```
 
-And plot:
-
-```python
-multi_window.plot(multi_conv_model)
-```
-
 ![Plot of a multi step convolution neural network.](./fig/ep5_fig5.png)
 
 
 ## Recurrent neural network (LSTM)
 
+Recall that the recurrent neural network model maintains an internal state
+based on consecutive inputs. For this reason, the ```lamba``` function used so
+far to flatten the model input is not necessary in this case. Instead, a single
+Long Short-Term Memory (```LSTM```) layer processes the mode input that is 
+passed to the ```Dense``` layer. The position *units* argument of 32 specifies
+the shape of the output. The *return_sequences* argument is here set to false
+so that the internal state will be maintained until the final input timestep.
+
 ```python
 multi_lstm_model = tf.keras.Sequential([
     tf.keras.layers.LSTM(32, return_sequences=False),
@@ -528,6 +541,9 @@ multi_performance['LSTM'] = multi_lstm_model.evaluate(multi_window.test, verbose
 
 print("LSTM performance against validation data:", multi_val_performance["LSTM"])
 print("LSTM performance against test data:", multi_performance["LSTM"])
+
+# add a plot
+multi_window.plot(multi_lstm_model)
 ```
 
 ```ouput
@@ -538,16 +554,21 @@ LSTM performance against validation data: [0.17599913477897644, 0.17859137058258
 LSTM performance against test data: [0.19873034954071045, 0.18935032188892365]
 ```
 
-And plot:
-
-```python
-multi_window.plot(multi_lstm_model)
-```
-
 ![Plot of a multi step LSTM neural network.](./fig/ep5_fig6.png)
 
 ## Evaluate
 
+From monitoring the output of the different models for each epoch, we can see
+that in general all of the multi-step models outperform all of the single 
+step models. Multiple factors can influence model performance, including the
+size of the dataset, feature engineering, and data normalization. It is not
+necessarily true that multi-step models will always generally outperform
+single step models, though that happened to be the case for this dataset.
+
+Similar to the results of the single step models, the convolution neural network
+performed best overall, though only slightly better than the LSTM model. All
+three of the neural network models outperformed the baseline and linear models.
+
 ```python
 for name, value in multi_performance.items():
   print(f'{name:8s}: {value[1]:0.4f}')
@@ -561,7 +582,8 @@ Conv    : 0.1891
 LSTM    : 0.1894
 ```
 
-Plot MAE on validation and test dataframes.
+Finally, the below plot provides a comparison of each model's performance 
+against both the validation and test dataframes.
 
 ```python
 x = np.arange(len(multi_performance))