Time Series Forecasting with TensorFlow and Visualization Techniques to Perform Predictions Beyond the Validation Period
How to Extend Your Predictions Beyond Validation Period

In this article, I’ll guide you through the process of building time series models using TensorFlow, a powerful framework for constructing and training neural networks. I’ll show you a variety of neural network architectures for time series forecasting, ranging from simple models like SimpleRNN to more complex ones such as LSTM. Additionally, I’ll present advanced visualization techniques to I’ve used to make and visualize predictions beyond the validation period.
Setting up the Environment
I’ve used the following libraries: TensorFlow with Keras for building neural networks, Matplotlib for visualization, NumPy for numerical operations, and Scikit-Learn for data preprocessing.
import numpy as np
import tensorflow as tf
from matplotlib import pyplot as plt
from sklearn.preprocessing import MinMaxScaler
Data Preparation
Data preparation is fundamental for the success of any machine learning model. In this section, I will perform several steps to prepare the data for training and validation.
Separating Data and Time Steps
The first step is to separate the time steps from the actual data.
For Short Time Series Data (data stored in an array): we can create an array of time steps using ‘np.arange()’:
#For short time series data, data stored in an array, I'll do the following:
dummy_data = np.array([1, 2, 3,...])
time_step = np.arange(len(dummy_data))
For Larger Datasets Stored in Files (e.g., CSV Files): we can read the data and corresponding time steps from the file:
#For larger datasets stored in files, such as CSV files
import csv
time_step = []
data = []
with open("file.txt", "r", encoding="utf-8") as f:
csv_reader = csv.reader(f, delimiter=",")
# Skip the header
next(csv_reader)
# Skip lines with NUL characters
lines = (line for line in csv_reader if "\0" not in line)
# Iterate through non-null lines
for line in lines:
# Assuming the first column is the date and the second column is the number
time_step.append(datetime.strptime(line[0], "%Y-%m-%d"))
data.append(float(line[1]))
Splitting Data into Training and Validation Sets
After obtaining the time steps and data, we split them into training and validation sets to train and evaluate the model’s performance.
# Determine the split point between training and validation data
split_time =
# Split time steps into training and validation sets
time_train = time_step[:split_time]
time_valid = time_step[split_time:]
# Split data into training and validation sets
x_train = dummy_data[:split_time]
x_valid = dummy_data[split_time:]
# Use Min-Max scaling
# Initialize MinMaxScaler
scaler = MinMaxScaler()
# Reshape and scale the training data
x_train_scaled = scaler.fit_transform(np.array(x_train).reshape(-1, 1)).flatten()
# Scale the validation data using the same scaler
x_valid_scaled = scaler.transform(np.array(x_valid).reshape(-1, 1)).flatten()
Generating windowed datasets
After that, I’ve created a function to generate windowed datasets for both training and validation. Each window consists of a fixed number of data points. For instance, with a window size of 4, I use the last four data points in each window to predict the next one.
The windowed dataset produceses two-dimensional batches of windows on the data (batch size and timestamps).
def windowed_dataset(series, window_size):
# Create a TensorFlow dataset from the input series
dataset = tf.data.Dataset.from_tensor_slices(series)
# Window the dataset into fixed-size windows with a specified window_size,
# shifting the window by 1 at each step, and drop any remaining data that
# doesn't fit into a complete window
dataset = dataset.window(window_size, shift=1, drop_remainder=True)
# Flatten the dataset of windows into individual windows and batch them
dataset = dataset.flat_map(lambda window: window.batch(window_size))
# Map each window to a tuple where the first element contains all but the last
# element of the window and the second element contains the last element of the window
dataset = dataset.map(lambda window: (window[:-1], window[-1]))
# Batch the dataset with a batch size of 1 and prefetch it for improved performance
return dataset.batch(1).prefetch(1)
# Window size
window_size =
# Create windowed dataset for training
dataset_train = windowed_dataset(x_train_scaled, window_size)
# Create windowed dataset for validation
dataset_valid = windowed_dataset(x_valid_scaled, window_size)
After following these steps the data is properly prepared for ingestion into the model.
Defining the Model
Keras provides many time series models that can be used for time-series forecasting. I’ll briefly describe some of the models I’ve used, starting with simpler structures and gradually increasing in complexity. It’s important to note that these structures are just examples. The number of units and the number of layers need to be fine-tuned according to the dataset being used.
Simple Recurrent Neural Network
RNNs are neural networks used for processing sequences of data while retaining information from earlier time steps. However, they may fail in remembering information over long sequences.
In each time step, different batches of input data are fed into the RNN cell. The output of the RNN cell at each time step dependS not only on the current input batch but also on the previous state of the cell, which captures information from earlier time steps.

Here’s a simple example of an RNN model with two recurrent layers and a final output layer. When using RNNs, the input data needs to be shaped because RNNs typically expect input data in a 3D tensor. [1]
model = tf.keras.models.Sequential([
tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1),
input_shape=[window_size]),
tf.keras.layers.SimpleRNN(units, return_sequences=True),
tf.keras.layers.SimpleRNN(units),
tf.keras.layers.Dense(1)
])
Long Short-Term Memory networks
LSTMs networks are a type of recurrent neural network known for their ability to retain information over multiple time steps. LSTMs achieve this by incorporating a memory cell that passes information from one cell to another and from one time step to another within the network.

Additionally, LSTMs can be bidirectional, allowing them to analyze input data not only in the forward direction but also in reverse.
Here’s an example of an LSMT model with two bidirectional LSTM layers. In my experience, using multiple layers in LSTMs tends to outperform single-layer models by capturing both low-level and high-level features. [2]
model = tf.keras.models.Sequential([
tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1),input_shape=[None]),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units)),
tf.keras.layers.Dense(1)
])
Adjusting the Learning Rate
These architectures need to be adapted according to each dataset. However, what works for me is to adjust the learning rate first and then experimenting with other number of layers and of units.
Have a look to the code for adjusting the learning rate and the number of layers in this article: Optimizing Neural Network Performance through Learning Rate Tuning and Hidden Layer Unit Selection.
Important Note: A very useful piece of advice I wish I had known a long time ago: apart from experimenting with different numbers of layers and units, I advise you to always adjust the weights while training the model. For more detailed information and coding, check out my article titled Achieving Reproducibility in Neural Network Predictions.
Evaluate the model on the validation dataset
To be able to visualize the performance of the model and the predictions, let’s compile and train the model.
# Compile the model
model.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(),metrics=["mae"])
# Train the model
history = model.fit(dataset_train, epochs=, validation_data=dataset_valid)
# Evaluate the model
evaluation_result = model.evaluate(dataset_valid)
print("Validation Loss:", evaluation_result)
In time series forecasting, it’s common to use the Mean Squared Error (MSE) or the Mean Absolute Error (MAE) to validate the performance of the models. Unlike MSE, MAE does not square the errors but instead uses their absolute values. This approach does not overly penalize large errors, making it suitable for scenarios where all errors should be treated equally.
Visualizing training and validation loss
Let’s now create two plots showing the Loss and MAE curves over epochs for both training and validation data.
# Plot training and validation loss over epochs
plt.figure(figsize=(10, 6))
# Plot training loss
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss', color='blue')
plt.plot(history.history['val_loss'], label='Validation Loss', color='red')
plt.title('Training and Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
# Plot training and validation MAE over epochs
plt.subplot(1, 2, 2)
plt.plot(history.history['mae'], label='Training MAE', color='blue')
plt.plot(history.history['val_mae'], label='Validation MAE', color='red')
plt.title('Training and Validation MAE')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.legend()
plt.tight_layout()
plt.show()
This is an example of the plots which can be obtained:

Making Predictions on the Validation Dataset
Now that I’ve trained the model and validated its performance using MSE metric, it’s time to apply the model to the validation dataset to make predictions.
Below is the code to make predictions on the validation dataset. Additionally, I’ve included how to print the predictions beyond the validation period along the actual data for each time step.
# Predict on the validation dataset
num_predictions_beyond_validation = 2
validation_predictions = []
# Use the last window_size from the training set for predictions
current_window = x_train_scaled[-window_size:]
# Adjust time steps for validation predictions
validation_time_steps = np.arange(len(x_valid_scaled))
for time in range(len(validation_time_steps) + num_predictions_beyond_validation):
# Reshape the current_window to match the model's input shape
current_window_reshaped = np.array(current_window[-window_size:]).reshape(1, -1)
# Predict the next value using the model on the validation dataset
predicted_value_scaled = model.predict(current_window_reshaped)[0, 0]
# Append the predicted value to the list of predictions
validation_predictions.append(predicted_value_scaled)
# Print the actual and predicted values during validation along with the time step
if time < len(x_valid_scaled):
actual_value_scaled = x_valid_scaled[time]
# Denormalize the actual value
actual_value_denormalized = scaler.inverse_transform(np.array(actual_value_scaled).reshape(1, -1)).flatten()
# Denormalize the predicted value
predicted_value_denormalized = scaler.inverse_transform(np.array(predicted_value_scaled).reshape(1, -1)).flatten()
print(f'Time: {time_valid[time]}, Actual: {actual_value_denormalized}, Predicted: {predicted_value_denormalized}')
# Update the current window for the next iteration using the true value from the validation set
if time < len(x_valid_scaled):
current_window = np.append(current_window, x_valid_scaled[time])[1:]
else:
# Print the predicted value beyond validation along with the time step
# Denormalize the predicted value beyond validation
predicted_value_denormalized = scaler.inverse_transform(np.array(predicted_value_scaled).reshape(1, -1)).flatten()
print(f'Time: {time_valid[-1] + time - len(x_valid_scaled) + 1}, Predicted (Beyond Validation): {predicted_value_denormalized}')
current_window = np.append(current_window, predicted_value_scaled)[1:]
You can also visualize the predictions beyond the validation period in a plot.
# Plot the original data, training data, validation data, and predictions
plt.figure(figsize=(10, 6))
# Plot original data in blue
plt.plot(time_step, dummy_data, label='Original Data', marker='o', linestyle='-', color='black')
# Plot training data in green
plt.plot(time_step[:split_time], dummy_data[:split_time], label='Training Data', color='blue')
# Plot validation data in orange
plt.plot(time_valid, dummy_data[split_time:], label='Validation Data', color='red')
# Denormalize validation predictions
validation_predictions_denormalized = scaler.inverse_transform(np.array(validation_predictions).reshape(-1, 1)).flatten()
# Plot predictions (Validation)
plt.plot(time_valid, validation_predictions_denormalized[:len(time_valid)], label='Predictions (Validation)', color='orange')
# Highlight the last few predictions beyond the validation set in red
last_predictions_beyond_validation_denormalized = scaler.inverse_transform(np.array(validation_predictions[-num_predictions_beyond_validation:]).reshape(-1, 1)).flatten()
time_last_predictions_beyond_validation = np.arange(split_time + len(time_valid), split_time + len(time_valid) + num_predictions_beyond_validation)
plt.scatter(time_last_predictions_beyond_validation, last_predictions_beyond_validation_denormalized, color='red', marker='X', label='Last Predictions Beyond Validation')
plt.legend()
plt.title('Original Data, Training Data, Validation Data, and Predictions')
plt.xlabel('Time Step')
plt.ylabel('Values')
plt.show()
This is an example of an image from the code above, based on a dummy dataset.

Conclusion
Visualizing time series data and model predictions is basic for understanding the performance of your model. In this article I solved a problem I faced several times which was how to perform and visualize time series predictions beyond the validation period, which may be useful for checking the model on actual data afterward.
Bibliography
[1]SimpleRNN:https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN#call_arguments
[2]LSTM:https://www.tensorflow.org/api_docs/python/tf/keras/layers/LSTM