Machine Learning for Time Series Data#
Time series data is an ordered sequence of data points collected or recorded at specific time intervals. Examples include stock prices, weather data, sensor readings, sales data, and website traffic. Unlike traditional datasets, time series data has temporal dependencies that must be captured to make accurate predictions.
In this section, we’ll cover:
Types of Time Series Problems
Key Challenges in Time Series Analysis
Popular Machine Learning Algorithms for Time Series
Deep Learning Approaches for Time Series
Project: Forecasting Stock Prices
Evaluation Metrics for Time Series
1. Types of Time Series Problems#
1.1. Time Series Forecasting:#
The goal is to predict future values based on historical data. For example, predicting the temperature for the next 7 days, forecasting stock prices, or estimating future product demand.
1.2. Time Series Classification:#
Given a sequence of data, the goal is to classify the entire series or detect patterns within the series. For example, classifying an ECG signal as normal or abnormal or detecting fraudulent transactions.
1.3. Anomaly Detection:#
Identifying outliers or abnormal patterns in time series data. This is important in applications like network security (detecting unusual traffic), sensor monitoring (identifying faulty equipment), or financial systems (detecting fraudulent activities).
1.4. Time Series Regression:#
Predicting a continuous output where the input includes one or more time series variables. This is similar to regular regression but takes into account the time dependencies.
2. Key Challenges in Time Series Analysis#
Temporal Dependency: Time series data points are not independent of each other. Each point may be influenced by previous time steps, and capturing this dependency is crucial.
Trend and Seasonality: Time series data often exhibit trends (upward or downward movements) and seasonal patterns (regular fluctuations that repeat over time).
Stationarity: Many machine learning models assume that the data is stationary (its statistical properties do not change over time), which may not be the case for many real-world time series.
Missing Data: Time series data often contains missing values, which need to be handled carefully, as they can affect the model’s predictions.
Multivariate Time Series: In some cases, multiple time series variables (e.g., temperature, humidity, wind speed) are correlated and need to be considered together.
3. Popular Machine Learning Algorithms for Time Series#
Several traditional machine learning algorithms can be adapted for time series forecasting, classification, and regression. These include:
3.1. ARIMA (AutoRegressive Integrated Moving Average)#
ARIMA is a widely used statistical method for time series forecasting. It combines three components: autoregression (AR), differencing to remove non-stationarity (I), and moving average (MA).
ARIMA is suited for univariate time series data and is effective for short-term forecasting when the underlying data is stationary.
Example:
from statsmodels.tsa.arima.model import ARIMA
# Example univariate time series data (e.g., stock prices)
data = [112, 118, 132, 129, 121, 135, 148, 148, 136, 119]
# Fit an ARIMA model
model = ARIMA(data, order=(1, 1, 1)) # (p, d, q) parameters
model_fit = model.fit()
# Forecast future values
forecast = model_fit.forecast(steps=5)
print(forecast)
[112.3353047 112.0833227 112.07379565 112.07343544 112.07342182]
3.2. Exponential Smoothing (ETS)#
Exponential Smoothing methods (e.g., Holt-Winters) are used for forecasting time series data with trends and seasonality. The model captures the level, trend, and seasonality of the data.
It is particularly useful for making short-term forecasts.
Example:
from statsmodels.tsa.holtwinters import ExponentialSmoothing
# Example time series data with trend and seasonality
data = [112, 118, 132, 129, 121, 135, 148, 148, 136, 119]
# Fit an Exponential Smoothing model (Additive Trend and Seasonality)
model = ExponentialSmoothing(data, trend='add', seasonal='add', seasonal_periods=4)
model_fit = model.fit()
# Forecast future values
forecast = model_fit.forecast(steps=5)
print(forecast)
[152.75000479 151.24999381 140.00000884 141.00000786 161.25000724]
3.3. Random Forest for Time Series#
Random Forest can be adapted to time series forecasting by using lagged values of the time series as features. For example, to predict (y_t), you might use (y_{t-1}), (y_{t-2}), etc., as input features.
Random Forest can capture non-linear relationships between lagged variables, making it more flexible than linear models like ARIMA.
Example:
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
# Generate time-lagged features
data = pd.DataFrame({'value': [112, 118, 132, 129, 121, 135, 148, 148, 136, 119]})
data['lag1'] = data['value'].shift(1)
data['lag2'] = data['value'].shift(2)
data.dropna(inplace=True)
# Train a Random Forest model
X = data[['lag1', 'lag2']]
y = data['value']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestRegressor()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
print(y_pred)
[138.82 133.41]
3.4. XGBoost for Time Series#
XGBoost is a gradient-boosting algorithm that can be adapted for time series forecasting. Similar to Random Forest, you create lagged features to capture the temporal dependencies in the data.
XGBoost is powerful for multivariate time series or datasets with complex relationships between variables.
Example:
import xgboost as xgb
from sklearn.metrics import root_mean_squared_error
# Load and preprocess data
# Assume data is a pandas DataFrame with a 'value' column and lagged features
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train an XGBoost model
model = xgb.XGBRegressor()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
rmse = root_mean_squared_error(y_test, y_pred)
print(f'RMSE: {rmse}')
RMSE: 15.888949865206014
4. Deep Learning Approaches for Time Series#
4.1. Recurrent Neural Networks (RNNs)#
RNNs are designed to handle sequential data by passing the hidden state from one time step to the next. This makes them well-suited for time series forecasting and classification tasks where temporal dependencies are key.
4.2. Long Short-Term Memory (LSTM)#
LSTMs are a type of RNN designed to handle longer sequences by mitigating the vanishing gradient problem. They are widely used for time series forecasting when long-term dependencies are crucial.
Example: LSTM for Time Series Forecasting
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras import Input
# Generate synthetic time series data
data = np.array([112, 118, 132, 129, 121, 135, 148, 148, 136, 119])
data = data.reshape((len(data), 1, 1)) # Reshape for LSTM input
# Build LSTM model
model = Sequential()
model.add(Input((1, 1)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(data[:-1], data[1:], epochs=200, verbose=0)
# Predict the next value
x_input = np.array([119]).reshape((1, 1, 1))
y_pred = model.predict(x_input)
print(y_pred)
2024-10-15 16:41:07.554190: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-15 16:41:07.562323: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-15 16:41:07.572156: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-15 16:41:07.575107: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-15 16:41:07.582663: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-15 16:41:08.341890: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1729024868.843185 286816 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-10-15 16:41:08.873201: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2343] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
[[119.23905]]
4.3. Convolutional Neural Networks (CNNs) for Time Series#
CNNs can be applied to time series by treating the data as 1D signals. By applying convolutional filters, CNNs can detect short-term patterns and features in time series data.
4.4. Transformer Models for Time Series#
Transformers, originally developed for NLP tasks, can also be used for time series forecasting. They rely on attention mechanisms to capture dependencies between different time steps, making them effective for multivariate time series with complex temporal relationships.
5. Project: Forecasting Stock Prices Using LSTM#
Goal:#
Predict future stock prices using LSTM based on historical prices.
Dataset:#
Use the Yahoo Finance API to fetch historical stock prices for a company (e.g., Apple). You can use the yfinance
library to easily access stock data.
Steps:#
Data Preprocessing:
Fetch historical stock prices (e.g., “Close” prices) using
yfinance
.Normalize the data
using MinMax scaling.
Create sequences of historical prices to be used as input for the LSTM model.
LSTM Model:
Build an LSTM model using Keras or PyTorch.
Train the model to predict the next stock price based on the previous prices in the sequence.
Evaluation:
Evaluate the model using Root Mean Squared Error (RMSE) or Mean Absolute Error (MAE).
Code Example: Stock Price Forecasting Using LSTM#
import yfinance as yf
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras import Input
# Fetch historical stock prices (e.g., Apple)
df = yf.download('AAPL', start='2023-01-01', end='2024-01-01')
data = df['Close'].values.reshape(-1, 1)
# Normalize the data
scaler = MinMaxScaler(feature_range=(0, 1))
data_scaled = scaler.fit_transform(data)
# Prepare the data (create sequences of 60 days)
X, y = [], []
for i in range(60, len(data_scaled)):
X.append(data_scaled[i-60:i, 0])
y.append(data_scaled[i, 0])
X, y = np.array(X), np.array(y)
X = X.reshape(X.shape[0], X.shape[1], 1) # Reshape for LSTM
# Build the LSTM model
model = Sequential()
model.add(Input((X.shape[1], 1)))
model.add(LSTM(50, return_sequences=True))
model.add(LSTM(50))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# Train the model
model.fit(X, y, epochs=10, batch_size=32)
# Predict future stock prices
y_pred = model.predict(X)
y_pred_scaled = scaler.inverse_transform(y_pred)
print(y_pred_scaled.reshape(-1)[:10])
[*********************100%***********************] 1 of 1 completed
Epoch 1/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 9s 2s/step - loss: 0.8192
4/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.6868
6/6 ━━━━━━━━━━━━━━━━━━━━ 2s 17ms/step - loss: 0.6054
Epoch 2/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step - loss: 0.1183
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0750
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0694
Epoch 3/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step - loss: 0.0861
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0626
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0548
Epoch 4/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - loss: 0.0068
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0115
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0129
Epoch 5/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step - loss: 0.0255
4/6 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0216
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0187
Epoch 6/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step - loss: 0.0043
4/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0049
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0054
Epoch 7/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step - loss: 0.0090
4/6 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0090
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0085
Epoch 8/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - loss: 0.0060
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0054
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 0.0054
Epoch 9/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step - loss: 0.0045
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0053
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - loss: 0.0053
Epoch 10/10
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step - loss: 0.0047
5/6 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0046
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - loss: 0.0048
1/6 ━━━━━━━━━━━━━━━━━━━━ 0s 193ms/step
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
[163.93512 164.47 165.09428 165.79855 166.51762 167.16321 167.7607
168.21623 168.51826 168.67763]
6. Evaluation Metrics for Time Series#
Mean Absolute Error (MAE): Measures the average absolute difference between actual and predicted values.
Root Mean Squared Error (RMSE): A more sensitive metric to large errors than MAE.
Mean Absolute Percentage Error (MAPE): A percentage-based error metric useful for comparing across datasets with different scales.
R-squared: Measures the proportion of variance explained by the model.
Conclusion#
Time series analysis requires specialized models and methods due to its temporal structure. Traditional techniques like ARIMA and Exponential Smoothing are still effective for short-term univariate forecasting, while machine learning models like Random Forest and XGBoost are useful for more complex multivariate series. Deep learning models such as LSTMs, CNNs, and Transformers have become popular for capturing long-term dependencies and patterns in time series data.