Aquatic Center Sysid Benchmark

Ressources

This dataset is supplied with two toolkits to perform all the computations necessary to obtain an evaluation according to the authors’ criteria. This toolkit is all the more relevant as the functions needed to obtain the test scores are not available in the standard Matlab toolkit or in the Python community library.

Two brief presentations are given here to illustrate the user-friendly tools that come with the dataset.

Matlab tools box

The matlab toolbox is available for idss, idpoly and custom user class (according to the toolbox abstract class). For all three models, functions are available to visualise the prediction horizon, estimate the accuracy in relation to the prediction depth and calculate the scores. This toolkit also contains data directly available with the iddata class for a user-friendly API.

Here we show an example with the ssest estimation tools with a short section of the data set. Feel free to use the arx function or other custom tools.

For this example, we first include the dataset with the toolbox.

1
[data_train, data_test] = load_dataset();

Using the Matlab system identification toolkit, we can directly estimate a state space model.

1
2
nx=2;
sys = ssest(getexp(data_train, "Exp1"), nx);

After identifying the system, we can use the first function to obtain predictions according to a given observation size (here 22 steps) and a fixed prediction horizon (8 hours). The plot is available Figure 1.

1
2
3
4
5
6
7
8
9
% Generate prediction
[real_data, pred_data]  = predict_horizon_ss(sys, getexp(data_train, "Exp1"), 8*6, 20);

tiledlayout(2,1);
nexttile
plot([real_data(1:500,1) pred_data(1:500,1)]);

nexttile
plot([real_data(1:500,2) pred_data(1:500,2)]);
Figure 1: Matlab prediction plot

We can then generate the accuracy w.r.t the prediction depth. This function can take (like all the function) a iddata object. The function result is available Figure 2.

1
2
3
4
5
6
7
8
% Get accuracy curve
acc_vec = eval_horizon_ss(sys, data_test, 48, 22);

hold on;
plot(acc_vec);
xlabel('prediction horizon [step]');
ylabel('RMSE');
hold off;
Figure 2: Matlab accuracy curve

Finally, we can generate the model score based on the test set and the predefined criteria.

1
2
3

% Get benchmark score
score_list = score_ss(sys, data_test, 48, 22);

This short example can be done with other models with the same API.

Python tools box

The python toolkit can make available dataset and run accuracy scores based on an 8-hour horizon prediction (like the Matlab version). Here we will show an example of use. For this example, we will use a basic first-order ARX model.

First, we need to include all the necessary libraries.

1
2
3
4
import numpy as np
from sklearn.linear_model import Lasso
import matplotlib.pyplot as plt
import swp_sysid as swp 

We can then download the data directly and select a section for demonstration purposes.

1
2
3
4
5
6
# Load dataset
data_train, data_test, info = swp.data.load_dataset()

# selection only a small portion of the whole dataset
# (here last experiments and the first 2000 steps)
data = data_train[-1].iloc[0:2000]

Then we define the following ARX model : $$ y[k] = Ay[k-1] + Bu[k-1]$$

This model predicts the complete horizon in an iterative way. For training the model, we use sklearn’s Lasso model to add regularisation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Get ARX regression units 
y_t_1, u_t, y_t = data.y[0][1:], data.u[0][:-1], data.y[0][:-1]

X_mat = np.concatenate([y_t, u_t], axis=1)
Y_mat = y_t_1


# Train the ARX model 
model = Lasso(alpha=0.004)
model.fit(X_mat, Y_mat)

The last part consists of redefining our model according to the API of the toolbox. For this we use the TimeSerieModel abstract class. It requires a prediction method that takes several experiments and runs all predictions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Define the model class 
class ModelLin(swp.TimeSerieModel):
    def forecast(self, past_datas: swp.DataSet, futur_datas: swp.DataSet) -> np.ndarray:
        y_t = np.array([ y[-1,:]  for y in past_datas.y]) # We define the y[k-1]
        X_mat = np.concatenate([np.array(past_datas.u)[:,-1:,:],np.array(futur_datas.u)], axis=1) # We get all the u[k] vectors
        
        # Create a list to store all prediction
        prediction_list = []
        
        # We compute iteratively the prediction
        for i in range(futur_datas.get_size()[0]): 
            # y[k] = THETA  [y[k-1], u[k-1]]^T
            y_t = model.predict(np.concatenate([y_t, X_mat[:,i,:]], axis=1)) 
            prediction_list.append(y_t.copy())
        return np.transpose(np.array(prediction_list),(1,0,2))

We are now able to use all the tools in the benchmark. First, we can display several predictions (see Figure 3 and the code below).

1
2
3
4
5
# Get the model instance
final_model = ModelLin()

# Plot with the toolbox prediction horizon with recalibration
swp.evaluate.compare(data, final_model, K=6*8)
Figure 3: Compare graphique

We can also calculate the accuracy of the model per pool with another function (see code below and Figure 4).

1
2
3
4
5
# Get the accuracy with horizon depth
acc = swp.evaluate.acc_times_series(data, final_model, K=6*8)

plt.plot(acc, label=["Pool 1", "Pool2"])
plt.show()
Figure 4: Acc time series plot

We can finally calculate the reference score with the function get_benchmark_test_score.

1
2
# Get benchmark score
scores = swp.evaluate.get_benchmark_test_score(data_test, final_model)