Abstract
In the course of digitization, there is an increased interest in sensor data, includ-
ing data from old systems with a service life of several decades. Since the installation
of sensor technology can be quite expensive, soft sensors are often used to enhance
the monitoring capabilities. Soft sensors use easy-to-measure variables to predict
hard-to-measure variables, employing arbitrary models. This is particularly challeng-
ing if the observed system is complex and exhibits dynamic behavior, e.g., transient
responses after changes in the system. Data-driven models are, therefore, often
used. As recent studies suggest using Transformer-based models for regression
tasks, this paper investigates the use of Transformer-based soft sensors for modelling
the dynamic behavior of systems. To this extent, the performance of Multilayer Per-
ceptron (MLP) and Long Short-term Memory (LSTM) models are compared to Trans-
formers, based on two data sets featuring dynamic behavior in terms of time-delayed
variables. The outcomes of this paper demonstrate that while the Transformer can map
time delays, it is outperformed by MLP and LSTM. This deviation from previous Trans-
former evaluations is noteworthy as it may be influenced by the dynamic character-
istics of the input data set, and its attention-based mechanism may not be optimized
for sequential data. It is important to mention that the previous studies in this area did
not focus on time-delayed dynamic variables.