Nowadays, the problem of short- and long-term forecasts in the Arctic region becomes more important for global logistics and engineering. Here, we present our first findings on the application of machine learning (ML) methods to oceanographic data. The Barents Sea is the region of our interest, and we focus on the analysis of wind waves and surface currents. We applied several ML models to simulate observed time series and obtained short- and long-term forecasts. The Long Short Time Memory model and XGBoost algorithms showed better results in fitting the observed curve.
The Russian Arctic has changed rapidly as a result of modern climate change. It has had a significant impact on global economics and logistics. The Northern Sea Route is the key transport way in this region. But the region is still characterized by harsh hydrometeorological and ice conditions during the autumn–winter season. It is still poorly covered by regular observations. Nowadays, the problem of short- and long-term forecasting has become more important for logistics, operations, and engineering. Numerical modeling and forecasting of atmospheric, oceanic, and ice conditions should help plan safe and cost-effective routes.
In response to the request, at the Marine Research Center, we are developing a complex forecasting system using numerical modeling and Artificial Intelligence methods. Recently, the application of machine learning (ML) methods in oceanography and geosciences has become more popular. Machine learning can be defined as a field of inquiry devoted to understanding and building methods that ‘learn’, that is, methods that leverage data to improve performance on some set of tasks (Mitchell, 1997). Machine learning algorithms build a model based on sample data, known as training data, to make predictions or decisions without being explicitly programmed to do so. Adytia et al. (2018) used the autoregressive integrated moving average (ARIMA) model to predict the significant wave height obtained from the SWAN numerical model with a high correlation exceeding 0.94. The results showed an accurate short-term prediction with ARIMA (2,2,2). Yang et al. (2019) described the seasonal autoregressive integrated moving average (SARIMA) as a useful tool for long-term significant wave height prediction in the South China Sea. They also based their work on the results of numerical modeling using WAVEWATCH III. However, Ali et al. (2021) concluded that ARIMA performed the worst in significant wave height prediction. The best results were obtained by gated recurrent units of the deep learning model. They also attempted to use the Long Short Time Memory (LSTM) model. There are several publications that proposed a hybrid approach for forecasting in geosciences and network training. Tsai et al. (2021) described a differential parameter learning framework to train models with complex nonlinear and nonstationary data. Tang et al. (2021) published a multicomponent hybrid model for wave height prediction. Wu et al. (2020) proposed a hybrid physics-based machine learning model and successfully trained it with the wind-driven wave data. Also, a convolutional neural network (CNN) and recurrent neural network (RNN) might be used in geosciences. Wang et al. (2020) applied CNN to obtain the forecast of the cyclone track.