Machine Learning-Based Residual Bias Correction of SWAN Significant Wave Height Using Satellite Altimeter Observations as Ground Truth in the Black Sea

Anık, Emirhan; AMAROUCHE, KHALID; KANKAL, MURAT; AKPINAR, ADEM

doi:10.1016/j.oceaneng.2026.125991

Machine Learning-Based Residual Bias Correction of SWAN Significant Wave Height Using Satellite Altimeter Observations as Ground Truth in the Black Sea

Anık E. M., AMAROUCHE K., KANKAL M., AKPINAR A.

Ocean Engineering, cilt.359, sa.P2, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 359 Sayı: P2
Basım Tarihi: 2026
Doi Numarası: 10.1016/j.oceaneng.2026.125991
Dergi Adı: Ocean Engineering
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Environment Index, Geobase, ICONDA Bibliographic, INSPEC
Anahtar Kelimeler: Machine learning, Residual bias correction, Satellite altimeter, Significant wave height, Tree-based methods, Wave hindcast enhancement
Bursa Uludağ Üniversitesi Adresli: Evet

Özet

Reliable and accurate prediction of significant wave height (SWH) is imperative for the sustainable design and economic development of coastal and offshore structures. In this study, a total of eight machine learning (ML) methods; multi-layer perceptron (MLP), Kolmogorov-Arnold network (KAN), gradient boosting (GB), extreme gradient boosting (XGB), adaptive boosting (AB), random forest (RF), M5 Prime (M5P), and linear regression (LR) belonging to three different classes (neural network, tree, and linear based) are applied as a data-driven correction framework to minimize systematic biases in numerical wave hindcasts. In order to achieve this objective, satellite altimeter observations (SAT SWH) were used as the target for the ML models, which utilized wave (CSWAN SWH, Wind-Sea, CSWAN Tm02, and SWH Swell), wind (WindSy and WindSx), and location (Lon, Lat, and CoastDist) features in the Black Sea. Furthermore, the feature significance analysis (Cohen's d statistic) was utilized to reduce model complexity, resulting in the removal of insignificant features. In the final stage of the research, the relationship between input features and SAT SWH was analyzed using the LOWESS curves with SHAP values. The findings indicate that ML methods significantly improved the agreement between numerical hindcasts and satellite observations, with tree-based methods demonstrating optimal performance. It has been observed that, in addition to CSWAN SWH, the most effective features for a spatial correction are location (Lon and Lat), wind (WindSy and WindSx), and SWH Swell.