Ocean Engineering, cilt.359, sa.P2, 2026 (SCI-Expanded, Scopus)
Reliable and accurate prediction of significant wave height (SWH) is imperative for the sustainable design and economic development of coastal and offshore structures. In this study, a total of eight machine learning (ML) methods; multi-layer perceptron (MLP), Kolmogorov-Arnold network (KAN), gradient boosting (GB), extreme gradient boosting (XGB), adaptive boosting (AB), random forest (RF), M5 Prime (M5P), and linear regression (LR) belonging to three different classes (neural network, tree, and linear based) are applied as a data-driven correction framework to minimize systematic biases in numerical wave hindcasts. In order to achieve this objective, satellite altimeter observations (SAT SWH) were used as the target for the ML models, which utilized wave (CSWAN SWH, Wind-Sea, CSWAN Tm02, and SWH Swell), wind (WindSy and WindSx), and location (Lon, Lat, and CoastDist) features in the Black Sea. Furthermore, the feature significance analysis (Cohen's d statistic) was utilized to reduce model complexity, resulting in the removal of insignificant features. In the final stage of the research, the relationship between input features and SAT SWH was analyzed using the LOWESS curves with SHAP values. The findings indicate that ML methods significantly improved the agreement between numerical hindcasts and satellite observations, with tree-based methods demonstrating optimal performance. It has been observed that, in addition to CSWAN SWH, the most effective features for a spatial correction are location (Lon and Lat), wind (WindSy and WindSx), and SWH Swell.