Ain Shams Engineering Journal, cilt.17, sa.1, 2026 (SCI-Expanded, Scopus)
This study evaluates the comparative performance of different neural network models in predicting dissolved oxygen (DO) concentrations in the Clackamas River, USA. It examines the effects of site characteristics, water flow and quality parameters, and data distribution characteristics on these predictions. The study comprehensively compares the Kolmogorov–Arnold networks (KANs) method, applied for the first time in this study, with the multilayer perceptron (MLP), bidirectional long short-term memory (Bi-LSTM), and bidirectional gated recurrent unit (Bi-GRU) methods. Eight models were created using daily mean water temperature (T), discharge (Q), pH, specific conductance (SC), and DO data from two different monitoring site for the 2019–2021 period, and the models were evaluated using four performance metrics. Uncertainty (prediction interval) and significance (paired t -test) analyses were also applied to evaluate the prediction success of the methods from a different perspective than performance metrics. Furthermore, the relationship between input features and DO concentration was examined using the LOWESS curves with SHAP values. The results revealed that the KANs and MLP methods exhibited higher accuracy than Bi-LSTM and Bi-GRU. The KANs method provides a significant advantage in high prediction success and interpretability due to its ability to generate symbolic equations. Furthermore, it was determined that the distribution characteristics of the input variables affected the performance of MLP, Bi-LSTM, and Bi-GRU more than KANs. Logarithmic transformation improved the model success in non-normally distributed data. This study fills an essential gap in literature by applying the KANs method to water quality modeling for the first time. The results show that the KANs method offers an explainable, reliable, and low-data alternative, and therefore can be an effective tool for DO prediction and water quality management in conditions where data deficiencies are experienced.