Optimal identification of unknown groundwater contaminant sources in conjunction with designed monitoring networks

Hazrati Yadkoori, Shahrbanoo (2018) Optimal identification of unknown groundwater contaminant sources in conjunction with designed monitoring networks. PhD thesis, James Cook University.

PDF (Thesis)
Download (1MB) | Preview
View at Publisher Website: https://doi.org/10.25903/5d0ad27afe2ec


Human activities and improper management practices have resulted in widespread deterioration of groundwater quality worldwide. Groundwater contamination has seriously threatened its beneficial use in recent decades. Remediation processes are necessary for groundwater management. In the remediation of contaminated aquifer sites, identification of unknown groundwater contaminant sources has a crucial role. In other words, an effective groundwater remediation process needs an accurate identification of contaminant sources in terms of contaminant source locations, magnitudes and time-release. On the other hand, the efficiency and reliability of contaminant source identification depend on the availability, adequacy, and accuracy of hydrogeologic information and contaminant concentration measurements data. Whereas, generally when groundwater contaminations are detected, only limited and sparse measured contaminant concentration values are available. Usually, groundwater contaminations are detected after a long time, years or even decades after the starting of contaminant source activities or even after their extinction. Therefore, usually, there is not enough information regarding the number of contaminant sources, the duration of sources' activities and the contaminant magnitudes, as well as the hydrogeologic parameters of the contaminated aquifers. Simulations of groundwater flow and solute transport involve intrinsic uncertainties due to this sparse information or lack of enough hydrogeologic information of the porous medium. Therefore, for groundwater management, developing and applying an efficient procedure for identification of unknown contaminant sources is essential.

Moreover, available observed contaminant concentration values are usually erroneous and this erroneous data could cause instability in the solution results. Various combinations of source characteristics can result in similar effects at observation locations and cause non-uniqueness in the solution. Due to these instabilities and non-uniqueness in solution (Datta, 2002), the source identification problem is known as an "ill-posed problem" (Yeh, 1986). The non-uniqueness and uncertainties involved in this ill-posed problem make this problem a difficult and complex task. Suggested methodologies to tackle this task are not completely efficient. For instance, the crux of previous approaches is highly vulnerable to the accuracy and adequacy of contaminant concentration measurements and hydrogeologic data. As a result, many of the previously suggested approaches are not applicable to real-world cases and application of relevant approaches to real-world contaminant aquifer sites is usually tedious and time-consuming. The suggested methodologies involve enormous computational time and cost due to repeated runs of the numerical simulation models within the optimisation algorithms. Therefore, to identify the unknown characteristics of contaminant sources, different surrogate models were developed. Three different algorithms were utilized for developing the surrogate models: Self-Organising Maps (SOM), Gaussian Process Regression (GPR), and Multivariate Adaptive Regression Splines (MARS). Performance of the developed procedures was assessed for potential applicability in two hypothetical, an experimental, and a real-world contaminated aquifer sites. In the used contaminated aquifer sites, only limited contaminant concentrations data were assumed to be available. In three cases, it was also assumed that the contaminant concentrations data were collected a long time after the start of the first potential contaminant source activities.

The performance evaluations of the developed surrogate models show that these models could accurately mimic the behaviour of simulation models of groundwater flow and solute transport. These surrogate models solutions showed acceptable errors in comparison to the more robust numerical model solutions. These surrogate models were also used for identification of unknown groundwater contaminant sources when utilized to solve the inverse problem. The SOM algorithm was chosen as the surrogate model type in this study for directly addressing the source identification problem as well. The SOM algorithm was chosen for its classification capabilities. In source identification problems, the number of actual contaminant sources is uncertain and usually, a set of a larger number of potential contaminant sources are assumed. Therefore, screening the active sources by SOM-based Surrogate Models (SOM-based SMs) may simplify the source identification problems. The performance of the developed SOM-based SMs was assessed for different scenarios. Results indicate that the developed models could also accurately screen the active sources among all potential contaminant sources with sparse contaminant concentrations data and uncertain hydrogeologic information.

For comparison purposes, MARS and GPR algorithms that are precise prediction tools were also utilized for developing MARS and GPR-based Surrogate Models (MARS and GPR-based SM) for source identification. Performance of the developed surrogate models for source identification was evaluated in terms of Normalized Absolute Error of Estimation (NAEE). For example, the performance of the developed SOM, MARS and GPR-based SMs was assessed in an illustrative hypothetical contaminated aquifer site. The results for testing data in terms of NAEE were equal to 16.3, 4.9 and 6.6%, respectively. Performance of the developed SOM, MARS and GPR-based SMs was also evaluated in an experimental contaminated aquifer site. The results for testing data in terms of NAEE were equal to 15.8, 14.1 and 16.2%. These performance evaluation results of the developed surrogate models indicate that the MARS-based SMs can be more accurate models than the SOM and GPR-based SMs in source identification problems. The most important advantage of the developed methodologies is their direct application for source identification in an inverse mode without linking to an optimisation model.

Surrogate Model-Based Optimisation (SMO) was also developed and utilized for source identification. In this developed SMO, MARS and Genetic Algorithm (GA) were utilized as the surrogate model and the optimisation model types, respectively. MARS-based SMOs performance was assessed in an illustrative hypothetical contaminated aquifer site and in a real-world contaminated aquifer site. The result of the developed MARS-based SMO for testing data in the illustrative hypothetical contaminated aquifer site in terms of Root Mean Square Error (RMSE) was equal to 0.92. Obtained solution results of the developed MARS-based SM in the real contaminated study area for testing data in terms of RMSE was equal to 42.5. The performance evaluation results of the developed methodologies in different hypothetical and real contaminated study areas demonstrate the capabilities of the constructed SOM, GPR, and MARS-based SMs and MARS-based SMO for source identification. Also, in order to increase the accuracy of source identification results, and based on the preliminary solution results of the developed SOM-based SMs, a sequential sampling method can be applied adaptively for updating the developed surrogate models. Information from a hypothetical contaminated aquifer site was used to assess the performance of this procedure. Performance evaluation results of adaptively developed MARS and GPR-based SMs in terms of NAEE were equal to 1.9 and 2.1%, respectively. The results show 3 and 4.5% improvements for source identification results by applying adaptively developed MARS and GPR-based SMs, respectively.

Another difficulty with source identification problems has been the limitation and sparsity of observed contaminant concentrations data. Previously suggested methodologies usually need long-term observation data at numerous locations which can involve large costs. Therefore, developing an effective monitoring network design procedure was one of the main goals of this study. In designing the monitoring networks, two main objectives were considered: 1. Maximizing the accuracy of source identification results, and 2. Limiting the number of monitoring locations. It was supposed that by implementing obtained results from the designed monitoring networks for developing surrogate models, the source identification results would significantly improve. In this study, different algorithms were utilized to identify potentially important and effective monitoring locations which probably could improve source identification results. These algorithms are Random Forests (RF), Tree Net (TN) and CART. The performance of these algorithms was evaluated in different scenarios. Results indicate the potential applicability of these algorithms in recognising the most important components of prediction models. As a result, these algorithms could apply for designing monitoring networks for improving the source identification efficiency and accuracy. Concentration measurement information from a designed monitoring network and from a set of arbitrary monitoring sites was utilized to develop MARS-based surrogate models for source identification. The solution results for these two scenarios of designed monitoring and arbitrary measurements were compared for a hypothetical study area for evaluation purpose. Performance evaluation results of the developed surrogate model using information from the designed monitoring network showed improvement in source identification error in terms of RMSE for testing data by 0.7. The obtained information from the designed monitoring network was used to develop MARSbased SM for source identification of testing data in a real contaminated aquifer site. Source identification results of the developed MARS-based SM with testing data for the real contaminated aquifer site showed improvement by 35.3 in terms of RMSE compared to the solution results of MARS-based SM, which was developed by using obtained information from arbitrary monitoring locations. Performance evaluation results for the developed monitoring network procedure demonstrate the potential applicability of this procedure for source identification.

Item ID: 58751
Item Type: Thesis (PhD)
Keywords: adaptive surrogate models, contaminated aquifers experimental site, groundwater contaminant source identification, groundwater contamination, hydrogeologic uncertainty, self-organizing maps, source characterization, source identification, surrogate models, unknown groundwater contamination sources
Related URLs:
Copyright Information: Copyright © 2018 Shahrbanoo Hazrati Yadkoori.
Additional Information:

Publications arising from this thesis are available from the Related URLs field. The publications are:

Chapter 3: Hazrati Yadkoori, Shahrbanoo, and Datta, Bithin (2017) Adaptive surrogate model based optimization (ASMBO) for unknown groundwater contaminant source characterizations using self-organizing maps. Journal of Water Resource and Protection, 9. pp. 193-214.

Chapter 4: Hazrati Y., Shahrbanoo, and Datta, Bithin (2017) Self-organizing map based surrogate models for contaminant source identification under parameter uncertainty. International Journal of Geomate, 13 (36). pp. 11-18.

Chapter 5: Hazrati Yadkoori, Shahrbanoo, and Datta, Bithin (2017) Evaluation of unknown groundwater contaminant sources characterization efficiency under hydrogeologic uncertainty in an experimental aquifer site by utilizing surrogate models. Journal of Water Resource and Protection, 9 (13). pp. 1612-1633.

Date Deposited: 03 Jul 2019 05:01
FoR Codes: 09 ENGINEERING > 0905 Civil Engineering > 090509 Water Resources Engineering @ 100%
SEO Codes: 96 ENVIRONMENT > 9609 Land and Water Management > 960999 Land and Water Management of Environments not elsewhere classified @ 100%
Downloads: Total: 57
Last 12 Months: 3
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page