Application of GA-BP Neural Network in Accurately Characterizing the Diffusion Range of Groundwater Pollutants in the Site
(2) GA-BP神经网络预测补全采样点分布较少区域的Mn2+浓度,校正Mn2+分布。
HIGHLIGHTS(1) The distribution of groundwater Mn2+ concentration in the study area does not conform to the law of Mn2+ production and migration, and needs to be corrected.
(2) The GA-BP neural network was used to predict the concentration of Mn2+ in the area with less distribution of sampling points and correct the distribution of Mn2+.
(3) It is verified that the corrected Mn2+ concentration distribution conforms to the site Mn2+ production mechanism and groundwater dynamics conditions.
Abstract:This study addresses the issue of unevenly distributed sampling points, which leads to inaccurate characterization of pollutant diffusion ranges. Using ArcGIS spatial interpolation, the distribution of Mn2+ ions in a chemical park was analyzed, revealing discrepancies due to uneven sampling. To overcome this, two neural network models—GA-BP and standard BP—were applied to predict Mn2+ concentrations at unsampled locations. The GA-BP neural network, optimized with a Genetic Algorithm, showed the best performance, filling gaps in data and allowing for a more accurate concentration distribution map. This revised map was used to delineate the Mn2+ diffusion range, which was further validated with the known production and migration mechanisms of Mn2+. The results demonstrate that the GA-BP model significantly improves the accuracy of pollutant diffusion mapping and offers a more reliable method for environmental pollution assessment, especially in areas with limited sampling data.
- groundwater /
- chemical industrial park /
- GA-BP neural network /
- influence range
BRIEF REPORTSignificance: Groundwater pollution is a common environmental issue in industrial areas,in which heavy metal pollution (e.g.,Mn2+,chromium,lead,iron,etc.) poses a serious threat to ecosystems and human health. The Technical Guidelines for Identification and Assessment of Ecological Environmental Damage Environmental Elements Part 1: Soil and Groundwater (GB/T 39792.1—2020) released in 2021 sets more stringent standards for the accuracy of characterizing groundwater pollutant diffusion. However,due to site constraints limiting groundwater monitoring points and the uneven distribution of sampling locations,traditional spatial interpolation methods (such as the Kriging method,inverse distance weighting,and spline functions) introduce significant errors in predicting pollutant diffusion,making it difficult to accurately capture their migration patterns.
Methods: Using Mn2+ pollution in a chemical industry park as a case study,this research utilizes ArcGIS spatial interpolation analysis to reveal that there is a large deviation in the distribution trend of Mn2+ concentration and its formation mechanism,and explores multiple interpolation methods for correction,yet the results still do not meet the required accuracy. In view of this,in this paper,the back propagation neural network (GA-BP) optimized by genetic algorithm is compared with the standard back propagation neural network (BP) to optimize the prediction of pollutant concentration and improve the characterization accuracy of the diffusion range.
First,a spatial interpolation of Mn2+ concentrations using GIS interpolation methods (Kriging method,inverse distance weighting,spline functions,etc.) shows that the predictions of these methods do not correspond to the direction of groundwater in the study area and to the mechanism of Mn2+ formation,indicating the low applicability of traditional interpolation methods when monitoring points are unevenly distributed. Therefore,this paper further uses GA-BP neural network for concentration prediction,chooses the model with the best fit for Mn2+ concentration prediction at unmonitored points. In combination with ArcGIS spatial interpolation,we delimit the diffusion range of Mn2+ and verify it in accordance with the mechanism of Mn2+ production.
In this study,twelve groundwater monitoring sites were deployed in the chemical industry park,and the data were divided into training sets and test sets. The training set includes points upstream,contamination plume and downstream of pollution sources to ensure that the model can learn pollution migration characteristics under different hydrodynamic conditions. In the test set,three monitoring points were selected,located in the contamination plume and downstream,to test the prediction ability of the model.
Data and Results: This study first models the relationship between Mn2+ concentration and spatial distribution using a standard BP neural network,then applies a genetic algorithm (GA) to optimize its weights and thresholds for improved prediction accuracy. After training,an optimized GA-BP neural network is used to predict the concentration of Mn2+ in an unmonitored area and to optimize the delineation of the extent of pollutant dispersion. Finally,model reliability is validated through comparison with measured data,considering Mn2+ migration mechanisms and groundwater dynamics. GA-BP neural networks perform best in the prediction of Mn2+ concentration,with a prediction error close to 0 and a higher fitting accuracy than standard BP neural networks. Using a GA-BP neural network to supplement the Mn2+ concentration data for the missing regions of the monitoring points and to replot the Mn2+ concentration distribution,the result shows that: the dispersion of Mn2+ in the centre of the chemical park is 1.16×106 m2,of which 2.13×105 m2 goes beyond the chemical park. A comparison of the extent of pollutant dispersion before and after optimization demonstrates that the revised Mn2+ extent aligns more closely with its generation mechanism and migration dynamics. That is,the direction of Mn2+ migration is influenced jointly by the degradation of organic matter and the flow of groundwater rather than by the distribution of monitoring points alone. After a review of the literature,it was found that the electron acceptor response in microbial degradation is in the order of O2>NO3->Mn4+,only when the nitrate degradation reaction has been substantially completed does petroleum degradation lead to the release of Mn2+.
It was found that the concentration of Mn2+ is highest when the concentration of nitrate is lowest and that all monitoring points have concentrations below 1 mg/L in the dispersion of Mn2+,which is consistent with the theoretical mechanism. Among them,the highest concentration of Mn2+ at the M03 point indicates that the degradation of nitrate in the region is largely complete,leading to a large release of Mn2+. This finding further confirms that Mn2+ migration is influenced by the degradation of petroleum and that the removal of nitrate is a key premise for Mn2+ release.
In summary,this study verifies the mechanism of microbial degradation and Mn2+ release described in the literature by means of data. It shows that in petroleum-contaminated areas the dispersion of Mn2+ is influenced not only by the degree of degradation of petroleum but also by the direction of groundwater. This finding helps to explain the migration patterns of pollutants in groundwater system more accurately,and provides scientific basis for groundwater pollution control.
This study shows that the GA-BP neural network has obvious advantages in the characterization of pollutant diffusion range,its prediction error is low,the fitting effect is good,and even in the case of limited sampling points and uneven distribution,it can still maintain a high prediction accuracy. Compared with traditional GIS spatial interpolation methods (Kriging method,inverse distance weighting method,spline function,etc.),the GA-BP neural network is more accurate in pollutant concentration prediction and diffusion range delineation,and the results are more consistent with the migration mechanism and hydrodynamic conditions of pollutants.
Furthermore,this study found that the migration of Mn2+ is mainly influenced by the combination of the degradation of petroleum pollutants and the flow of groundwater. In areas where the nitrate concentration is near zero and the concentration of petroleum pollutants is moderate,the concentration of Mn2+ is highest and migrates along groundwater flows,forming three Mn2+ rich areas in the center of the industrial park,the north-west corner and the south-east corner. The revised Mn2+ diffusion range is more consistent with known migration laws and further validates the accuracy of the GA-BP neural network predictions.
Nevertheless,this study has limitations. The limited number and uneven distribution of sampling points may introduce certain errors in model predictions. In the future,more field monitoring data can be combined to further optimize the parameters of the GA-BP neural network model and improve the accuracy and applicability of prediction.
《生态环境损害鉴定评估技术指南 环境要素 第1部分:土壤和地下水》(GB/T 39792.1-2020)指出,损害鉴定需要明确地下水的当前损害范围和评估时间范围内的可能损害范围[8],计算可能受损的地下水面积[9],以便更好地评估环境损害。但研究者采样时,受研究区条件限制,采样点常常无法均匀分布。时而发生现有ArcGIS插值方法的插值结果与真实情况有较大差别的情况。王兴等[10]分析了不同插值方法对海水盐度插值适用性,并指出无论采用哪种空间插值方法,都会引入较大的误差,适度加密监测站位是必要的,尤其是在插值误差较大的区域。Mohammed等[11]指出适当使用机器学习方法对插值精度有一定提升,BP神经网络用于拟合时间较早,是深度学习的一个关键分支,它能拟合变量之间的非线性关系。多名研究者在水文地质方面将其用于浓度预测,方法较为成熟[12]。
1. 研究区概况
2. 研究方法
2.1 数据收集
为了研究该化工园区地下水系统有机物对锰元素迁移规律的影响,本研究结合场地水化学指标分布特征,针对地下水中有机污染区域进行地下水采集,共采集12组地下水水样。水样主要取自监测井,依据园区地下水水动力条件以及前期获取的地下水污染源分布情况确定地下水取样深度,本研究采集深度为地下水水位线0.5m以下。地下水水样的采样、密封、保存以及检测均按照《地下水环境监测技术规范》要求进行。本次地下水检测指标包括锰、硝酸盐(以N计)、亚硝酸盐(以N计)、苯酚、丙酮、硝基苯、4-硝基苯胺、石油烃(C10~C40)、石油类、甲苯、乙苯、间二甲苯+对二甲苯、邻二甲苯。参照“35+N”的原则确定化工园区地下水检测指标,“35”为《地下水质量标准》(GB/T 14848—2017)扣除微生物指标和放射性指标后的35项常规指标,“N”为场地特征污染物指标。具体测试方法列于表1,每个平行样品采样点位采集的3份平行样品,其中2份地下水测试样品送河北省地质实验测试中心,另1份地下水质控测试样品送华测检测认证集团股份有限公司。所得数据均经过了准确度检验。
表 1 地下水样品分析测试方法Table 1. Analysis methods for groundwater sample检测项目
Aanalytical items分析测试方法
Analysis methods方法检出限
Method detection limit硝酸盐
Ion chromatography0.004mg/L 锰
Inductively coupled plasma mass spectrometer0.00012mg/L 石油类
Infrared spectrophotometry0.06mg/L 石油烃(C10~C40)
Petroleum hydrocarbons ( C10-C40 )气相色谱法
Gas chromatography0.004mg/L 丙酮
Gas chromatography0.2mg/L 4-硝基苯胺
Gas chromatography-mass spectrometry4.6μg/L 苯酚
Gas chromatography-mass spectrometry0.1μg/L 甲苯
Gas chromatography-mass spectrometry0.3μg/L 乙苯
Gas chromatography-mass spectrometry1.2μg/L 间二甲苯+对二甲苯
Gas chromatography-mass spectrometry1.2μg/L 邻二甲苯
Gas chromatography-mass spectrometry1.2μg/L 2.2 BP神经网络模型构建
2.2.1 模型构建及主要参数设置
$$ \begin{array}{c}j=\sqrt{i+k}+q\end{array} $$ (1) 隐含层输出:
$$ \begin{array}{c}{\mathit{a}}_{\mathit{j}}=f\left({\displaystyle\sum }_{i=1}^{l}{w}_{\text{ij}}{\mathit{x}}_{\mathit{i}}+{\mathit{b}}_{\mathit{j}}\right)\end{array} $$ (2) 输出层输出:
$$ \begin{array}{c}\begin{array}{c}{\mathit{y}}_{\mathit{k}}=f\left({\displaystyle\sum }_{j=1}^{l}{w}_{\text{jk}}{a}_{j}+{c}_{k}\right)\end{array}\end{array} $$ (3) 误差计算:
$$ \begin{array}{c}E=\dfrac{1}{2}{\displaystyle\sum }_{k=1}^{m}{\left({Y}_{k}-{y}_{k}\right)}^{2}\end{array} $$ (4) 2.2.2 遗传算法优化BP神经网络
2.2.3 选择点位预测并进行插值
3. 结果与讨论
3.1 实测点位Mn2+污染晕及研究区流场
表 2 研究区监测数据Table 2. Monitoring data of the study area点位
Point positions锰
Petroleum hydrocarbons (C10-C40)
(mg/L)M01 0.184 0.3 — — — 0.24 M02 0.18 0.2 — — — 2.37 M03 0.768 0.1 — — — 5.46 M04 0.031 1.1 — 0.00864 39.7 34.8 M05 0.0658 — — 0.00552 6.3 0.62 M06 0.0452 — — — — 0.14 M07 0.0752 — — — — 2.78 M08 0.182 0.4 — — — 23.5 M09 0.0024 — — — — 0.55 M10 0.18 0.5 — — — 27.6 M11 0.194 — 58 — — 8.95 M12 0.0018 — — — — — 点位
Point positions石油类
Petroleum hydrocarbons (C10-C40)
(mg/L)M01 0.45 — — — — 0.24 M02 0.29 2.4 — — — 2.37 M03 0.98 — 0.4 1.3 0.6 5.46 M04 1.64 18.2 0.4 1.2 0.4 34.8 M05 0.11 2.3 — — — 0.62 M06 0.11 — — — — 0.14 M07 0.17 1 — — — 2.78 M08 2.52 — — — — 23.5 M09 0.17 — — — — 0.55 M10 2.95 — — — — 27.6 M11 8.98 — — — — 8.95 M12 0.03 — — — — 0.02 注:“—”表示未检出。 3.2 有机物污染物的空间分布特征
3.3 Mn2+扩散范围圈定
3.3.1 神经网络训练结果
表 3 隐含层节点的确定过程Table 3. Determination process of hidden layer nodes隐藏层节点
Hidden layer node训练集的均方误差
Mean square error of training set隐藏层节点
Hidden layer node训练集的均方误差
Mean square error of training set2 0.12601 8 0.67715 3 0.27314 9 0.32133 4 0.11288 10 0.2487 5 0.20273 11 0.06521 6 0.031883 12 2.0155 7 0.069991 同时训练遗传算法优化的BP神经网络,其拟合效果见图5。训练集R为0.9921,R2=0.99747大于0.9,测试集R为0.9987,R2=0.9974证明训练集拟合良好,具有较强的可信度。GA-BP、BP神经网络预测结果与误差见表4。
表 4 GA-BP、BP神经网络预测结果与误差Table 4. Prediction results and errors of GA-BP and BP neural network样本序号
Sample serial numberMn2+实测值
Mn2+ measured value
(mg/L)1 0.1800 −0.1366 0.1849 −0.3166 0.0049 2 0.1940 −0.1669 0.1876 −0.3609 −0.0064 3 0.0018 −0.1581 0.0057 −0.1599 0.0039 根据表5可看出,优化后的BP神经网络误差比没有优化的BP神经网络mae、mse、rmse、mape更小,拟合效果更好。训练集与测试集拟合效果均较好,且最近隐含点层数按照均方根误差获得,训练集与测试集样本点种类分布均匀,设计合理,没有发生过拟合[21]。
表 5 BP神经网络误差Table 5. Error of BP neural networkBP神经网络种类的误差
Error in types of BP neural networkmae mse rmse mape 标准的BP神经网络模型
Standard BP neural network model0.279 0.085 0.292 3082.097% 遗传算法优化的BP神经网络模型
Genetic algorithm optimized BP neural network0.005 2.694×106 0.005 75.066% 实测值与预测值对比见图6,GA-BP实测值与预测值相较于标准BP神经网络更为拟合,且GA-BP神经网络误差均接近于0,GA-BP神经网络预测效果明显优于BP神经网络。研究区数据点不多,因此数据分成了训练集与测试集两个集。三个测试集点分布在污染源中心的污染羽及下游位置,且训练集具有污染源上游,污染羽以及污染源下游三种类型的点位,训练的模型结果拟合较好。测试集结果验证通过后,模型可靠[22]。
3.3.2 现象机理验证
4. 结论
