Geostatistical interpolation model selection based on ArcGIS and spatio-temporal variability analysis of groundwater level in piedmont plains, northwest China

SpringerPlus, Published 11 April 2016

By Yong Xiao, Xiaomin Gu, Shiyang YinEmail author, Jingli Shao, Yali Cui, Qiulan Zhang, and Yong Niu

“Based on the geo-statistical theory and ArcGIS geo-statistical module, datas of 30 groundwater level observation wells were used to estimate the decline of groundwater level in Beijing piedmont. Seven different interpolation methods (inverse distance weighted interpolation, global polynomial interpolation, local polynomial interpolation, tension spline interpolation, ordinary Kriging interpolation, simple Kriging interpolation and universal Kriging interpolation) were used for interpolating groundwater level between 2001 and 2013. Cross-validation, absolute error and coefficient of determination (R2) was applied to evaluate the accuracy of different methods.

Groundwater level drawdown during 2001 and 2013.

Groundwater level drawdown during 2001 and 2013.

“The result shows that simple Kriging method gave the best fit. The analysis of spatial and temporal variability suggest that the nugget effects from 2001 to 2013 were increasing, which means the spatial correlation weakened gradually under the influence of human activities. The spatial variability in the middle areas of the alluvial–proluvial fan is relatively higher than area in top and bottom. Since the changes of the land use, groundwater level also has a temporal variation, the average decline rate of groundwater level between 2007 and 2013 increases compared with 2001–2006. Urban development and population growth cause over-exploitation of residential and industrial areas. The decline rate of the groundwater level in residential, industrial and river areas is relatively high, while the decreasing of farmland area and development of water-saving irrigation reduce the quantity of water using by agriculture and decline rate of groundwater level in agricultural area is not significant.”

Geographically Weighted Regression to Measure Spatial Variations in Correlations between Water Pollution versus Land Use in a Coastal Watershed

OCMOcean & Coastal Management, Volume 103, January 2015, Pages 14–24

By Jinliang Huang, Yaling Huang,Robert Gilmore Pontius Jr., and Zhenyu Zhang


  • GWR reveals spatial variation in water pollution-land use linkages.
  • Water pollution is associated more with built-up than with cropland or forest.
  • More built-up is associated with more pollution for less urbanized sub-watersheds.
  • Forest has a stronger negative association with pollution in urban sub-watersheds.
  • Cropland has a weak association with water pollution among 21 sub-watersheds.

“Land use can influence river pollution and such relationships might or might not vary spatially. Conventional global statistics assume one relationship for the entire study extent, and are not designed to consider whether a relationship varies across space. We used geographically weighted regression to consider whether relationships between land use and water pollution vary spatially across a subtropical coastal watershed of Southeast China. Surface water samples of baseflow for seven pollutants were collected twelve times during 2010–2013 from headwater sub-watersheds. We computed 21 univariate regressions, which consisted of three regressions for each of the seven pollutants. Each of the three regressions considered one of three independent variables, i.e. the percent of the sub-watershed that was cropland, built-up, or forest.

Local R2 values and local parameter estimates for GWR cropland models among three types of sub-watershed.

Local R2 values and local parameter estimates for GWR cropland models among three types of sub-watershed.

“Cropland had a local R2 less than 0.2 for most pollutants, while it had a positive association with water pollution in the agricultural sub-watersheds and a negative association with water pollution in the non-agricultural sub-watersheds. Built-up had a positive association with all pollutants consistently across space, while the increase in pollution per increase in built-up density was largest in the sub-watersheds with low built-up density. The local R2 values were stronger with built-up than with cropland and forest. The local R2 values for built-up varied spatially, and the pattern of the spatial variation was not consistent among the seven pollutants. Forest had a negative association with most pollutants across space. Forest had a stronger negative association with water pollution in the urban sub-watersheds than in the agricultural sub-watersheds. This research provides an insight into land-water linkages, which we discuss with respect to other watersheds in the literature.”

Geo-Based Statistical Models for Vulnerability Prediction of Highway Network Segments

isprsISPRS International Journal of Geo-Information, 2014, 3(2), 619-637

By Keren Pollak, Ammatzia Peled, and Shalom Hakkert

“This study describes four statistical models—Poisson; Negative Binomial; Zero-Inflated Poisson; and Zero-Inflated Negative Binomial—which were devised in order to examine traffic accidents and estimate the best probability estimating model in terms of future risk assessment at interurban road sections. The study was conducted on four sets of fixed-length sections of the road network: 500, 750, 1000, and 1500 m. The contribution of transportation and spatial parameters as predictors of road accident rates was evaluated for all four data sets separately. In addition, the Empirical Bayes method was applied. This method uses historical accidents information, allowing regression to the mean phenomenon so as to improve model results.

Expected number of accidents comparing real number of accidents and predicted number after applying EB method (road section of 500 m)—observation 3000 until 3300.

Expected number of accidents comparing real number of accidents and predicted number after applying EB method (road section of 500 m)—observation 3000 until 3300.

“The study was performed using Geographic Information System (GIS) software. Other analyses, such as statistical analyses combined with spatial parameters, interactions, and examination of other geographical areas, were also performed. The results showed that the short road sections data sets of 500 and 750 m yielded the most stable models. This allows focused treatment on short sections of the road network as a way to save resources (enforcement; education and information; finance) and potentially gain maximum benefit at minimum investment. It was found that the significant parameters affecting accident rates are: curvature of the road section; the region and traffic volume. An interaction between the region and traffic volume was also found. ”

Identification of Optimum Scopes of Environmental Factors for Snails using Spatial Analysis Techniques in Dongting Lake Region, China

pnvParasites & Vectors 7:216, Published Online 09 May 2014

By Jin-Yi Wu, Yi-Biao Zhou, Lin-Han Li, Sheng-Bang Zheng, Song Liang, Ashley Coatsworth, Guang-Hui Ren, Xiu-Xia Song, Zhong He, Bin Cai, Jia-Bian You, and Qing-Wu Jiang

Owing to the harmfulness and seriousness of Schistosomiasis japonica in China, the control and prevention of S. japonica transmission are imperative. As the unique intermediate host of this disease, Oncomelania hupensis plays an important role in the transmission. It has been reported that the snail population in Qiangliang Lake district, Dongting Lake Region has been naturally declining and is slowly becoming extinct. Considering the changes of environmental factors that may cause this phenomenon, we try to explore the relationship between circumstance elements and snails, and then search for the possible optimum scopes of environmental factors for snails.

Moisture content of soil, pH, temperature of soil and elevation were collected by corresponding apparatus in the study sites. The LISA statistic and GWR model were used to analyze the association between factors and mean snail density, and the values in high-high clustered areas and low-low clustered areas were extracted to find out the possible optimum ranges of these elements for snails.


A total of 8,589 snail specimens were collected from 397 sampling sites in the study field. Besides the mean snail density, three environmental factors including water content, pH and temperature had high spatial autocorrelation. The spatial clustering suggested that the possible optimum scopes of moisture content, pH, temperature of the soil and elevation were 58.70 to 68.93%, 6.80 to 7.80, 22.73 to 24.23[degree sign]C and 23.50 to 25.97 m, respectively. Moreover, the GWR model showed that the possible optimum ranges of these four factors were 36.58 to 61.08%, 6.541 to 6.89, 24.30 to 25.70[degree sign]C and 23.50 to 29.44 m, respectively.

The results indicated the association between snails and environmental factors was not linear but U-shaped. Considering the results of two analysis methods, the possible optimum scopes of moisture content, pH, temperature of the soil and elevation were 58.70% to 68.93%, 6.6 to 7.0, 22.73[degree sign]C to 24.23[degree sign]C, and 23.5 m to 26.0 m, respectively. The findings in this research will help in making an effective strategy to control snails and provide a method to analyze other factors.”

Spatial Distribution of Soil Organic Carbon and Total Nitrogen Based on GIS and Geostatistics in a Small Watershed in a Hilly Area of Northern China

PLOS_ONEPLOS One, Published Online 31 December 2013

By Gao Peng, Wang Bing, Geng Guangpo, and Zhang Guangcan

“The spatial variability of soil organic carbon (SOC) and total nitrogen (STN) levels is important in both global carbon-nitrogen cycle and climate change research. There has been little research on the spatial distribution of SOC and STN at the watershed scale based on geographic information systems (GIS) and geostatistics. Ninety-seven soil samples taken at depths of 0–20 cm were collected during October 2010 and 2011 from the Matiyu small watershed (4.2 km2) of a hilly area in Shandong Province, northern China. The impacts of different land use types, elevation, vegetation coverage and other factors on SOC and STN spatial distributions were examined using GIS and a geostatistical method, regression-kriging.

Distribution map of SOC and STN concentrations by regression-kriging (a, b) and ordinary kriging (c, d) in Matiyu small watershed.

Distribution map of SOC and STN concentrations by regression-kriging (a, b) and ordinary kriging (c, d) in Matiyu small watershed.

“The results show that the concentration variations of SOC and STN in the Matiyu small watershed were moderate variation based on the mean, median, minimum and maximum, and the coefficients of variation (CV). Residual values of SOC and STN had moderate spatial autocorrelations, and the Nugget/Sill were 0.2% and 0.1%, respectively. Distribution maps of regression-kriging revealed that both SOC and STN concentrations in the Matiyu watershed decreased from southeast to northwest. This result was similar to the watershed DEM trend and significantly correlated with land use type, elevation and aspect. SOC and STN predictions with the regression-kriging method were more accurate than those obtained using ordinary kriging. This research indicates that geostatistical characteristics of SOC and STN concentrations in the watershed were closely related to both land-use type and spatial topographic structure and that regression-kriging is suitable for investigating the spatial distributions of SOC and STN in the complex topography of the watershed.”

Geostatistical Approach for Site Suitability Mapping of Degraded Mangrove Forest in the Mahakam Delta, Indonesia

Journal of Geographic Information SystemJournal of Geographic Information System, Vol.5 No.5, October 2013

Ali Suhardiman, Satoshi Tsuyuki, Muhammad Sumaryono, and Yohanes Budi Sulistioadi

“As part of operational guidance of mangrove forest rehabilitation in the Mahakam delta, Indonesia, site suitability mapping for 14 species of mangrove was modelled by combining 4 underlying factors—clay, sand, salinity and tidal inundation. Semivariogram analysis and a geographic information system (GIS) were used to apply a site-suitability model, while kriging interpolation generated surface layers, based on sample point data collection. The tidal inundation map was derived from a tide table and a digital elevation model from topographic maps. The final site-suitability maps were produced using spatial analysis technique, by overlaying all surface layers. We used a Gaussian model to adjust a semivariogram graph in order to help to understand the variation of sample data values, and create a natural surface layer of data distribution over the area of study.”

Site suitability map of our study sites generated using geostatistical analysis and GIS operations.

Site suitability map of study sites generated using geostatistical analysis and GIS operations.

“By examining the statistical value and the visual inspection of surface layers, we saw that the models were consistent with the expected data behavior; therefore, we assumed that interpolation has been carried out appropriately. Our site-suitability map showed that Avicennia species was the most suitable species and matched with 50% of the study area, followed by Nypa fruticans, which occupied about 42%. These results were actually consistent with the mangrove zoning pattern in the region prior to deforestation and conversion.”

Understanding Spatial Filtering for Analysis of Land Use-transport Data

Journal of Transport GeographyJournal of Transport Geography, Volume 31, July 2013, Pages 123–131

Yiyi Wang, Kara M. Kockelman, Xiaokun (Cara) Wang


  • We explore use of spatial filtering (SF) for regression model estimation.
  • We compare SF models and SAR-type models, and a distance decay parameter.
  • Data sets contain appraised values for private properties across Texas’ Travis County.
  • SF methods allow focus on the marginal effects of policy variables and other covariates.

“This paper summarizes the literature on spatial filtering (SF) for analysis of spatial data. Given the scarcity of its application in transportation and its fledgling nature, preliminary case studies were conducted using continuous and discrete response data sets, for land values and land use, in comparison with results from spatial autoregressive (SAR) models with distance decay parameters estimated using Bayesian techniques. For both the continuous land value and binary land use cases, the SF approach demonstrates great potential as a worthy competitor to more conventional SAR-based models. In addition to offering high fit statistics, somewhat shorter computing times, and more straightforward computations, the SF approach makes explicit the patterns of spatial dependency in the land value and land use data. By controlling for these spatial relationships, the SF approach yields more reliable marginal effects of policy variables of interest. Model results confirm the important role of transportation access (as quantified using distances to a region’s central business district, and various roadway types).”