In Part I of this interview, Konstantin shared some background on the development of ArcGIS Geostatistical Analyst, and introduced us to his forthcoming book, Introduction to Spatial Statistical Data Analysis for GIS Users. In the conclusion of our interview, he discusses what he brought back from the recent GEOSTAT2008 conference in Santiago, Chile, and what we might expect to see in future releases of Geostatistical Analyst.
Last month you attended GEOSTAT2008, the major geostatistical conference, in Santiago, Chile.
Yes, during the conference I met with many leading scientists to discuss Geostatistical Analyst, attended sessions on geostatistical theory and applications to learn about modern tendencies in the science, and introduced many of the attendees to Geostatistical Analyst software. I spoke at length with several of the best modern geostatisticians on the current state of the art and what can be done for large audience of GIS users in the near future.
Were attendees of the conference already familiar with ESRI’s work in this field?
All attendees use one or more geostatistical software packages in their work, but a large number of the attendees were not aware of our geostatistical analysis software package. The need to better promote the existence of Geostatistical Analyst to the scientific community clearly exists.
What are some of the current trends in geostatistics?
Based on the conference, the tendencies in modern geostatistics are non-Gaussian kriging models; a preference by a majority of researchers for simulations over predictions; and rapidly growing interest in space-time and Bayesian geostatistics.
Modeling with Geostatistical Analyst.
Is the geostatistical team addressing these in future releases of Geostatistical Analyst?
In general, we are following modern tendencies in Geostatistical Analyst 9.4. In particular, we are working on several non-Gaussian kriging models including areal interpolation for binomial data (epidemiological, crime, etc) and gamma disjunctive kriging (for interpolation of data with positive values). We are also providing several enhancements to the recently released Gaussian geostatistical simulation geoprocessing tool. For example, users will be able to specify measurement error for each datum, which is often known or can be estimated. You can hardly find such option in other geostatistical software.
I have developed some recommendations for future functionality of Geostatistical Analyst based on what I learned during the conference. They include Bayesian kriging, space-time series using functional kriging, and copula-based spatial regression.
Your book is called “Spatial Statistics…”, and the product is called “Geostatistical Analyst.” Does Geostatistical Analyst address all types of spatial statistics?
No. Spatial data are divisible into three main categories according to their location:
- Discrete point data: data that consist of locations of events. Applications of point pattern analysis include forestry, epidemiology, and criminology.
- Regional data (sometimes also known as aggregated, polygonal, or lattice data): data that are associated with areas and that typically include counts of an event within a polygon. Regional data occur in epidemiology, criminology, agriculture, census, and business-related applications.
- Geostatistical or continuous data: data that can be measured at any location in the study area but are known only at a limited number of sample points. Geostatistical data occur in meteorology, agriculture, mining, and environmental studies, for example.
And the Geostatistical Analyst product focuses on the models for the third data type, continuous data?
Primarily, yes. The Geostatistical Analyst team is small and our focus is limited to models for continuous data at this point in time. However, many models and tools in Geostatistical Analyst can be used for exploration of the other two types of spatial data; in other words, for the data summary. In practice, researchers are often interested in the data summary only, at least at the initial stage of the data analysis. Data modeling and prediction may or may not follow the spatial data exploration stage.
Are there other statistical software packages out there that integrate with GIS, and address discrete points and regional data modeling?
Yes, these include R, WinBUGS, and SAS. The usage of these software packages in conjunction with GIS software is discussed in my book in detail. Integration between these packages and ArcGIS is possible through geoprocessing tools, but at the moment the researchers are simply exchanging data between programs. Just as I mentioned in the beginning of our talk, creation of a set of geoprocessing tools for running external statistical software packages is much easier than explaining clearly where and how statistical models should be used and when statistical models may produce wrong results.
Can you explain more about the value of simulations?
With conditional geostatistical simulation, instead of using just one input surface in geoprocessing, you can use many surfaces with the same statistical features—say 1,000—and then produce 1,000 outputs. The resulting distributions of possible values at specified locations or areas show how uncertain the result of your analysis, and this is extremely important for good decision-making. Areas with relatively frequent extreme values may be the most interesting part of the data analysis. In applications such as geology, mining, and environmental science, there is a big advantage in having a distribution of possible values as opposed to just one (most probable) value. I believe that the number of GIS researchers who could benefit from using simulations will grow. Note that Bayesian statistical modeling is essentially based on simulation methods.
Thanks, Konstantin, for taking the time to share some of your experience with the readers of my blog.