By Yingjie Hu, Krzysztof Janowicz, Sathya Prasad, and Song Gao
Transactions in GIS, Volume 19, Issue 3, June 2015, Pages 398–416
“Geoportals provide integrated access to geospatial resources, and enable both authorities and the general public to contribute and share data and services. An essential goal of geoportals is to facilitate the discovery of the available resources. Such a process relies heavily on the quality of metadata. While multiple metadata standards have been established, data contributers may adopt different standards when sharing their data via the same geoportal. This is especially the case for user-generated content where various terms and topics can be introduced to describe similar datasets. While this heterogeneity provides a wealth of perspectives, it also complicates resource discovery. With the fast development of the Semantic Web technologies, there is a rise of Linked-Data-driven portals. Although these novel portals open up new ways to organize metadata and retrieve resources, they lack effective semantic search methods.
“This article addresses the two challenges discussed above, namely the topic heterogeneity brought by multiple metadata standards and the lack of established semantic search in Linked-Data-driven geoportals. To harmonize the metadata topics, we employ a natural language processing method, namely Labeled Latent Dirichlet Allocation (LLDA), and train it using standardized metadata from Data.gov. With respect to semantic search, we construct thematic and geographic matching features from the textual metadata descriptions, and train a regression model via a human participants experiment. We evaluate our methods by examining their performances in addressing the two issues. Finally, we implement a semantics-enabled and Linked-Data-driven prototypical geoportal using a sample dataset from Esri’s ArcGIS Online.”