Venice, City of Canals: Characterizing Regions through Content Classification

“Information retrieval tasks in the geographic domain rely on textual annotation of georeferenced information objects. These information objects can be annotated with references to spatial objects contained within the corresponding geographical footprint. Not all the spatial objects, however, describe the essential attributes characterizing the region. In this article, we present a method to calculate the descriptive prominence of categories of spatial objects in a given region and select a subset for the characteristic description of the region. The method is demonstrated on three datasets of points of interest and an artificial dataset is used as a benchmark. The method reduces the number of categories describing regions significantly (p<0.001). We further illustrate the results qualitatively for three regions characterized in text.”