The Sensor Web: Unpredictable, Noisy and Loaded with Errors

WI/IAT 2010 – IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology, 1-3 September 2010, Toronto, Canada

Alan F. Smeaton

“Classical information retrieval is based around a user having an information need, formulated as a query, and a system which matches the query against ‘documents’, retrieving those most likely to be relevant. In some applications there are challenges because the ‘documents’ are not discrete objects but highly inter-connected, and IR research has for decades developed models of the processes, devised novel ranking algorithms, and developed very elaborate benchmarking techniques for performance. But what if the information we need or seek is not neatly divided into documents, either discrete or inter-connected, but needs to be taken from a constant stream of data values, namely data from sensors. These sensors cover the physical sensors around us (environment, place, physical activities like traffic, weather, people movement, crowd gatherings like concerts and sports events) as well as the online sensors we have access to (blogs, tweets, etc.). Often termed the *sensor web*, this information source is characterised as being noisy, errorsome, unpredictable and dynamic, exactly like the real and the virtual worlds in which we live, work and play. In this presentation I introduce several diverse sensor web applications to show the breadth and pervasive nature of the sensor web and I then show some of the techniques which we use to manage the information which forms part of the sensor web.”