Data-rich document geotagging using geodesic grids

Repository

Data-rich document geotagging using geodesic grids

Show full record

Title: Data-rich document geotagging using geodesic grids
Author: Wing, Benjamin Patai
Abstract: This thesis investigates automatic geolocation (i.e. identification of the location, expressed as latitude/longitude coordinates) of documents. Geolocation can be an effective means of summarizing large document collections and is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document’s raw text as evidence. All of our methods predict locations in the context of geodesic grids of varying degrees of resolution. We evaluate the methods on geotagged Wikipedia articles and Twitter feeds. For Wikipedia, our best method obtains a median prediction error of just 11.8 kilometers. Twitter geolocation is more challenging: we obtain a median error of 479 km, an improvement on previous results for the dataset.
Department: Linguistics
Subject: Geospatial data Geographical positions Geodatabases Computational linguistics Geolocation Geographic information retrieval Wikipedia Twitter KL divergence Geotagging
URI: http://hdl.handle.net/2152/ETD-UT-2011-05-3632
Date: 2011-05

Files in this work

Download File: WING-THESIS.pdf
Size: 2.459Mb
Format: application/pdf

This work appears in the following Collection(s)

Show full record


Advanced Search

Browse

My Account

Statistics

Information