Data-rich document geotagging using geodesic grids


Data-rich document geotagging using geodesic grids

Show simple record

dc.contributor.advisor Baldridge, Jason
dc.creator Wing, Benjamin Patai 2011-07-07T16:07:12Z 2011-07-07T16:07:12Z 2011-05 2011-07-07 May 2011
dc.description.abstract This thesis investigates automatic geolocation (i.e. identification of the location, expressed as latitude/longitude coordinates) of documents. Geolocation can be an effective means of summarizing large document collections and is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document’s raw text as evidence. All of our methods predict locations in the context of geodesic grids of varying degrees of resolution. We evaluate the methods on geotagged Wikipedia articles and Twitter feeds. For Wikipedia, our best method obtains a median prediction error of just 11.8 kilometers. Twitter geolocation is more challenging: we obtain a median error of 479 km, an improvement on previous results for the dataset.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.subject Geospatial data
dc.subject Geographical positions
dc.subject Geodatabases
dc.subject Computational linguistics
dc.subject Geolocation
dc.subject Geographic information retrieval
dc.subject Wikipedia
dc.subject Twitter
dc.subject KL divergence
dc.subject Geotagging
dc.title Data-rich document geotagging using geodesic grids 2011-07-07T16:07:18Z
dc.identifier.slug 2152/ETD-UT-2011-05-3632
dc.contributor.committeeMember Erk, Katrin
dc.description.department Linguistics
dc.type.genre thesis
dc.type.material text Linguistics Linguistics University of Texas at Austin Masters Master of Arts

Files in this work

Download File: WING-THESIS.pdf
Size: 2.459Mb
Format: application/pdf

This work appears in the following Collection(s)

Show simple record

Advanced Search


My Account