Show simple item record

dc.creatorWing, Benjamin Patai
dc.date.accessioned2011-07-07T16:07:12Z
dc.date.available2011-07-07T16:07:12Z
dc.date.created2011-05
dc.date.issued2011-07-07
dc.date.submittedMay 2011
dc.identifier.urihttp://hdl.handle.net/2152/ETD-UT-2011-05-3632
dc.descriptiontext
dc.description.abstractThis thesis investigates automatic geolocation (i.e. identification of the location, expressed as latitude/longitude coordinates) of documents. Geolocation can be an effective means of summarizing large document collections and is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document’s raw text as evidence. All of our methods predict locations in the context of geodesic grids of varying degrees of resolution. We evaluate the methods on geotagged Wikipedia articles and Twitter feeds. For Wikipedia, our best method obtains a median prediction error of just 11.8 kilometers. Twitter geolocation is more challenging: we obtain a median error of 479 km, an improvement on previous results for the dataset.
dc.format.mimetypeapplication/pdf
dc.language.isoeng
dc.subjectGeospatial data
dc.subjectGeographical positions
dc.subjectGeodatabases
dc.subjectComputational linguistics
dc.subjectGeolocation
dc.subjectGeographic information retrieval
dc.subjectWikipedia
dc.subjectTwitter
dc.subjectKL divergence
dc.subjectGeotagging
dc.titleData-rich document geotagging using geodesic grids
dc.date.updated2011-07-07T16:07:18Z
dc.identifier.slug2152/ETD-UT-2011-05-3632
dc.description.departmentLinguistics
dc.type.genrethesis*
thesis.degree.departmentLinguistics
thesis.degree.disciplineLinguistics
thesis.degree.grantorUniversity of Texas at Austin
thesis.degree.levelMasters
thesis.degree.nameMaster of Arts


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record