Show simple item record

dc.contributor.advisorBaldridge, Jasonen
dc.creatorWing, Benjamin Pataien
dc.date.accessioned2011-07-07T16:07:12Zen
dc.date.available2011-07-07T16:07:12Zen
dc.date.created2011-05en
dc.date.issued2011-07-07en
dc.date.submittedMay 2011en
dc.identifier.urihttp://hdl.handle.net/2152/ETD-UT-2011-05-3632en
dc.descriptiontexten
dc.description.abstractThis thesis investigates automatic geolocation (i.e. identification of the location, expressed as latitude/longitude coordinates) of documents. Geolocation can be an effective means of summarizing large document collections and is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document’s raw text as evidence. All of our methods predict locations in the context of geodesic grids of varying degrees of resolution. We evaluate the methods on geotagged Wikipedia articles and Twitter feeds. For Wikipedia, our best method obtains a median prediction error of just 11.8 kilometers. Twitter geolocation is more challenging: we obtain a median error of 479 km, an improvement on previous results for the dataset.en
dc.format.mimetypeapplication/pdfen
dc.language.isoengen
dc.subjectGeospatial dataen
dc.subjectGeographical positionsen
dc.subjectGeodatabasesen
dc.subjectComputational linguisticsen
dc.subjectGeolocationen
dc.subjectGeographic information retrievalen
dc.subjectWikipediaen
dc.subjectTwitteren
dc.subjectKL divergenceen
dc.subjectGeotaggingen
dc.titleData-rich document geotagging using geodesic gridsen
dc.date.updated2011-07-07T16:07:18Zen
dc.identifier.slug2152/ETD-UT-2011-05-3632en
dc.contributor.committeeMemberErk, Katrinen
dc.description.departmentLinguisticsen
dc.type.genrethesisen
thesis.degree.departmentLinguisticsen
thesis.degree.disciplineLinguisticsen
thesis.degree.grantorUniversity of Texas at Austinen
thesis.degree.levelMastersen
thesis.degree.nameMaster of Artsen


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record