Testing EAD encoding in the Texas Archival Resources Online (TARO) system with textual analysis techniques
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Electronic archival finding aids encoded in Encoded Archival Description (EAD) are transported across networks and rendered into HTML for display on the browser. Considering the time, effort and money involved in marking up the finding aids, has the markup been used for retrieval purposes? Has the multilevel hierarchical nature of finding aids been used for searching? A few online EAD tag based retrieval systems that process queries look for occurrences of the search term in the corresponding EAD tag, but do not seem to address subject- or topic- based queries. This study explores the possibility of using the content of specific EAD tags for subject retrieval purposes. We studied the consistencies, commonalities and discrepancies in usages of various critical tags across repositories participating in the Texas Archival Resources Online project (TARO). These usages were compared to EAD tagging guidelines as well as TARO guidelines. We identified the , and tags as good representatives of the finding aid from standard archival descriptive practice and examined their content for a sample of repositories within TARO. The content of these tags was processed using text processing techniques to further study and arrive at possible similarity metrics to identify similar finding aids. We feel this would help evaluate EAD as an information retrieval tool within TARO and if our experiments help conclude that EAD can be effective as such a tool (or can be made effective by better descriptive practice), then the prospect of creating a highly interconnected web of finding aids exploiting the hierarchical nature of EAD is possible