Browsing by Subject "Metadata"
Now showing 1 - 8 of 8
- Results Per Page
- Sort Options
Item BIBFRAME beginnings at UT Austin(2016-05) Brown, Amy; Cofield, Melanie C.; Davis, Jee; Quagliana, Alisha; Ringwood, AlanStaff from UT Libraries, the Harry Ransom Center, and the Tarlton Law Library have been collaborating in discussion group activities during the last year to develop knowledge and skills in anticipation of life after MARC, investigating the brave new world of linked data in libraries with a focus on the Library of Congress Bibliographic Framework (BIBFRAME) initiative. Our group efforts to better understand BIBFRAME and linked data for libraries include in-depth discussions of current literature, webcasts, and presentations; strategic application of Zepheira’s Practical Practitioner training; and hands-on experimentation transforming local metadata in various formats for various resource types to BIBFRAME. Our analysis of the resulting transformations has helped us gain insight on mapping complexities, data loss, false transformations, potential new metadata displays, and the limitations of the tools involved. The experimentation process overall has afforded us the opportunity to ask targeted questions about what is needed to move towards linked data and to gain a better view of the frontier of Technical Services staff skillsets. In this panel presentation, we’ll share details about our approaches to maximizing the group learning experience, and lessons learned from grappling with new concepts, data models, terminology, and tools. Representatives from our experimentation teams will report on the initial experience of transforming MARC and non-MARC data sets to BIBFRAME, and what we see as emerging questions and next steps.Item Efficient fine-grained virtual memory(2018-05) Zheng, Tianhao, Ph. D.; Erez, Mattan; Reddi, Vijay Janapa; Tiwari, Mohit; Lin, Calvin; Peter, SimonVirtual memory in modern computer systems provides a single abstraction of the memory hierarchy. By hiding fragmentation and overlays of physical memory, virtual memory frees applications from managing physical memory and improves programmability. However, virtual memory often introduces noticeable overhead. State-of-the-art systems use a paged virtual memory that maps virtual addresses to physical addresses in page granularity (typically 4 KiB ).This mapping is stored as a page table. Before accessing physically addressed memory, the page table is accessed to translate virtual addresses to physical addresses. Research shows that the overhead of accessing the page table can even exceed the execution time for some important applications. In addition, this fine-grained mapping changes the access patterns between virtual and physical address spaces, introducing difficulties to many architecture techniques, such as caches and prefecthers. In this dissertation, I propose architecture mechanisms to reduce the overhead of accessing and managing fine-grained virtual memory without compromising existing benefits. There are three main contributions in this dissertation. First, I investigate the impact of address translation on cache. I examine the restriction of virtually indexed, physically tagged (VIPT) caches with fine-grained paging and conclude that this restriction may lead to sub-optimal cache designs. I introduce a novel cache strategy, speculatively indexed, physically tagged (SIPT) to enable flexible cache indexing under fine-grained page mapping. SIPT speculates on the value of a few more index bits (1 - 3 in our experiments) to access the cache speculatively before translation, and then verify that the physical tag matches after translation. Utilizing the fact that a simple relation generally exists between virtual and physical addresses, because memory allocators often exhibit contiguity, I also propose low-cost mechanisms to predict and correct potential mis-speculations. Next, I focus on reducing the overhead of address translation for fine-grained virtual memory. I propose a novel architecture mechanism, Embedded Page Translation Information (EMPTI), to provide general fine-grained page translation information on top of coarse-grained virtual memory. EMPTI does so by speculating that a virtual address is mapped to a pre-determined physical location and then verifying the translation with a very-low-cost access to metadata embedded with data. Coarse-grained virtual memory mechanisms (e.g., segmentation) are used to suggest the pre-determined physical location for each virtual page. Overall, EMPTI achieves the benefits of low overhead translation while keeping the flexibility and programmability of fine-grained paging. Finally, I improve the efficiency of metadata caching based on the fact that memory mapping contiguity generally exists beyond a page boundary. In state-of-the-art architectures, caches treat PTEs (page table entries) as regular data. Although this is simple and straightforward, it fails to maximize the storage efficiency of metadata. Each page in the contiguously mapped region costs a full 8-byte PTE. However, the delta between virtual addresses and physical addresses remain the same and most metadata are identical. I propose a novel microarchitectural mechanism that expands the effective PTE storage in the last-level-cache (LLC) and reduces the number of page-walk accesses that miss the LLC.Item Geospatial metadata and an ontology for water observations data(2009-05) Marney, Katherine Anne; Maidment, David R.; McKinney, DaeneWork has been successfully performed by the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) to synthesize the nation’s hydrologic data. Through the building of a national Hydrologic Information System, the organization has demonstrated a successful structure which promotes data sharing. While data access has been improved by the work completed thus far, the resources available for discovering relevant datasets are still lacking. In order to improve data discovery among existing data services, a model for the storage and organization of metadata has been created. This includes the creation of an aggregated table of relevant metadata from any number of sources, called a Master SeriesCatalog. Using this table, data layers are easily organized based on themes, therefore simplifying data discovery based on concepts.Item Identifying, selecting, and organizing the attributes of Web resources(2004) Pasch, Grete; Miksa, Francis L.The basic human approach for referring to the real world is to represent the observed objects by their attributes, be it in natural language or in formal data models. Library cataloging is no different in using attributes to represent information resources, but its approach to data modeling is implicit and does not provide methodologies for attribute analysis. This is a critical problem when representing web resources since they differ significantly from the kinds of resources typically handled by library catalogs. The purpose of this dissertation is systematically to identify, select, and organize the attributes of web resources by means of an alternative to the traditional, library-based bibliographic model. Here, an alternative methodology is explored that combines data modeling principles from information systems theory, concepts from bibliographic modeling, and Gerard Genette's paratextual theory. The proposed methodology is applied to a working collection of 300 web resources listed by the LANIC (Latin American Network Information Center) center of the Institute of Latin American Studies, University of Texas at Austin. A semi-automatic process is used to extract attributes from the HTML code's HEAD and BODY sections, the information provided by the browser, the data about each locally stored file, and the LANIC directory pages. Attributes are also manually marked up and extracted from each pageview. As a result, a total of 290 attributes were identified and selected. The attributes were then organized according to two methods. First, a direct mapping into Dublin Core (DC) highlights the shortcomings of the traditional approach: two thirds of the attributes found do not match any DC elements, and questions about the structure and meaning of the DC elements are underscored. Second, the matching of each attribute to its parent entity resulted in a model with 35 entities grouped into four categories: agents, binders, components, and original documents. These entities highlight the origin of each attribute, help model the life cycle of the information entities, and offer an alternative source for attribute values. The 37 unmatched attributes (expressive, navigational, and directive attributes) hint at the possible application of a social relativist approach for modeling them further.Item Mediating conflicting values in a community archives setting(2019-05-15) Griesse, Birch; Acker, AmeliaThis report examines conflicts within the user community of the online fanwork repository Archive of Our Own (AO3). The site is a nonprofit venture with the ambitious goal of serving the large heterogeneous community of fandom writ large. Tensions among subsets of the Archive’s user group have flared up at various points in its ten-year history, forcing its volunteer-based staff into the position of arbiter of community values. These conflicting values have influenced, sometimes asymmetrically, the functionality of the Archive and are now embedded in its design. Focusing on tensions in three broad categories delineated by the user groups in conflict, this report explores the effect the compromises have had on AO3’s goals of inclusivity, preservation, and access.Item Separating data from metadata for robustness and scalability(2014-12) Wang, Yang, Ph. D.; Alvisi, Lorenzo; Dahlin, MichaelWhen building storage systems that aim to simultaneously provide robustness, scalability, and efficiency, one faces a fundamental tension, as higher robustness typically incurs higher costs and thus hurts both efficiency and scalability. My research shows that an approach to storage system design based on a simple principle—separating data from metadata—can yield systems that address elegantly and effectively that tension in a variety of settings. One observation motivates our approach: much of the cost paid by many strong protection techniques is incurred to detect errors. This observation suggests an opportunity: if we can build a low-cost oracle to detect errors and identify correct data, it may be possible to reduce the cost of protection without weakening its guarantees. This dissertation shows that metadata, if carefully designed, can serve as such an oracle and help a storage system protect its data with minimal cost. This dissertation shows how to effectively apply this idea in three very different systems: Gnothi—a storage replication protocol that combines the high availability of asynchronous replication and the low cost of synchronous replication for a small-scale block storage; Salus—a large-scale block storage with unprecedented guarantees in terms of consistency, availability, and durability in the face of a wide range of server failures; and Exalt—a tool to emulate a large storage system with 100 times fewer machines.Item The Obama Administration and digital content : a case study of Healthcare.gov(2016-05) Gant, Alia; Wickett, Karen M.; Towery, StephanieThe United States government has made enormous strides to adapt and evolve with the digital era in the 21st century. Initially the Clinton Administration in the 1990s showed a sense of acceptance and willingness to work with the changing times in regards to technology. The subsequent administrations also continued to support platforms that utilized digital programs such as the Internet. This Master’s Report will examine government websites under the Obama Administration, in particular Healthcare.gov, however through the perspective of information professionals. The report will describe and analyze the information pertinent to users to accessing health needs for insurance plans. The report will discuss and apply frameworks from information studies, including metadata, digital libraries and community informatics Lastly, the report will provide critiques, suggestions, and ways to research this topic in the future.Item Themes in videogame research : a content analysis of scholarly articles(2010-08) Broussard, Ramona Lindley; Geisler, Gary; Feinberg, MelanieIn trying to provide access to videogame materials for scholars, collecting organizations must build standards for building and structuring collections, and in turn information professionals must assess the information needs of users. In order to begin the assessment, this paper presents a content analysis of scholarly videogame articles. The results of the analysis will provide the basis for structuring videogame archives, libraries, or databases. Metadata schemas are important to access, and to collecting. That metadata will aid patrons is widely accepted, but too often schemas and vocabularies are based on only experts’ opinions without taking into account patrons’ ideas of what is important. To address this dearth, the content analysis presented in this paper combines historical ideas of metadata standards from expert archivists with an analysis of what themes are important, common, and sought for in the literature of videogame scholars, who are the likely users of videogame collections.