Browsing by Subject "Large language models"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Designing a multi-perspective search system using large language models and retrieval augmented generation(2024-05) Mujumdar, Utkarsh ; Lease, Matthew A.; Rajadesingan, AshwinIn the context of information retrieval, multi-perspective search is a desired solution when the search query focuses on contentious topics that might not have clear factual grounds for an answer - "Should humans colonize space" being an example of such a search query. Although the explicit intent of this query might require a definitive answer ("Yes"/"No"), an ideal search result of the query should add the necessary context, or perspective along with the definitive answer. Added to this is the facet that there can be multiple such perspectives that can be used to answer the question, and hence the need for multi-perspective search systems. However, seeking diverse perspectives in information-seeking contexts is a challenging problem to solve - traditional search engines, while effective in aggregating data, often fall short in providing a cohesive context, particularly when addressing complex, contentious topics. Motivated by these shortcomings, this thesis introduces a multi-perspective search system that leverages the capabilities of Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). Given a search topic of interest, the proposed system employs LLMs to generate and embody diverse personas that represent different perspectives on the topic of interest. Each persona then presents their perspective as part of a simulated debate format, resembling a hypothetical discussion between the different stakeholders. Retrieval Augmented Generation is employed to provide substantiating evidence as part of each argument presented in the debate. This innovative approach allows users to explore a topic through a dialogue that synthesizes multiple perspectives, offering a richer and more nuanced understanding of the topic of interest. The system is designed with a user interface that supports this complex interaction, making it accessible and engaging for users. The development of this system not only advances the field of multi-perspective search but also opens new avenues for potential applications in conversational interfaces, decision-making support systems, and online discussions on digital platforms. This thesis discusses the motivation for multi-perspective search systems, conceptualization of the proposed approach, the design of the system’s interface and architecture, implementation challenges, and potential use-cases of the proposed system, setting a robust foundation for future enhancements and wider application.Item Exploring multiple perspectives to mitigate cognitive biases through an integrated interface to language models(2024-05) Wong, Yian ; Lease, Matthew A.; Li, Junyi JessyIn recent years, large language models (LLMs) have demonstrated remarkable abilities in generating human-like text and supporting decision-making processes. However, their use is often limited by inherent biases and a lack of diversity in presented perspectives. This work introduces a novel system designed to mitigate these issues by leveraging the capabilities of LLMs to simulate a multi-perspective debate format, aimed at providing a balanced view on controversial topics. The proposed system employs a unique integrated interface that facilitates dynamic interactions between multiple AI-generated personas, each representing distinct viewpoints. These personas engage in structured debates, allowing for a comprehensive exploration of a topic that counteracts the cognitive biases typically associated with single-perspective information retrieval systems. The system incorporates advanced prompt engineering techniques and retrieval-augmented generation to ensure the accuracy and relevance of the information presented. Additionally, the interface is designed with user engagement in mind, featuring interactive elements that allow users to manipulate the debate dynamics and contribute to the discussion. This thesis evaluates the system’s effectiveness in enhancing users' understanding of complex issues and its potential in reducing bias in decision support systems. By simulating diverse viewpoints, the system potentially fosters more critical and informed engagement with topics, thus supporting better decision-making.Item Exploring protein biochemistry with deep learning(2023-12) Kulikova, Anastasiya Vitalievna; Wilke, C. (Claus); Davies, Bryan W.; Klivans, Adam R.; Russell, RickDeep learning has become widely used in biological sciences. More specifically, the development of protein deep learning models has leveraged the evergrowing collection of biological data to learn the patterns that govern protein biochemistry. Here, we focus on the assessment of different protein deep learning models to better understand each of their capabilities, benefits and drawbacks. Our work aims to provide insights for future protein engineering efforts and for the discovery of protein homologs. In Chapter 2 we assessed a structure-based protein ML model in its ability to make biochemically meaningful predictions and tested weather or not the model can predict specific allowed amino acids in a protein. We compared the performance of models trained on different input sizes and correlated model predictions with natural variation in order to better understand how these models learn protein structure and biochemistry. In Chapter 3, we compared the predictions of two structure models and two language models to determine if different protein representations affect what information each model type learns and their performance. Finally, in Chapter 4, we apply a sequence-based protein model to searching for antibacterial microcin peptides in bacterial genomes.Item Introducing controlled reasoning into autoregressive large language models(2023-05) Mersinias, Michail; Li, Junyi Jessy; Mahowald, KyleIn this thesis, we explore two ways in order to enhance and optimize the text generation process of autoregressive large language models (LLMs), in particular those with a generative pre-trained transformer (GPT) architecture which we categorize into GPT and InstructGPT model types. In both cases, our proposed methods attempt to replicate human cognitive behavior and introduce System 2 (controlled) reasoning into the text generation process. For GPT models, we explore incorporating natural language inference (NLI) into the text generative pipeline by using a pre-trained NLI model to assess whether a generated sentence entails, contradicts, or is neutral to the prompt and preceding text. First, we show that the NLI task is predictive of generation errors 6 made by GPT-3. We use these results to develop an NLI-informed generation procedure for GPT-J. Then, we evaluate these generations by obtaining human annotations on error types and overall quality. We demonstrate that an NLI strategy of maximizing the neutral class provides the highest quality of generated text, significantly better than the vanilla generations, regardless of nucleus sampling parameter value. For InstructGPT models, we propose constant interaction between two separate instances of the same model: the generator and the critic. We train the critic using Scarecrow, a framework for machine text evaluation which defines ten generation error types. We explore different training procedures and demonstrate that a critic trained with two examples for each error type, as well as chain-of-thought, is highly predictive of generation errors. The critic provides feedback regarding the location, the reason and the type of each detected error within the generated text. We conclude that using this feedback, the generator has the potential to correct its own errors and produce text of higher quality.