Introducing controlled reasoning into autoregressive large language models



Journal Title

Journal ISSN

Volume Title



In this thesis, we explore two ways in order to enhance and optimize the text generation process of autoregressive large language models (LLMs), in particular those with a generative pre-trained transformer (GPT) architecture which we categorize into GPT and InstructGPT model types. In both cases, our proposed methods attempt to replicate human cognitive behavior and introduce System 2 (controlled) reasoning into the text generation process. For GPT models, we explore incorporating natural language inference (NLI) into the text generative pipeline by using a pre-trained NLI model to assess whether a generated sentence entails, contradicts, or is neutral to the prompt and preceding text. First, we show that the NLI task is predictive of generation errors 6 made by GPT-3. We use these results to develop an NLI-informed generation procedure for GPT-J. Then, we evaluate these generations by obtaining human annotations on error types and overall quality. We demonstrate that an NLI strategy of maximizing the neutral class provides the highest quality of generated text, significantly better than the vanilla generations, regardless of nucleus sampling parameter value. For InstructGPT models, we propose constant interaction between two separate instances of the same model: the generator and the critic. We train the critic using Scarecrow, a framework for machine text evaluation which defines ten generation error types. We explore different training procedures and demonstrate that a critic trained with two examples for each error type, as well as chain-of-thought, is highly predictive of generation errors. The critic provides feedback regarding the location, the reason and the type of each detected error within the generated text. We conclude that using this feedback, the generator has the potential to correct its own errors and produce text of higher quality.


LCSH Subject Headings