Braids: out-of-order performance with almost in-order complexity

Tseng, Francis, 1976-

Braids: out-of-order performance with almost in-order complexity

dc.contributor.advisor	Patt, Yale N.	en
dc.creator	Tseng, Francis, 1976-	en
dc.date.accessioned	2008-08-29T00:05:15Z	en
dc.date.available	2008-08-29T00:05:15Z	en
dc.date.issued	2007-12	en
dc.description.abstract	There is still much performance to be gained by out-of-order processors with wider issue widths. However, traditional methods of increasing issue width do not scale; that is, they drastically increase design complexity and power requirements. This dissertation introduces the braid, a compile-time generated entity that enables the execution core to scale to wider widths by exploiting the small fanout and short lifetime of values produced by the program. A braid captures dataflow and register usage information of the program which are known to the compiler but are not traditionally conveyed to the microarchitecture through the instruction set architecture. Braid processing requires identification by the compiler, minor augmentations to the instruction set architecture, and support by the microarchitecture. The execution core of the braid microarchitecture consists of a number of braid execution units (BEUs). The BEU is tailored to efficiently carry out the execution of a braid in an in-order fashion. Each BEU consists of a FIFO scheduler, a busy-bit vector, two functional units, and a small internal register file. The braid microarchitecture provides a number of opportunities for the reduction of design complexity. It reduces the port requirements of the renaming mechanism, it simplifies the steering process, it reduces the area, size, and port requirements of the register file, and it reduces the paths and port requirements of the bypass network. The complexity savings result in a design characterized by a lower power requirement, a shorter pipeline, and a higher clock frequency. On an 8-wide design, the result from executing braids is performance within 9% of a very aggressive conventional out-of-order microarchitecture with the complexity of an in-order implementation. Three bottlenecks are identified in the braid microarchitecture and a solution is presented to address each. The limitation on braid size is addressed by dynamic merging. The underutilization of braid execution resources caused by long-latency instructions is addressed by context sharing. The poor utilization of braid execution resources caused by single-instruction braids is addressed by heterogeneous execution resources.
dc.description.department	Electrical and Computer Engineering	en
dc.format.medium	electronic	en
dc.identifier.oclc	212384221	en
dc.identifier.uri	http://hdl.handle.net/2152/3710	en
dc.language.iso	eng	en
dc.rights	Copyright © is held by the author. Presentation of this material on the Libraries' web site by University Libraries, The University of Texas at Austin was made possible under a limited license grant from the author who has retained all copyrights in the works.	en
dc.subject.lcsh	Computer architecture	en
dc.subject.lcsh	Compilers (Computer programs)	en
dc.title	Braids: out-of-order performance with almost in-order complexity	en
dc.title.alternative	Out-of-order performance with almost in-order complexity	en
dc.type.genre	Thesis	en
thesis.degree.department	Electrical and Computer Engineering	en
thesis.degree.discipline	Electrical and Computer Engineering	en
thesis.degree.grantor	The University of Texas at Austin	en
thesis.degree.level	Doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Access full-text files

Original bundle

Now showing 1 - 1 of 1

Name:: tsengf71786.pdf
Size:: 558.81 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.65 KB
Format:: Plain Text
Description:

Download

Collections

UT Electronic Theses and Dissertations