Towards real-time HW/SW co-simulation with operating system support
A trend in the consumer electronics market is the demand for new applications that have a lot of similarities to older applications but the new ones impose more challenging and special-purpose performance requirements. In the digital signal processing (DSP) industry, this clearly reflects a transition from the design regime of general DSP to the application-specific DSP. From the design perspective, it means that the DSP core remains unchanged but more and more hardware (HW) accelerators, DMAs and bus architectures need to be integrated into the chip. A key in effecting this transition is the engineering capability to make sure that the design specification \matches" the application before detailed design starts. Therefore, application software (SW) needs to be developed in parallel with HW to verify the design specification at the system level. Enabling development and simulation of SW before the actual HW is available also reduce the time-to-market period which is another important benefit. HW/SW co-simulation for design specification refinement imposes many challenging requirements to the simulation platform. The simulation components (simcoms) modeling the real HW (rhw) modules to be designed and the application SW need to be integrated to carry out the simulation at system level. Simulation result needs to be accurate. Simulation speed should allow fast design space exploration and ease debugging complex application SW. HW and SW problems should be isolated cleanly since HW and SW engineers often do not have enough expertise in one another's domains. The simulator should be cost-effective. These requirements often conflict with one another. For example, achieving high simulation accuracy typically requires the simulation to be carried out at low level, which implies that the simulation speed is slow. A simulator allowing integration of simcoms and application SW for simulation is very expensive and thus only very few engineers can use it. In many cases, simcoms and application SW are not constructed in the same programming language. Interfacing them is not a trivial problem and often impacts the simulation speed severely. Using a single simulator requires the engineers to understand both HW and SW details that violates the requirement of HW/SW problem isolation. The bottom line is that a single simulator is not possible to fulfill all these requirements at the same time. This dissertation describes three simulation tools for different usages. The first one models and simulates the real-time operating system (RTOS) together with the application SW. It is motivated by the fact that with the appearance of high performance DSPs, more and more tasks will be implemented as SW on a single DSP managed by an RTOS. Selecting the \right" RTOS before the SW is developed is very important. The tool is implemented based on SystemC and is configurable to support modeling and timed simulation of most popular embedded RTOSes. Timing fidelity is achieved by using delay annotation. The OS timing information is derived from published benchmark data. Application timing information can be profiled or estimated from similar legacy applications. The optimized conservative approach is taken to synchronize simcoms. Compared to other research work, an important contribution of this tool is an online algorithm for predicting the timestamp of the next event based on the realistic assumption that multiple tasks execute on currently on a processor, managed by a static or dynamic priority driven scheduler. The simulation speed is more than 3 orders of magnitude faster than commercial instruction set simulator (ISS) with comparable accuracy. The tool is used to assist in generation of an initial design specifiation. The second tool is a system data flow simulator (SDFS) and is used by the HW engineers to refine the HW specifications. It models the application by a parameter driven conditional data flow graph (CDFG) at the transaction level and the HW by a configurable HW graph at the cycle-accurate level. SDFS takes the application CDFG and HW graph as the input and carries out the simulation to catch the detailed HW activities, i.e., bus arbitration. It only requires the HW engineers to understand the application at the CDFG level. To carry out the system simulation at such a low level, many commercial simulators need to couple an ISS for application SW with an RTL simulator for simcoms that are typically 6 orders of magnitude slower than the rhw speed. The simulation error of SDFS is within 5% in most cases and the worst case error is within 13%, which is comparable to the ISS+RTL approach. But the simulation speed is only 4 orders of magnitude slower than the rhw speed. Compared to other similar research work that also models the system at CDFG level, SDFS an achieve higher simulation accuracy be cause of the following advantages: 1) it does not need a fixed application trace as input and thus is flexible enough to cover many simulation scenarios; 2) it does not assume a fixed cost for each functional block and thus is able to estimate the system performance under actual execution conditions; and 3) it is able to model the pipelined architecture common in modern DSPs. The proposed simulator is cost-effective since it is implemented in the SystemC language and an be executed on most PCs and workstations. The third tool is a real-time simulation platform (RTSP) implemented on legacy DSPs. To the best of our knowledge, this is the first simulator that truly enables the application SW to be developed in parallel with HW by offering the same SW development environment as if the rhw was available. To simulate the behavior of a rhw module, a corresponding simcom is constructed running on a legacy DSP. The success of this simulation strategy hinges on a novel way to apply the concept of Real-Time Virtual Machines to simulation. Each legacy DSP employs a two level scheduler to enforce that each simcom carries out the simulation at a proportional speed (1= ) to the rhw, so that any job that would finish at time t on the rhw will finish no later than t + 4 where 4 is a constant bound. Such a feature eliminates expensive synchronization between the simcoms. RTSP is proven to perform simulations faithfully and also is shown experimentally to be effective for real industry applivations. For a rhw whose timing behavior can be accurately modeled by the SW behavior model, the simulation error is shown to be < 5%. For very complicated rhw whose timing cannot be accurately captured by the behavior model, the simulation accuracy was shown to be ex ellent for the average case. The simulation speed is quite fast. For the selected audio and video applications, simulation is only 10X and 30X slower than rhw execution. The RTSP platform is practically zero-cost since legacy EVM boards can be reused for the purpose of simulation. RTSP and SDFS can be used to complement each other. RTSP carries out the simulation at a higher level than SDFS and usually cannot capture activities on buses at every cycle. The information collected from SDFS determines the appropriate rate settings for simcoms to compensate for the resource competition. RTSP allows SW engineers to optimize the algorithm and suggest improvements to HW architecture. Suggested changes are fed to SDFS for refining the design specification.