An Energy Efficient SMT Processor with Heterogeneous Instruction Set Architectures

K. Yoshimura, T. Nakada, Y. Nakashima, and T. Kitamura (Japan)


Architectures and Heterogeneous Computing


Recently, it has become popular to employ a multi-core processor that heterogeneously includes a conventional core and several VLIW/DSP cores to achieve high performance. This approach achieves not only quick integration of embedded OS and multimedia programs but also main taining Quality of Service (QoS) on multimedia applications such as stereo matching. However, from the view point of cost, multi-core processors that increase the chip area by incorporating discrete cores straightforwardly are not the best solution. For an energy efficient processor in embedded systems, we propose an SMT processor named OROCHI, which contains two heterogeneous front-end pipelines, which correspond to ARM ISA for conventional OS programs and FR-V ISA for VLIW multimedia applications, and a common back-end pipeline based on a VLIW pro cessor. In this paper, we propose an instruction scheduling and issue mechanism for SMT execution of ARM and FRV instructions with a VLIW instruction queue. We also propose an asymmetric QoS mechanism to improve performance drop by cache miss stalls. Based on an ASIC implementation with a 0.25µm cell library, we compared our design with a traditional multicore processor heterogeneously containing ARM and FRV cores. Regarding QoS on the VLIW side, the evaluation results show that the energy delay product is 6.0% better than that without the mechanism on SMT execution with high frequency of cache misses.

Important Links:

Go Back