Thermal-/variation-aware task-level scheduling for multicore processors

 

Dr. Chen-Yong Cher

IBM T.J. Watson Research Center

New York

 

Abstract

 

This talk focuses on our on-going research on thermal characterization of a specially selected, defective POWER5 chip that exhibits thermal variability, using Infra-red thermal imaging, on-chip thermal sensors and performance counters, as well as software-level techniques to mitigate on-chip hotspots.

 

Elevated on-chip hotspots have adverse effects on cooling cost and if not addressed suitably, on chip reliability. In our prior work published in ISLPED 2007, we investigated the general trade-offs between temporal and spatial hotspot mitigation schemes and thermal time constants, workload variations and microprocessor power distributions. By leveraging spatial and temporal heat slacks, our schemes enable lowering of on-chip unit temperatures by changing the workload in a timely manner with Operating System (OS) and existing hardware support.

 

With technologies below 65nm, high-performance microprocessor architectures are more likely to be impacted by the increased process variability.  Our analysis indicates: (i) increase in magnitude and (ii) non-negligible variability at finer granularities such as: among cores, functional units or even transistors on the same processor die. The increased variation causes both performance degradation and lost yield.  We will describe the extension on our thermal-aware system software to alleviate the variability problems among different cores in a multi-core architecture.

 

Speaker’s Biographical Sketch

 

Chen-Yong Cher received his B.S. degree in Computer Engineering in 1998 and the M.S. and PhD in Electrical and Computer Engineering in 2000 and 2004, respectively, from Purdue University at West Lafayette, Indiana.  He is currently a research staff member at the IBM T. J. Watson Research Center in New York in Reliability- and Power-aware Micro-architecture Group.  He has researched on thermal-aware Linux scheduler, on-chip power management for multi-core and more recently on off-loading Java garbage collection using Cell SPUs.  His current research efforts focus on future Blue Gene Systems.

 

DATE:  April 9, 2008

TIME:  12:00-12:10 pm-refreshments; 12:10-1:00 pm-seminar

LOCATION:  426 Benedum Hall