Title page for ETD etd-05202014-193503


Type of Document Master's Thesis
Author Lyerly, Robert Frantz
Author's Email Address rlyerly@vt.edu
URN etd-05202014-193503
Title Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures
Degree Master of Science
Department Electrical and Computer Engineering
Advisory Committee
Advisor Name Title
Binoy Ravindran Committee Chair
Binoy Ravindran Committee Chair
Cameron Patterson Committee Member
Cameron Patterson Committee Member
Paul Plassmann Committee Member
Paul Plassmann Committee Member
Keywords
  • High-Performance Computing
  • High-Performance Computing
  • Runtime Systems
  • Runtime Systems
  • Heterogeneous Architectures
  • Heterogeneous Architectures
  • Compilers
  • Compilers
  • Scheduling
  • Scheduling
Date of Defense 2014-05-07
Availability unrestricted
Abstract
The world of high-performance computing has shifted from increasing single-core performance to extracting performance from heterogeneous multi- and many-core processors due to the power, memory and instruction-level parallelism walls. All trends point towards increased processor heterogeneity as a means for increasing application performance, from smartphones to servers. These various architectures are designed for different types of applications – traditional “big” CPUs (like the Intel Xeon) are optimized for low latency while other architectures (such as the NVidia Tesla K20x) are optimized for high-throughput. These architectures have different tradeoffs and different performance profiles, meaning fantastic performance gains for the right types of applications. However applications that are ill-suited for a given architecture may experience significant slowdown; therefore, it is imperative that applications are scheduled onto the correct processor.

In order to perform this scheduling, applications must be analyzed to determine their execution characteristics. Traditionally this application-to-hardware mapping was determined statically by the programmer. However, this requires intimate knowledge of the application and underlying architecture, and precludes load-balancing by the system. We demonstrate and empirically evaluate a system for automatically scheduling compute kernels by extracting program characteristics and applying machine learning techniques. We develop a machine learning process that is system-agnostic, and works for a variety of contexts (e.g. embedded, desktop/workstation, server). Finally, we perform scheduling in a workload-aware and workload-adaptive manner for these compute kernels.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Lyerly_RF_T_2014_2.pdf 916.72 Kb 00:04:14 00:02:10 00:01:54 00:00:57 00:00:04
  Lyerly_RF_T_2014_2.pdf 916.72 Kb 00:04:14 00:02:10 00:01:54 00:00:57 00:00:04

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.