Type of Document Dissertation Author Li, Dong Author's Email Address firstname.lastname@example.org URN etd-02022011-182442 Title Scalable and Energy Efficient Execution Methods for Multicore Systems Degree PhD Department Computer Science Advisory Committee
Advisor Name Title Cameron, Kirk W. Committee Chair Nikolopoulos, Dimitrios S. Committee Co-Chair de Supinski, Bronis R. Committee Member Feng, Wu-Chun Committee Member Ma, Xiaosong Committee Member Keywords
- Performance Modeling and Analysis
- Multicore Processors
- Power-Aware Computing
- Concurrency Throttling
- High-Performance Computing
Date of Defense 2011-01-26 Availability unrestricted AbstractMulticore architectures impose great pressure on resource management. The exploration spaces available for resource management increase explosively, especially for large-scale high end computing systems. The availability of abundant parallelism causes scalability concerns at all levels. Multicore architectures also impose pressure on power management. Growth in the number of cores causes continuous growth in power.
In this dissertation, we introduce methods and techniques to enable scalable and energy efficient execution of parallel applications on multicore architectures. We study strategies and methodologies that combine DCT and DVFS for the hybrid MPI/OpenMP programming model. Our algorithms yield substantial energy saving (8.74% on average and up to 13.8%) with either negligible performance loss or performance gain (up to 7.5%).
To save additional energy for high-end computing systems, we propose a power-aware MPI task aggregation framework. The framework predicts the performance effect of task aggregation in both computation and communication phases and its impact in terms of execution time and energy of MPI programs. Our framework provides accurate predictions that lead to substantial energy saving through aggregation (64.87% on average and up to 70.03%) with tolerable performance loss (under 5%).
As we aggregate multiple MPI tasks within the same node, we have the scalability concern of memory registration for high performance networking. We propose a new memory registration/deregistration strategy to reduce registered memory on multicore architectures with helper threads. We investigate design polices and performance implications of the helper thread approach. Our method efficiently reduces registered memory (23.62% on average and up to 49.39%) and avoids memory registration/deregistration costs for reused communication memory. Our system enables the execution of application input sets that could not run to the completion with the memory registration limitation.
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access Li_Dong_D_2011.pdf 8.05 Mb 00:37:15 00:19:09 00:16:46 00:08:23 00:00:42
If you have questions or technical problems, please Contact DLA.