Title page for ETD etd-09172015-154213

Type of Document Master's Thesis
Author Moore, Sean Ryan
Author's Email Address sean8051@vt.edu
URN etd-09172015-154213
Title Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation
Degree Master of Science
Department Electrical and Computer Engineering
Advisory Committee
Advisor Name Title
Binoy Ravindran Committee Chair
Dongyoon Lee Committee Member
Roberto Palmieri Committee Member
  • Hardware Transactional Memory
  • Global Lock
  • GNU C Library
Date of Defense 2015-09-16
Availability unrestricted
It has historically been the case that CPUs have run programs ever faster without significant intervention on the behalf of the programmer. However, this "free lunch" has largely ended due to the end of exponentially increasing core frequency and the current slow increase in instruction-level parallelism but continues to a degree in cache size improvements. But since Moore's law still largely continues "lunch", i.e. program performance, can still be bought at the price of rewriting code for multiple cores, which is enabled by the trend Moore's law describes. Multicore architectures cannot aid performance for problems whose solutions are necessarily sequential in nature and writing efficient and correct concurrent programs is not easy in all cases when using synchronization methods like fine-grained mutex locks.

Transactional memory, and its implementation as hardware transactional memory, allow programmers to write concurrent applications without the attendant complexity of programming with mutex locks. This allows programmers to focus on optimizing the application for performance. Given that transactions can run two segments of code in parallel that a mutex lock would force to run sequentially and that transactions can abort, causing a program to do the same work more than once, whether transactions perform better or worse than mutex locks is dependent on the program's execution profile and the coarseness or fineness at which mutex locks are used.

In this thesis the GNU C Library's futex implementation of mutex locks and Intel's Restricted Transactional Memory have been compared and the behavior of those transactions has been analyzed. This analysis includes a pathological behavior permitted by the GNU C Library's hardware transactional memory implementation of mutex locks. The tradeoffs between fine-grained and global locking implementations have been discussed, compared, and used in the context of fallback locks for hardware transactions. This thesis provides evidence to the effect that fine-grained locking is not critical for program performance and that in many cases global locking and hardware transactions can provide nearly equivalent performance without the programming difficulties. This work has shown that across the 23 applications examined, with relation to their original locking implementation, a global locking scheme without elision has a 0.96x speedup, Intel's Restricted Transactional Memory (RTM) with the application's original locks as a fallback has a 1.01x speedup and with global lock fallback RTM has a speedup of 0.97x.

This work is supported in part by NAVSEA/NEEC under grant 3003279297. Any opinions, findings, and conclusions or recommendations expressed in this thesis are those of the author and do not necessarily reflect the views of NAVSEA.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Moore_SR_T_2015.pdf 790.86 Kb 00:03:39 00:01:52 00:01:38 00:00:49 00:00:04

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.