SCHEDULE: NOV 16-22, 2013
When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.
AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs
SESSION: Optimizing Numerical Code
EVENT TYPE: Papers
TIME: 3:30PM - 4:00PM
SESSION CHAIR: Naoya Maruyama
AUTHOR(S):Qian Wang, Xianyi Zhang, Yunquan Zhang, Qing Yi
ROOM:401/402/403
ABSTRACT:
Basic liner algebra subprograms (BLAS) is a fundamental library in scientific computing. In this paper, we present a template-based optimization framework, AUGEM, which can automatically generate fully optimized assembly code for several dense linear algebra (DLA) kernels, such as GEMM, GEMV, AXPY and DOT, on varying multi-core CPUs without requiring any manual interference from developers. In particular, based on domain-specific knowledge about algorithms of the DLA kernels, we use a collection of parameterized code templates to formulate a number of commonly occurring instruction sequences within the optimized low-level C code of these DLA kernels. Then, our framework uses a specialized low-level C optimizer to identify instruction sequences that match the predefined code templates and thereby translates them into extremely efficient SSE/AVX instructions. The DLA kernels generated by our template-based approach surpass the implementations of Intel MKL and AMD ACML BLAS libraries, on both Intel Sandy Bridge and AMD Piledriver processors.
Chair/Author Details:
Naoya Maruyama (Chair) - RIKEN Advanced Institute for Computational Science
Qian Wang - Institute of Software, Chinese Academy of Science
Xianyi Zhang - Institute of Software, Chinese Academy of Science
Yunquan Zhang - Institute of Software, Chinese Academy of Science
Qing Yi - University of Colorado Colorado Springs
Click here to download .ics calendar file
Click here to download .vcs calendar file
Click here to add event to your Google Calendar
The full paper can be found in the ACM Digital Library
