SC13 Home > SC13 Schedule > SC13 Presentation - AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs

SCHEDULE: NOV 16-22, 2013

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs

SESSION: Optimizing Numerical Code

EVENT TYPE: Papers

TIME: 3:30PM - 4:00PM

SESSION CHAIR: Naoya Maruyama

AUTHOR(S):Qian Wang, Xianyi Zhang, Yunquan Zhang, Qing Yi

ROOM:401/402/403

ABSTRACT:
Basic liner algebra subprograms (BLAS) is a fundamental library in scientific computing. In this paper, we present a template-based optimization framework, AUGEM, which can automatically generate fully optimized assembly code for several dense linear algebra (DLA) kernels, such as GEMM, GEMV, AXPY and DOT, on varying multi-core CPUs without requiring any manual interference from developers. In particular, based on domain-specific knowledge about algorithms of the DLA kernels, we use a collection of parameterized code templates to formulate a number of commonly occurring instruction sequences within the optimized low-level C code of these DLA kernels. Then, our framework uses a specialized low-level C optimizer to identify instruction sequences that match the predefined code templates and thereby translates them into extremely efficient SSE/AVX instructions. The DLA kernels generated by our template-based approach surpass the implementations of Intel MKL and AMD ACML BLAS libraries, on both Intel Sandy Bridge and AMD Piledriver processors.

Chair/Author Details:

Naoya Maruyama (Chair) - RIKEN Advanced Institute for Computational Science

Qian Wang - Institute of Software, Chinese Academy of Science

Xianyi Zhang - Institute of Software, Chinese Academy of Science

Yunquan Zhang - Institute of Software, Chinese Academy of Science

Qing Yi - University of Colorado Colorado Springs

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library