PGCon2017 – 20180510
The PostgreSQL Conference
|Day||Talks – Day 2 – 2017-05-26|
|Language used for presentation||English|
Currently, PostgreSQL executes SQL queries using the interpreter, which is quite slow. However, significant speedup can be achieved by compiling query “on-the-fly” with LLVM JIT: this way it’s possible for given SQL query to generate more effective code, which is specialized using the information known at run time, and then additionally optimize it with LLVM. This approach is especially important for complex queries, where performance is CPU-bound.
In the talk we’ll discuss how such dynamic compilation with LLVM compiler infrastructure can be used to speed up various stages of SQL query execution:
- compiling expressions;
- compiling Scan, Aggregation, Sort and Join methods of Executor tree nodes;
- compiling indexing method;
- saving compiled native code for PREPARED statements to speedup OLTP queries.
Also we’ll show the necessary architectural changes in PostgreSQL for dynamic compilation to be effective. E.g. we had to switch from PostgreSQL original “pull” iteration model to “push” model, because the former model didn’t allow JIT compiler to optimize the code effectively (due to virtual calls and saving internal state in Executor tree nodes).
We’ll also discuss the methods for reusing original PostgreSQL source code for building both JIT compiler and the interpreter, trying to avoid manually rewriting PostgreSQL backend as well as Executor nodes implementation using LLVM API.
As the result we have achieved significant speedup on TPC-H benchmark: when JIT-compiling expressions the speedup is 20% (the source code is available at github.com/ispras/postgres). When using all described optimization techniques at different query execution stages, the speedup is up to 5x on TPC-H tests.