Benchmarking TPC-DS with Databend
April 11, 2023 · 1 min read
xudong
Benchmarking Databend using TPC-DS
The TPC-DS benchmark is widely used for measuring the performance of decision support and analytical systems. Databend is a data warehouse that supports TPC-DS SQLs. In this blog, we will walk you through the process of benchmarking TPC-DS with Databend, covering key aspects such as generating TPC-DS data, preparing create tables for Databend, and executing benchmark queries.
What's TPC-DS?
TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark provides a representative evaluation of performance as a general purpose decision support system.
It includes 7 fact tables, 17 dimension tables, with an average of 18 columns per table and 99 test queries.
You can find more information about TPC-DS at https://www.tpc.org/tpcds/.
Running TPC-DS Benchmark on Databend
This section describes the steps to run the TPC-DS benchmark on Databend and provides the related scripts. You can find more detail information at: https://github.com/datafuselabs/databend/tree/main/benchmark/tpcds.
Step 1: Generate TPC-DS test data
Leverage duckdb to generate TPC-DS data:
INSTALL tpcds;
LOAD tpcds;
SELECT * FROM dsdgen(sf=1);
EXPORT DATABASE 'TARGET_DIR' (FORMAT CSV, DELIMITER '|');
Step 2: Load TPC-DS data into Databend
./load_data.sh
Step3: Run TPC-DS queries
databend-sqllogictests --handlers mysql --database tpcds --run_dir tpcds --bench
🎈Connect With Us
Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.
Join the Databend Community to try, get help, and contribute!