- Notifications
You must be signed in to change notification settings - Fork43
rapidsai/gpu-bdb
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
gpu-bdb is derived fromTPCx-BB. Any results based on gpu-bdb are considered unofficial results, and perTPC policy, cannot be compared against official TPCx-BB results.
The GPU Big Data Benchmark (gpu-bdb) is aRAPIDS library based benchmark for enterprises that includes 30 queries representing real-world ETL & ML workflows at various "scale factors": SF1000 is 1 TB of data, SF10000 is 10TB. Each “query” is in fact a model workflow that can include SQL, user-defined functions, careful sub-setting and aggregation, and machine learning.
We provide a conda environment definition specifying all RAPIDS dependencies needed to run our query implementations. To install and activate it:
CONDA_ENV="rapids-gpu-bdb"conda env create --name$CONDA_ENV -f gpu-bdb/conda/rapids-gpu-bdb.ymlconda activate rapids-gpu-bdb
This repository includes a small local module containing utility functions for running the queries. You can install it with the following:
cd gpu-bdb/gpu_bdbpython -m pip install.
This will install a package namedbdb-tools
into your Conda environment. It should look like this:
conda list| grep bdbbdb-tools 0.2 pypi_0 pypi
Note that this Conda environment needs to be replicated or installed manually on all nodes, which will allow starting one dask-cuda-worker per node.
Queries 10, 18, and 19 depend on two static (negativeSentiment.txt, positiveSentiment.txt) files. As we cannot redistribute those files, you shoulddownload the tpcx-bb toolkit and extract them to your data directory on your shared filesystem:
jar xf bigbenchqueriesmr.jarcp tpcx-bb1.3.1/distributions/Resources/io/bigdatabenchmark/v1/queries/q10/*.txt ${DATA_DIR}/sentiment_files/
For Query 27, we rely onspacy. To download the necessary language model after activating the Conda environment:
python -m spacy download en_core_web_sm
We use thedask-scheduler
anddask-cuda-worker
command line interfaces to start a Dask cluster. We provide acluster_configuration
directory with a bash script to help you set up an NVLink-enabled cluster using UCX.
Before running the script, you'll make changes specific to your environment.
Incluster_configuration/cluster-startup.sh
:
- Update `GPU_BDB_HOME=...` to location on disk of this repo- Update `CONDA_ENV_PATH=...` to refer to your conda environment path.- Update `CONDA_ENV_NAME=...` to refer to the name of the conda environment you created, perhaps using the `yml` files provided in this repository.- Update `INTERFACE=...` to refer to the relevant network interface present on your cluster.- Update `CLUSTER_MODE="TCP"` to refer to your communication method, either "TCP" or "NVLINK". You can also configure this as an environment variable.- You may also need to change the `LOCAL_DIRECTORY` and `WORKER_DIR` depending on your filesystem. Make sure that these point to a location to which you have write access and that `LOCAL_DIRECTORY` is accessible from all nodes.
To start up the cluster on your scheduler node, please run the following fromgpu_bdb/cluster_configuration/
. This will spin up a scheduler and one Dask worker per GPU.
DASK_JIT_UNSPILL=True CLUSTER_MODE=NVLINK bash cluster-startup.sh SCHEDULER
Note: Don't use DASK_JIT_UNSPILL when running BlazingSQL queries.
Then run the following on every other node fromgpu_bdb/cluster_configuration/
.
bash cluster-startup.sh
This will spin up one Dask worker per GPU. If you are running on a single node, you will only need to runbash cluster-startup.sh SCHEDULER
.
If you are using a Slurm cluster, please adapt the example Slurm setup ingpu_bdb/benchmark_runner/slurm/
which usesgpu_bdb/cluster_configuration/cluster-startup-slurm.sh
.
To run a query, starting from the repository root, go to the query specific subdirectory. For example, to run q07:
cd gpu_bdb/queries/q07/
The queries assume that they can attach to a running Dask cluster. Cluster address and other benchmark configuration lives in a yaml file (gpu_bdb/benchmark_runner/becnhmark_config.yaml
). You will need to fill this out as appropriate if you are not using the Slurm cluster configuration.
conda activate rapids-gpu-bdbpython gpu_bdb_query_07.py --config_file=../../benchmark_runner/benchmark_config.yaml
To NSYS profile a gpu-bdb query, changestart_local_cluster
in benchmark_config.yaml toTrue
and run:
nsys profile -t cuda,nvtx python gpu_bdb_query_07_dask_sql.py --config_file=../../benchmark_runner/benchmark_config.yaml
Note: There is no need to start workers withcluster-startup.sh
asthere is aLocalCUDACluster
being started inattach_to_cluster
API.
This repository includes optional performance-tracking automation using Google Sheets. To enable logging query runtimes, on the client node:
export GOOGLE_SHEETS_CREDENTIALS_PATH=<path to creds.json>
Then configure the--sheet
and--tab
arguments inbenchmark_config.yaml
.
The includedbenchmark_runner.py
script will run all queries sequentially. Configuration for this type of end-to-end run is specified inbenchmark_runner/benchmark_config.yaml
.
To run all queries, cd togpu_bdb/
and:
pythonbenchmark_runner.py--config_filebenchmark_runner/benchmark_config.yaml
By default, this will run each Dask query five times, and, if BlazingSQL queries are enabled inbenchmark_config.yaml
, each BlazingSQL query five times. You can control the number of repeats by changing theN_REPEATS
variable in the script.
BlazingSQL implementations of all queries are included. BlazingSQL currently supports communication via TCP. To run BlazingSQL queries, please follow the instructions above to create a cluster usingCLUSTER_MODE=TCP
.
The RAPIDS queries expectApache Parquet formatted data. We provide ascript which can be used to convert bigBench dataGen's raw CSV files to optimally sized Parquet partitions.
About
RAPIDS GPU-BDB
Resources
License
Code of conduct
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors12
Uh oh!
There was an error while loading.Please reload this page.