cstyl/omp-schedulingPublic

NotificationsYou must be signed in to change notification settings
Fork2
Star3

Experiments with the loop scheduling options in OpenMP and creation of affinity loop scheduling in C.

License

MIT license

3 stars 2 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
includes		includes
res		res
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Repository files navigation

Scheduling policies in OpenMP

This is a repository where various scheduling policies available inOpenMP are investigated. The investigation is performed for two different workloads, one that is slightly unbalanced calledloop1 and the other one that is very unbalanced with the most of the work concentrated in the first few iterations, calledloop2.

The following schedulers provided by OpenMP are investigated:

STATIC,n
DYNAMIC,n
GUIDED,n

wheren is the selected chunksize.

Additionally, a scheduler was designed by hand calledaffinity scheduler aiming to combine the characteristics of the afformentioned schedulers and compare the performance.

What is included

includes/: Contains the header file calledresources.h necessary for the development of the code. Additionally, containsaffinity_structs.h andmacros.h necessary for the development of theaffinity scheduler.
src/main.c: The main source file used to execute each scheduling option for the two available workloads.
src/loops/: Contains all the functions relevant to the workload, i.e initialisation and validation as well as execution of the workload.
src/omplib/: Contains wrap functions ofOpenMP commands in an effort to hide the APIs functions.
src/affinity/: Contains all the functions used to develop theaffinity scheduler.
scripts/performance/: Contains all the performance tests available to measure the performance of the code.
scripts/pbs/: Contains all the performance tests to run on the back-end ofCIRRUS available to measure the performance of the code.
scripts/plots/: Contains all the plot scripts available to plot the results of the performance tests.
res/: Directory containing the raw results and plots for each test.

Options

The designedaffinity scheduler comes in two versions. The first one usescritical regions in order to synchronize the threads while the secondlocks. One can choose between the two versions by compiling the code with differentDEFINE flag. Moreover, one can also choose between which scheduler to use to measure its performance. In other words, one can use anotherDEFINE flag to choose between thebest_scheduling option chosen for each workload or choose to determine the scheduling option on the runtime.

The following options are available:

-DRUNTIME: Choose to select the scheduling option on the runtime.
-DBEST_SCHEDULE: Choose to use the best scheduling option determined for each workload.
-DBEST_SCHEDULE_LOOP2: Choose to use the best scheduling option determined for each workload after a further investigation ofloop2.
-DAFFINITY: Choose to use affinity scheduler.
- -DLOCK: If set, the affinity scheduler with locks is used, otherwise the one with critical regions.

Note that one should only choose one of the four main options shown above. In case no option is selected, theserial version of the code is being executed.

Usage

Prerequisites:

Compiler:icc
Build Tool:make

Building

To compile all the available versions of the code use:

$ make all

This will create all the necessary directories for the code to be executed. All the versions of the code are compiled using the different options showed above. This will result in the following executables:

bin/serial: Serial version of the code.
bin/runtime: Parallel version of the code where scheduling can be determined on the runtime. Note that only the scheduling options provided byOpenMP can be selected.
bin/best_schedule: The best scheduling options provided byOpenMP are used for each workload.
bin/best_schedule_loop2: The best scheduling options provided byOpenMP are used for each workload after the best schedule option forloop2 was tunned based on its chunksize.
bin/affinity: The affinity scheduler with critical regions is used.
bin/affinity_lock: The affinity scheduler with locks is used.

Alternatively, one can compile each version as follows:Create the required directories using:

$ make dir

Build the serial version:

$ make bin/serial -Bicc -O3 -qopenmp -std=c99 -Wall  -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall  -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall  -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc obj/omplib.o obj/workload.o obj/main.o -o bin/serial -lm -qopenmp

Build the runtime version:

$ make bin/runtime DEFINE=-DRUNTIME -Bicc -O3 -qopenmp -std=c99 -Wall -DRUNTIME -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall -DRUNTIME -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall -DRUNTIME -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc  obj/omplib.o obj/workload.o obj/main.o -o bin/runtime -lm -qopenmp

Build the best_scheduling version:

$ make bin/best_schedule DEFINE=-DBEST_SCHEDULE -Bicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc -O3 -qopenmp -std=c99 -Wall obj/omplib.o obj/workload.o obj/main.o -o bin/best_schedule -lm -qopenmp

Build the best_scheduling version for loop2:

$ make bin/best_schedule_loop2 DEFINE=-DBEST_SCHEDULE_LOOP2 -Bicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE_LOOP2 -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE_LOOP2 -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall -DBEST_SCHEDULE_LOOP2 -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc -O3 -qopenmp -std=c99 -Wall obj/omplib.o obj/workload.o obj/main.o -o bin/best_schedule_loop2 -lm -qopenmp

Build the affinity version with critical regions:

$ make bin/affinity DEFINE=-DAFFINITY -Bicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/affinity.o -c src/affinity/affinity.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/mem.o -c src/affinity/mem.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc -O3 -qopenmp -std=c99 -Wall obj/omplib.o obj/workload.o obj/affinity.o obj/mem.o obj/main.o -o bin/affinity -lm -qopenmp

Build the affinity version with locks:

$ make bin/affinity_lock DEFINE=-DAFFINITY DEFINE+=-DLOCK -Bicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -DLOCK -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/omplib.o -c src/omplib/omplib.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -DLOCK -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/workload.o -c src/loops/workload.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -DLOCK -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/affinity.o -c src/affinity/affinity.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -DLOCK -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/mem.o -c src/affinity/mem.cicc -O3 -qopenmp -std=c99 -Wall -DAFFINITY -DLOCK -Iincludes -Isrc/affinity -Isrc/loops -Isrc/omplib -o obj/main.o -c src/main.cicc -O3 -qopenmp -std=c99 -Wall obj/omplib.o obj/workload.o obj/affinity.o obj/mem.o obj/main.o -o bin/affinity_lock -lm -qopenmp

Cleaning

To clean the project run:

$ make clean

Running

To execute the serial code:

$ ./bin/serial

To execute the parallel code one has to choose the number of threads the code will be executed on. This can be done using:

$export OMP_NUM_THREADS=$(THREADS)

where$(THREADS) is the number of threads selected.

To executed the runtime version:

$export OMP_SCHEDULE=$(KIND,n)$ ./bin/runtime

where$(KIND,n) is the selected scheduling option and chunksize used.

The available scheduling options are:

STATIC,n: Static scheduler
DYNAMIC,n: Dynamic scheduler
GUIDED,n: Guided schedulerwheren is the selected chunksize.

Example:

$export OMP_NUM_THREADS=4$export OMP_SCHEDULE=DYNAMIC,2$ ./bin/runtime

This will execute the code on4 threads using adynamic scheduler withchunkisize of 2 for each workload.

To executed the best_scheduling version:

$ ./bin/best_schedule

This will execute the code withGUIDED,16 forloop1 andDYNAMIC,8 forloop2.

To executed the best_scheduling_loop2 version:

$ ./bin/best_schedule_loop2

This will execute the code withGUIDED,16 forloop1 andDYNAMIC,4 forloop2.

To executed the affinity version with critical regions use:

$ ./bin/affinity

To executed the affinity version with locks use:

$ ./bin/affinity_lock

Tests

Determining the best scheduling option for each workload on constant number of threads

This test executes multiple times thebin/runtime executable. Each time the performance of eachOpenMP scheduling option is measured for different chunksizes. The number of threads is kept constant in order to determine the best scheduling option and chunksize for each workload.

Executing the test

Running on the front-end:

$ make runtime_test

Submitting a job on the back-end of CIRRUS:

$ make runtime_test_back

Ploting the results

To plot the results once the test is finished run:

$ make plot_runtime_test

Evaluating the performance of the selected best option on variable number of threads

Observing the results from the previous test, the best scheduling option is selected for each workload. This test runs multiple times thebin/best_schedule executable over a set of number of threads. The performance is then evaluated for each thread and each workload.

Executing the test

Running on the front-end:

$ make best_schedule_test

Submitting a job on the back-end of CIRRUS:

$ make best_schedule_test_back

Ploting the results

To plot the results once the test is finished run:

$ make plot_runtime_test

Evaluating the performance of loop2 by tunning its chunksize

Asloop2 after thebest_schedule test, saturates for the selected scheduling option and chunksize, a further investigation is performed. The executablebin/best_schedule_loop2 is executed multiple times over a set of number of threads and chunksizes for the selectedbest_scheduling option forloop2.

Executing the test

Running on the front-end:

$ make best_schedule_loop2_test

Submitting a job on the back-end of CIRRUS:

$ make best_schedule_loop2_test_back

Ploting the results

To plot the results once the test is finished run:

$ make plot_runtime_test

Evaluating the performance of affinity scheduling

The performance of theaffinity scheduler is investigated for the two available versions, i.e when critical regions are used and when locks are used instead.

Executing the test

Running on the front-end:

$ make affinity_schedule_test

Submitting a job on the back-end of CIRRUS:

$ make affinity_schedule_test_back

Ploting the results

To plot the results once the test is finished run:

$ make plot_runtime_test

Evaluating the performance of all the different versions

The performance of all the implemented versions for each loop is evaluated and compared together.

Executing the test

Running on the front-end:

$ make performance_comparison_test

Submitting a job on the back-end of CIRRUS:

$ make performance_comparison_test_back

Ploting the results

To plot the results once the test is finished run:

$ make plot_performance_comparison_test

Performing all the tests at once:

Instead of submiting the test scripts one by one, one can use the following to perform all the tests together:

$ make run_tests_front

If one wants to submits the tests at the back-end can run:

$ make run_tests_back

Plotting all the test results at once:

Once all the tests are finished, the results can be plotted using:

$ make plot_tests

About

Experiments with the loop scheduling options in OpenMP and creation of affinity loop scheduling in C.

Releases

No releases published

Packages

No packages published

Movatterモバイル変換

License

cstyl/omp-scheduling

Folders and files

Latest commit

History

Repository files navigation

Scheduling policies in OpenMP

What is included

Options

Usage

Prerequisites:

Building

Cleaning

Running

Tests

Determining the best scheduling option for each workload on constant number of threads

Executing the test

Ploting the results

Evaluating the performance of the selected best option on variable number of threads

Executing the test

Ploting the results

Evaluating the performance of loop2 by tunning its chunksize

Executing the test

Ploting the results

Evaluating the performance of affinity scheduling

Executing the test

Ploting the results

Evaluating the performance of all the different versions

Executing the test

Ploting the results

Performing all the tests at once:

Plotting all the test results at once:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages