- Notifications
You must be signed in to change notification settings - Fork0
This library attempts to wrap communication procudures of MPI with concepts found in libunifex
License
NotificationsYou must be signed in to change notification settings
maikel/async_mpi
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This library wraps the blocking communication procudureamrex::FillBoundary
in an internalMPI_Waitsome
loop that allows asynchronous dispatch with concepts found inlibunifex.Consider a blocking call toFillBoundary
as in:
// Block the current thread until all pending requests are donevoidEulerAmrCore::Advance(double dt) {// Blocking call to FillBoundary.// This waits for all MPI_Request to be done.// states is a MultiFab member variable states.FillBoundary();// Parallel for loop over all FABs in the MultiFab#ifdef AMREX_USE_OMP#pragma omp parallel if (Gpu::notInLaunchRegion())#endiffor (MFItermfi(states); mfi.isValid(); ++mfi) {const Box box = mfi.growntilebox();auto advance_dir = [&](Direction dir) {constint dir_v =int(dir);auto csarray = states.const_array(mfi);auto farray = fluxes[dir_v].array(mfi); Box faces =shrink(convert(box,dim_vec(dir)), dir_v,1);ComputeNumericFluxes(faces, farray, csarray, dir);auto cfarray = fluxes[dir_v].const_array(K);auto sarray = states.array(K);UpdateConservatively(inner_box, sarray, farray, dt_over_dx[dir_v], dir); };// first order accurate operator splittingadvance_dir(Direction::x);advance_dir(Direction::y);advance_dir(Direction::z); }// implicit join of all OpenMP threads here}
This library provides thin wrappers around this FillBoundary call and enables the usage of the Sender/Receiver model that is developed inlibunifex.
The above example could read instead
voidEulerAmrCore::AsyncAdvance(double dt) {auto advance =unifex::bulk_join(unifex::bulk_transform(// tbb_scheduler to schedule work items and comm_scheduler to schedule MPI_WaitAny/All threads// The feedback funcction (Box, int) will be called for every box thas it ready,// i.e. all its ghost cells are filled.FillBoundary(tbb_scheduler, comm_scheduler, states), [this, dt_over_dx](const Box& box,int K) {auto advance_dir = [&](Direction dir) {auto csarray = states.const_array(K);auto farray = fluxes[dir_v].array(K); Box faces =shrink(convert(box,unit(dir)),1,unit(dir));ComputeNumericFluxes(faces, farray, csarray, dir);auto cfarray = fluxes[dir_v].const_array(K);auto sarray = states.array(K);UpdateConservatively(inner_box, sarray, farray, dt_over_dx[dir_v], dir); };// first order accurate operator splittingadvance_dir(Direction::x);advance_dir(Direction::y);advance_dir(Direction::z); }, unifex::par_unseq));// Explicitly wait here until the above is done for all boxesunifex::sync_wait(std::move(advance));}
The intent is to try out a structural parallel programming model in classical HPC applications such as in a finite volume flow solver on structured grids.