Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikibooksThe Free Textbook Project
Search

Parallel Spectral Numerical Methods/Gray Scott

From Wikibooks, open books for an open world
<Parallel Spectral Numerical Methods

Background

[edit |edit source]

[1]

ut=Du2uuv2+F(1u){\displaystyle {\frac {\partial u}{\partial t}}=D_{u}\nabla ^{2}u-uv^{2}+F(1-u)}
vt=Dv2u+uv2(F+k)v{\displaystyle {\frac {\partial v}{\partial t}}=D_{v}\nabla ^{2}u+uv^{2}-(F+k)v}
The above reaction-diffusion system describes the density of two chemicals ,(u{\displaystyle u} andv{\displaystyle v}), reacting with another. First term in the equations is for the diffusion, the second describes the reaction ofu{\displaystyle u} andv{\displaystyle v},(u+2v{\displaystyle u+2v}3v{\displaystyle 3v}) and the last term in the first equation is the input of newu{\displaystyle u}, while in the second the last term removesv{\displaystyle v} from the system.

Matlab Program

[edit |edit source]

The program solves the linear part with fourier transformation and the nonlinear part with the implicit midpoint rule.

 

 

 

 

( A)
A Matlab program to solve the 3-dimensional GrayScott equationCode download
% A program to solve the Gray-Scott equations using% splitting method. The nonlinear equations are solved using the implicit% midpoint ruleclearall;formatcompact,formatshort,set(0,'defaultaxesfontsize',17,'defaultaxeslinewidth',.7,...'defaultlinelinewidth',3.5,'defaultpatchlinewidth',3.5)Nx=64;% number of modesNy=64;Nz=64;% set up gridNt=4;tmax=0.04;Lx=1.5;% period  2*pi * LLy=1.5;Lz=1.5;dt=tmax/Nt;% number of time slicestol=0.1^12;A=0.04;B=0.1;Du=1.;Dv=1.;% initialise variablesx=(2*pi/Nx)*(-Nx/2:Nx/2-1)'*Lx;% x coordinatey=(2*pi/Ny)*(-Nx/2:Ny/2-1)'*Ly;z=(2*pi/Nz)*(-Nx/2:Nz/2-1)'*Lz;kx=i*[0:Nx/2-10-Nx/2+1:-1]'/Lx;% wave vectorky=i*[0:Ny/2-10-Ny/2+1:-1]'/Ly;kz=i*[0:Nz/2-10-Nz/2+1:-1]'/Lz;[xx,yy,zz]=meshgrid(x,y,z);[kxx,kyy,kzz]=meshgrid(kx,ky,kz);% initial conditionst=0;tdata(1)=t;u=0.2+exp(-2*(xx.^2+yy.^2+zz.^2));v=0.1+exp(-4*(xx.^2+yy.^2+zz.^2-0.01).^2);gamma=[1];Ahat=A*fftn(ones(size(xx)));t=0;figure(11);clf;subplot(2,1,1);H=vol3d('CData',real(u),'texture','3D','XData',x,'YData',y,'ZData',z);xlabel('x');ylabel('y');zlabel('z');colorbar;axisequal;axissquare;view(3);xlabel('x');ylabel('y');zlabel('z');axisequal;axissquare;view(3);drawnow;title(['Time ',num2str(t)]);subplot(2,1,2);H=vol3d('CData',real(v),'texture','3D','XData',x,'YData',y,'ZData',z);xlabel('x');ylabel('y');zlabel('z');colorbar;axisequal;axissquare;view(3);xlabel('x');ylabel('y');zlabel('z');axisequal;axissquare;view(3);drawnow;filename=['./pictures1/',num2str(10000000),'.jpg'];saveas(11,filename)forn=1:Ntchg=1;form=1:1% use fixed point iteration to solve nonlinear system in real spacechg=1;uold=u;vold=v;while(chg>tol)utemp=u;vtemp=v;umean=0.5*(u+uold);vmean=0.5*(v+vold);u=uold+0.5*gamma(m)*dt*(-umean.*vmean.^2);v=vold+0.5*gamma(m)*dt*umean.*vmean.^2;chg=max(abs(u-utemp))+max(abs(v-vtemp));enduhat=fftn(u);vhat=fftn(v);% solve linear part exactly in Fourier spaceuhat=exp(gamma(m)*dt*(-A+Du*(kxx.^2+kyy.^2+kzz.^2))).*...(uhat-Ahat./(A+Du*(kxx.^2+kyy.^2+kzz)))+Ahat./...(A+Du*(kxx.^2+kyy.^2+kzz.^2));vhat=exp(gamma(m)*dt*(-B+Dv*(kxx.^2+kyy.^2+kzz.^2))).*vhat;u=ifftn(uhat);v=ifftn(vhat);% use fixed point iteration to solve nonlinear system in real spacechg=1;uold=u;vold=v;while(chg>tol)utemp=u;vtemp=v;umean=0.5*(u+uold);vmean=0.5*(v+vold);u=uold+0.5*gamma(m)*dt*(-umean.*vmean.^2);v=vold+0.5*gamma(m)*dt*umean.*vmean.^2;chg=max(abs(u-utemp))+max(abs(v-vtemp));endendt=n*dt;figure(11);clf;subplot(2,1,1);H=vol3d('CData',real(u),'texture','3D','XData',x,'YData',y,'ZData',z);xlabel('x');ylabel('y');zlabel('z');colorbar;axisequal;axissquare;view(3);xlabel('x');ylabel('y');zlabel('z');axisequal;axissquare;view(3);drawnow;title(['Time ',num2str(t)]);subplot(2,1,2);H=vol3d('CData',real(v),'texture','3D','XData',x,'YData',y,'ZData',z);xlabel('x');ylabel('y');zlabel('z');colorbar;axisequal;axissquare;view(3);xlabel('x');ylabel('y');zlabel('z');axisequal;axissquare;view(3);drawnow;filename=['./pictures1/',num2str(1000000+n),'.jpg'];saveas(11,filename)end

To compute the visualisation download vol3d.m[2].

Fortran Program

[edit |edit source]

The program uses 2decomp_fft like the programs from previous chapters.

 

 

 

 

( B)
A Fortran program to solve the 3-dimensional GrayScott equationCode download
!--------------------------------------------------------------------!!! PURPOSE!! This program solves GrayScott equation in 3 dimensions! u_t=D_u*(u_{xx}+u_{yy}+u_{zz})^2-u*v^2-F*(1-u)! v_t=D_v*(v_{xx}+v_{yy}+v_{zz})^2+u*v^2-(F+k)*v!!! .. Parameters ..!  Nx= number of modes in x - power of 2 for FFT!  Ny= number of modes in y - power of 2 for FFT!  Nz= number of modes in z - power of 2 for FFT!  Nt= number of timesteps to take!  Tmax= maximum simulation time!  plotgap= number of timesteps between plots!  pi = 3.14159265358979323846264338327950288419716939937510d0!  Lx= width of box in x direction!  Ly= width of box in y direction!  Lz= width of box in z direction!  A=F!  B=F+k! .. Scalars ..!  Ahat = Fourier transform of A!  i= loop counter in x direction!  j= loop counter in y direction!  k= loop counter in z direction!  n= loop counter for timesteps direction!  allocatestatus= error indicator during allocation!  start= variable to record start time of program!  finish= variable to record end time of program!  count_rate= variable for clock count rate!  dt= timestep!  modescalereal= Number to scale after backward FFT!  myid= Process id!  ierr= error code!  p_row= number of rows for domain decomposition!  p_col= number of columns for domain decomposition!  filesize= total filesize!  disp= displacement to start writing data from!  ind= index in array to write!  plotnum= number of plot to save!  numberfile= number of the file to be saved to disk!  stat= error indicator when reading inputfile!  umean= temporary for fix point iteration!  vmean= temporary for fix point iteration! .. Arrays ..!  u = approximate solution u!  v= approximate solution v!  uold = previous timestep of u!  vold= previous timestep of v!  utemp = temporary for fix point iteration u!  vtemp= temporary for fix point iteration v!  uhat = Fourier transform of u!  vhat= Fourier transform of v! .. Vectors ..!  kx= fourier frequencies in x direction squared!  ky= fourier frequencies in y direction squared!  kz= fourier frequencies in z direction squared!  x= x locations!  y= y locations!  z= z locations!  time= times at which save data!  nameconfig= array to store filename for data to be saved!  InputFileName= name of the Input File!  dpcomm!  intcomm! .. Special Structures ..!  decomp= contains information on domain decomposition!  spsee http://www.2decomp.org/ for more information!! REFERENCES!! ACKNOWLEDGEMENTS!! ACCURACY!! ERROR INDICATORS AND WARNINGS!! FURTHER COMMENTS! Check that the initial iterate is consistent with the! boundary conditions for the domain specified!--------------------------------------------------------------------! External routines required!! External libraries required! 2DECOMP&FFT -- Domain decomposition and Fast Fourier Library!(http://www.2decomp.org/index.html)! MPI libraryPROGRAMmainUSEdecomp_2dUSEdecomp_2d_fftUSEdecomp_2d_ioUSEMPI! Declare variablesIMPLICIT NONEINTEGER(kind=4)::Nx=0INTEGER(kind=4)::Ny=0INTEGER(kind=4)::Nz=0INTEGER(kind=4)::Nt=0INTEGER(kind=4)::plotgap=0REAL(kind=8),PARAMETER::&pi=3.14159265358979323846264338327950288419716939937510d0REAL(kind=8)::Lx=0.0,Ly=0.0,Lz=0.0REAL(kind=8)::dt=0.0REAL(kind=8)::tol=0.1**12REAL(kind=8)::A=0.0REAL(kind=8)::Ahat=0REAL(kind=8)::B=0.0REAL(kind=8)::Du=0.0REAL(kind=8)::Dv=0.0REAL(kind=8)::chg=1.0REAL(kind=8),DIMENSION(:),ALLOCATABLE::kx,ky,kzREAL(kind=8),DIMENSION(:),ALLOCATABLE::x,y,zREAL(kind=8),DIMENSION(:,:,:),ALLOCATABLE::u,vREAL(kind=8),DIMENSION(:,:,:),ALLOCATABLE::uold,voldREAL(kind=8)::utemp,vtempCOMPLEX(kind=8),DIMENSION(:,:,:),ALLOCATABLE::uhat,vhatCOMPLEX(kind=8)::uhatmpREAL(kind=8)::umean,vmeanREAL(kind=8),DIMENSION(:),ALLOCATABLE::timeINTEGER(KIND=4),DIMENSION(1:5)::intcommREAL(KIND=8),DIMENSION(1:8)::dpcommREAL(kind=8)::modescalerealINTEGER(kind=4)::i,j,k,n,AllocateStatus,statINTEGER(kind=4)::myid,numprocs,ierrTYPE(DECOMP_INFO)::decomp,spINTEGER(kind=MPI_OFFSET_KIND)::filesize,dispINTEGER(kind=4)::p_row=0,p_col=0INTEGER(kind=4)::start,finish,count_rate,ind,plotnumCHARACTER*50::nameconfigCHARACTER*20::numberfile,InputFileName! initialisation of MPICALLMPI_INIT(ierr)CALLMPI_COMM_SIZE(MPI_COMM_WORLD,numprocs,ierr)CALLMPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr)CALLMPI_BCAST(stat,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)IF(myid.eq.0)THENInputFileName='INPUTFILE'OPEN(unit=11,FILE=trim(InputFileName),status="OLD")REWIND(11)READ(11,*)intcomm(1),intcomm(2),intcomm(3),intcomm(4),intcomm(5),&dpcomm(1),dpcomm(2),dpcomm(3),dpcomm(4),dpcomm(5),&dpcomm(6),dpcomm(7),dpcomm(8)CLOSE(11)PRINT*,"NX ",intcomm(1)PRINT*,"NY ",intcomm(2)PRINT*,"NZ ",intcomm(3)PRINT*,"NT ",intcomm(4)PRINT*,"plotgap ",intcomm(5)PRINT*,"Lx",dpcomm(1)PRINT*,"Ly",dpcomm(2)PRINT*,"Lz",dpcomm(3)PRINT*,"dt",dpcomm(4)PRINT*,"Du",dpcomm(5)PRINT*,"Dv",dpcomm(6)PRINT*,"F",dpcomm(7)PRINT*,"k",dpcomm(8)PRINT*,"Read inputfile"END IFCALLMPI_BCAST(dpcomm,8,MPI_DOUBLE_PRECISION,0,MPI_COMM_WORLD,ierr)CALLMPI_BCAST(intcomm,5,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)Nx=intcomm(1)Ny=intcomm(2)Nz=intcomm(3)Nt=intcomm(4)plotgap=intcomm(5)Lx=dpcomm(1)Ly=dpcomm(2)Lz=dpcomm(3)dt=dpcomm(4)Du=dpcomm(5)Dv=dpcomm(6)A=dpcomm(7)B=dpcomm(8)+dpcomm(7)! initialisation of 2decomp! do automatic domain decompositionCALLdecomp_2d_init(Nx,Ny,Nz,p_row,p_col)! get information about domain decomposition choosenCALLget_decomp_info(decomp)! physical domainCALLdecomp_info_init(Nx/2+1,Ny,Nz,sp)! spectral domain! initialise FFT libraryCALLdecomp_2d_fft_initALLOCATE(u(decomp%xst(1):decomp%xen(1),&decomp%xst(2):decomp%xen(2),&decomp%xst(3):decomp%xen(3)),&v(decomp%xst(1):decomp%xen(1),&decomp%xst(2):decomp%xen(2),&decomp%xst(3):decomp%xen(3)),&uold(decomp%xst(1):decomp%xen(1),&decomp%xst(2):decomp%xen(2),&decomp%xst(3):decomp%xen(3)),&vold(decomp%xst(1):decomp%xen(1),&decomp%xst(2):decomp%xen(2),&decomp%xst(3):decomp%xen(3)),&uhat(sp%zst(1):sp%zen(1),&sp%zst(2):sp%zen(2),&sp%zst(3):sp%zen(3)),&vhat(sp%zst(1):sp%zen(1),&sp%zst(2):sp%zen(2),&sp%zst(3):sp%zen(3)),&kx(sp%zst(1):sp%zen(1)),&ky(sp%zst(2):sp%zen(2)),&kz(sp%zst(3):sp%zen(3)),&x(decomp%xst(1):decomp%xen(1)),&y(decomp%xst(2):decomp%xen(2)),&z(decomp%xst(3):decomp%xen(3)),&time(1:1+Nt/plotgap),&stat=AllocateStatus)IF(AllocateStatus.ne.0)STOPIF(myid.eq.0)THENPRINT*,'allocated space'END IFCALLMPI_BARRIER(MPI_COMM_WORLD,ierr)modescalereal=1.0d0/REAL(Nx,KIND(0d0))modescalereal=modescalereal/REAL(Ny,KIND(0d0))modescalereal=modescalereal/REAL(Nz,KIND(0d0))! setup fourier frequencies and grid pointsDOi=1,1+Nx/2IF((i.GE.sp%zst(1)).AND.(i.LE.sp%zen(1)))THENkx(i)=-(REAL(i-1,kind(0d0))/Lx)**2END IFEND DOIF((Nx/2+1.GE.sp%zst(1)).AND.(Nx/2+1.LE.sp%zen(1)))THENkx(Nx/2+1)=0.0d0END IFDOi=Nx/2+2,NxIF((i.GE.sp%zst(1)).AND.(i.LE.sp%zen(1)))THENKx(i)=-(REAL(1-i+Nx,KIND(0d0))/Lx)**2END IFEND DODOi=decomp%xst(1),decomp%xen(1)x(i)=(-1.0d0+2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)))*pi*LxEND DODOj=1,1+Ny/2IF((j.GE.sp%zst(2)).AND.(j.LE.sp%zen(2)))THENky(j)=-(REAL(j-1,kind(0d0))/Ly)**2END IFEND DOIF((Ny/2+1.GE.sp%zst(2)).AND.(Ny/2+1.LE.sp%zen(2)))THENky(Ny/2+1)=0.0d0ENDIFDOj=Ny/2+2,NyIF((j.GE.sp%zst(2)).AND.(j.LE.sp%zen(2)))THENky(j)=-(REAL(1-j+Ny,KIND(0d0))/Ly)**2END IFEND DODOj=decomp%xst(2),decomp%xen(2)y(j)=(-1.0d0+2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)))*pi*LyEND DODOk=1,1+Nz/2IF((k.GE.sp%zst(3)).AND.(k.LE.sp%zen(3)))THENkz(k)=-(REAL(k-1,kind(0d0))/Lz)**2END IFEND DOIF((Nz/2+1.GE.sp%zst(3)).AND.(Nz/2+1.LE.sp%zen(3)))THENkz(Nz/2+1)=0.0d0ENDIFDOk=Nz/2+2,NzIF((k.GE.sp%zst(3)).AND.(k.LE.sp%zen(3)))THENkz(k)=-(REAL(1-k+Nz,KIND(0d0))/Lz)**2END IFEND DODOk=decomp%xst(3),decomp%xen(3)z(k)=(-1.0d0+2.0d0*REAL(k-1,kind(0d0))/REAL(Nz,kind(0d0)))*pi*LzEND DOIF(myid.eq.0)THENPRINT*,'Setup grid and fourier frequencies'END IF!compute AhatAhat=A*Nx*Ny*Nz!initial conditionCALLMPI_BARRIER(MPI_COMM_WORLD,ierr)DOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)u(i,j,k)=1+(exp(-1*(x(i)**2+y(j)**2+z(k)**2)-1))v(i,j,k)=0+(exp(-1*(x(i)**2+y(j)**2+z(k)**2)-1))END DOEND DOEND DO! write out using 2DECOMP&FFT MPI-IO routinesnameconfig='./data/u'plotnum=0WRITE(numberfile,'(i0)')10000000+plotnumind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//numberfileind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//'.datbin'CALLdecomp_2d_write_one(1,u,nameconfig)!same for vnameconfig='./data/v'plotnum=0WRITE(numberfile,'(i0)')10000000+plotnumind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//numberfileind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//'.datbin'CALLdecomp_2d_write_one(1,v,nameconfig)IF(myid.eq.0)THENPRINT*,'Got initial data, starting timestepping'END IFCALLsystem_clock(start,count_rate)time(1)=0DOn=1,Nt!use fixed point iteration to solve nonlinear system in real spaceDOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)uold(i,j,k)=u(i,j,k)vold(i,j,k)=v(i,j,k)END DOEND DOEND DODOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)chg=1DO WHILE(chg>tol)utemp=u(i,j,k)vtemp=v(i,j,k)umean=0.5*(u(i,j,k)+uold(i,j,k))vmean=0.5*(v(i,j,k)+vold(i,j,k))u(i,j,k)=uold(i,j,k)+0.5*dt*(-umean*vmean**2)v(i,j,k)=vold(i,j,k)+0.5*dt*(umean*vmean**2)utemp=u(i,j,k)-utempvtemp=v(i,j,k)-vtempchg=abs(utemp)+abs(vtemp)END DOEND DOEND DOEND DO! solve linear part exactly in Fourier spaceCALLdecomp_2d_fft_3d(u,uhat)CALLdecomp_2d_fft_3d(v,vhat)IF(myid.eq.0)THENuhatmp=uhat(1,1,1)END IFDOk=sp%zst(3),sp%zen(3)DOj=sp%zst(2),sp%zen(2)DOi=sp%zst(1),sp%zen(1)uhat(i,j,k)=exp(dt*(-A+Du*(kx(i)+ky(j)+kz(k))))*&uhat(i,j,k)vhat(i,j,k)=exp(dt*(-B+Dv*(kx(i)+ky(j)+kz(k))))*&vhat(i,j,k)END DOEND DOEND DOIF(myid.eq.0)THENuhat(1,1,1)=exp(dt*(-A+Du*(kx(1)+ky(1)+kz(1))))*&(uhatmp-Ahat/(A+Du*(kx(1)+ky(1)+kz(1))))+&(Ahat/(A+Du*(kx(1)+ky(1)+kz(1))))END IF!dont forget to scaleCALLdecomp_2d_fft_3d(uhat,u)CALLdecomp_2d_fft_3d(vhat,v)DOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)u(i,j,k)=u(i,j,k)*modescalerealv(i,j,k)=v(i,j,k)*modescalerealEND DOEND DOEND DO!use fixed point iteration to solve nonlinear system in real spaceDOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)uold(i,j,k)=u(i,j,k)vold(i,j,k)=v(i,j,k)END DOEND DOEND DODOk=decomp%xst(3),decomp%xen(3)DOj=decomp%xst(2),decomp%xen(2)DOi=decomp%xst(1),decomp%xen(1)chg=1DO WHILE(chg>tol)utemp=u(i,j,k)vtemp=v(i,j,k)umean=0.5*(u(i,j,k)+uold(i,j,k))vmean=0.5*(v(i,j,k)+vold(i,j,k))u(i,j,k)=uold(i,j,k)+0.5*dt*(-umean*vmean**2)v(i,j,k)=vold(i,j,k)+0.5*dt*(umean*vmean**2)utemp=u(i,j,k)-utempvtemp=v(i,j,k)-vtempchg=abs(utemp)+abs(vtemp)END DOEND DOEND DOEND DO!write dataIF(mod(n,plotgap)==0)THENtime(1+n/plotgap)=n*dtIF(myid.eq.0)THENPRINT*,'time',n*dtEND IFnameconfig='./data/u'plotnum=plotnum+1WRITE(numberfile,'(i0)')10000000+plotnumind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//numberfileind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//'.datbin'!write out using 2DECOMP&FFT MPI-IO routinesCALLdecomp_2d_write_one(1,u,nameconfig)nameconfig='./data/v'WRITE(numberfile,'(i0)')10000000+plotnumind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//numberfileind=index(nameconfig,' ')-1nameconfig=nameconfig(1:ind)//'.datbin'!write out using 2DECOMP&FFT MPI-IO routinesCALLdecomp_2d_write_one(1,v,nameconfig)END IFEND DOIF(myid.eq.0)THENPRINT*,'Finished time stepping'END IFCALLsystem_clock(finish,count_rate)IF(myid.eq.0)THENPRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),'for execution'END IFIF(myid.eq.0)THEN! Save times at which output was made in text formatnameconfig='./data/tdata.dat'OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")REWIND(11)DOj=1,1+Nt/plotgapWRITE(11,*)time(j)END DOCLOSE(11)! Save x grid points in text formatnameconfig='./data/xcoord.dat'OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")REWIND(11)DOi=1,NxWRITE(11,*)x(i)END DOCLOSE(11)! Save y grid points in text formatnameconfig='./data/ycoord.dat'OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")REWIND(11)DOj=1,NyWRITE(11,*)y(j)END DOCLOSE(11)! Save z grid points in text formatnameconfig='./data/zcoord.dat'OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")REWIND(11)DOk=1,NzWRITE(11,*)z(k)END DOCLOSE(11)PRINT*,'Saved data'END IF! clean upCALLdecomp_2d_fft_finalizeCALLdecomp_2d_finalizeDEALLOCATE(u,v,uold,vold,uhat,vhat,&kx,ky,kz,x,y,z,&time,stat=AllocateStatus)IF(AllocateStatus.ne.0)STOPIF(myid.eq.0)THEN   PRINT*,'Program execution complete'END IFCALLMPI_FINALIZE(ierr)END PROGRAMmain

OpenCL

[edit |edit source]

In OpenCL[3] you write a 'Kernel' to do the intense work in parallel. You can choose to run the kernel on your cpu, gpu or some other coprocessor. The kernel is compiled during runtime for the device you have choosen.

To run the following code you need an valid OpenCL 1.2 framework for example AMD APP SDK[4] which runs also on non AMD CPU's. You also need clfft[5] an implementation of the FFT for OpenCL.

 

 

 

 

( C)
The main program to solve the 2-dimensional GrayScott equationCode download
#include"clFFT.h"#include<stdio.h>#include<stdlib.h>#include<math.h>#include<sys/time.h>#include"gs_functions.c"#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision floating-point arithmetic#endifintmain(void){//time meassuringstructtimevaltvs;//variablesintNx=1024;intNy=1024;intplotnum=0;intTmax=2;intplottime=0;intplotgap=1;doubleLx=1.0;doubleLy=1.0;doubledt=0.0;doubleA=0.0;doubleB=0.0;doubleDu=0.0;doubleDv=0.0;//splitting coefficientsdoublea=0.5;doubleb=0.5;doublec=1.0;//loop countersinti=0;intj=0;intn=0;double*umax=NULL;double*vmax=NULL;parainit(&Nx,&Ny,&Tmax,&plotgap,&Lx,&Ly,&dt,&Du,&Dv,&A,&B);plottime=plotgap;vmax=(double*)malloc((Tmax/plotgap+1)*sizeof(double));umax=(double*)malloc((Tmax/plotgap+1)*sizeof(double));//openCL variablescl_platform_id*platform_id=NULL;cl_kernelfrequencies=NULL,initialdata=NULL,linearpart=NULL;cl_kernelnonlinearpart_a=NULL,nonlinearpart_b=NULL;cl_intret;cl_uintnum_platforms;// Detect how many platforms there are.ret=clGetPlatformIDs(0,NULL,&num_platforms);// Allocate enough space for the number of platforms.platform_id=(cl_platform_id*)malloc(num_platforms*sizeof(cl_platform_id));// Store the platformsret=clGetPlatformIDs(num_platforms,platform_id,NULL);printf("Found %d platform(s)!\n",num_platforms);cl_uint*num_devices;num_devices=(cl_uint*)malloc(num_platforms*sizeof(cl_uint));cl_device_id**device_id=NULL;device_id=(cl_device_id**)malloc(num_platforms*sizeof(cl_device_id*));// Detect number of devices in the platformsfor(i=0;i<num_platforms;i++){charbuf[65536];size_tsize;ret=clGetPlatformInfo(platform_id[i],CL_PLATFORM_VERSION,sizeof(buf),buf,&size);printf("%s\n",buf);ret=clGetDeviceIDs(platform_id[i],CL_DEVICE_TYPE_ALL,0,NULL,num_devices);printf("Found %d device(s) on platform %d!\n",num_devices[i],i);ret=clGetPlatformInfo(platform_id[i],CL_PLATFORM_NAME,sizeof(buf),buf,&size);printf("%s ",buf);// Store numDevices from platformdevice_id[i]=(cl_device_id*)malloc(num_devices[i]*sizeof(device_id));ret=clGetDeviceIDs(platform_id[i],CL_DEVICE_TYPE_ALL,num_devices[i],device_id[i],NULL);for(j=0;j<num_devices[i];j++){ret=clGetDeviceInfo(device_id[i][j],CL_DEVICE_NAME,sizeof(buf),buf,&size);printf("%s (%d,%d)\n",buf,i,j);}}//create context and command_queuecl_contextcontext=NULL;cl_command_queuecommand_queue=NULL;//Which platform and device do i choose?intchooseplatform=0;intchoosedevice=0;printf("Choose platform %d and device %d!\n",chooseplatform,choosedevice);context=clCreateContext(NULL,num_devices[chooseplatform],device_id[chooseplatform],NULL,NULL,&ret);if(ret!=CL_SUCCESS){printf("createContext ret:%d\n",ret);exit(1);}command_queue=clCreateCommandQueue(context,device_id[chooseplatform][choosedevice],0,&ret);if(ret!=CL_SUCCESS){printf("createCommandQueue ret:%d\n",ret);exit(1);}//OpenCL arrayscl_memcl_u=NULL,cl_v=NULL;cl_memcl_uhat=NULL,cl_vhat=NULL;cl_memcl_kx=NULL,cl_ky=NULL;//FFTclfftPlanHandleplanHandle;cl_memtmpBuffer=NULL;fftinit(&planHandle,&context,&command_queue,&tmpBuffer,Nx,Ny);//allocate gpu memory/cl_u=clCreateBuffer(context,CL_MEM_READ_WRITE,2*Nx*Ny*sizeof(double),NULL,&ret);cl_v=clCreateBuffer(context,CL_MEM_READ_WRITE,2*Nx*Ny*sizeof(double),NULL,&ret);cl_uhat=clCreateBuffer(context,CL_MEM_READ_WRITE,2*Nx*Ny*sizeof(double),NULL,&ret);cl_vhat=clCreateBuffer(context,CL_MEM_READ_WRITE,2*Nx*Ny*sizeof(double),NULL,&ret);cl_kx=clCreateBuffer(context,CL_MEM_READ_WRITE,Nx*sizeof(double),NULL,&ret);cl_ky=clCreateBuffer(context,CL_MEM_READ_WRITE,Ny*sizeof(double),NULL,&ret);printf("allocated space\n");//load the kernelsloadKernel(&frequencies,&context,&device_id[chooseplatform][choosedevice],"frequencies");loadKernel(&initialdata,&context,&device_id[chooseplatform][choosedevice],"initialdata");loadKernel(&linearpart,&context,&device_id[chooseplatform][choosedevice],"linearpart");loadKernel(&nonlinearpart_a,&context,&device_id[chooseplatform][choosedevice],"nonlinearpart_a");loadKernel(&nonlinearpart_b,&context,&device_id[chooseplatform][choosedevice],"nonlinearpart_b");size_tglobal_work_size[1]={Nx*Ny};size_tglobal_work_size_X[1]={Nx};size_tglobal_work_size_Y[1]={Ny};//frequenciesret=clSetKernelArg(frequencies,0,sizeof(cl_mem),(void*)&cl_kx);ret=clSetKernelArg(frequencies,1,sizeof(double),(void*)&Lx);ret=clSetKernelArg(frequencies,2,sizeof(int),(void*)&Nx);ret=clEnqueueNDRangeKernel(command_queue,frequencies,1,NULL,global_work_size_X,NULL,0,NULL,NULL);ret=clFinish(command_queue);ret=clSetKernelArg(frequencies,0,sizeof(cl_mem),(void*)&cl_ky);ret=clSetKernelArg(frequencies,1,sizeof(double),(void*)&Ly);ret=clSetKernelArg(frequencies,2,sizeof(int),(void*)&Ny);ret=clEnqueueNDRangeKernel(command_queue,frequencies,1,NULL,global_work_size_Y,NULL,0,NULL,NULL);ret=clFinish(command_queue);//printCL(&cl_kx,&command_queue,Nx,1);//printCL(&cl_ky,&command_queue,1,Ny);//inintial dataret=clSetKernelArg(initialdata,0,sizeof(cl_mem),(void*)&cl_u);ret=clSetKernelArg(initialdata,1,sizeof(cl_mem),(void*)&cl_v);ret=clSetKernelArg(initialdata,2,sizeof(int),(void*)&Nx);ret=clSetKernelArg(initialdata,3,sizeof(int),(void*)&Ny);ret=clSetKernelArg(initialdata,4,sizeof(double),(void*)&Lx);ret=clSetKernelArg(initialdata,5,sizeof(double),(void*)&Ly);ret=clEnqueueNDRangeKernel(command_queue,initialdata,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);//make outputwritedata_C(&cl_u,&command_queue,Nx,Ny,plotnum,"u");writedata_C(&cl_v,&command_queue,Nx,Ny,plotnum,"v");umax[plotnum]=writeimage(&cl_u,&command_queue,Nx,Ny,plotnum,"u");vmax[plotnum]=writeimage(&cl_v,&command_queue,Nx,Ny,plotnum,"v");printf("Got initial data, starting timestepping\n");mtime_s(&tvs);for(n=0;n<=Tmax;n++){//nonlinearpart_aret=clSetKernelArg(nonlinearpart_a,0,sizeof(cl_mem),(void*)&cl_u);ret=clSetKernelArg(nonlinearpart_a,1,sizeof(cl_mem),(void*)&cl_v);ret=clSetKernelArg(nonlinearpart_a,2,sizeof(double),(void*)&A);ret=clSetKernelArg(nonlinearpart_a,3,sizeof(double),(void*)&dt);ret=clSetKernelArg(nonlinearpart_a,4,sizeof(double),(void*)&a);ret=clEnqueueNDRangeKernel(command_queue,nonlinearpart_a,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);//nonlinearpart_bret=clSetKernelArg(nonlinearpart_b,0,sizeof(cl_mem),(void*)&cl_u);ret=clSetKernelArg(nonlinearpart_b,1,sizeof(cl_mem),(void*)&cl_v);ret=clSetKernelArg(nonlinearpart_b,2,sizeof(double),(void*)&A);ret=clSetKernelArg(nonlinearpart_b,3,sizeof(double),(void*)&dt);ret=clSetKernelArg(nonlinearpart_b,4,sizeof(double),(void*)&b);ret=clEnqueueNDRangeKernel(command_queue,nonlinearpart_b,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);//linearfft2dfor(&cl_u,&cl_uhat,&planHandle,&command_queue,&tmpBuffer);fft2dfor(&cl_v,&cl_vhat,&planHandle,&command_queue,&tmpBuffer);//printf("A%f,B%f\n",A,B);ret=clSetKernelArg(linearpart,0,sizeof(cl_mem),(void*)&cl_uhat);ret=clSetKernelArg(linearpart,1,sizeof(cl_mem),(void*)&cl_vhat);ret=clSetKernelArg(linearpart,2,sizeof(cl_mem),(void*)&cl_kx);ret=clSetKernelArg(linearpart,3,sizeof(cl_mem),(void*)&cl_ky);ret=clSetKernelArg(linearpart,4,sizeof(double),(void*)&Du);ret=clSetKernelArg(linearpart,5,sizeof(double),(void*)&Dv);ret=clSetKernelArg(linearpart,6,sizeof(double),(void*)&A);ret=clSetKernelArg(linearpart,7,sizeof(double),(void*)&B);ret=clSetKernelArg(linearpart,8,sizeof(double),(void*)&dt);ret=clSetKernelArg(linearpart,9,sizeof(double),(void*)&c);ret=clSetKernelArg(linearpart,10,sizeof(int),(void*)&Nx);ret=clSetKernelArg(linearpart,11,sizeof(int),(void*)&Ny);ret=clEnqueueNDRangeKernel(command_queue,linearpart,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);fft2dback(&cl_u,&cl_uhat,&planHandle,&command_queue,&tmpBuffer);fft2dback(&cl_v,&cl_vhat,&planHandle,&command_queue,&tmpBuffer);//nonlinearpart_bret=clSetKernelArg(nonlinearpart_b,0,sizeof(cl_mem),(void*)&cl_u);ret=clSetKernelArg(nonlinearpart_b,1,sizeof(cl_mem),(void*)&cl_v);ret=clSetKernelArg(nonlinearpart_b,2,sizeof(double),(void*)&A);ret=clSetKernelArg(nonlinearpart_b,3,sizeof(double),(void*)&dt);ret=clSetKernelArg(nonlinearpart_b,4,sizeof(double),(void*)&b);ret=clEnqueueNDRangeKernel(command_queue,nonlinearpart_b,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);//nonlinearpart_aret=clSetKernelArg(nonlinearpart_a,0,sizeof(cl_mem),(void*)&cl_u);ret=clSetKernelArg(nonlinearpart_a,1,sizeof(cl_mem),(void*)&cl_v);ret=clSetKernelArg(nonlinearpart_a,2,sizeof(double),(void*)&A);ret=clSetKernelArg(nonlinearpart_a,3,sizeof(double),(void*)&dt);ret=clSetKernelArg(nonlinearpart_a,4,sizeof(double),(void*)&a);ret=clEnqueueNDRangeKernel(command_queue,nonlinearpart_a,1,NULL,global_work_size,NULL,0,NULL,NULL);ret=clFinish(command_queue);// doneif(n==plottime){printf("time:%f, step:%d,%d,umax:%f,vmax:%f\n",n*dt,n,plotnum,umax[plotnum],vmax[plotnum]);plottime=plottime+plotgap;plotnum=plotnum+1;writedata_C(&cl_u,&command_queue,Nx,Ny,plotnum,"u");writedata_C(&cl_v,&command_queue,Nx,Ny,plotnum,"v");umax[plotnum]=writeimage(&cl_u,&command_queue,Nx,Ny,plotnum,"u");vmax[plotnum]=writeimage(&cl_v,&command_queue,Nx,Ny,plotnum,"v");}}//end timesteppingprintf("Finished time stepping\n");mtime_e(&tvs,"Programm took:");writearray(umax,(Tmax/plotgap)+1,"u");writearray(vmax,(Tmax/plotgap)+1,"v");free(umax);free(vmax);clReleaseMemObject(cl_u);clReleaseMemObject(cl_v);clReleaseMemObject(cl_uhat);clReleaseMemObject(cl_vhat);clReleaseMemObject(cl_kx);clReleaseMemObject(cl_ky);ret=clReleaseKernel(initialdata);ret=clReleaseKernel(frequencies);ret=clReleaseKernel(linearpart);ret=clReleaseKernel(nonlinearpart_a);ret=clReleaseKernel(nonlinearpart_b);fftdestroy(&planHandle,&tmpBuffer);ret=clReleaseCommandQueue(command_queue);ret=clReleaseContext(context);for(i=0;i<num_platforms;i++){free(device_id[i]);}free(device_id);free(platform_id);free(num_devices);printf("Program execution complete\n");return0;}

You can change "chooseplatform" and "choosedevice" to the device you want to compute on.

 

 

 

 

( C)
This file contains some functions to give the main code better read ability.Code download
////////This file contains only functions for main_gs.c////#include<CL/cl.h>#include<sys/time.h>#include<string.h>//read the INPUTFILEvoidparainit(int*Nx,int*Ny,int*Tmax,int*plotgap,double*Lx,double*Ly,double*dt,double*Du,double*Dv,double*A,double*B){intintcomm[4];doubledpcomm[7];charInputFileName[]="./INPUTFILE";FILE*fp;fp=fopen(InputFileName,"r");if(!fp){fprintf(stderr,"Failed to load IPUTFILE.\n");exit(1);}intierr=fscanf(fp,"%d %d %d %d %lf %lf %lf %lf %lf %lf %lf",&intcomm[0],&intcomm[1],&intcomm[2],&intcomm[3],&dpcomm[0],&dpcomm[1],&dpcomm[2],&dpcomm[3],&dpcomm[4],&dpcomm[5],&dpcomm[6]);if(ierr!=11){fprintf(stderr,"INPUTFILE corrupted:%d\n",ierr);exit(1);}fclose(fp);printf("NX %d\nNY %d\nTmax %d\nplotgap %d\n",intcomm[0],intcomm[1],intcomm[2],intcomm[3]);printf("Lx %lf\nLy %lf\ndt %lf\nDu %lf\nDv %lf\nF %lf\nk %lf\n",dpcomm[0],dpcomm[1],dpcomm[2],dpcomm[3],dpcomm[4],dpcomm[5],dpcomm[6]);*Nx=intcomm[0];*Ny=intcomm[1];*Tmax=intcomm[2];*plotgap=intcomm[3];*Lx=dpcomm[0];*Ly=dpcomm[1];*dt=dpcomm[2];*Du=dpcomm[3];*Dv=dpcomm[4];*A=dpcomm[5];*B=dpcomm[5]+dpcomm[6];printf("Read Inputfile\n");};//loads a kernel from a file#define MAX_SOURCE_SIZE 8192voidloadKernel(cl_kernel*kernel,cl_context*context,cl_device_id*device_id,char*name){cl_programp_kernel;cl_intret=0;size_tsource_size;char*source_str;charnameconfig[100];inti=0;source_str=(char*)malloc(MAX_SOURCE_SIZE*sizeof(char));for(i=0;i<MAX_SOURCE_SIZE;i++){source_str[i]='\0';}FILE*fp;strcpy(nameconfig,"./");strcat(nameconfig,name);strcat(nameconfig,".cl");fp=fopen(nameconfig,"r");if(!fp){fprintf(stderr,"Failed to load kernel.\n");exit(1);}source_size=fread(source_str,sizeof(char),MAX_SOURCE_SIZE,fp);fclose(fp);p_kernel=clCreateProgramWithSource(*context,1,(constchar**)&source_str,(constsize_t*)&source_size,&ret);if(ret!=CL_SUCCESS){printf("createProgram ret:%d\n",ret);exit(1);}ret=clBuildProgram(p_kernel,1,&*device_id,NULL,NULL,NULL);if(ret!=CL_SUCCESS){printf("buildProgram ret:%d\n",ret);exit(1);}*kernel=clCreateKernel(p_kernel,name,&ret);if(ret!=CL_SUCCESS){printf("createKernel ret:%d\n",ret);exit(1);}ret=clReleaseProgram(p_kernel);if(ret!=CL_SUCCESS){printf("releaseProgram ret:%d\n",ret);exit(1);}printf("got kernel %s\n",name);free(source_str);};//displays an array on gpu memory (debug)voidprintCL(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy){double*u;inti=0;intj=0;cl_intret=0;u=(double*)malloc(Nx*Ny*sizeof(double));ret=clEnqueueReadBuffer(*command_queue,*cl_u,CL_TRUE,0,Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=CL_SUCCESS){printf("failed");}for(i=0;i<Nx;i++){for(j=0;j<Ny;j++){printf("%f ",u[i+Nx*j]);}printf("\n");}printf("\n");free(u);};//displays an real part of complex array on gpu memory (debug)voidprintCL_C(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy){double*u;inti=0;intj=0;cl_intret=0;u=(double*)malloc(2*Nx*Ny*sizeof(double));ret=clEnqueueReadBuffer(*command_queue,*cl_u,CL_TRUE,0,2*Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=CL_SUCCESS){printf("failed");}for(i=0;i<Nx;i++){for(j=0;j<Ny;j++){printf("%f ",u[2*i+Nx*2*j]);}printf("\n");}printf("\n");free(u);};//make plans for FFTvoidfftinit(clfftPlanHandle*planHandle,cl_context*context,cl_command_queue*command_queue,cl_mem*tmpBuffer,intNx,intNy){clfftDimdim=CLFFT_2D;size_tclLength[2]={Nx,Ny};cl_intret=0;// Setup clFFT.clfftSetupDatafftSetup;ret=clfftInitSetupData(&fftSetup);if(ret!=CL_SUCCESS){printf("clFFT init ret:%d\n",ret);exit(1);}ret=clfftSetup(&fftSetup);if(ret!=CL_SUCCESS){printf("clFFT Setup ret:%d\n",ret);exit(1);}// Create a default plan for a complex FFT.ret=clfftCreateDefaultPlan(&*planHandle,*context,dim,clLength);if(ret!=CL_SUCCESS){printf("clFFT Plan ret:%d\n",ret);exit(1);}// Set plan parameters.ret=clfftSetPlanPrecision(*planHandle,CLFFT_DOUBLE);if(ret!=CL_SUCCESS){printf("clFFT Precision ret:%d\n",ret);exit(1);}//ret = clfftSetPlanBatchSize(*planHandle, (size_t) Ny );//if(ret!=CL_SUCCESS){printf("clFFT Batch ret:%d\n",ret);exit(1); }ret=clfftSetLayout(*planHandle,CLFFT_COMPLEX_INTERLEAVED,CLFFT_COMPLEX_INTERLEAVED);if(ret!=CL_SUCCESS){printf("clFFT Layout ret:%d\n",ret);exit(1);}ret=clfftSetResultLocation(*planHandle,CLFFT_OUTOFPLACE);if(ret!=CL_SUCCESS){printf("clFFT Place ret:%d\n",ret);exit(1);}// Bake the plan.ret=clfftBakePlan(*planHandle,1,&*command_queue,NULL,NULL);if(ret!=CL_SUCCESS){printf("clFFT Bake ret:%d\n",ret);exit(1);}// Create temporary buffer.// Size of temp buffer.size_ttmpBufferSize=0;ret=clfftGetTmpBufSize(*planHandle,&tmpBufferSize);if((ret==CL_SUCCESS)&&(tmpBufferSize>0)){*tmpBuffer=clCreateBuffer(*context,CL_MEM_READ_WRITE,tmpBufferSize,NULL,&ret);if(ret!=CL_SUCCESS){printf("Error with tmpBuffer clCreateBuffer\n");exit(1);}}};//destroy plansvoidfftdestroy(clfftPlanHandle*planHandle,cl_mem*tmpBuffer){cl_intret=0;clReleaseMemObject(*tmpBuffer);ret=clfftDestroyPlan(&*planHandle);if(ret!=0){printf("Error while destroying fft");exit(1);}clfftTeardown();};//fft2dfowardvoidfft2dfor(cl_mem*cl_u,cl_mem*cl_uhat,clfftPlanHandle*planHandle,cl_command_queue*command_queue,cl_mem*tmpBuffer){intret=0;ret=clfftEnqueueTransform(*planHandle,CLFFT_FORWARD,1,command_queue,0,NULL,NULL,&*cl_u,&*cl_uhat,*tmpBuffer);if(ret!=CL_SUCCESS){printf("FFT failedA%d",ret);}ret=clFinish(*command_queue);if(ret!=CL_SUCCESS){printf("FFT failedB%d",ret);}};//fft2dbackvoidfft2dback(cl_mem*cl_u,cl_mem*cl_uhat,clfftPlanHandle*planHandle,cl_command_queue*command_queue,cl_mem*tmpBuffer){intret=0;ret=clfftEnqueueTransform(*planHandle,CLFFT_BACKWARD,1,command_queue,0,NULL,NULL,&*cl_uhat,&*cl_u,*tmpBuffer);if(ret!=CL_SUCCESS){printf("FFT failedC%d",ret);}ret=clFinish(*command_queue);if(ret!=CL_SUCCESS){printf("FFT failedD%d",ret);}};//writes an image to disk and returns the maximum of cl_udoublewriteimage(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy,intplotnum,char*prefix){inti=0;cl_intret=0;intheader=54;double*u;u=(double*)malloc(2*Nx*Ny*sizeof(double));ret=clEnqueueReadBuffer(*command_queue,*cl_u,CL_TRUE,0,2*Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=0){printf("Error hahah");}doublemax=0.0;for(i=0;i<Nx*Ny;i++){if(u[2*i]>max){max=u[2*i];}}unsignedchar*picture=(unsignedchar*)malloc((3*Nx*Ny+header)*sizeof(unsignedchar));for(i=0;i<Nx*Ny;i++){picture[3*i+header+0]=(unsignedchar)(255*u[2*i]/max);picture[3*i+header+1]=(unsignedchar)(255*u[2*i]/max);picture[3*i+header+2]=(unsignedchar)(255*u[2*i]/max);}//header for bmp fileintw=Ny;inth=Nx;intpadSize=(4-w%4)%4;intfilesize=header+3*h*w+h*padSize;unsignedcharbmppad[3]={0,0,0};//paddingpicture[0]='B';picture[1]='M';picture[2]=(unsignedchar)(filesize);picture[3]=(unsignedchar)(filesize>>8);picture[4]=(unsignedchar)(filesize>>16);picture[5]=(unsignedchar)(filesize>>24);picture[6]=0;picture[7]=0;picture[8]=0;picture[9]=0;picture[10]=54;picture[11]=0;picture[12]=0;picture[13]=0;picture[14]=40;picture[15]=0;picture[16]=0;picture[17]=0;//3picture[18]=(unsignedchar)(w);picture[19]=(unsignedchar)(w>>8);picture[20]=(unsignedchar)(w>>16);picture[21]=(unsignedchar)(w>>24);picture[22]=(unsignedchar)(h);picture[23]=(unsignedchar)(h>>8);picture[24]=(unsignedchar)(h>>16);picture[25]=(unsignedchar)(h>>24);picture[26]=1;picture[27]=0;picture[28]=24;picture[29]=0;for(i=30;i<54;i++){picture[i]=0;}FILE*fp;//file namechartmp_str[10];charnameconfig[100];strcpy(nameconfig,"./data/");strcat(nameconfig,prefix);sprintf(tmp_str,"%d",10000000+plotnum);strcat(nameconfig,tmp_str);strcat(nameconfig,".bmp");fp=fopen(nameconfig,"wb");if(!fp){fprintf(stderr,"Failed to write data.\n");exit(1);}for(i=0;i<header;i++){fwrite(&picture[i],sizeof(unsignedchar),1,fp);}for(i=0;i<h;i++){fwrite(picture+(w*(h-i-1)*3)+header,3*sizeof(unsignedchar),w,fp);fwrite(bmppad,sizeof(unsignedchar),(4-(w*3)%4)%4,fp);}fclose(fp);free(picture);free(u);returnmax;};//writes the array to disk (debug)voidwritedata(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy,intplotnum,char*prefix){inti=0;cl_intret=0;double*u;u=(double*)malloc(Nx*Ny*sizeof(double));ret=clEnqueueReadBuffer(*command_queue,*cl_u,CL_TRUE,0,Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=0){printf("Error hahah");}FILE*fp;//file namechartmp_str[10];charnameconfig[100];strcpy(nameconfig,"./data/");strcat(nameconfig,prefix);sprintf(tmp_str,"%d",10000000+plotnum);strcat(nameconfig,tmp_str);strcat(nameconfig,".datbin");fp=fopen(nameconfig,"wb");if(!fp){fprintf(stderr,"Failed to write data.\n");exit(1);}for(i=0;i<Nx;i++){fwrite(u+i*Ny,sizeof(double),Ny,fp);}fclose(fp);free(u);};//writes the real part of complex array to diskvoidwritedata_C(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy,intplotnum,char*prefix){inti=0;cl_intret=0;double*u;u=(double*)malloc(2*Nx*Ny*sizeof(double));ret=clEnqueueReadBuffer(*command_queue,*cl_u,CL_TRUE,0,2*Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=0){printf("Error hahah");}for(i=0;i<Nx*Ny;i++){u[i]=u[2*i];}FILE*fp;//file namechartmp_str[10];charnameconfig[100];strcpy(nameconfig,"./data/");strcat(nameconfig,prefix);sprintf(tmp_str,"%d",10000000+plotnum);strcat(nameconfig,tmp_str);strcat(nameconfig,".datbin");fp=fopen(nameconfig,"wb");if(!fp){fprintf(stderr,"Failed to write data.\n");exit(1);}for(i=0;i<Nx;i++){fwrite(u+i*Ny,sizeof(double),Ny,fp);}fclose(fp);free(u);};//loades the data from disk (debug)voidloaddata(cl_mem*cl_u,cl_command_queue*command_queue,intNx,intNy,char*name){inti=0;intnumread=0;cl_intret=0;double*u;u=(double*)malloc(Nx*Ny*sizeof(double));FILE*fp;fp=fopen(name,"rb");if(!fp){fprintf(stderr,"Failed to open file.\n");exit(1);}for(i=0;i<Nx;i++){numread=fread(u+i*Ny,sizeof(double),Ny,fp);if(numread!=Ny){fprintf(stderr,"Failed to read file.\n");exit(1);}}fclose(fp);ret=clEnqueueWriteBuffer(*command_queue,*cl_u,CL_TRUE,0,Nx*Ny*sizeof(double),u,0,NULL,NULL);ret=clFinish(*command_queue);if(ret!=0){printf("Error hahah");}free(u);};//writes an array to disc ASCIIvoidwritearray(double*u,intlength,char*prefix){inti=0;charnameconfig[100];strcpy(nameconfig,"./data/");strcat(nameconfig,prefix);FILE*fp;fp=fopen(nameconfig,"w");if(!fp){fprintf(stderr,"Failed to write data.\n");exit(1);}for(i=0;i<length;i++){fprintf(fp,"%.17g\n",u[i]);}};//start measuring timevoidmtime_s(structtimeval*tvs){gettimeofday(&*tvs,NULL);};//end measuring time+ printingvoidmtime_e(structtimeval*tvs,char*printme){structtimevaltve;doubleelapsedTime;gettimeofday(&tve,NULL);elapsedTime=(tve.tv_sec-(*tvs).tv_sec)*1000.0;// sec to mselapsedTime+=(tve.tv_usec-(*tvs).tv_usec)/1000.0;// us to msprintf("%s%lfms\n",printme,elapsedTime);};


Now all the OpenCL-Kernel's, which should be saved in the same folder.

 

 

 

 

( D)
initialdata.clCode download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision doubleing-point arithmetic#endif__kernelvoidinitialdata(__globaldouble2*u,__globaldouble2*v,constintNx,constintNy,constdoubleLx,constdoubleLy){constintind=get_global_id(0);inti,j;j=floor((double)(ind)/(double)Nx);i=ind-j*Nx;doublex=(-1.0+(2.0*(double)i/(double)Nx))*M_PI*Lx;doubley=(-1.0+(2.0*(double)j/(double)Ny))*M_PI*Ly;u[ind].x=0.5+exp(-1.0*(x*x+y*y)-1.0);//u[ind].y=0.0;v[ind].x=0.1+exp(-1.0*(x*x+y*y)-1.0);v[ind].y=0.0;}


 

 

 

 

( E)
frequencies.clCode download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision doubleing-point arithmetic#endif__kernelvoidfrequencies(__globaldouble*kx,constdoubleLx,constintNx){constintind=get_global_id(0);if(ind<Nx/2)kx[ind]=-1.0*((double)((ind))/Lx)*((double)(ind)/Lx);if(ind==Nx/2)kx[ind]=0.0;if(ind>Nx/2)kx[ind]=-1.0*(double)(Nx-ind)/Lx*(double)(Nx-ind)/Lx;}


 

 

 

 

( F)
linearpart.clCode download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision doubleing-point arithmetic#endif__kernelvoidlinearpart(__globaldouble2*uhat,__globaldouble2*vhat,__globalconstdouble*Kx,__globalconstdouble*Ky,constdoubleDu,constdoubleDv,constdoubleA,constdoubleB,constdoubledt,constdoublec,constintNx,constintNy){constintind=get_global_id(0);inti,j,k;j=floor((double)(ind)/(double)Nx);i=ind-j*Nx;doubleuexp;doublevexp;uexp=dt*c*(-1.0*A+Du*(Kx[i]+Ky[j]));vexp=dt*c*(-1.0*B+Dv*(Kx[i]+Ky[j]));uhat[ind].x=exp(uexp)*uhat[ind].x;uhat[ind].y=exp(uexp)*uhat[ind].y;vhat[ind].x=exp(vexp)*vhat[ind].x;vhat[ind].y=exp(vexp)*vhat[ind].y;}


 

 

 

 

( G)
nonlinearpart_a.clCode download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision doubleing-point arithmetic#endif__kernelvoidnonlinearpart_a(__globaldouble2*u,__globaldouble2*v,constdoubleA,constdoubledt,constdoublea){constintind=get_global_id(0);//u[ind].x=A/(v[ind].x*v[ind].x)+(u[ind].x-A/(v[ind].x*v[ind].x))*exp(dt*a*(-1.0)*v[ind].x*v[ind].x);u[ind].x=u[ind].x*exp(dt*a*(-1.0)*v[ind].x*v[ind].x);u[ind].y=0.0;}


 

 

 

 

( H)
nonlinearpart_b.clCode download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)#if defined(cl_amd_fp64)#pragma OPENCL EXTENSION cl_amd_fp64 : enable#elif defined(cl_khr_fp64)#pragma OPENCL EXTENSION cl_khr_fp64 : enable#endif// function declarations/definitions using double precision doubleing-point arithmetic#endif__kernelvoidnonlinearpart_b(__globaldouble2*u,__globaldouble2*v,constdoubleA,constdoubledt,constdoubleb){constintind=get_global_id(0);//v[ind].x=-v[ind].x/(u[ind].x*dt*b*v[ind].x-1.0);v[ind].x=1.0/(-0.5*A*(dt*dt*b*b)-u[ind].x*(dt*b)+1/v[ind].x);v[ind].y=0.0;u[ind].x=A*dt*b+u[ind].x;}

Here is your makefile! Change the clFFT to your path.

 

 

 

 

( I)
makefileCode download
####### Compiler, tools and optionsCC=gccCFLAGS=-m64-pipe-O3-Wno-unused-but-set-parameter-WallDEL_FILE=rm-fLIBS=-lOpenCL-I/home/quell/grayscott/clFFT/src/package/include/-L/home/quell/grayscott/clFFT/src/package/lib64-lclFFT-lm######## Build rulesCL:main_gs.c$(DEL_FILE)gs$(DEL_FILE)*.o$(CC)$(CFLAGS)-ogsmain_gs.c$(LIBS)exportLD_LIBRARY_PATH=/home/quell/grayscott/clFFT/src/package/lib64:$$LD_LIBRARY_PATHout:xdmfcreate.f90gfortran-ooutxdmfcreate.f90clean:$(DEL_FILE)gs$(DEL_FILE)out$(DEL_FILE)*.xmf$(DEL_FILE)*.o$(DEL_FILE)./data/u*$(DEL_FILE)./data/v*cleandata:$(DEL_FILE)./data/u*$(DEL_FILE)./data/v*

To run you need the INPUTFILE with the parameters.

 

 

 

 

( J)
INPUTFILECode download
102410241000001004.04.00.010.040.0050.0380.076NxNyTmaxPlotgapLxLyDTDuDvFk
To view the data, compile and run xdmfcreate.f90. Open udata.xdmf and vdata.xdmf in ParaView.

 

 

 

 

( K)
xdmfcreate.f90Code download
PROGRAMXdmfCreate!--------------------------------------------------------------------------------! .. Purpose ..! XdmfCreate is a postprocessing program which creates header files for! Paraview and Visit! It uses the INPUTFILE and assumes that the filenames in the program! are! consistent with those in the current file.!! .. PARAMETERS .. INITIALIZED IN INPUTFILE! Nx = power of two, number of modes in the x direction! Ny = power of two, number of modes in the y direction! Nt = the number of timesteps! plotgap = the number of timesteps to take before plotting! Lx = definition of the periodic domain of computation in x direction! Ly = definition of the periodic domain of computation in y direction! Dt = the time step!! REFERENCES!! www.xdmf.org! Paraview users mailing list! VisIt users mailing list!! ACCURACY!! ERROR INDICATORS AND WARNINGS!! FURTHER COMMENTS!------------------------------------------------------------------------------------! EXTERNAL ROUTINES REQUIREDIMPLICIT NONE! .. Scalar Arguments ..INTEGER(KIND=4)::Nx,Ny,Nt,plotgapREAL(KIND=8)::Lx,Ly,DT! .. Local scalars ..INTEGER(KIND=4)::stat,plotnum,ind,n,numplotsREAL(KIND=8),PARAMETER::pi=3.14159265358979323846264338327950288419716939937510d0! .. Local Arrays ..CHARACTER*50::InputFileName,OutputFileName,OutputFileName2CHARACTER*10::number_fileInputFileName='INPUTFILE'OPEN(unit=11,FILE=trim(InputFileName),status="OLD")REWIND(11)READ(11,*)Nx,Ny,Nt,plotgap,Lx,Ly,DTCLOSE(11)plotnum=0numplots=1+Nt/plotgapOutputFileName='udata.xmf'OPEN(unit=11,FILE=trim(OutputFileName),status="UNKNOWN")REWIND(11)WRITE(11,'(a)')"<?xml version=""1.0"" ?>"WRITE(11,'(a)')"<!DOCTYPE Xdmf SYSTEM ""Xdmf.dtd"" []>"WRITE(11,'(a)')"<Xdmf xmlns:xi=""http://www.w3.org/2001/XInclude"" Version=""2.0"">"WRITE(11,'(a)')"<Domain>"WRITE(11,'(a)')" <Topology name=""topo"" TopologyType=""3DCoRectMesh"""WRITE(11,*)"  Dimensions=""",Nx,Ny,""">"WRITE(11,'(a)')" </Topology>"WRITE(11,'(a)')" <Geometry name=""geo"" Type=""ORIGIN_DXDYDZ"">"WRITE(11,'(a)')"  <!-- Origin -->"WRITE(11,'(a)')"  <DataItem Format=""XML"" Dimensions=""2"">"WRITE(11,*)-pi*Lx,-pi*LyWRITE(11,'(a)')"  </DataItem>"WRITE(11,'(a)')"  <!-- DxDyDz -->"WRITE(11,'(a)')"  <DataItem Format=""XML"" Dimensions=""2"">"WRITE(11,*)REAL(2.0*pi*Lx/Nx,kind(0d0)),REAL(2.0*pi*Ly/Ny,kind(0d0))WRITE(11,'(a)')"   </DataItem>"WRITE(11,'(a)')"  </Geometry>"WRITE(11,'(a)')WRITE(11,'(a)')"       <Grid Name=""TimeSeries"" GridType=""Collection"" CollectionType=""Temporal"">"WRITE(11,'(a)')"               <Time TimeType=""HyperSlab"">"WRITE(11,'(a)')"                       <DataItem Format=""XML"" NumberType=""Float"" Dimensions=""2"">"WRITE(11,*)"                   0.0",dt," 2"WRITE(11,'(a)')"                       </DataItem>"WRITE(11,'(a)')"               </Time>"DOn=1,numplotsOutputFileName='data/u'ind=index(OutputFileName,' ')-1WRITE(number_file,'(i0)')10000000+plotnumOutputFileName=OutputFileName(1:ind)//number_fileind=index(OutputFileName,' ')-1OutputFileName=OutputFileName(1:ind)//'.datbin'OutputFileName2="<Grid Name=""T"WRITE(number_file,'(i0)')nOutputFileName2=OutputFileName2(1:13)//number_fileind=index(number_file,' ')-1OutputFileName2=OutputFileName2(1:13+ind)//'" GridType="Uniform">'plotnum=plotnum+1WRITE(11,'(a)')WRITE(11,'(3X,a)')"   ",OutputFileName2WRITE(11,'(a)')"       <Topology Reference=""/Xdmf/Domain/Topology[1]""/>"WRITE(11,'(a)')"       <Geometry Reference=""/Xdmf/Domain/Geometry[1]""/>"WRITE(11,'(a)')"        <Attribute Name=""Displacement"" Center=""Node"">"WRITE(11,'(a)')"               <DataItem Format=""Binary"" "WRITE(11,'(a)')"                DataType=""Float"" Precision=""8"" Endian=""Little"" "WRITE(11,*)"            Dimensions=""",Nx,Ny,""">"WRITE(11,'(16X,a)')"           ",OutputFileNameWRITE(11,'(a)')"               </DataItem>"WRITE(11,'(a)')"       </Attribute>"WRITE(11,'(3X,a)')"</Grid>"END DO        WRITE(11,'(a)')"        </Grid>"WRITE(11,'(a)')"</Domain>"WRITE(11,'(a)')"</Xdmf>"CLOSE(11)!same for vplotnum=0numplots=1+Nt/plotgapOutputFileName='vdata.xmf'OPEN(unit=11,FILE=trim(OutputFileName),status="UNKNOWN")REWIND(11)WRITE(11,'(a)')"<?xml version=""1.0"" ?>"WRITE(11,'(a)')"<!DOCTYPE Xdmf SYSTEM ""Xdmf.dtd"" []>"WRITE(11,'(a)')"<Xdmf xmlns:xi=""http://www.w3.org/2001/XInclude"" Version=""2.0"">"WRITE(11,'(a)')"<Domain>"WRITE(11,'(a)')" <Topology name=""topo"" TopologyType=""3DCoRectMesh"""WRITE(11,*)"  Dimensions=""",Nx,Ny,""">"WRITE(11,'(a)')" </Topology>"WRITE(11,'(a)')" <Geometry name=""geo"" Type=""ORIGIN_DXDYDZ"">"WRITE(11,'(a)')"  <!-- Origin -->"WRITE(11,'(a)')"  <DataItem Format=""XML"" Dimensions=""2"">"WRITE(11,*)-pi*Lx,-pi*LyWRITE(11,'(a)')"  </DataItem>"WRITE(11,'(a)')"  <!-- DxDyDz -->"WRITE(11,'(a)')"  <DataItem Format=""XML"" Dimensions=""2"">"WRITE(11,*)REAL(2.0*pi*Lx/Nx,kind(0d0)),REAL(2.0*pi*Ly/Ny,kind(0d0))WRITE(11,'(a)')"   </DataItem>"WRITE(11,'(a)')"  </Geometry>"WRITE(11,'(a)')WRITE(11,'(a)')"       <Grid Name=""TimeSeries"" GridType=""Collection"" CollectionType=""Temporal"">"WRITE(11,'(a)')"               <Time TimeType=""HyperSlab"">"WRITE(11,'(a)')"                       <DataItem Format=""XML"" NumberType=""Float"" Dimensions=""2"">"WRITE(11,*)"                   0.0",dt," 2"WRITE(11,'(a)')"                       </DataItem>"WRITE(11,'(a)')"               </Time>"DOn=1,numplotsOutputFileName='data/v'ind=index(OutputFileName,' ')-1WRITE(number_file,'(i0)')10000000+plotnumOutputFileName=OutputFileName(1:ind)//number_fileind=index(OutputFileName,' ')-1OutputFileName=OutputFileName(1:ind)//'.datbin'OutputFileName2="<Grid Name=""T"WRITE(number_file,'(i0)')nOutputFileName2=OutputFileName2(1:13)//number_fileind=index(number_file,' ')-1OutputFileName2=OutputFileName2(1:13+ind)//'" GridType="Uniform">'plotnum=plotnum+1WRITE(11,'(a)')WRITE(11,'(3X,a)')"   ",OutputFileName2WRITE(11,'(a)')"       <Topology Reference=""/Xdmf/Domain/Topology[1]""/>"WRITE(11,'(a)')"       <Geometry Reference=""/Xdmf/Domain/Geometry[1]""/>"WRITE(11,'(a)')"        <Attribute Name=""Displacement"" Center=""Node"">"WRITE(11,'(a)')"               <DataItem Format=""Binary"" "WRITE(11,'(a)')"                DataType=""Float"" Precision=""8"" Endian=""Little"" "WRITE(11,*)"            Dimensions=""",Nx,Ny,""">"WRITE(11,'(16X,a)')"           ",OutputFileNameWRITE(11,'(a)')"               </DataItem>"WRITE(11,'(a)')"       </Attribute>"WRITE(11,'(3X,a)')"</Grid>"END DO        WRITE(11,'(a)')"        </Grid>"WRITE(11,'(a)')"</Domain>"WRITE(11,'(a)')"</Xdmf>"CLOSE(11)END PROGRAMXdmfCreate
  1. Information about the equations can be found athttp://mrob.com/pub/comp/xmorphia/#formula.
  2. http://www.mathworks.com/matlabcentral/fileexchange/22940-vol3d-v2
  3. https://en.wikipedia.org/wiki/OpenCL
  4. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/
  5. https://github.com/clMathLibraries/clFFT
Retrieved from "https://en.wikibooks.org/w/index.php?title=Parallel_Spectral_Numerical_Methods/Gray_Scott&oldid=4208808"
Category:

[8]ページ先頭

©2009-2025 Movatter.jp