For almost all unstructured grid computation, the distribution of portions of the grid across the processor's work load and memory can have a very large impact on performance. In most PDE calculations the grid partitioning and distribution across the processors can (and should) be done in a ``pre-processing'' step before the numerical computations. However, this does not mean it need be done in a separate, sequential program, rather it should be done before one sets up the parallel grid data structures in the actual program. PETSc provides an interface to the ParMETIS (developed by George Karypis; see the docs/installation.html file for directions on installing PETSc to use ParMETIS), to allow the partitioning to be done in parallel. PETSc does not currently provide directly support for dynamic repartitioning, load balancing by migrating matrix entries between processors, etc. For problems that require mesh refinement, PETSc uses the ``rebuild the data structure'' approach, as opposed to the ``maintain dynamic data structures that support the insertion/deletion of additional vector and matrix rows and columns entries'' approach.
Partitioning in PETSc is organized around the MatPartitioning object.
One first creates a parallel matrix that contains the connectivity information about the
grid (or other graph-type object) that is to be partitioned. This is done with the
command
ierr = MatCreateMPIAdj(MPI_Comm comm,int mlocal,int n,int *ia,int *ja, Mat *Adj);The argument mlocal indicates the number of rows of the graph being provided by the given processor, n is the total number of columns; equal to the sum of all the mlocal. The arguments ia and ja are the row pointers and column pointers for the given rows, these are the usual format for parallel compressed sparse row storage, using indices starting at 0, not 1.
This, of course, assumes that one has already distributed the grid (graph) information among the processors. The details of this initial distribution is not important; it could be simply determined by assigning to the first processor the first n0 nodes from a file, the second processor the next n1 nodes, etc.
For example, we demonstrate the form of the ia and ja for a triangular grid where we
(1) partition by element (triangle)
and (2) partition by vertex.
ierr = MatPartitioningCreate(MPI_Comm comm,MatPartitioning *part); ierr = MatPartitioningSetAdjacency(MatPartitioning part,Mat Adj); ierr = MatPartitioningSetFromOptions(MatPartitioning part); ierr = MatPartitioningApply(MatPartitioning part,IS *is); ierr = MatPartitioningDestroy(MatPartitioning part); ierr = MatDestroy(Mat Adj); ierr = ISPartitioningToNumbering(IS is,IS *isg);The resulting isg contains for each local node the new global number of that node. The resulting is contains the new processor number that each local node has been assigned to.
Now that a new numbering of the nodes has been determined one must
renumber all the nodes and migrate the grid information to the correct processor.
The command
ierr = AOCreateBasicIS(isg,PETSC_NULL,&ao);generates, see Section Application Orderings , an AO object that can be used in conjunction with the is and gis to move the relevant grid information to the correct processor and renumber the nodes etc.
PETSc does not currently provide tools that completely manage the migration and node renumbering, since it will be dependent on the particular data structure you use to store the grid information and the type of grid information that you need for your application. We do plan to include more support for this in the future, but designing the appropriate user interface and providing a scalable implementation that can be used for a wide variety of different grids requires a great deal of time. Thus we demonstrate how this may be managed for the model grid depicted above using (1) element based partitioning and (2) a vertex based partitioning.