14.5. Incremental Debugging

Up: Contents Next: Complex Numbers Previous: Error Handling

When developing large codes, one is often in the position of having a correctly (or at least believed to be correctly) running code; making a change to the code then changes the results for some unknown reason. Often even determining the precise point at which the old and new codes diverge is a major pain. In other cases, a code generates different results when run on different numbers of processors, although in exact arithmetic the same answer is expected. (Of course, this assumes that exactly the same solver and parameters are used in the two cases.)

PETSc provides some support for determining exactly where in the code the computations lead to different results. First, compile both programs with different names. Next, start running both programs as a single MPI job. This procedure is dependent on the particular MPI implementation being used. For example, when using MPICH on workstations, procgroup files can be used to specify the processors on which the job is to be run. Thus, to run two programs, old and new, each on two processors, one should create the procgroup file with the following contents:

   local 0 
   workstation1 1 /home/bsmith/old 
   workstation2 1 /home/bsmith/new 
   workstation3 1 /home/bsmith/new 
(Of course, workstation1, etc. can be the same machine.) Then, one can execute the command
   mpirun -p4pg <procgroup_filemame> old -compare <tolerance> [your_program_options] 
Note that the same runtime options must be used for the two programs. The first time an inner product or norm detects an inconsistency larger than <tolerance>, PETSc will generate an error. The usual runtime options -start_in_debugger and -on_error_attach_debugger may be used. The user can also place the commands
   PetscCompareDouble() 
   PetscCompareScalar() 
   PetscCompareInt() 
in portions of the application code to check for consistency between the two versions.


Up: Contents Next: Complex Numbers Previous: Error Handling