By default, the profiling produces a single set of statistics for all
code between the PetscInitialize() and PetscFinalize()
calls within a program. One can independently monitor up to ten
stages of code by switching among the various stages with the comands
PLogStagePush(int stage); PLogStagePop();where stage is an integer (0-9); see the manual pages for details. The command
PLogStageRegister(int stage,char *name)allows one to associate a name with a stage; these names are printed whenever summaries are generated with -log_summary or PLogPrintSummary(). The following code fragment uses three profiling stages within an program.
PetscInitialize(int *argc,char ***args,0,0); /* [stage 0 of code here] */ PLogStageRegister(0,"Stage 0 of Code"); for (i=0; i<ntimes; i++) { PLogStagePush(1); PLogStageRegister(1,"Stage 1 of Code"); /* [stage 1 of code here] */ PLogStagePop() PLogStagePush(2); PLogStageRegister(1,"Stage 2 of Code"); /* [stage 2 of code here] */ PLogStagePop() } PetscFinalize();Figures 19 and 20 show output generated by -log_summary for a program that employs several profiling stages. In particular, this program is subdivided into six stages: loading a matrix and right-hand-side vector from a binary file, setting up the preconditioner, and solving the linear system; this sequence is then repeated for a second linear system. For simplicity, Figure 20 contains output only for stages 4 and 5 (linear solve of the second system), which comprise the part of this computation of most interest to us in terms of performance monitoring. This code organization (solving a small linear system followed by a larger system) enables generation of more accurate profiling statistics for the second system by overcoming the often considerable overhead of paging, as discussed in Section Accurate Profiling: Overcoming the Overhead of Paging .