12.8. Accurate Profiling: Overcoming the Overhead of Paging

Up: Contents Next: Hints for Performance Tuning Previous: Saving Output to a File

One factor that often plays a significant role in profiling a code is paging by the operating system. Generally, when running a program only a few pages required to start it are loaded into memory rather than the entire executable. When the execution procedes to code segments that are not in memory, a pagefault occurs, prompting the required pages to be loaded from the disk (a very slow process). This activity distorts the results significantly. (The paging effects are noticeable in the the log files generated by -log_mpe, which is described in Section Using -log_mpe with Upshot/Nupshot .)

To eliminate the effects of paging when profiling the performance of a program, we have found an effective procedure is to run the exact same code on a small dummy problem before running it on the actual problem of interest. We thus ensure that all code required by a solver is loaded into memory during solution of the small problem. When the code procedes to the actual (larger) problem of interest, all required pages have already been loaded into main memory, so that the performance numbers are not distorted.

When this procedure is used in conjunction with the user-defined stages of profiling described in Section Profiling Multiple Sections of Code , we can focus easily on the problem of interest. For example, we used this technique in the program ${}PETSC_DIR/src/sles/examples/tutorials/ex10.c to generate the timings within Figures 19 and 20 . In this case, the profiled code of interest (solving the linear system for the larger problem) occurs within event stages 4 and 5. Section Interpreting -log_summary Output: Parallel Performance provides details about interpreting such profiling data.


Up: Contents Next: Hints for Performance Tuning Previous: Saving Output to a File