This describes the very simple accounting statistics tool pbsacct for the Portable Batch System (PBS version 2.2).
The latest version of this software may be downloaded from here.
Usage:
pbsacct fileswhere files are daily records (such as 20000705) located in $PBSHOME/server_priv/accounting/ (PBSHOME is usually /var/spool/pbs).
A sample output is:
# pbsacct 200006?? Portable Batch System accounting statistics ------------------------------------------- A total of 30 accounting files will be processed. First record is dated 06/01/2000, last record is dated 06/30/2000. Average Average Username #jobs CPU-days Wall-days Efcy. #nodes q-days -------- ----- -------- --------- ----- ------- ------- TOTAL 237 1415.31 1578.68 0.897 6.80 3.50 user0001 12 278.67 301.06 0.926 12.00 5.98 user0002 29 226.96 244.98 0.926 4.71 4.04 user0003 52 221.65 271.37 0.817 10.83 3.21 user0004 26 201.35 204.27 0.986 5.66 4.94 user0005 13 130.26 151.13 0.862 8.69 3.44 user0006 38 112.23 114.87 0.977 3.22 2.33 user0007 18 109.85 117.90 0.932 7.53 4.75 user0008 14 75.43 85.88 0.878 8.86 2.58 user0009 8 38.88 41.63 0.934 6.72 2.62 user0010 4 12.05 12.36 0.975 3.97 3.25 user0011 4 5.88 31.12 0.189 6.40 7.30 user0012 3 1.47 1.48 0.991 2.08 2.61 user0013 5 0.37 0.37 0.986 1.00 1.36 user0014 10 0.26 0.27 0.973 1.00 0.87 user0015 1 0.00 0.00 0.797 1.00 2.84
The usernames have been made anonymous. We prefer to count CPU- and wall-time in days rather than hours or seconds.
It should be noted that PBS records only the CPU-time spent on the Master-node of parallel jobs. The spawning of parallel processes by, e.g., MPI is outside the control of PBS, and no accounting of the Slave nodes is currently performed. The total CPU-time is estimated as the CPU-time on the Master times the number of nodes. The only reliable measure is actually the Wall-time times the number of nodes.
The column "Efcy." is the ratio of CPU-time to wall-time. Some jobs spend a long time in waiting states, likely because of I/O, or because of parallel processes waiting for network communication. This measure may indicate that some users' jobs need to be analyzed for possible improvements.
The column "Average #nodes" is a weighted average of the number of nodes used in parallel by the user's jobs.
The column "Average q-days" is the average number of days that the jobs spent in the queue while being eligible to run. This shows how difficult it is for jobs to get CPU-time on this system.
0 2 1 * * (cd Report-directory; /usr/local/bin/pbsreportmonth)
The accounting report may be mailed to the administrators by uncommenting some lines at the end of the script.
Author: Ole Holm Nielsen
Address:Department of Physics, Technical University of Denmark,
Building 307, DK-2800 Lyngby, Denmark.
E-mail: Ole.H.Nielsen@fysik.dtu.dk