Cover V12, i14
dec2003.tar

Introduction to Performance Data Values

Wk -- This is the assigned numeric workload number ID when collecting data.

Timestamps -- Timestamps are in Unix Epoch time format, which is the number of seconds after 00:00:00 UTC on January 1 1970.

Command -- The workload search filter based on the command name.

Args -- The workload search filter based on processes arguments.

User -- The workload search filter based on the processes running user name.

Cnt -- The number of concurrent instances of the running workload process.

usr% -- The total percentage of time the workload spent running on the CPU executing in the developer code.*+

sys% -- The total percentage of time the workload spent running on the CPU executing in the kernel servicing the system calls for your application.*+

cpwt% -- The total percentage of time the workload was ready to run, but spent in the RUNQUE waiting to get back on the CPU to begin executing again.

chld% -- The total percentage of CPU time of the child processes that have exited. You will see high chld% percentages if your application is forking off multiple shell scripts. This happens typically with monitoring applications. (See Figure 9 for a great example, represented as reaped children).

sizeM -- The virtual address space for the application. Most long-lived processes grow their address space quickly when they first start up then stabilize. If a process's virtual memory size continues to increase, then it is quite likely that it has some kind of memory leak. This rule is useful for long-term analysis; it does not apply recently created processes.

pf -- The number of page faults that had been triggered. Page faults occur when the process returns to the CPU and finds that a page in memory no longer exists and has to be reclaimed back to the memory freelist. This can indicate a memory-shortage if seen in high numbers.^

pgwt% -- The total percentage of time waiting for page fault to be completed. The time spent waiting for page faults for non-executable data pages indicates I/O activity. The pgwt% can be used to indicate a memory shortage

ulkwt% -- The total percentage of CPU time the user-lock from idle threads stopped on a semaphore. This is relevant on systems with databases.

Process with Blocked Threads Rule -- Multithreaded processes can consume more than 100% CPU time, and can also report more than 100% of wait time. Every thread that is blocked on a semaphore will report via the microstate "user lock wait time". This can be confusing as you could see a process accumulating hundreds or thousands of wait% time in a report. The fact is that there are a bunch of idle threads; no CPU resources are being consumed. The Java runtime environment is multithreaded, and it is quite common for this to occur with Java programs.

ioK -- The total number of characters in Kb read and written in read/write system calls.

sysc -- The total number of system calls executed for the workload.

vctx -- Voluntary context switches happen when the application leaves the CPU on its own. An example would be waiting for the system to service a disk I/O. If you see this consistently high, then it may indicate that the application is spending more time waiting for its own interrupts to be completed than processing on the CPU. Voluntary context switches occur when the process blocks in a system call to wait for something else to complete.

ictx -- Involuntary context switches happen when a process is interrupted or kicked off the CPU by a high priority process or a system interrupt. High involuntary context switches can indicate that the system has lots more processes with higher priority keeping your process from getting CPU time. An involuntary context switch occurs when another higher priority process has taken over the CPU, and often occurs when the process has used up an entire time slice and had its priority reduced as a consequence. So, voluntary context switches should be expected, but involuntary context switches indicate that there is some contention for the CPU and a bottleneck exists.

msps -- The milliseconds per context switch shows how long the process ran on average before it switched off the CPU. 1 seconds = 1000 milliseconds and 1 minute = 60000 milliseconds.


*The microstate measures of user and system CPU time don't miss anything, so if they are zero the process has definitely not run in the interval.

+You may see the CPU percentage reach over a 100% on a multiprocessor machine; the following can be read to correctly interpret (100%=1 CPU).

^Swapping happens to idle processes, so when memory is short you would expect to see one process reporting a memory shortage and different processes reporting that they have been swapped out to make space.