Cover V12, I10

Article
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

oct2003.tar

The PBS Accounting Toolkit

Rodney Mach

The Portable Batch System (OpenPBS) is a workload management package used in many high-performance computing environments. OpenPBS reports a plethora of information in an accounting log file that can be used by systems administrators for capacity planning, resource reporting, and performance tuning. Unfortunately, extracting information from the accounting logs for those purposes has historically been a difficult task. The example shown in Figure 1 is a typical accounting log entry for an executed job generated by OpenPBS.

This example provides quite a bit of information. The typical solution to extracting information from the accounting log would be to first decipher the format, and then spend time writing Perl code to get the information you need. However, this solution means you spend more time retrieving your data instead of working with your data. The PBS Accounting Toolkit solves this problem. It allows administrators to convert the PBS accounting logs to XML to leverage XML technologies for querying and parsing, and includes software to generate high quality PDF usage reports from this XML accounting data.

Installing the Software

Before installation, make sure you have Java 1.4 or higher installed. You can type "java -version" at the Unix prompt to make sure the version number you have is 1.4.X. You can install the latest Java 1.4 RPM from http://java.sun.com.

After confirming your Java installation, go to:

http://pbsaccounting.sourceforge.net
and obtain the latest version of the Accounting Toolkit. There are two RPMs to install -- one is the XML conversion software, the other is the Darkslide reporting package that I will discuss later.

XML Conversion

After installing the RPMs, you can convert the first accounting file to XML. For example, to convert accounting data found in the file /usr/spool/PBS/server_priv/accounting/20030523 to XML, type:

% cd /usr/spool/PBS/server_priv/accounting/
% mkdir xml
% pbstoxml 20030523 xml/20030523.xml
Figure 2 shows the resulting 20030523.xml file.

Converting the accounting data to XML opens up a powerful set of XML tools for querying, parsing, converting, and storing the data.

One example of leveraging an existing tool is using XPath to find out how many nodes user rmach used. To do this, use the Perl XML:Xpath module to do an XPath query against the XML data. (You could think of XPath sort of like SQL for XML; see the references section for a good tutorial on XPath.):

% xpath "sum(//pbs_jobfile/execution_record/resource_list/nodes)" < 20030523.xml
Query didn't return a nodeset. Value: 1
In this case, the XPath query returned 1 nodes were used by user rmach. Writing a Perl script would have given the same answer, assuming it was coded correctly, but using XPath is much faster and less prone to error because it doesn't require any coding.

Darkslide Report Generation

The XML by itself isn't useful if you don't do something with it, which is where the Darkslide Report Generator comes in. Darkslide produces high-quality usage graphs and tables with which you can easily visualize your accounting data (see Figure 3).

A high-level view of the Darkslide architecture is shown in Figure 4. XML accounting data is stored inside Xindice, a native XML database. The database is accessed via XML:DB. The reporting engine issues XPath commands against the database to gather data to produce the PDF report output.

Darkslide Installation

The basic steps to getting up and running with the Darkslide Report Generator are:

1. Install and configure Xindice.
2. Convert your OpenPBS accounting data to XML.
3. Load the converted XML accounting data into Xindice.

When this is complete, you will be able to generate the report.

Xindice Installation

DarkSlide uses Xindice, a native XML database, to store the accounting data. Here is the procedure (also documented in the Xindice admin guide) for setting up the database in preparation for storing the accounting data.

Download and install the package xml-xindice-1.0.tar.gz from: http://xml.apache.org/xindice:

% gunzip xml-xindice-1.0.tar.gz
% tar xf xml-xindice-1.0.tar
% mv xml-xindice-1.0 /usr/local/
% nohup /usr/local/xml-xindice-1.0/start &
If everything went well, you should get a message saying "Server Running". Set important environment variables that Xindice requires:

% export PATH=$PATH:/usr/local/xml-xindice-1.0/bin
% export XINDICE_HOME=/usr/local/xml-xindice-1.0
Next, you can configure Xindice.

Configuring Xindice

At this point, you must create a "collection" in which to store your XML accounting data. A collection is a storage location where XML files will be stored. By convention, I just use the name of the cluster that the accounting data is for. Remember what you use here for the collection name, because later you will need to edit the Darkslide configuration file to reflect the name of the collection you chose. Here is an example of creating a collection for the cluster named "examplecluster"; replace "examplecluster" with the name for your cluster:

% xindiceadmin add_collection -c /db/ -n examplecluster
Created : /db//examplecluster
Configuring Darkslide

You will also need to edit the Darkslide configuration file QueryEngineConfig.xml in /usr/local/darkslide-1.0/etc/. The most important parameter is ensuring the <collection> tag has the same collection name you created in the "Configuring Xindice" step. In this case, you would change it to "examplecluster". You may also want to edit the title tag <title> that modifies the title of the report.

If you have any problems, try lowering the debugging tag <debugging> from SEVERE to FINEST to get copious amounts of debugging information.

The Config file after editing for our purposes is shown in Figure 5.

Loading XML Accounting Data

You can use the bulkloader command included with the Accounting Toolkit to load all the XML files into the Xindice collection you created. For example, if all your XML data was converted with pbstoxml into the directory /usr/spool/PBS/server_priv/accounting/xml/, you would type this command:

% bulkloader --directory=/usr/spool/PBS/server_priv/accounting/xml/
Document /usr/spool/PBS/server_priv/accounting/xml/20030501 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030502 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030503 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030504 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030505 inserted
To verify the documents were loaded properly, use the following Xindice command to query the files in the collection named "examplecluster":

% xindiceadmin ld -c /db/examplecluster
The filenames that were just loaded into the collection should be listed as output.

Generating the Report

It's finally time to produce some reports. To generate a report from 4-10-2003 through 04-29-2003 and save the report in a file called /tmp/test.pdf, type the following command:

/usr/local/darkslide-1.00/bin/reportgenerator \
  --startdate=04-10-2003 --enddate=04-29-2003 --filename=/tmp/test.pdf
You can now use your favorite PDF viewer to view the report.

Daily Updates

To load the daily XML accounting data you converted with pbstoxml into Xindice, simply create a daily cron job. The cron job should load the XML document into Xindice using the Xindice command here:

xindice add_document -c /db/examplemachine -f \
  /usr/spool/PBS/server_priv/accounting/xml/filename -n filename
where the filename is the name of the accounting file, such as 20030523.

Conclusion

Although there is certainly a bit of a learning curve to XML technologies in the beginning, the time invested in learning them pays back dividends in the long run. Leveraging tools provided in the PBS Accounting Toolkit gives you the power to visualize accounting information quickly and easily, harnessing the power of XML tools and technologies to give you the decision-making information you need.

References

PBS Accounting Toolkit -- http://pbsaccounting.sourceforge.net

OpenPBS -- http://www.openpbs.org

Xindice -- http://xml.apache.org/xindice

XML tutorials -- http://www.w3schools.com

Rodney Mach is President of Fathom5 Consulting (http://www.fathom5consulting.com), a technology firm specializing in providing fast affordable custom software solutions. He can be reached at: rmach@fathom5consulting.com.