may2004.tar

Remotely Monitoring Files with PHP

Russell J.T. Dyer

Although we may enjoy working with computers, it's nice to be able to walk away from them. To keep from worrying about them, however, we need to be able to monitor our servers remotely. In this series on PHP, I will present scripts useful to systems administrators for monitoring and maintaining servers, and teach a little PHP in the process. For this first article, I will examine a PHP script that scans for files on an FTP server as an example of how to write a PHP script to monitor a server.

Scenario and Goal

Suppose we worked for a business that answers the telephone for mail-order catalog companies (i.e., a call center). The telephone operators record merchandise orders from callers. Our server automatically writes the orders to export files daily for clients to download from our FTP site. Because export programs can fail, as a quality control measure, we must check the client FTP directories every morning. This is tedious, but delegating this task could cause security problems. A better solution is to design a PHP script to scan the FTP directories and generate a report for the user like the one in Figure 1.

To accomplish this, the PHP script will need a list of client FTP directories, the names of client export files, and when they should be placed on the site. The script must generate a screen that will show the client's name and list only the day's export files that are found, along with their sizes and times. If files are missing, then a warning message will be displayed in red. In smaller print, the expected file name and run time is listed. Incidentally, it's important to password protect scripts like these or the directories they're kept in. This can be done through the Apache configuration.

Initial Variables

To begin, we will create a simple text file that will contain information about client exports. If we have many clients and are tracking quite a bit of data on them, we could keep this data in a database like MySQL. For our purposes here, we'll keep it simple and use a text file, called ftp-clients.txt, which looks like this:

# Details of FTP clients
# ID|Name|Method|Time|Qty|Pattern|Sub-Directory
#
1000|ABC, Inc.|FTP|0800|3|expyyyymmddx.txt|abc
1001|DEF, Inc.|FTP|0900|1|expyyyymmdd.txt|def
1002|GHI, Inc.|email|0830|1|expyyyymmdd.txt|shipping@ghi_client.com

The first lines describe the file and its fields. Regarding the data, there is one client record per line. The fields are delineated by pipes. The first field contains the client identification number, and the second contains the client's name. The third field shows the method of delivering files to the client: some by FTP, others by email. We're only concerned with the FTP clients here, so we'll have to filter out the email ones. The next field lists the time of day when the client's export files should be placed on the FTP site or emailed. The fifth field states the number of export files that should be generated. Following that is the file naming pattern and then the subdirectory of the FTP directory where the client's files will be placed (or their email address). Later, I'll show how to extract and parse this data with PHP. For now, let's set up the initial variables of our script:

<?php

$yyyy = strftime('%Y');
$mm = strftime('%m');
$dd = strftime('%d');
$now = strftime('%H%M');

$patterns = array('/yyyy/','/mm/','/dd/','/x\./');
$dates = array($yyyy, $mm, $dd, '[a-z].');

$dir = '/home/ftp/';
$list = 'ftp-clients.txt';

?>

PHP code must be contained within proper tags (e.g., <?php and ?>) so that Apache (the Web server) knows how to handle the text that it's reading. The next four lines of code establish variables containing date and time elements. The strftime() function will retrieve the server's time. The symbols inside the parentheses are formatting codes for the particular elements that are retrieved. The next pair of lines creates two arrays used to translate the file naming patterns for each export (e.g., expyyyymmddx.txt to exp20040515[a-z].txt). The [a-z] is for a range of one-letter possibilities. The last two lines of code set up variables containing the path to the FTP directory and the name of the text file. Knowing where this file is found, where FTP files are located, and determining the current date and time will be necessary when the script scans for the day's export files.

GUI and Patterns

After establishing the initial variables, the next task is to start a Web page to act as a user interface. For this article, we will keep the Web design simple:

<html>
<body>
<h3>FTP File Checking</h3>

Because the PHP script section was closed, HTML tags can be given without the aid of a print statement. The next preparatory task is for the script to read the client information text file, ftp-clients.txt, represented by the $list variable. It's located in the directory named in the $dir variable. The script must extract the data for each client from the file. The following bit of code placed right after the opening HTML shown above will work:

<?php 

$FILE = fopen("$dir$list", 'r')
        or die("Could not open $list");
$line = rtrim(fgets($FILE, 4096));

Note that there's another PHP opening code tag here. The closing tag will appear later. The first line of code uses the fopen() function to open ftp-clients.txt given in double-quotes. With double-quotes, PHP will interpolate or extract the value of $dir and $list. The 'r' listed inside the parentheses tells PHP to open the file in read-only mode. Wrapped onto the next line is an instruction for what PHP is to do if it cannot open the file -- it's to kill the script and display the message in quotes.

The next line uses the fgets() function to get the first line (i.e., 4096 bytes or 4 KB of data) from the client file. The fgets() is inside the rtrim() function. This trims off the right-most character of the line retrieved -- the line feed is dropped. The trimmed record is then stored in the variable $line. Incidentally, in PHP, variables don't need to be declared.

Once PHP knows where to find the client information file, it can open the file and parse the data. It'll loop through the client records with a while statement like so:

while(!feof($FILE))
{

   if(strpos($line, "\#") == false)
   {  
      list($id,$name,$method,$run,$exp,$pattern,$subdir) = split('\|', $line, 7);

      if($method == 'FTP')
      {
         $pattern = preg_replace($patterns, $dates, $pattern);

         $cdata["$id"] = array('name'=>"$name", 'method'=>"$method",
                               'run'=>"$run", 'exp'=>"$exp",
                               'pattern'=>"$pattern",
'subdir'=>"$subdir");
      }
   }
   $line = rtrim(fgets($FILE, 4096));
}
fclose($FILE);

asort($cdata);

The test condition of the while statement here is a negated feof() function. This function checks whether the script is at the end of the file. The negation, indicated by the exclamation point, means that the statement block contained in the curly braces is to be performed if it's not at the end of the file. The first line in the statement block is an if statement that uses strpos() to determine whether the first character of the line retrieved is a # (i.e., a comment line). If it's not, PHP will process the record.

The next line of code uses the list() function in conjunction with the split() function to parse the fields of the first line of data (saved in $line). The split() function separates each field based on the pipe (bar) separator. Note that the pipe must be escaped with a back-slash, just like the pound sign above. The last element of the split() indicates that there are seven fields to parse. The list() function saves the parsed data into the respective, named variables. Before moving on, an if statement checks that the client's method is FTP. The file-naming pattern is translated with preg_replace(). It uses the arrays set up earlier to replace the pattern found with the current date. For simplicity, we will overwrite the value of $pattern with the results.

Once the data for the first client has been parsed, it can be stored in a multidimensional array (i.e., an associative array of associative arrays) called $cdata. The keys of $cdata contain client identification numbers, and the values are references to memory addresses containing another associative array. This associative array contains the keys and related values for the client. Data easily can be retrieved from multidimensional arrays by using a double index. For instance, to retrieve from $cdata the name of a client for a particular client id, you would enter: {$cdata[$id]['name']}.

When the script finishes with the first record, the next line of code loads in another line of data from the client file. When the while statement is finished, the script closes the file handle with fclose(). For tidiness, it sorts $cdata with the asort() function.

Scanning Files

The next step is for the script to scan each client's directory. It will loop through the multidimensional array $cdata with a foreach statement. It will extract the client identification number and then loop through the client's respective associative array:

foreach($cdata as $id => $client_array)
{
   print "<b>{$cdata[$id]['name']}</b> <br/>";

   $path = "$dir{$cdata[$id]['subdir']}";

   $DIR = opendir("$path")
          or die("Could not open $path");

   while($filename = readdir($DIR))
   {
      if(is_file("$path/$filename"))
      {  
         if(preg_match("/{$cdata[$id]['pattern']}/", $filename))
         {
            $time = strftime('%T', filemtime("$path/$filename"));
            $size = filesize("$path/$filename");
            print("$filename <font color='gray' size='1'>
                   ($size bytes, $time)</font><br/>");
            $count++;
         }
      }
   }
   closedir($DIR);

The first line of the foreach statement block above prints the client's name for the particular client identification number by extracting it from $cdata. Then, the absolute path to the client's directory is spliced together and saved in $path. The opendir() function creates a directory handle ($DIR) for the client's FTP directory. If it fails, the script will die and an error message will be displayed. Next, the script loops through the list of files in the client directory with a while statement.

Within the condition section of the while statement, the readdir() function opens the client's FTP directory and extracts each file name, saving it temporarily in the variable $filename. Next, within the statement block, an if statement and an is_file() function are deployed to check each file to determine that it is actually a file and not a directory.

If the file found is a regular file, then another if statement is introduced with a regular expression using preg_match(). It will try to match the export file-naming pattern with the name of the file found. This will eliminate export files from previous days. If it finds a match, it will determine the date and time in which the file was last modified with filemtime(), coupled with strftime(), which will extract the time and format it. Next, the file size is determined with filesize(). Then, the file name, the file size, and the modification time are printed. A counter is then incremented to compare to the number of expected files later. Finally, the directory handle is closed.

Missing Files

Once the script has gone through all of the files in the client's FTP directory, it must compare the count of matching files found to the number of files expected. It will do this with an if statement:

   if($count != "{$cdata[$id]['exp']}"
             && $now > "{$cdata[$id]['$run']}")
   {
      $missing = "{$cdata[$id]['exp']}" - $count;
      print("<font color='red'>We are missing $missing
file(s).</font><br/>");
   }

The condition of this if statement extracts the number of files expected (i.e., the value of the key exp) from $cdata. It then checks to see whether the actual count does not equal (!=) the expected count. It also checks whether the current time is later than the time at which the export is expected to run. If both of these conditions are true (meaning files are missing and it's after the time that the export should have run), then it will calculate the number of missing files and print the difference within a warning message in red.

As an added touch, the following code provides the user with footnotes about what he should be seeing for each client. This can be useful when a file is missing:

   $run = substr("{$cdata[$id]['run']}",0,2)
          . ":" . substr("{$cdata[$id]['run']}",2,2);

  
print("<p><small><b>Expectations</b><br/>
          File Name: {$cdata[$id]['pattern']}<br/>
          Run Time: $run</small></p><br/>");

   unset($count);
}

?>

</body>
</html>

In the code above, the expected runtime is extracted and reformatted using a couple of substr() functions and some appends (i.e., the dots) to insert a colon between the hour and minutes. The first substr() grabs two characters starting at the beginning (hence, the 0,2). The second one gets two more starting at position 2. The runtime along with the file-naming pattern is printed in a small font. Next, the counter can be reset, or rather unset, so that it will be ready for PHP to go through the next client's FTP directory. Finally, this part of the PHP code section is closed and the Web page is ended, thus ending the script.

Conclusion

Although, the project presented in this article is fairly simple, it demonstrates many PHP functions. I hope it also gives you ideas for setting up PHP scripts to remotely monitor your systems. In future articles, I will cover PHP scripts that will monitor and handle other system maintenance tasks.

Russell Dyer is a Perl programmer, a MySQL developer, and a Web designer living and working on a consulting basis in New Orleans. He is also an adjunct instructor at a technical college where he teaches Linux and other open source software. He can be reached at: russell@dyerhouse.com.