Sharing
Open Source Code Through the CPAN
Randal L. Schwartz
The Comprehensive Perl Archive Network (CPAN) is a wonderful place,
full of contributed items for you to use, such as scripts and modules.
Modules are the core of the CPAN: little building blocks for you
to include into your applications.
Modules end up in the CPAN by being wrapped in a distribution.
A distribution is typically a compressed tar archive, identified
with a particular version number, and containing one or more modules
separated out into packages. A distribution also contains installation
instructions and often contains tests to validate proper build and
installation.
Let's look at a sample distribution that I entered into the CPAN
this past summer to see the essential parts of any distribution.
As a joke, I created the Acme::Current module which gives
the current year, month, and day... as constants! The module is
out-dated and updated every 24 hours, using an automated script
that I'll describe shortly.
The module itself is in Current.pm, shown in Listing 1.
Line 1 defines the module name by switching to its package. Line
2 enables strict -- a good thing for larger program development.
Lines 4 and 6 declare the package variables $VERSION, $YEAR,
$MONTH, and $DAY. Although I could have done this
with our rather than use vars, that would have restricted
my module to installations with Perl 5.6 or later. When contributing
to the CPAN, it's important to think about portability.
Line 8 provides the version number of the module. In this joke
module, it happens to be the entire active code of the module as
well. The values for the three constants are established using hardwired
values, and then they are wrestled into a single integer by sprintf.
The name $VERSION is special to Perl's require operator,
and also to the indexing software used to track changes to CPAN
modules. Because the indexing software is simplistic, I had to keep
this line within the guidelines established in the ExtUtils::MakeMaker
manpage.
The resulting version number for these constants is 20030720.
Any version number greater than this supersedes this release, so
it's important that the version number monotonically increase with
time.
Line 10 provides the mandatory "true value" required by the require
operator, on which the use operator is built. Line 12 defines the
end of the executable portion of the file, as we are now beginning
the POD for the module. While this line is essentially optional,
it does save a bit of time in the parser, because the parser doesn't
need to scan beyond the __END__ marker to find non-POD code
after the POD.
The POD starts in line 14. Again, the format of POD for a CPAN
distribution is a bit restricted because the automated tools are
extracting key information. The NAME heading is very specific,
for example. The synopsis here shows a typical use of this module.
I left out the remainder of the POD for space... you can see it
in the CPAN if you wish.
To create a distribution from this module, we need to create a
distribution directory, and put this file beneath that directory.
I called mine Acme-Current and placed this module into lib/Acme/Current.pm
below that directory.
At the top of the directory, I include the next essential component
-- the Makefile.PL, shown in Listing 2. This file is a Perl
program that creates a Makefile that can be used to build,
test, install, and create distribution archives for this particular
distribution. Line 1 pulls in the ExtUtils::MakeMaker module,
which defines the meaning of the rest of the file.
Lines 2 through 8 create a Makefile when executed. The
parameters declare the name of my distribution (Acme::Current),
which file I'm getting the distribution version number from (the
only module here), and also that to properly install this module,
the user must have any version of Test::More installed. It's
important to list every non-core module in this list if I've used
them in my module, because the automated installation tools (like
CPAN.pm) can then install any necessary dependencies before
continuing on to my module.
But what is Test::More? I've used it in my one test file
for this distribution, which I've placed in t/01-core.t below
Acme-Current, the contents of which are shown in Listing
3. Once again, it's a Perl program, using the Test::More
testing harness. To be recognized as a test, the file should match
the fileglob of t/*.t. If there's more than one test, they're
run in alphabetical order.
Line 1 pulls in the Test::More module, declaring that I'm
not prepared to define how many tests I'm running. If this had been
a real module and not a joke module, I should change this to:
use Test::More tests => 4;
because at the moment, there are four tests. Of course, I'd have to
change this every time I added or deleted a test, so the no_plan
option was good while I was monkeying around.
The first test (in line 2) is a use_ok test, which verifies
the syntax of the Acme::Current module. At this point, we
should now have the three constants defined in the Acme::Current
package namespace.
To test the values, I compute the current time vector using gmtime.
Then, using a series of is tests, I compare the value as
given by the module with the value as it should be. If the numbers
are the same, the test passes. Otherwise, the test fails, and the
discrepancy is noted. A failed test normally prevents the module
from being installed as well, unless the user is fairly insistent
(and a bit crazy).
These tests succeed as long as "today" (from a GMT perspective)
is the same as the values that I set when making up the version
number. If someone tries to install a stale version, the tests will
fail. Obviously, for this to work, a new version must be uploaded
every day, and the process to do that will be described shortly.
Besides the module itself, the Makefile.PL, and one or
more test files, I also need a MANIFEST file to denote the
contents of the distribution archive. My MANIFEST looks like
Listing 4. It's just a list of files. Note that a comment is permitted
following some delimiting whitespace. One additional file is listed
there -- the README file, which should be a simple information
file. The README file is extracted separately and archived
in the CPAN so that users can download and examine just the README
without having to unpack the entire distribution. My README
is given in Listing 5.
So, now I've got the essentials of a distribution. To make a distribution
archive, I issue the following commands:
perl Makefile.PL
make all test tardist
The first command creates a Makefile, which the second command
uses to build the module (copying it into a staging area), run the
tests on the distribution (everything matching t/*.t), and
then if that all works, create a file whose name looks like Acme-Current-20030720.tar.gz.
Whew!
Once we have the distribution archive, it's time to get that into
the CPAN. I had already applied and received my CPAN PAUSE
ID, by visiting http://pause.cpan.org. Using that same Web
site, I can also insert items into my CPAN author directory, by
file upload, URL fetching, or FTP transfer.
I decided that since this joke would need to be updated nightly,
I would build a Perl program to update the constants in the .pm
file and then submit the file using a URL fetch. By experimenting
with WWW::Mechanize, I found the right combination of steps,
and put that into a cron-trigger job that I call Maintain
in the same Acme-Current directory, presented in Listing
6.
Lines 1 through 3 start nearly every long Perl program that I
write, turning on the compiler restrictions. Lines 5 and 6 force
the current directory to be the same directory in which the script
is located.
Lines 8 and 9 pull in the WWW::Mechanize (found in the
CPAN) and File::Copy modules. Lines 11 through 14 define
the necessary paths and authentications for the CPAN upload. I use
a private directory on my Web server to provide the source for the
upload. My CPAN PAUSE ID is merlyn, but my CPAN PAUSE password
is not shown, because it is provided on STDIN from cron
for security.
Lines 16 through 26 edit the .pm file to create the proper
version number and date constants. I use an in-place edit here,
looking for the $VERSION assignment line by recognizing the
specific pattern. Once that's complete, I can now go about the task
of testing and creating a new distribution.
Line 28 performs the steps shown earlier, throwing away any output
text. Because this is a joke module, I fail silently, erring on
the side of just not uploading a new module. Had this been a more
important module, I'd be parsing through the output text to verify
that it is as expected. At this point, we've either rebuilt the
same distribution archive as before, or we now have a new archive.
Line 30 cleans up some cruft that had been left over when the
same version number was used twice (which is true 23 times a day).
Line 31 does the actual business of attempting an upload for any
new distributions now made. Each name that fits the fileglob pattern
is submitted to the CPAN, using the subroutine beginning in line
33.
Line 34 saves the local file name parameter. Line 36 computes
this same name in the Web server directory, and if it already exists,
then we've already uploaded this file and return immediately (in
line 37).
Line 39 copies this new file to the Web server directory. Line
40 creates a WWW::Mechanize object to let us talk to the
PAUSE site and schedule a new upload.
Line 42 establishes my PAUSE ID and password using HTTP "basicauth".
Line 44 gets the initial form that I see to log in. Line 45 follows
the login link, which will look at the "basicauth" credentials,
and present me with a menu of actions to take regarding my CPAN
author directory.
Line 46 follows the link to upload a file. The resulting form
has some very specific form elements, into which I stuff the URL
that maps to my new distribution (line 47). Line 49 submits that
form, and line 50 reports that a new distribution was submitted.
Because this is run from cron, I get one email every day
with this single text line in it.
The next step is to set up this script to run every hour. Why
every hour? Because I don't want to figure out when GMT rolls over,
so I let the machine just do the work every hour, and upload only
when there's a new distribution. Simple, but effective. The cron
entry looks like this:
## Acme::Current joke module
11 * * * * /home/merlyn/Perl/Acme-Current/Maintain%MYPASSWORD
I do this at 11 minutes past the hour every hour of every day. Why
11? Because I have different things firing from my crontab, and I
try to stagger them so that they don't all start at the same time,
and "11 minutes past the hour" was the next slot available.
Thus, 24 times a day, the pm file gets edited, and the
Makefile.PL is transformed into a Makefile, and the
distribution gets tested and an archive gets created. But only once
a day is this version number different from the previous number,
and then the PAUSE machinery is contacted to trigger an upload.
OK, so it's a lot of work for a joke that I eventually disabled
anyway, but hey, that's what Acme is for. I hope I've shown
how simple it is to create a distribution, though, and inspired
you to get your open source code pieces into the CPAN. Until next
time, enjoy!
Randal L. Schwartz is a two-decade veteran of the software
industry -- skilled in software design, system administration, security,
technical writing, and training. He has coauthored the "must-have"
standards: Programming Perl, Learning Perl, Learning
Perl for Win32 Systems, and Effective Perl Programming.
He's also a frequent contributor to the Perl newsgroups, and has
moderated comp.lang.perl.announce since its inception. Since 1985,
Randal has owned and operated Stonehenge Consulting Services, Inc.
|