Cover V13, i05

Article

may2004.tar

Constructing Objects

Randal L. Schwartz

To construct an object in Perl, you must select a valid package name for the object's class, populate that package with subroutines to define the methods, set the value of @ISA within that package to define the base (parent) classes for that class, and then create a blessed reference. For example, we can make widgets that know how to say their names and take on new names with the following code (I'll describe $self in a moment):

package Widget;

sub display {
  my $self = shift;
  print $self->{name}, "\n";
}

sub rename {
  my $self = shift;
  $self->{name} = shift;
  $self;
}
A constructed object compatible with this definition must be a hashref with at least a key of name holding the name of the object. We can construct such a hashref like so:

my $dog = { name => 'Spot' };
bless $dog, 'Widget';
The bless operation puts a little post-it note on the hash data structure (not the reference!) that says, "I belong to Widget". Now, we can invoke the methods like so:

$dog->display; # prints "Spot\n"
$dog->rename("Fido");
$dog->display; # prints "Fido\n"
How does this work? To execute the rename call, for example, Perl constructs an argument list consisting of the object variable ($dog) plus any arguments given to the method, resulting in:

($dog, "Fido")
Next, Perl looks for a subroutine in the package given by the post-it note (the object's class) named the same as the method. The subroutine Widget::rename gets invoked, and the first argument ends up in $self. The second argument is assigned as an element of the hash, and finally the subroutine returns $self (not a requirement, but handy for other operations).

Normally, we wouldn't hand-construct the object. The lines to create the hash and bless the object will be found in a constructor within the class. We'll invoke the constructor as:

my $dog = Widget->named("Spot");
To execute this class method invocation, Perl again constructs an argument list, but this time puts the name of the package as the first element:

("Widget", "Spot")
And upon finding the Widget::named subroutine, invokes it:

package Widget;

sub named {
  my $class = shift; # gets Widget (usually)
  my $self = { name => shift };
  bless $self, $class;
  $self;
}
Comparing this code to the code above, we see that we'll be returning $self, which is just like the $dog from before. (One common optimization is to know that bless also returns $self in this case, so we can leave that last line out with no change in result.)

For a more detailed explanation of this process, please see my most recent tutorial book, Learning Perl Objects, References, and Modules, from O'Reilly & Associates.

Now, why didn't we just hardcode the Widget value into the bless, and what's up with that "usually" in the comment? The complication arises when we get to inheritance. Suppose we have a subclass called ColoredWidget that inherits from Widget and adds two methods to manage the color of the widget:

package ColoredWidget;
use base qw(Widget); # sets @ISA

sub color {
  my $self = shift;
  $self->{color};
}

sub recolor {
  my $self = shift;
  $self->{color} = shift;
  $self;
}
Calling color or recolor on a ColoredWidget uses the subroutines found in the ColoredWidget package, but calling named on ColoredWidget uses the @ISA to find the named routine from the base class, Widget. In this case, the argument list will look like:

("ColoredWidget", "Spot")
Because the first argument to named is shifted off into $class, and then used in the bless, we get an object of class ColoredWidget instead of Widget.

Our display method for ColoredWidget needs a bit more work, though, if we want the color as well. We can use overriding to handle that:

package ColoredWidget;

sub display {
  my $self = shift;
  print $self->name, ", colored ", $self->color, "\n";
}
Now, for ColoredWidget objects, this version of display is used in preference to the previous version. We can also extend rather than override by reusing the base class version of display:

package ColoredWidget;

sub display {
  my $self = shift;
  $self->SUPER::display;
  print "[color: ", $self->color, "]\n";
}
Now when we invoke display on a ColoredWidget, we invoke the first display found in the base class (as if there were no definition in this class). That invocation produces the name by itself. Then control returns to this method, and we add the color in brackets below.

The constructor here is named named because it reads like what it does: give me a Widget named Spot. But for tradition's sake, I could also call the constructor new. In fact, I might make a constructor new that returns an unnamed Widget (the name left as undef if referenced). This would look like:

package Widget;

sub new {
  my $class = shift;
  bless {}, $class;
}
Here, a simple reference to an empty hash is generated, blessed into the right class, and returned. To make Spot, I can now say:

my $dog = Widget->new;
$dog->rename("Spot");
That's a little more clumsy, but it gets the job done.

Another advantage to always naming your constructor as new is that you can easily create an object that is "like" another object. For example, if we have an unknown object $object, we can call ref $object to get its class, then create another object of the same class by calling new:

my $similar = (ref $object)->new;
But this works only if all of our possible classes of $object understand the same new method. Fortunately, for the times we're likely to do this, we've also made the classes work this way.

Another common operation is cloning: making an object that is a copy of the current object. It's not enough simply to copy the reference:

my $puppy = $dog;
This action copies the reference to the data but not the data itself. So if I rename the $puppy, the $dog changes its name as well! Cloning is best handled by copying all of the data. A naive clone could be executed as:

package Widget;

sub clone {
  my $self = shift;
  my $clone = { %$self }; # copy keys/values one level deep
  bless $clone, ref $self; # copy the object class, returning $clone
}
This will work properly for objects that do not have a deep structure, such as we've seen here so far. But what if one of the object attributes is a reference to yet another data structure. Again, we're copying the reference, and not the data, so the data will be shared amongst the clones. See my column from February 2000 (http://www.samag.com/documents/s=1168/sam0002d/0002d.htm) for more details on deep copying.

An alternative method of cloning is a more piecemeal approach. Teach each class in the hierarchy to clone the attributes added by that class:

package Widget;

sub clone {
  my $self = shift;
  my $clone = (ref $self)->new; # empty object of same class
  $clone->rename($self->name); # copy name attribute
  $clone;
}

package SubWidget;

sub clone {
  my $self = shift;
  my $clone = $self->SUPER::clone; # clone base class stuff
  $clone->recolor($self->color); # copy color attribute
  $clone;
}
Note the similarity of design. If a class knows that it adds a complex attribute (a reference to a deeper data structure), then it can add special copying instructions for that attribute to the new clone. This is a good OO design, because the information contained within each base and derived class is maintained closely with the logic on which it depends.

And now, before I run out of space, I'll touch on a hot-button for me. The perltoot manpage contains an archetypal new routine that looks like:

sub new {
  my $proto = shift;
  my $class = ref($proto) || $proto;
  my $self  = {};
  ...
}
The purpose of these few lines of extra code is to permit:

my $other = $dog->new;
to act like:

my $other = (ref $dog)->new;
But here's the problem. When I survey experienced object-oriented programmers, and ask them what they expect new means when called on an instance (without looking at the implementation), the result usually divides rather equally into three camps: those who go "huh, why would you do that" and think it should throw an error, those that say that it would clone the object, and those that say it would copy the object's class but not the contents.

So, no matter what you intend, if you make your new do one of those three things, two thirds of the people who look at it will be wrong. It's not intuitive. So, don't write code like that, and especially don't just cargo-cult that from the manpage into your code. If you want an object like another object, use ref explicitly, as shown above. If you want a clone, put cloning code into your package, and call clone, as I showed earlier.

I hope you've learned at least one or two things about objects that you might not have considered before. Until next time, enjoy!

Randal L. Schwartz is a two-decade veteran of the software industry -- skilled in software design, system administration, security, technical writing, and training. He has coauthored the "must-have" standards: Programming Perl, Learning Perl, Learning Perl for Win32 Systems, and Effective Perl Programming. He's also a frequent contributor to the Perl newsgroups, and has moderated comp.lang.perl.announce since its inception. Since 1985, Randal has owned and operated Stonehenge Consulting Services, Inc.