/usr/portage

Antipattern: the verbose constructor 48

Constructors are often used to shortcut dependency injection and parameter passing on instantiation. This is a valid practice and often leads to shorter code. Consider the following example (a simple value object, often used to not mess around with floats and to keep currency and amount together):

class Money
{
    protected $_amount;
    protected $_currency;
    protected $_divisor;
    public function __construct(
        $amount = null, $currency = null, $divisor = null)
    {
        if ($amount !== null)
            $this->setAmount($amount);
        if ($current !== null)
            $this->setCurrency($currency);
        if ($divisor !== null)
            $this->setDivisor($divisor);
    }
    ... setter and getter ...
}

Now consider instantiating this object. Instead of creating a new instance of “Money” and calling three setter, everything can be done compactly in the constructor.

bc . $money = new Money(13200, ‘EUR’, 100);

So for the money object this works pretty well. The code is easy to read, but wait, the first argument can be grasped easily, the second too, but the third? It is not too obvious that it is a divisor is passed. An alternative would be changing the constructor to accept an array. This is a replacement for true named arguments, as e.g. Python supports. Solar uses that a lot, as well as the Zend Framework.

$money = new Money(
    array(
        'amount' => 13200,
        'currency' => 'EUR',
        'divisor' => 100
    )
);

Much better readable but does your IDE code completion works? And what happens if you pass “amoµnt”, because your fingers are as clumsy as mine? Exactly, the parameter will be silently ignored.
But look at this:

$money = new Money();
$money->setAmount(13200);
$money->setCurrency('EUR');
$money->setDivisor(100);

It is at least equally short, readable, your IDE works and if you have problems with the dimensions of your keys on your keyboard (they are too small, it has nothing to do with your fingers) you will be warned. But we could even have an even shorter example while maintaining the readability. With fluent interfaces we would get the following:

$money = new Money();
$money->setAmount(13200)->setCurrency('EUR')->setDivisor(100);

Wonderful! If you want, you can add a newline between each object operator and you would have the same amount of lines but less dense code (sad that we don’t have fluent constructors, isn’t it?). Sometimes setters are so elegant.

So until know one thing should be clear: it is not just about easily writing the code, but about the next guy understanding it too. Because you never write code for yourself. Never. But let’s investigate some real live example. I work with a framework that allows me to define really nifty business logic by just sticking together a bunch of fields and every field having a bunch of validators and filters attached.

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username', true, null, true));
    }
    protected function _getStorageClass()
    {
        return 'UserStorage';
    }
}

All the time I write such a definition, I need to look into the code to check the order of the parameters. I can remember the first parameter, but the rest is too similar. To explain it: the second parameter specifies whether the field is required, the third expects a default parameter and the fourth indicates whether the value can be changed after it has been set once. I’ve talked about filters and validators, right?

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username', true, null, true))
            ->addValidator(new UniqueUserValidator())
            ->addFilter(new LowercaseFilter())
            ->addValidator(new RegexValidator('/^[a-z]+$/'));
    }
}

Definition::addField() returns the passed field object to allow adding validators and filters. What works for validators and filters, should work for the rest too, shouldn’t it?

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username'))
            ->setRequired(true)
            ->setReadonly(true);
    }
}

I admit, a bit more code to write, but a huge improvement in readability and therefore in maintainability. Other variants, where setter are not a good solution is to create an expressive factory. We e.g. have a Criteria object that creates and orders Criterion objects internally. Because we don’t have a fluent constructor, we have a static create-method for the Criteria object.

$criteria = Criteria::create('User')->field('id')->equal(1);

The alternative with just utilizing the constructor would be horribly to read and would have limitations regarding the parameter parsing capabilities (except if func_get_args() is used, which is totally the opposite of the paradigm of strict APIs). But back to the constructor only example:

$criteria = new Criteria('User', array('id' => 1));

And how would you express “id not equal 1” with it? So that’s where expressive factories are an alternative.

Constructors, as like any other method, should have as less parameters as possible but as much as needed. Obvious. The constructor should only allow setting vital information for the object (if the object has a name, there is a good chance, that the name is the parameter of the class’ constructor because it is considered vital). And the ease of use depends heavily whether the parameters passed can be intuitively distinguished by looking at there values. As well when the code is written first time as for maintaining it for the rest of your life.

(There are a bunch of other tricks to make parameters more readable, like using class constants as parameters, but this is out of scope of this article).

Filed on 31-07-2008, 01:01 under , , & 48 comments & no trackbacks

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

  1. Matthew Weier O'Phinney means:
    published on July 31st 2008, 03:22:55 am *

    Nice post.

    In ZF, I’ve been pushing to use an array of named arguments that can be passed to the constructor — but that these same arguments correspond to mutators in the object. This gives both the ability to configure the entire object up front, as well as to use the mutators expressively via fluent interfaces. The approach actually was born form unit testing, which exposed the most efficient methods for testing object state.

    P.S. There appears to be double escaping occurring in your code samples.

    Reply

  2. Lars Strojny replys:
    published on July 31st 2008, 08:13:16 am *

    I saw the raise of named arguments in ZF, yes. The reason me not liking arrays as a named arguments replacement is a) taste and b) hm, taste, c) the IDE arguments (which is kind of a straw man, as I’m a VIM user) and d) the thing with the strict APIs.
    When using named parameters, everything raises and sinks with the implementation. Foo::setSomethin() (typo intended) just errors out, while array(‘somethin’ => ..) might not. So the result for implementation would bea whitelist approach, to throw "invalid argument"-exceptions, if someone passes an unknown parameter. That becomes problematic with inheritance (Liskov substitution principle, the child should be capable of at least the same as the parent class and when it is capable of handling more params than the parent, the validation process must be well defined). The advantage of mutators is that they fail if they do not exist. The same behaviour must be reimplemented with arrays. The thing that bugs me in ZF is that most of the time both is provided, a setter for params (setParams()) and individual setters. Zend_Layout even solves that programatically. So a number of classes have a lot of boilerplate code just to handle their different mutations and parameter handling is definitely not the domain problem of the layout component ;)

    Reply

  3. Aaron reckons:
    published on July 31st 2008, 04:03:38 am *

    Yeah mate, looked very interesting, but I couldn’t handle reading it all because the of the escaping occuring. I’m picking that the two apostrophes in my post will appear as ’

    Reply

  4. Lars Strojny means:
    published on July 31st 2008, 07:55:10 am *

    Hi Aaron, hi Matthew,

    the double escaping, does it take place in the blog for you or in the feed?

    Reply

  5. Aaron Heimlich states:
    published on July 31st 2008, 08:23:23 am *

    The double escaping takes place in the blog post (and comments, too).

    Reply

  6. Lars Strojny replys:
    published on July 31st 2008, 10:33:47 am *

    Strange. I can’t reproduce neither of those. Can you provide me a screenshot?

    Reply

  7. Ivo opines:
    published on July 31st 2008, 09:00:47 am *

    Same here (FF2), code snippets are hardly readable.

    Reply

  8. Matthew Weier O'Phinney states:
    published on July 31st 2008, 01:01:08 pm *

    The double escaping was only present in the actual page for me. However, emphasis is on past tense here — escaping is fine now, and the examples are readable.

    Reply

  9. Joshua Trii opines:
    published on July 31st 2008, 03:29:45 pm *

    I was going to report a bug on this escaping at the s9y forums but didn’t want to register.

    s9y seems to check the referrer and if it is a search engine it double htmlspecialchars() certain text (php code snips). If you hit this page via the feed from google reader, etc you will see it but if you navigate to it from another link or type the URL you do not.

    Reply

  10. Thomas Koch replys:
    published on July 31st 2008, 08:40:28 am *

    There is another possibility: Instead of passing an array to the constructor, you could utilize structs.
    Actually PHP does not have structs, but eZComponents provides two abstract classes to simulate them:
    "ezcBaseOptions":http://ezcomponents.org/docs/api/trunk/Base/ezcBaseOptions.html
    and
    "ezcBaseStruct":http://ezcomponents.org/docs/api/trunk/Base/ezcBaseStruct.html

    They avoid typos by throwing an exception when you try to access unknown members.

    Best regards, Thomas Koch

    Reply

  11. Lars Strojny means:
    published on July 31st 2008, 10:41:13 am *

    That looks a lot like the "argument object":http://c2.com/cgi/wiki?ArgumentObject. Does that mean, that ezComponents do parameter validation in the argument object too?

    The issue with argument objects is, that it just outsources the problem to another object instead of solving it. It looks like with a class derived from ezBaseOption my example would look like this:

    [geshi lang=php]
    $options = new Options();
    $options->readonly = true;
    $options->name = ‘foo’;
    $definition->addField(new Field($options));
    [/geshi]

    Reply

  12. Richard responses:
    published on July 31st 2008, 11:12:34 am *

    "PHP does not have structs"

    StdClass is a pretty good lightweight alternative though.

    Reply

  13. Lars Strojny answers:
    published on July 31st 2008, 11:38:14 am *

    I don’t agree, honestly. Structs have pre defined members. The stdClass does not verify that "foo" is a valid member. I guess stdClass just exists for Drupal and Serendipity, doesn’t it? ;)

    Reply

  14. tim says:
    published on August 1st 2008, 02:32:13 am *

    Do you know if something like structs is planned for PHP? It would be really useful, though. I hope a similar feature will be added to PHP 5.3 or PHP 6.

    Reply

  15. Lars Strojny reckons:
    published on August 1st 2008, 08:44:20 am *

    As far as I can see, there are no such plans currently. And I’m not even sure if I would consider this a helpful addition. We have objects, we have mutators and interceptors, so write your own :)

    Reply

  16. tim answers:
    published on August 1st 2008, 09:29:59 am *

    Yes, but the down side with this method is its bad performance. I think those things should be implemented in PHP itself rather than trying to "emulate" a possbile behavior (which is not even possbile).

    Reply

  17. Matthew Weier O'Phinney responses:
    published on August 1st 2008, 12:56:37 pm *

    You may want to check out ArrayObject.

    Reply

  18. tim replys:
    published on August 1st 2008, 05:51:43 pm *

    Thanks but I am not a fan of the SPL libraries. There are some benchmarks which show how slow RecursiveDirectoryIterator and other SPL libraries actually are. The differences were horrifying. To keep to the point, this ArrayObject is not exactly what I am looking for. These structs should be built into PHP because it is very unlikely that an IDE will ever support auto completion for array elements.

    Reply

  19. Joshua Trii states:
    published on August 1st 2008, 06:24:41 pm *

    You can do anything a struct can do by using an object. By creating a base class that implements __get/__set/__isset to deny access to undefined members you can easily emulate structs in php.

    Back to the discussion at hand, not remembering parameter order or name, etc is an error on the developers part. Just because you can abstract away any knowledge of the code doesn’t mean you should. While you are wasting your time trying to solve a problem in code that should be solved with training or replacing a developer, I am writing new code that actually does something.

    Reply

  20. tim means:
    published on August 1st 2008, 06:58:12 pm *

    Yes, this is another possibility of solving it but what I am looking for is a solution that will harmonize with PHP IDEs. Magic functions will never support code completion, etc. They behave like arrays.

    I agree with the other part of your message in a certain way but I think the developer should try to make their classes be more intuitive and easier to use for other developers. As already stated in the blog article, it is hard to figure out what the last parameter of the Money-initialization stands for. Therefore I suggest only to use the most vital parameters in the constructor and to set all the others by using member functions. Like:

    $money = new Money(13200, ‘EUR’);
    $money->setDivisor(100);

    If the divisor is not set, it will use a default value. It will not be needed very often, though.

    Reply

  21. Lars Strojny returns:
    published on August 1st 2008, 08:12:19 pm *

    PHPDocumentor uses property to document interceptors. As far as I know, the ezComponents use this syntax a lot. Something like that could be easily introduced for methods. So that you say "method NAME DESCRIPTION" in the docblock and PHPDocumentor reads it and your favorite IDE interprets it correctly.

    Reply

  22. tim opines:
    published on August 1st 2008, 11:10:50 pm *

    That is interesting. Unfortunately my favorite IDE, Monkey Studio, does not parse the phpDocumentor-comments. I used VIM before but I found Monkey Studio more convenient because of some featues VIM is lacking of like a project manager, a class and function navigator (using ctags), Qt application designer (I have not tried yet but I have heard that these files can be used in PHP-Qt just like you can use Glade files for PHP-GTK projects) and some others. The problem with Monkey Studio is that there are only a few developers working on it and it is quite unknown. Therefore it lacks some basic things that would be useful for PHP development but actually I was not able to find a better IDE for PHP development for GNU/Linux using Qt.

    PS: Please write more of those articles. They cover most of my thoughts when designing the interface of classes. In my opinion the design of a class is the most difficult thing. It is not unusual that it takes literally hours finding the best design of a class. That’s because I look for similar classes on the internet and check how it is been realized there before I get started myself. If I’d be using UML and similar stuff, it would even take more time. I am also having problems finding the right design pattern. Could you give us some tips on this? This would be great. I have the impression that this article attracts wide interest (due to the quanitity of comments here). Thanks.

    Reply

  23. Lars Strojny reckons:
    published on August 1st 2008, 08:14:28 pm *

    Hm, I’m always skeptic of claims without a proof. So where are those benchmarks and what does one class say of the rest of a library?

    Reply

  24. tim answers:
    published on August 1st 2008, 10:41:45 pm *

    I think the biggest problem with the SPL classes is the overhead through its initialization. I did not find that benchmark I had in mind but found another one which shows the difference between RecursiveDirectoryIterator and parsing an XML file containing a directory listing: http://www.silomarket.com/blog/autoloading-with-xml-or-recursivedirectoryiterator. It is no surprise that parsing the XML file went faster because there is less I/O access necessary. If I have the time, I will benchmark RecursiveDirectoryIterator myself with a more meaningful conclusion.

    Reply

  25. Matthew Weier O'Phinney supposes:
    published on August 2nd 2008, 01:20:43 am *

    Sorry, but I’m with Lars here: this is an apples and oranges comparison; a better comparison would have been RecursiveDirectoryIterator vs. the hurdles you need to go through to do the same with PHP 4 using glob and other such functions. Honestly, even if SPL does impose a performance hit, my guess is it would pale in comparison to database or web service access.

    Reply

  26. Lars Strojny states:
    published on August 2nd 2008, 02:23:08 am *

    This looks like a really specific use case. See, once a class is loaded, the whole library directory including subdirectories is examined. This must be slow, something like O(n^n) probably? ;-)
    Anyway, when evaluating a component, you would gather at least the following data: * How does it simplify writing of my code (time to market, labor costs) * How does it simplify reading my code (maintenance, labor costs, development costs over time) * How does it affect performance of my application servers (cost of operation) * How does it affect the scalability of my application (cost of operation) * How does it affect safety and security of my application (risk management) * Do I like it

    Of course, the last point is most important but there are a lot of question to answer. Normally you don’t answer these questions straight away and often a compromise is the way to go. It might affect performance, but servers are actually cheaper than your developers, so you will go and hire servers instead of new developers.
    But, back to topic, what I tried to illustrate is, that sometimes the slower variant is much better. Sometimes this might be valid for SPL.

    Reply

  27. Dave reckons:
    published on July 31st 2008, 11:03:01 am *

    I actually like the first one the best, this is most similar I guess to how C++ works. Your only argument against it seems to be that its not obvious what the arguments are.

    But if your using something like javadoc your editor will tell you, att the very least autocompletion gives the name of the variable in the function declaration so pressing ctrl+space at least in c++ or java would give you "float divisor" as a prompt.

    I havent used a fully featured IDE with php since I generally only need to do small edits as part of my job however this seems to be a problem with the editor not with the coding style.

    The perl style array example is clearly the worst, for as you say at runtime the structure of the associative array passed isnt checked. Further in all languages Im familiar with constructing an array is expensive.

    Your final example is OK but to me it’s inviting NullPointerExceptions. What if someone forgets to run one of the initialisation functions? Further you’re racking up overhead by putting needless function calls on the stack. It may be pretty small but if it’s run in time critical sections of code, or in a function with hihg recursion then it soon adds up and could end up with performance penalty or blowing away the stack 3 times earlier than you otherwise would due to recursion.

    Reply

  28. Lars Strojny opines:
    published on July 31st 2008, 11:35:37 am *

    Yes, the only criticism I have regarding the first example is that it is hard to read. I would even agree and say for a value object it is easy enough. But the example with adding a field is really hard to remember, as the parameters are too similar.
    Editor/IDE: I think IDEs are useful and they can provide vital help for the programmer, but I would consider a practice a bad one if the result is only usable with an specific IDE or editor. At the end you are always debugging by hand.
    Performance concerns: this is a valid concern for performance critical parts. Except that you don’t do them in interpreted languages normally. And even it is becomes a bottleneck, than you need to decide if the performance is worth the trade off.
    Mutated objects: preventing NullPointerExceptions is in fact harder, as the object needs to verify that it has a correct state. That’s why for value objects it is often fine to just have the constructor + getters to receive the values. So there is either a constructed object and this object is valid, or there is just no object.

    Reply

  29. tim states:
    published on August 1st 2008, 02:25:28 am *

    I have a question according to your comment. What do you mean by those NullPointerExceptions?

    Reply

  30. Lars Strojny supposes:
    published on August 1st 2008, 08:55:39 am *

    In Java a NullPointerException is thrown when you call a method on a variable which is null. In PHP you get a fatal error "Call to a member function method() on a non-object".

    Reply

  31. Johannes supposes:
    published on July 31st 2008, 11:59:37 am *

    The nice thing about the first version is that it ensures, that all object members are properly initialized, with the later ones you have to introduce some flag or something to check everywhere "is this object ready to use", and modern IDEs should give you a mouse over thingy to tell you the names of the parameters.

    Reply

  32. Lars Strojny responses:
    published on August 1st 2008, 08:46:28 am *

    That’s more trouble, yes. But allowing the state to be changed after the object is initialized makes the object much easier to test. So while implementing it it is more work, but it will pay back later.

    Reply

  33. Dave reckons:
    published on August 1st 2008, 06:26:01 pm *

    Well that really depends on if your object model permits the object to be mutable. Having setters and using a fully defined constructor aren’t mutually exclusive.

    In my opinion it is a bug to have a constructor which allows the resultant object to have an invalid state.

    Similarly your setters should also enforce validity of the state.

    I cant stand the whole perl idea of of just throwing a bunch of stuff into the function and the function picking out what it wants. Most likely returning an invalid state as a magic failure string (although Exceptions are a topic for another day). Yuck!

    Reply

  34. Lars Strojny replys:
    published on August 2nd 2008, 02:27:12 am *

    I totally agree, invalid states should never happen. What might happen though are incomplete state. Say, execute() depends on property foo and bar but just bar is given, execute() has to verify its dependencies properly. As a method should now what it is supposed (and on what it depends) to do internally, it’s totally fine to throw an "hey user, set bar, otherwise this won’t work"-exception.

    Reply

  35. Lukas responses:
    published on July 31st 2008, 09:53:09 pm *

    Actually I think its easy to support both approaches. The constructors should handle a reasonable set of parameters. Array style single parameters is valid here, especially to support various optional default parameters.

    Remember arrays are dynamic, so the array could just as easily come out of some configuration file!

    That being said, if you know up front exactly which things you want to set in the constructor, I would also prefer the explicit method calls. Even if its more verbose, with an IDE you get code completion (so you will likely type much less than the array style and not much more than the long parameter list style) with the added advantage of syntax checking.

    Reply

  36. tinou states:
    published on August 1st 2008, 09:04:26 am *

    i respectfully disagree. Using setters allows you to create objects in invalid states. Constructors are there to create objects in valid states. In your money example, you can’t detect that a setter wasn’t called. You could have a Money without a currency. If there are too many arguments to the constructor that’s a sign of code smell and need to rethink your class design.

    Reply

  37. Derek reckons:
    published on August 1st 2008, 12:13:46 pm *

    I agree with Dave. Coming from a C++ Java background it is important that objects are valid when instantiated. And using PHPDocumentor style comments takes care of the IDE issue, at least in PHPEclipse and I’m pretty sure other IDEs provide "intellisense" for parameters if the constructor has be commented appropriately.

    Reply

  38. kira means:
    published on August 1st 2008, 04:00:58 pm *

    nice, but you have to change the colors of your blog ! it’s hard for my weak eyes to learn correctly lol

    Reply

  39. Lars Strojny states:
    published on August 1st 2008, 08:15:40 pm *

    I’ve sometimes got that comment, but until know I just like it too much. And my design skills are somewhat limited, so I guess this layout will last at least a bit. Sorry :)

    Reply

  40. MonkeyT says:
    published on August 2nd 2008, 02:37:24 am *

    I was working with open code for a login mechanism and saw an instantiation like this:

    $money = new Money( $amount = 13200 , $currency = ‘EUR’ , $divisor = 100 );

    Not sure if I like this style or not, but since assignation in PHP actually returns the value that is being assigned, it worked fine. Mechanically, it’s the same as the first option you showed, but it clearly documents what was happening, even though you wound up with some useless variables in memory.

    Reply

  41. Lars Strojny responses:
    published on August 2nd 2008, 11:41:15 am *

    That’s possible, yes. Except that you pollute your variable space which can be dangerous.

    Reply

  42. stef opines:
    published on August 2nd 2008, 05:44:10 pm *

    Very cool article, and we could try the "Builder pattern" too…

    Reply

  43. Lars Strojny reckons:
    published on August 2nd 2008, 06:08:46 pm *

    The example with the internal DSL has some similarities with the builder pattern. Except that it aggregates a number of objects internally instead of returning them. So it’s neither a real factory nor a builder. That the different I tried to describe with "expressive factory".

    Reply

  44. Koen supposes:
    published on August 3rd 2008, 01:01:04 am *

    Thanks for this article.

    In reference to the Zend Framework, one example I like to present is Zend_Session. You can pass an array to set all session options. There are lots of them. Do you think this class would benefit from adding these all as setters?

    Reply

  45. Lars Strojny opines:
    published on August 3rd 2008, 04:51:06 pm *

    Zend Sesseion is really a wrapper around ext/session. A convenient one but still a wrapper. So the component is there to configure the session. I think that an expressive factory might be the way to go. Regarding it being static: most of the the time when you have static classes one of the collaborators should have its members. I say could because I did mot analyze the problem any further.

    Reply

  46. troels means:
    published on August 8th 2008, 12:31:21 am *

    Interesting post. I don’t think I generally agree with it though. Taking aside the extreme cases, I generally don’t mind complex constructors. They are usually a sign of two things; A) the class has many dependencies and B) the code is properly decomposed, putting these dependencies in distinct objects. The latter is a good thing in my book, so if there is a problem with the constructor’s size, it must be because of A. Eg. the class has too many dependencies. This can be fixed by refactoring. As such, the crowded constructor is a sign of trouble, and in that sense an anti-pattern, but it should be solved at the root, rather than trying to cure the symptom.

    Reply

  47. Koen replys:
    published on August 8th 2008, 12:40:48 am *

    I don’t think the problem addressed here is too many constructor arguments. It’s that it’s not always clear what the arguments reference. Even with one argument this can be the case. The article, IMHO, tries to propose a solution to this problem.

    Reply

  48. Lars Strojny reckons:
    published on August 8th 2008, 01:23:42 am *

    Hm, most of the time the dependencies can be injected after the object is constructed. And the getter can return a default object if it makes sense, so having a lot of dependencies is not really an argument for verbose/unreadable/unrepeatable constructors. On the other hand, having a lot of dependencies might also be a sign, that – well – the object just has too many dependencies. Instead of having a number of dependencies it might just depend on a few objects that themselves depend on other objects and so on. The premise of composing small objects, which is a good one, has not been invented to have a small object and hundreds of other small objects.
    But back to the dependencies and the proposal to have the getter return a default: in service locator based systems you will have a pair of mutators for the dependency Foo (setFoo(FooInterface $foo) and getFoo()). getFoo() will return protected property foo if it has been set to an implementation of FooInterface or request the implementation of FooInterface from the service locator component. So, no forced need to pass the object as an constructor argument. Either use the setter to inject a dependency, or inject it from the configuration of the service locator.

    Reply

Add a Comment & let me know what you think

Submitted comments will be subject to moderation before being displayed.