The danger of privates, and composition vs. inheritance

Submitted by Larry on 11 December 2012 - 10:43pm

The private variables debate is going around the PHP world again. Brandon Savage posted a pair of articles pointing out the perils of private variables, boiling down mostly to them making extension infesible or impossible. Anthony Ferrara replied with his own article, arguing that the real problem is using inheritance in the first place rather than composition. I figured I'd weigh in on my own blog rather than in a comment. :-)

As an academic matter, I agree with Anthony. Composition and interfaces are more flexible than inheritance. I've been preaching the interface gospel within Drupal almost as long as I've been preaching Dependency Injection.

As a practical matter, however, rarely is the world so clear and academically pure. Sometimes, looking at a class I can see that it's API *and* implementation are 95% of what I want, but I need to tweak one piece. And that piece is an internal part of the implementation. It would be lovely if all classes were well-factored so that what I wanted to override in a wrapping adapter class was easy to do externally, but in practice that's simply not the case more often than not.

Even in a very well-designed system like Symfony2 I've repeatedly run into places where I had real, legitimate use cases for wanting to extend-and-tweak a class, but ran into something being private which made it require a lot of ugly contortions. And those contortions are contortions that I would not have been able to accomplish via pure composition at all.

Further, the PrinterInterface above is a nice and simple example. Nice and simple examples are great, but far too often the real world is not nice or simple. As a concrete example, take the SelectQueryInterface of Drupal 7/8's database library.

The interface has 26 public methods. Yes, 26. And I think if you examine the interface you'll see that it's entirely justified, and not a God Object (or God Interface). That problem space simply requires a lot of methods.

So to make a trivial wrapper around that object in order to do a little extra processing around just one method, you need *twenty five* stub methods that do nothing but forward the call. That's a huge burden to put on the developer, as now you're talking about, in well-formatted code, on the order of 100 lines of dumb boilerplate code just to "cleanly" write an adapter class even before you get to the meaningful part of the class. And, that does add runtime overhead for extra stack calls.

But it gets worse. Suppose that your wrapping objects have legitimate need to add methods. Their whole purpose is to take an object of one type (say, a select query building object) and add some additional functionality. That's what adapters do. Unfortunately, that means an adapter can't rely on the thing it's wrapping being only the interface it expects, as it may have other meaningful methods that need to be forwarded. PHP's only saving grace in this department is __call(), which is an ugly and slow hack but the only option here.

We ran into that exact problem in Drupal 7 with select query builders, and the only solution we had was a complex and fugly system of Extenders that I hate with a passion. Really, it's just a utility shell and, yes, base class for an object that has 100 lines of boilerplate call forwarding, plus a default __call() implementation. I hate it, but it's the only system we could come up with to work around the limitations of a "compose everything" mindset. (If someone has a better suggestion, I'd love to hear it.)

There was a proposal a while back for PHP to have some sort of "forward" keyword so that a class could declare "this interface I have? Forward all methods from it to this object I'm composing. kthx." That would neatly solve the 100 lines of boilerplate problem, but it didn't go anywhere unfortunately and Traits happened instead.

In an ideal world would private methods be fine, because you never need to extend classes? Sure. But rarely do I see code written in an ideal world. Composition and interfaces should be preferred over extension, absolutely, but it's not always possible or practical. In the real world, gaps happen and you need to extend things, and if the wrong thing is private, you can't.

That's why Drupal has a policy of not using private anything: We value extensibility and flexibility over purity, even while striving increasingly for architectural cleanliness in more recent efforts.

Hi Larry,

I think a very important argument for private variables is missing that is much more practical than academical: Evolution of software. If a framework like Symfony makes all properties protected, it opens the door for everyone to use everything, raising BC problems with every tiny change. If the properties are private though, people come back, give feedback about where the design is flawed so that the design can actually be fixed rather than hacked. The process is slower then, but the end result is better.

Bernhard

Hi Larry, I think you're mistaken in your believe that using private methods is somehow pure. It's not. It arises from the fact that PHP is too simple a language missing basic things like class invariants.

The best OO language in the world, Eiffel, does not have private inheritance. But it has the ability to make methods visible to the world, or to specific classes, or only to children. Every child can always override things. And children can override visibility too.

Private functions are just a poor man's only option.

Larry,

I get your point, and I don't disagree with it in practice. You know I'm a huge believer in Good Enough ( http://blog.ircmaxell.com/2011/03/difference-between-good-and-good-enou… ), and from that standpoint I agree.

However, I find it interesting that you bring up SelectQueryInterface as a counter example. The reason is that the pain that you are feeling there is not due to the real world nature of the class, but the bad design of it. If we think about it, it violates a few key points of SOLID coding:

SOLID Violations

Single Responsibility Principal

At first glance it may not seem to be violating this, but in the end it does. The reason is that it's really handling multiple things. You quote that there are 26 methods. But the way I see it, there are several other classes that these methods belong to that are not being used. For example:

  • addExpression
  • addField
  • fields
  • getExpressions
  • getFields

All are related to the responsibility of what fields the select should return. They are only tangentially related to the query itself, and as such belong on another object (perhaps SelectQueryInterface::getFieldSet())...

If we apply that out, we can reduce the SelectQueryInterface down to this:

interface SelectQueryInterface {
    // Expressions and Fields
    public function getFieldSet();
    // FROM and Joins
    public function getTables();
    // WHERE and HAVING
    public function getFilters();
    public function getGrouping();
    public function getOrder();
    public function getUnion();
}

So now we've cut that 26 down to 6. Each of those methods would return another object for manipulating that area (with an associated interface).

Additionally, you have clearer mixing of SRP violations with the prepare methods. The query builder should build a mid-representation of the query. It should have nothing to do with serialization of said query. So methods like preExecute() are a clear violation there...

Interface Segregation Principal

This is the big one. Right now, that interface represents one giant all-purpose interface. ISP says that we should be using small, client and task specific interfaces. The example that I gave above would be one way of handling that (and showing how we can reduce the size of the interface). Doing so would greatly reduce the pain that you feel.

It's to the point where I try to limit my interfaces to 5 methods at most. I don't always hit that mark, but if I don't, I weight the tradeoff between good and good enough, and decide from there. But I find 5 to be a good number, as if there is more, there's usually a case to be made for a SRP violation...

Wrapping Up

So I don't really disagree with your points, but more so with the example you used to try to justify it.

Furthermore, I fully agree about the boilerplate issue when creating decorators. I've been working on a proposal (won't get proposed by 5.5) to add a "decorates" syntax. So you'd have a decorator definition like:

class SelectObjectDecorator implements SelectObjectInterface {
    private $parent decorates SelectObjectInterface;
    public function __construct(SelectObjectInterface $obj) {
        $this->parent = $obj;
    }
    public function join() {
        // Your override here
        $this->parent->join();
    }
}

Finally, with respect to your last sentence: We value extensibility and flexibility over purity, even while striving increasingly for architectural cleanliness in more recent efforts., I think that it's a shallow argument for this point.

Protected vs Private does provide extensibility and flexibility, but at the expense of maintainability. Proper (I really don't want to use that term, but I can't think of another here) design will give you the same extensibility and flexibility (perhaps more, especially when it comes to composite flexibility) without sacrificing maintainability. It takes more time when writing the interface to distil it down to as simple as it can be, but there are definite long term benefits there...

Anthony

In many cases, yes, large interfaces are a code smell. But at the same time, forcing the developer using the object to go through even more layers of abstraction just to use it is a code smell, too.

Consider if we really did break SelectQuery out into a half dozen different objects. Now what does it look like when a developer wants to create a new query?

<?php
$query = new SelectQuery('t1');
$query->getTables()->join('inner', new Table('t2'), new JoinRule('t1', 't2', 't1field', 't2field'));
$query->getFieldSet()->addField(new Field('t1', 'somefield'));
$query->getFieldSet()->addField(new Field('t2', 'otherfield'));
$query->getOrdering()->addOrder(new Order('t1', 'datefield'));
$query->getFilters()->addFilter(new Filter('t2', 't2filterfield', '<', 'someval'));
$query->execute();
?>

That's way more verbose and difficult to wrap your head around. All developers I know, myself included, prefer to think of a select query as a single "thing". It's a command object they're setting up. The above model would mean exposing internal implementation details to developers (the fact that fields and filters are stored in separate sub-objects), which is not good.

We could no doubt quibble about specifics of the API (and preExecute() being public I don't like either; there was a specific use case that forced us to do so), but that's not the point here. My point is that guidelines are just that: guidelines. And going too far toward God Objects is bad, but also taking SRP to an extreme is just ad bad in different ways. As with most things, good API design and good software architecture is a balancing act, and a balancing act on which you have a finite amount of time.

On another note, I'd love to see a "decorates" keyword or similar in future PHP versions (give or take debating syntax details). There's a lot of other things I'd like to see, too, but that is one that would really help with this problem space and allow more composition for less work.

Larry,

We could of course go as deep as we wanted here, but I'll just make one more comment to that.

While I agree that directly developer workload may seem to suffer, it's not completely the case. For example, we could use the reduced objects (my suggestion) to actually implement the logic, and then use a different pattern for the fluent interface given to the user. For example, we could use a Bridge ( http://sourcemaking.com/design_patterns/bridge ) or a Facade ( http://sourcemaking.com/design_patterns/facade ). That way the "work" that you'd want to overload is still simple (because it has small interfaces, and classes with single responsibilities), but what's presented to the user is still simple as well, since there's a good API for it.

It may seem academic, but it's a pattern that I use all the time. For example, in PasswordLib, I use it so that the average user only needs to care about 1 or 2 classes (with simple interfaces), but the power user has the power to use either the deeper classes, or the ability to override functionality *behind* those two simple classes. Check out: https://github.com/ircmaxell/PHP-PasswordLib/blob/master/lib/PasswordLi…

Anthony

I imagine what you suggest as a Facade would boil down to a wrapper that would forward the calls to the respective pieces of the query. It could either be the query object itself that provides those "shortcut" wrapper methods, or it could be an object that actually wraps the select query.

Either way, you would have to write a good number of methods that essentially do nothing. And whenever you add a method to any of the query parts that you want to be public, you need to add a corresponding method on the wrapper. With the additional redirection penalty on the call stack.

I don't know if this is worth the trouble. I can perfectly understand why the select query in dbtng is written the way it is.

What we'd really need here is some kind of "compile time composition" that would avoid the runtime indirection, and auto-generate the wrapper methods ..

The patterns introduced with object-oriented design are nothing new, and exist for a reason. Everyone should pick up a good book, and read through it every night, AWAY FROM THE COMPUTER. Doesn't have to be PHP specific (in fact, I'd suggest against getting an OOP PHP book), but it does have to be design specific. Learn it, love it, live it.

Encapsulation is one of the base guiding principles which makes the architecture work, and we should abide by it.