As anyone who has followed my past work knows, software architecture is a particular interest of mine. I find the subject fascinating, but my interest is not entirely selfish.
Understanding architecture, and the trade-offs that different architectures imply, is an important part of any software project. Whether you're discussing a Content Management Platform like Drupal, a language like PHP, or a particular web site, having a solid understanding of the "big picture" is crucial not only for building the system right in the first place but for communicating that architecture to others.
To be able to speak and think about the design of your system properly, though, you need to understand the trade-offs that come with it. There is no such thing as a free lunch, and designing a system to be powerful in one way invariably tends to harm it in another. It is important to know what your priorities are before you start building; and in a distributed collaborative environment like Drupal to all agree what those priorities are, at least to a large extent.
Let us therefore examine those priorities and the trade-offs they require.
Software architecture is the process of structuring a large logical system. There are many ways to do so, and a number of common patterns and approaches for doing do. Architectural patterns are a sort of a more generalized case of design patterns; classics such as Model-View-Controller and Presentation-Abstraction-Control come to mind, as well as others less frequently seen on the web such as Pipes and Filters (the entire basis of the Unix command line) or the not at all pornographic Naked objects.
Different architectural patterns are not inherently good or bad. In fact, thinking of them as such is self-defeating. Different patterns are more or less appropriate given the nature of the system and its priorities. Without an explicit understanding of those priorities it is impossible to speak intelligently about what architectural pattern is appropriate.
The classic example here is not from software at all, but from cars. (Really, what discussion of computer is complete without a car analogy?) The 1960 Chevrolet Corvair featured a swing axle suspension system. That type of suspension is more commonly found on sports cars as it changes the handling of the car in a way that makes sense for sports car drivers but most sedan drivers aren't used to, as it results in less of the tire staying on the road during turns. That's fine if you're going for a sporty handling but not very safe in typical city driving, especially if you're not used to it. The result was a much more unsafe car, not because the suspension was bad but because it was inappropriate for a sedan. (It also launched the career of Ralph Nader. True story.)
In some cases, the "null architecture" is even appropriate; this is the "I don't think about architecture, I just add code until it works" approach, sometimes called "monolithic architecture". That is still an architectural pattern, and there are even cases where it is appropriate (usually for very small or short-lived projects).
Axis of Architecture
When considering the appropriateness of an architectural decision, there are a number of common factors to consider:
- How easy is it to change the way the program works later?
- How easy is it to tack on additional functionality, or take away existing functionality?
- Is the code structured in a way that makes it easy to separate out parts and unit test them?
- Can we prove, not just think, but mathematically prove that the code is correct in all cases?
- Does the program run quickly? How fast does it get through the task at hand?
- How well does the system scale to lots of traffic? (Hint: Scalability is not the same as performance, although improved performance usually improves scalability.)
- How easy is it for the end-user to use and leverage the resulting system?
- How easy is it for developers to understand and leverage the system? This is especially important for APIs. (Barry Jaspan referred to this as DX at one point.)
- All software requires updating and bug-fixing. How easy is it to do that?
- How long does it take to actually, you know, write the damned thing?
What's more, these different axes are frequently at odds with each other. Extensibility and Modifiability, for instance, usually go hand in hand but make Verifiability and Testability extremely hard. Performance usually (but not always) helps Scalability, but Scalability can sometimes harm performance through over-abstraction. Maintainability and Expediency are often an either-or question, as writing cleanly extensible and maintainable systems is hard. The Perl language is frequently described as a "write-only language", because syntactically it strongly favors Expediency and Performance over Maintainability or Understandability.
Drupal has, implicitly, favored certain factors over others. That's not a bad thing, but it is important to understand, and agree on, what our priorities are and why we have them.
For instance, Drupal has always emphasized Extensibility. In fact, I'd argue that has traditionally been our most important architectural priority (except when it hasn't been) thanks to the hooks system. However, that extensibility has come at the cost of Testability and Verifiability; Even with the major push for automated testing in Drupal 7, which has been incredibly beneficial, Drupal is architecturally difficult to impossible to properly unit test. Unit testing requires completely isolating a piece of code so that it can be analyzed in a vacuum. Hooks, by design, make isolating a piece of code nearly impossible. Code anywhere in the system could affect almost anything, and you can't control what a user decides to install.
Is that a good trade-off? If you're building a site-specific module, yes. If you're trying to debug an issue, no.
Drupal's hook system and standardization on Giant Undocumented Arrays of Doom(tm) is great for Modularity and Extensibility, because you can do pretty much anything anywhere by just adding an alter hook. However, if you're not already used to this completely proprietary design pattern it is completely incomprehensible and terrible for Understandability. (Even if you are used to it, it's still terrible for Understandability and Maintainability.) And if you've come from a background that uses more conventional techniques or an academic background, you're likely to run screaming and in fact many people do.
Is that a good trade-off? If your target developer is site-specific casual developers, yes. If your target developer is someone who already has extensive experience developing for any other system (PHP or otherwise) or has an academic background in CS, no.
For Drupal 7, there was an explicit decision by many developers to emphasize Scalability. That's not at all a bad decision, but in some cases that came at the cost of performance. The best example here is Field API storage. It is now pluggable, which is great and allows for non-SQL back ends to be dropped in. However, the extra abstraction that required makes the code harder to follow and makes combined storage harder. Similarly, the Expediency of getting a working SQL storage driver in place necessitated throwing out the dynamic table schema used by CCK in Drupal 6, which means a huge increase in JOINs and therefore a reduction in Performance.
Was that a good trade off? If you're Examiner.com or Sony BMG or Acquia's Drupal Gardens, yes. If you're a small non-profit, church, or personal site on shared hosting, no.
From some perspectives, Drupal 7 will be a huge, massive leap forward. From others, it's a huge, massive leap backward. That depends on what your needs and priorities are.
That's not to say that Drupal 7 is bad, or that the people building it (myself included) did something wrong.
Well, actually it does. Do we know what our priorities are? If forced to decide if it's worth sacrificing some extensibility for verifiability, or vice-versa, what would we decide?
Which is more important to make fast: The 95% of the market that runs on cheap shared hosting and has no PHP developers available to it, or the 5% of the market that runs its own server cluster and is more than happy to install MongoDB and Varnish and has four full time PHP developers, and therefore pays the salary of the people working on Drupal in the first place?
If there were a way to make Drupal faster for both of those groups, but at the expense of Modifiability and Extensibility, should we do it?
If there were a way to make Drupal easier for new developers to understand but at the expense of performance, should we?
If we can make Drupal easier to use for new site builders but at the expense of making it harder to develop for, should we?
These are the important questions that we need to be collectively asking ourselves. At the same time, we need to stop lying to ourselves and thinking that we can have our cake and eat it to.
What trade-offs are we willing to make?