Recently I've been talking up various ideas for pluggable subsystems in Drupal in IRC and the other usual haunts. Ideas have been percolating in my head, but so far I have been remiss in actually writing them down. Yesterday, however, I had an epiphany to solve the primary issue I was trying to work out, so I present a hopefully workable RFC (for real, not IETF version) for pluggable subsystems in Drupal.
I am posting this over to Planet PHP as well to invite commentary from those who aren't already embedded in the Drupal mindset. :-)
At DrupalCon Sunnyvale 2007, Rasmus Lerdorf chided Drupal on spending over half of its request time on just the bootstrap process. As a GHOP Task , Cornil did a performance analysis of Drupal and found its two largest performance drains were the bootstrap process and the theming layer. Quite simply, Drupal spends too much time including code.
Drupal 6 has the beginnings of a solution. Page handlers, the most unused code in Drupal, can now be split out into conditional include files and the menu system is able to conditionally load just the file it needs for a given page request. Based on earlier benchmarks, just that code shuffling netted Drupal 6 a 20% performance boost. The downside, however, is that it does require the module author to explicitly specify file to be included, and the syntax for it is just a little bit annoying what with the file name and file path being separate keys on the menu handler.
For those who haven't noticed yet, the latest in a expected long line of Drupal books for this year has been published: David Mercer's verbosely-named "Building Powerful and Robust Websites with Drupal 6". It is not a book for the experienced Drupaler; it's target market is people picking up Drupal, and the web for that matter, for the very first time.
Personally I think David has done a great job with it, but then I am biased; I was the tech reviewer for the book. :-) If you want an unbiased opinion, pick up a copy yourself and give it a read. Then you'll know how good it is. As an added bonus, 5% of all sales through Packt's web site are donated to the Drupal Association. Everybody wins!
By now you may have heard the news from Paris that a unit testing framework has landed in Drupal core. A huge shout-out goes to everyone involved. I particularly want to note the work that's been put in by former GHOP students and members of the GHOP team. It's amazing to see how far some people have come in a short time, despite still having homework to do. :-)
The next step, of course, is to make Drupal itself fully-tested. That poses a number of challenges, particularly for unit tests. Because I'm sure others will be singing the (well-deserved) praises of the testing team, I want to take a moment to focus on that next step and one important approach: Testable APIs.
I recently had a discussion with Peter Wolanin about pluggable subsystems. (You can tell this is going to be an exciting entry already, can't you?) Drupal has supported a few pluggable subsystems for a long time, namely the database and cache systems. In both cases, they work on a very simple principle: Conditionally include one of two (or more) files that defines the same set of functions but with different bodies.
That's all well and good and simple, but has some very serious limitations. Most notably, because the same function name is defined twice you can never load multiple versions at the same time. That becomes a problem if you want to, say, connect to a MySQL and PostgreSQL database in the same page request. In addition, Drupal 7 is on track to include a fully introspective code registry for conditional code loading, which, based on earlier benchmarks, should be a huge performance boost. The Registry, however, assumes that all code resources (functions, classes, and interfaces) are globally unique throughout Drupal. Having a given function name defined twice will confuse the poor thing.
That is not an insurmountable problem, or even, truth be told, a difficult one. It simply requires switching from a simple include to a more formal mechanism. There are, in fact, several ways that can be done, so to further the education of the world at large (and probably embarrass myself a bit in front of other architecture buffs) I decided to write a brief survey of simple pluggable mechanisms.
A recent thread on the Drupal documentation list has brought up once again the "why don't we have a wiki on Drupal.org?" question. It comes up regularly; you can set your watch by it.
What I've never understood is why anyone would want to take a step backwards from Drupal's handbooks to a simple wiki. How would removing features and capabilities help Drupal's huge pool of documentation?
What exactly makes something "a wiki"? Let's examine:
One of the major changes in Drupal 6 (where "major" is defined as "worthy of a mention in Dries' keynote") was a new feature of the menu and theme hooks. The newly introduced "file" and "file path" keys in those hooks' respective retun arrays. allow them to define files that get included conditionally, only when needed. In theory, that should be a big performance boost; page handlers are virtually never called except for on the page they handle, so loading all of that code on every other page is a waste of CPU cycles. Of course, there is also the added cost of the extra disk hit to load that one extra file we need. Modern operating systems should do a pretty good job of caching the file load, but that may vary with the configuration.
So just how much benefit did we get from two dozen fragile patches that were a glorified cut and paste? And is it worth doing more of it? Let's benchmark it and find out.
For those who haven't really been following it, several hundred contributors, 13 months, and tens of thousands of lines of code have gone into making the only version of Drupal ever that is better than Drupal 5. So, naturally, we've released it and called it Drupal 6.
Drupal 6 boasts a boat load of new functionality, ranging from Ajaxy yumminess throughout the system to native support for OpenID to vastly enhanced multi-lingual support. Several entire subsystems have been either overhauled or totally rewritten to proivde more power, flexibility, and speed. The official press release has the complete rundown, or for the more visually inclined there's a new features screencast. For me, though, the new theming system is feature numero uno.
Marco Tabini, of php|architect magazine fame among other things, has been openly disappointed at the death of PHP 4. Not because he likes PHP 4, but because of the "OMG you're discontinuing something that everyone's still using!" argument.
I like your magazine, Marco, but I have to disagree with you on this one. :-)
I've just posted another "Data API What If"-style article over on http://groups.drupal.org/. I figured that was more archiveable than my personal blog, but I wager more people read Drupal Planet than a specific g.d.o group so I'm posting the link here as well in a blatant bit of self-promotion. :-)
I am also posting this over to Planet PHP, in case any other exPHPerts have words of wisdom to offer. Hi guys!