Recently I've been talking up various ideas for pluggable subsystems in Drupal in IRC and the other usual haunts. Ideas have been percolating in my head, but so far I have been remiss in actually writing them down. Yesterday, however, I had an epiphany to solve the primary issue I was trying to work out, so I present a hopefully workable RFC (for real, not IETF version) for pluggable subsystems in Drupal.
I am posting this over to Planet PHP as well to invite commentary from those who aren't already embedded in the Drupal mindset. :-)
PHP sucks, but Drupal is worth it.
I have been running Mollom as my spam-fighter on this site for not quite two months now. It's been fairly effective overall. The nifty flash meter shows me just how bad the spam problem is (good grief, 593 blocked spam messages just on 15 May!), and I haven't gotten any spam in my comment list yet.
That is, until today, when a new form appeared.
At DrupalCon Sunnyvale 2007, Rasmus Lerdorf chided Drupal on spending over half of its request time on just the bootstrap process. As a GHOP Task , Cornil did a performance analysis of Drupal and found its two largest performance drains were the bootstrap process and the theming layer. Quite simply, Drupal spends too much time including code.
Drupal 6 has the beginnings of a solution. Page handlers, the most unused code in Drupal, can now be split out into conditional include files and the menu system is able to conditionally load just the file it needs for a given page request. Based on earlier benchmarks, just that code shuffling netted Drupal 6 a 20% performance boost. The downside, however, is that it does require the module author to explicitly specify file to be included, and the syntax for it is just a little bit annoying what with the file name and file path being separate keys on the menu handler.
Finding the release notes is a bit like finding the highway plans in the city building's 4th floor basement in a disused lavatory behind a locked door with a sign stating "beware of leopard."
For those who haven't noticed yet, the latest in a expected long line of Drupal books for this year has been published: David Mercer's verbosely-named "Building Powerful and Robust Websites with Drupal 6". It is not a book for the experienced Drupaler; it's target market is people picking up Drupal, and the web for that matter, for the very first time.
Personally I think David has done a great job with it, but then I am biased; I was the tech reviewer for the book. :-) If you want an unbiased opinion, pick up a copy yourself and give it a read. Then you'll know how good it is. As an added bonus, 5% of all sales through Packt's web site are donated to the Drupal Association. Everybody wins!
By now you may have heard the news from Paris that a unit testing framework has landed in Drupal core. A huge shout-out goes to everyone involved. I particularly want to note the work that's been put in by former GHOP students and members of the GHOP team. It's amazing to see how far some people have come in a short time, despite still having homework to do. :-)
The next step, of course, is to make Drupal itself fully-tested. That poses a number of challenges, particularly for unit tests. Because I'm sure others will be singing the (well-deserved) praises of the testing team, I want to take a moment to focus on that next step and one important approach: Testable APIs.
I recently had a discussion with Peter Wolanin about pluggable subsystems. (You can tell this is going to be an exciting entry already, can't you?) Drupal has supported a few pluggable subsystems for a long time, namely the database and cache systems. In both cases, they work on a very simple principle: Conditionally include one of two (or more) files that defines the same set of functions but with different bodies.
That's all well and good and simple, but has some very serious limitations. Most notably, because the same function name is defined twice you can never load multiple versions at the same time. That becomes a problem if you want to, say, connect to a MySQL and PostgreSQL database in the same page request. In addition, Drupal 7 is on track to include a fully introspective code registry for conditional code loading, which, based on earlier benchmarks, should be a huge performance boost. The Registry, however, assumes that all code resources (functions, classes, and interfaces) are globally unique throughout Drupal. Having a given function name defined twice will confuse the poor thing.
That is not an insurmountable problem, or even, truth be told, a difficult one. It simply requires switching from a simple include to a more formal mechanism. There are, in fact, several ways that can be done, so to further the education of the world at large (and probably embarrass myself a bit in front of other architecture buffs) I decided to write a brief survey of simple pluggable mechanisms.
It's the little things that really make or break a system. For instance, earlier tonight a song came up on my playlist in Amarok. I realized the name was misspelled. I corrected the ID3 tag. I then went to the directory where the file was and renamed it. The song was still playing. Amarok noticed and rescanned my collection, updating its records of the new file name, and kept on playing the song without any interruption.
That is how a computer is supposed to behave. :-)