Some time ago, I posted an RFC for pluggable "system handlers". It generated a fair bit of feedback, nearly all of it positive. That was followed up with a presentation in Szeged, which generated even more positive feedback.
So what's happened since then? Well, a fair bit. There's working code, but there are still some key gotchas to sort out. That gives us a couple of options for how to proceed, for which I would like feedback, particularly from core developers and maintainers. (Dries, webchick, this means you! :-) )
First, a brief overview of, conceptually, how Handlers have evolved since my original RFC.
- A slot (I still am not 100% on this name) is a system or subsystem which we want to be easily swappable. "Cache" is a slot, since we want a pluggable cache system. Other such systems include session handling, password generation, image manipulation, email sending, and potentially file storage, path caching, HTTP requests, user name generation, and various others.
- A Handler is a class that implements the interface for a given slot. That is, a Cache Handler is one particular implementation of the cache system and can be transparently swapped out for another Cache Handler. A Handler communicates with the outside world only through an Environment object.
- Environment object
- Jacob Petsovits noted a key problem with the original RFC that setting properties on a handler was far too limiting, as different handlers might care about different variables and such. The solution that I developed is to pass in to each handler an "environment object", that is, a front-end to accessing the rest of Drupal. All
variable_get() calls become
$this->env->variableGet(), for example. While that does add more layers of indirection, it does give us the optimal combination of power (handlers can still do anything) and testability. It's also an extremely common pattern in the OO world. See the Szeged presentation linked above for more details on the how and why. (Note: This is different than the "Context" system used by Panels.)
- Here's where it gets complicated. :-) One of the features that I really wanted to include in Handlers is multiple routing. That is, different handlers can be responsible for the same slot depending on the conditions of the action in question. For example, we could wire up page caching to the database but the smaller menu cache to memcache. Or for file storage we could map image files to the local file store and video files to a CDN, but only if they're larger than 1 MB. That gives us a great deal of flexibility in how we configure a Drupal site, and is based on the same logic as database targets in DBTNG. Kudos to perennial source of inspiration Jeff Eaton for a late-night chat at Ogilvie Transportation Center in Chicago that made me realize how this could and should work. Each slot defines its own targets, which can be multi-dimensional.
The code in the 2.x branch of the handler module in contrib contains all of the above code, mostly working. However, I've hit a snag with the targets system. Originally in Szeged I had a system that broke all target mappings if anyone changed the mapping parameters. After an all-night brainstorming session Jeff and Kyle Cunningham at Drupal Camp Chicago (which was awesome), we devised an improved target mapping system that is inspired by the D6 menu system's materialized path logic. That's the code that can be found in the 2.x branch of the handler module now.
How about core?
Sadly while the multi-routing works, configuring the multi-routing does not. It's just too complex when you allow each slot to define arbitrary targets. Drat. At this point, however, I've decided that too many people that liked the idea are waiting on handlers to do cool stuff (myself included), and perusing the Drupal 7 core issue queue it is becoming more and more evident that we need something like this in core as soon as possible. Just a few of the issues that seem to cry out for handlers include:
- Abstract SimpleTest browser into its own object: How about a curl-based implementation, a raw socket one, a simple string-based one... That way we can have a simple one in core that works everywhere and a curl-based one that simpletest can require.
- Abstract session handling to an object: We want swappable session backends, right?
- Option to disable IP logging: For sites that need heavy anonymity, we have to trade off the flood control. How do we balance that? Let the site admin decide by swapping out the flood engine.
- Pluggable password hashing framework: Peter Wolanin is already embracing this approach, in very-simplified form, for the new pluggable password hashing in D7.
- Adaptive path caching: It's really hard to figure out a single path alias caching system that will work for all sites, and even harder to proper test a new approach in the wild. Make that swappable, ship 2 with core (the one-at-a-time method and a load-them-all method) and let new systems be developed independently, then we can move them into core once they've proven themselves in the wild.
- Swappable mail systems: One for production, one for development that doesn't actually send, one for silly Windows/PHP servers that do it differently, all configured via radio button.
That's not counting the obvious cases like the cache system, for which any function-based implementation breaks the registry. I'm sure I've forgotten others, too.
I see several possible ways forward at this point. I would like input from all and sundry on which we should take, but especially from the core maintainers who, you know, have final say on these things. :-)
- Continue in contrib. Do nothing in core for now, try to get Handlers 2 sorted out in contrib, then revisit the question again in a year for Drupal 8. Given the number of important threads above where we really need a system like this, I don't like this option.
- Fast-track it. Pour brain power into solving the remaining issues with Handlers 2, then move that into core. Nice as this would be, I'm not entirely sure that throwing brains at the problem will fix it in a timely manner. Plus, many of our big brains are otherwise occupied with important matters, such as Fields in Core.
- Drop multi-routing. Cut back the functionality to just have a single active handler per slot. No multi-routing, no multi-dimensional routing. The entire cache system uses a single handler, but we can then still easily swap out the cache system, or anything else using handlers. We can then revisit the muti-routing problem another time, possibly in contrib during the D8 dev cycle.
- No formal handlers, just a pattern. This is what the password system is looking to do right now, with the pending RTBC patch. Rather than a separate index for handlers and explicit hooks, it just uses the variable system to store which password engine is active and loads a class. There's an interface, but no common interface for any handlers. There's also no environment object.
Personally I favor option #3. Most subsystems don't actually need multi-routing, and fewer still need multi-dimensional routing. The code needed for #3 already exists in the handlers module, and can be moved into core fairly easily (or as easy as it is to get anything new into core). It also maintains the explicit definition of slots and handlers and, most importantly, the environment object. I can't over-state how important that extra layer of indirection is toward making it easier to develop and test new systems for Drupal. The more loosely-coupled Drupal's subsystems are, the better for everyone.
While option #4 is the path of least resistance, I don't think it's the best alternative. We lose any sort of standardization, self-documentation, and much of the encapsulation (from the environment object) by going that route. It also doesn't give us any natural upgrade path to "full handlers" once all the multi-routing weirdness is sorted out. By introducing the slot/handler/environment structure now, we can add targets back into the mix later once they've matured some more, possibly even via contrib.
We are also then dependent on the variable system for our configuration, typically, which means on the database. Ideally we want the cache to be able to initialize and run without hitting the database so that we can have an entirely memcache-served site. That, however, does require that we have a different way of getting to the handler configuration, as well as at least part of the registry available without the database. (settings.php is ideal for that.) Both are solvable if we have a separate system.
Request for comments
So, there we are. How should we proceed? Who is willing to help with any of the above approaches if we go with them? Is this all just a pipe dream, or can we get some traction to allow new Drupal subsystems to be developed and improved faster, with better unit testing, and more admin configuration all in one package?
And what the heck do we call slots other than "slot"? :-)