RFC: Drupal pluggable system handlers

Recently I've been talkin' up vari'us ideas fer pluggable subsystems in Drupal in IRC an' th' other usual haunts. Ideas have been percolatin' in me head, but so far I have been remiss in actually writin' them down. Yesterday, however, I had an epiphany t' solve th' primary issue I were bein' tryin' t' work out, so I present a hopefully workable RFC (fer real, not IETF version) fer pluggable subsystems in Drupal.

I am postin' this o'er t' Planet PHP as well t' invite commentary from those who aren't already embedded in th' Drupal mindset, Dance the Hempen Jig :-)

Background an' definitions

The first important question is what exactly a "pluggable system" means. After all, Drupal already has a modular extensible system: Hooks. Why do we need another one?

The problem is thar be many ways t' make an extensible system; some be better suited t' certain types o' extension than others. Drupal's Hook mechanism can be described, as webchick so eloquently puts it, as "Avast, I'm about t' do X. Who wants t' do somethin' with/about it?" That is, hooks be a procedural implementation o' th' Observer pattern, with passive registration rather than active registration.

It's not perfect (nothin' is), but Hooks, properly implemented, end up bein' an extremely powerful-yet-cheap extension mechanism. Because Drupal uses a lot o' bare data structures, we be able t' use Hooks not only fer traditional Observer behavior but also fer Inversion of Control, allowin' th' core system t' simply act as a router an' mapper, lettin' modules do all o' th' hard work. The recent growth in "registry-style" hooks (hook_menu() an' hook_theme() in Drupal 6, fer instance) is a great example o' that.

However, thar is another important type o' extensibility that Drupal does not currently handle well at all. If I may borrow webchick's eloquent style, I would describe it as "Avast, I need X done, ye scurvey dog. Who is goin' t' take care o' it fer me?" While Drupal does include that sort o' logic in a few places, most notably th' menu handler system, it is not implemented in an extensible, consistent manner, and a bottle of rum! Ahoy! Nevertheless, it is a pattern that Drupal uses often, or rather needs t' use. Consider swappable cachin' systems, swappable session handlin', swappable password hashin' (comin' in Drupal 7), swappable image libraries.., ye scurvey dog. And thar be many more that we could have if we had a good way t' implement it. Right now, though, we dern't. Walk the plank! Walk the plank! Think output renderers, user management, user display name generation, th' list goes on.

I debated what t' call these pluggable systems, but have fer now settled on "Handlers". Earl Miles pointed out t' me that what is described above is very similar t' th' concept o' Handlers in Views 2 an' Panels 2, an' th' implementation I describe below is, while not th' same as th' "ofchaos suite", in some ways inspired by it.

Handlers

At a 10,000 foot perspective, a Handler is defined as:

  1. A self-contained piece o' code that
  2. is called explicitly by some other piece o' code t'
  3. handle some particular set o' related operations an'
  4. can be "swapped out" fer another implementation o' th' same interface with no changes t' th' callin' code an' ideally
  5. multiple implementations can exist in th' same request in parallel.

That's great. We've just described th' notion behind any function or subroutine. :-) If 'tis a multi-faceted subroutine, we've just defined an Object, Dance the Hempen Jig The only tricky part is requirement #5, that multiple implementations can co-exist, and a bucket o' chum. Not all pluggable systems need such functionality, but many do.

That is th' primary failure o' Drupal's current de facto mechanism, conditional includes, avast. Currently, Drupal handles multiple implementations o' th' cache, password, database, an' session systems by specifyin' via a hard-coded variable in settin's.php which o' a number o' files t' include, each o' which define th' same set o' functions with alternate implementations. While that works, it is, quite simply, sloppy. It does not permit multiple simultaneous implementations, to be sure. It requires hard-codin' paths (albeit relative paths) in a config file; thar is no cohesion betwixt related pieces o' functionality (cache_get() an' cache_set() fer instance) other than function name prefixes; It is difficult or impossible t' configure via an admin interface; an' duplicate names break th' shiny new code registry.

Another key problem with th' conditional include method is code duplication. The Drupal 6 database layer is an excellent example, Ya lily livered swabbie! The ext/mysql an' ext/mysqli drivers share aroun' 50% th' same exact code. Duplicatin' that code betwixt them is obviously a bad notion. The alternative, which we do, is t' have both drivers include yet another file, database.mysql-common.inc, which includes th' overlap. That doesn't work at all if we want t' have handlers added by modules rather than hard-coded into th' includes directory. We need somethin' better.

While it would be possible t' layer functions (as described in me earlier article), that requires each function implementation t' duplicate th' same pass-through code. If ye need t' call a lot o' routines on a given subsystem, th' extra function calls can get expensive. It also doesn't solve th' shared code problem, to be sure. It is, overall, a poor solution.

However, th' requirements described above map almost exactly t' an extremely common OOP pattern: The Factory pattern. In th' simplest sense, a Factory is a routine ye call t' return an object that matches a given interface, but th' exact implementation is determined by some encapsulated logic, however simple or complex it needs t' be. The canonical example here is a shippin' system; th' system calls a factory object with a product t' be shipped an' its destination. The factory looks up (from somewhere) th' closest warehouse t' th' user's location that has th' product, determines which shippin' partner (FedEx, UPS, USPS, etc.) would be cheapest given those two locations, creates a shippin' object fer that partner, an' returns it. All th' caller knows is that it has an object that conforms t' th' Shipper interface. And swab the deck, I'll warrant ye! New shippin' partners can be added by just addin' a new class t' th' system an' poof.

In PHP, th' factory doesn't even have t' be an object. If th' factory logic is simple enough, it can be a simple function that returns an object.

Self-contained: A sidebar

I want t' highlight two key words in requirement #1 above: Self-contained. A self-contained system has a number o' advantages; it is easier t' debug, easier t' develop, easier t' test, an' easier t' unit test. The up-side is that it pushes all interaction with th' outside world t' a very narrow pinhole, its defined interface, an' only interacts with th' rest o' th' world passively as it gets configured by others callin' it. The downside is that it pushes all interaction with th' outside world t' a very narrow pinhole, its defined interface, an' only interacts with th' rest o' th' world passively as it gets configured by others callin' it.

Drupal currently is very much not self-contained. That is one o' th' thin's that makes it fast, because it can take shortcuts, but also one o' th' thin's that makes unit testin' it extremely hard. In th' interest o' "embracin' testin'", therefore, I propose that we go ahead an' standardize on a Handler object havin' no contact with th' outside world except fer its specific domain, save through well-defined methods. That means no variable_get()s, fer instance. All o' those should go in th' factory, so that we can completely configure a Handler implementation in isolation an' therefore unit test it t' death more easily.

Early access

The main challenge implementin' such a system in Drupal poses is pre-database initialization. A few pluggable systems, such as th' cache system, sometimes need t' initialize before th' database does. The database is our primary storage mechanism, where we would store information on, say, which handler t' use fer a given system, Ya lily livered swabbie! And swab the deck! Ideally we'd also want t' rely on th' registry t' lazy-load just th' Handler implementations we need, which if they be classes it can do automagically but only if we have a workin' database. Oho, All Hands Hoay! This is a problem.

However, let us consider why we need stuff t' happen pre-database, Hornswaggle Well, thar is th' database layer itself. That can't rely on this mechanism period, but that's a special case. For th' rest, particularly th' cache system, th' issue is that we need t' be able t' use non-database cachin'. Load the cannons, All Hands Hoay! We want a site t' be able t', say, use Memcache fer page cachin', an' be able t' serve cached anonymous pages without e'er hittin' th' database. Why avoid th' database, Ya horn swogglin' scurvy cur! Well, all thin's bein' relative database access is very slow. Havin' t' connect t' a MySQL database on another server just t' do a single lookup t' see what file t' parse t' get th' CacheMemcache class is rather wasteful.

Fortunately, not all databases be slow. In particular, PHP 5 includes an integrated copy o' SQLite. SQLite uses th' PDO interface, which is what I am pushin' with all my might t' move Drupal t' ASAP. It is also fast on read operations, because th' "connection" cost is just a file stat call. Load the cannons! Write operations be slow, though, fer th' same reason. Ye'll be sleepin' with the fishes, by Davy Jones' locker! The lookup tables fer handler configuration should be very static. So if we move those t' an SQLite database, we can still "use th' database" fer our configuration without "usin' th' database". Neato.

The new database API includes support fer exactly that sort o' trickery. Master/slave replication is handled through "targets"; that is, a query can be specified t' try t' use one particular database connection (say, a slave server) but silently fall back t' a default (th' master server) if th' selected target isn't found. The exact same setup can be used fer selected system tables, such as th' registry, system table, an' handler-lookups, and dinna spare the whip! And swab the deck! Simply set those queries t' run against a "system" target if possible, an' then have an automatic way t' replicate selected tables from one target t' another when they change. If ye dern't have SQLite available, then everythin' runs through th' main database an' ye dern't get as much o' a benefit from Memcache cache implementations. If ye do, ye can skip th' main database more often.

To be fair, thar is still a performance hit fer th' cost o' simply loadin' an' parsin' th' core database code, Dance the Hempen Jig That is not a negligible number o' nanoseconds. However, any other mechanism I have devised involves a lot o' manual hackin' about in settin's.php. The elegance an' simplicity we gain from bein' able t' use "a database" is, I believe, worth th' extra code load, especially if we can further optimize th' database code t' load faster (we can) an' make use o' class autoloadin' t' reduce th' overall code weight o' Drupal in general.

The implementation

Enough talkin', on with th' code!

I am goin' t' use th' cache system as an example, mostly because it is a well-understood an' fairly simple API. This is also in very much draft form, but I hope t' have enough o' th' concept down that th' API makes sense. Yaaarrrrr! Let's start off with a basic interface needed by all Handlers:

<?php
interface HandlerInterface {
  public function
setProperty($var, $val);
}
?>

All handlers must implement this interface. At a basic level all it does is define a way t' set environment properties on a handler object, shiver me timbers Those be called from within th' factory with th' result o' variable_set() et al instead o' puttin' those inside th' handler, which gives us better encapsulation.

We also need a way t' define both thin's that can be handled an' thin's that can handle them. And hoist the mainsail! For this, we use two registry hooks:

<?php
function hook_slot_info() {
  return array(
   
'cache' => array(
     
'targets' => array('default', 'block', 'filter', 'page'),
     
'interface' => 'CacheInterface',
     
'factory' => 'cache',
     
'properties' => array('thingA', 'thingB'),
    ),
  );
}

function
hook_handler_info() {
  return array(
   
'cache' => array(
      
'database' => array(
        
// Translate on load, not define, like hook_menu().
        
'name' => 'Database',
        
'class' => 'CacheDatabase',
       ),
      
'memcache' => array(
        
'name' => 'Memcache',
        
'class' => 'CacheMemcache',
       ),
      
'mock' => array(
        
'name' => 'Mock caching, does nothing',
        
'class' => 'CacheMock',
       ),
    ),
  );
}
?>

There's actually a great deal goin' on here in these few simple lines. First, we define two concepts, a slot, which is a "thin' into which a handler gets plugged", an' thar is a handler, which is "an object that gets used by a slot". The term "slot" is borrowed from Qt, an' is likely misused here. I welcome suggestions on better names, provided it doesn't become a bike shed thread. :-) A slot has a unique ID (cache), an explicit Interface (CacheInterface) that all implementations must implement, a list o' th' properties that it is expected t' have, a factory function that will be used fer accessin' th' appropriate handler (I couldn't think o' any fer th' cache system t' use, so th' above is just fer illustration), an' zero or more targets.

A target here behaves in a similar fashion t' th' database system. It allows fer multiple simultaneous implementations. If specified, thar must always be a target called "default" plus some number o' additional targets. In this case, th' page cache, block cache, filter cache, etc. can all be specified as separate targets. Each target can then have its own handler hooked up, so we can use, say, memcache fer page cachin' but an SQLite database connection fer filter cachin'. If a target doesn't have a handler defined fer it, it uses th' handler fer th' default target, Ya horn swogglin' scurvy cur! If a given slot doesn't define any targets, then thar is only e'er one target, default.

We then define th' handlers, All Hands Hoay, and a bottle of rum! Each handler is defined as bound t' a specific slot (th' first array key), an' defines th' class fer that handler. That class must implement th' interface defined in th' slot_info hook, which in turn must extend HandlerInterface. We dern't care where th' handler class or th' interface be defined in code; th' registry will take care o' that fer us.

And o' course, both hooks have a correspondin' alter hook so that other modules can do whatever they need; usually that will mean addin' targets (such as views cachin').

<?php
function views_slot_info_alter(&$slots) {
 
$slots['cache']['targets'][] = 'views';
}
?>

Both hooks be saved t' th' database in dedicated tables, not in th' cache tables, me Jolly Roger There be two reasons fer that. One, if we needed t' use th' cache t' get t' th' slot/handler info then we couldn't use handlers fer th' cache. Two, memory, Ya swabbie! Havin' t' load an' de-serialize those arrays is not cheap, particularly on memory, and a bottle of rum! The menu system is a much better model here.

We then have our cache Interface an' our cache implementations, like so:

<?php
interface CacheInterface extends HandlerInterface {
  public function
get($cid);
  public function
set($cid, $data, $expire = CACHE_PERMANENT, $headers = NULL);
  public function
clear($cid = NULL, $table = NULL, $wildcard = FALSE);
}

class
CacheDatabase implements CacheInterface {
  protected
$properties = array();

  public function
setProperty($var, $val) { /* ... */ }
  public function
get($cid) { /* ... */ }
  public function
set($cid, $data, $expire = CACHE_PERMANENT, $headers = NULL) { /* ... */ }
  public function
clear($cid = NULL, $table = NULL, $wildcard = FALSE) { /* ... */ }
}
?>

Again, these can live virtually anywhere, although presumably provided by a module, because th' registry will be able t' find them an' load them when needed. However, it means fewer thin's in /includes an' therefore more thin's that can be put into modules where they belong.

Finally, thar is th' factory function. All it does is create an' return singletons fer th' appropriate target. In th' interest o' simplicity, we require all factories t' have th' same function signataure. We also have a "factory factory" fer indirect access.

A what, pass the grog! A factory factory is a factory that returns factories, o' course! If that doesn't make any sense, think o' it as module_invoke but fer handlers. In fact, we'll even name it th' same way.

<?php
function handler_invoke($slot, $target = 'default') {
 
$function = get_factory_for($slot);
  if (
drupal_function_exists($function)) {
    return
$function($target);
  }

  return
NULL;
}

function
cache($target = 'default') {
  static
$targets = array();

  if (empty(
$targets[$target])) {
   
$class = get_class_for_target($target);
   
$driver = new $class();

   
// If there were any properties, this would make more sense.
   
$driver->setProperty('thingA', variable_get('thingA', 'stuff'));
   
$driver->setProperty('thingB', variable_get('thingB', 'morestuff'));

   
$targets[$target] = $driver;
  }
  return
$targets[$target];
}
?>

A real implementation would include more error checkin', o' course, but ye get th' notion. The two magic functions listed above, get_factory_for() an' get_class_for_target(), still have t' be figured out. They could not use th' cache or variable systems, only th' database directly. Fire the cannons! That should be fine, however, an' a reasonably expedient implementation could, I have no doubt, be written.

We could then call th' cache system in one o' two ways:

<?php
handler_invoke
('cache', 'page')->get($cid);

cache('page')->get($cid);
?>

The latter is more self-documentin' an' easier t' read, but th' former will do a lazy-load fer us. If th' factory function is expected t' already be loaded, go ahead an' use th' direct version. If not, use th' indirect t' be sure. Because objects be resources an' therefore always behave as if they be passed by reference (in PHP 5), everythin' still works.

Explanation an' discussion

Note that while here we're implementin' a singleton fer each target, that is not strictly required by th' interface, we'll keel-haul ye! If it made sense t', we could recreate th' object each time. And hoist the mainsail! Shiver me timbers! In that case we're usin' it as a more traditional factory, but that's fine. Hooks pull double duty quite well (hook_nodeapi vs. registry hooks), so handlers can, too.

It is also important t' note that we be usin' only interfaces here; thar be no subclasses. That allows us t' use subclasses on th' concrete implementations if it makes sense. Say, all o' th' Cache implementations will share some code from a parent CacheGeneral abstract class; or if we be definin' an interface that's very SOAP-like, an' want t' extend th' PHP-native SoapClient class an' tack on th' extra interface, we can.

Also note that because PHP is weakly typed, it is true that thar is no reason why we must use interfaces at all, pass the grog! Oho! They be simply syntactic sugar, and a bottle of rum, Ya lily livered swabbie! I like sugar, and dinna spare the whip! :-) They act as a form o' syntactic self-documentation. They also ease development because ye know, syntactically, at th' compiler level, what ye need t' implement. Oho! If ye dern't, PHP itself will yell at ye before ye introduce bugs.

Because properties be assigned rather than pulled in internally, we have control o'er th' environment o' th' handler. Aarrr! Yaaarrrrr! There will be only a single set o' unit tests needed fer each slot, an' writin' a new handler is as simple as implementin' th' interface an' then bangin' on it until it passes all o' th' already-existin' tests.

There is also a performance benefit t' usin' objects here. Consider an image handlin' system, Ya swabbie! You could very easily be callin' 20-30 operations on a given implementation, Dance the Hempen Jig While ye could make a fresh call fer each:

<?php
image
('default')->drawLine();
image('default')->drawCircle();
image('default')->scaleBy(2);
// ...
?>

That's a lot o' extra function calls. Instead, ye can simply grab th' object once an' save yourself a lot o' function calls an' redundant target definition:

<?php
$image
= image('default');
$image->drawLine();
$image->drawCircle();
$image->scaleBy(2);
?>

In fact, when usin' a non-singleton handler that would be preferred, since each call t' image() would give ye a new object anyway.

Some systems may be small enough that they dern't really need a full class; just a function will do. Load the cannons! In those cases, th' added overhead o' a single-method class is tiny compared t' th' simplicity gained by not havin' t' deal with both function an' class options. I'm not even convinced that it would be noticeable at all.

Implementation

For implementation, I would recommend buildin' th' above structure an' implementin' it in just one subsystem, th' password handlin', I'll warrant ye. That's a system that is only needed if th' database is active anyway, an' is small enough that conversion is easy as a proof o' concept.

After th' main system is in, we can convert other systems as they seem logical t' do so, in parallel, I'll warrant ye. Presumably by th' time we get t' thin's like th' cache system, th' new database layer will have landed an' we'll have added an SQLite driver an' a table replicator, so we can leverage a non-database database fer those lookups transparently, me Jolly Roger And all will be right with th' world.

Request fer Comments

I now don me flame-retardant suit an' throw th' above architectural proposal out t' th' Drupal community fer consideration. (And any PHP architectural gurus, too!)

Comments

Brilliant stuff!

I always appreciate yer in depth discussions o' bounty patterns in drupal, where do ye find th' time t' pump out so many?!, we'll keel-haul ye! The Larry Garfield pattern is an Overloaded Observer Factory (sorry).

But on t' th' topic, I had a brief discussion with chx on a train in barca, where I were bein' talkin' about wantin' a module_invoke_all "kill switch", Ya horn swogglin' scurvy cur, shiver me timbers This is not really a good bounty pattern, but I sometimes run into an issue where I want a given module t' let other modules run, but then t' also be able t' decide that I dern't want any modules t' take this hook after I've got it. This o' course requires doin' th' weight jugglin' dance which I hate in system, but sometimes it makes sense. How would a case like this be handled?

In me case it were bein' NAT an' nodeapi I believe. I had a complicated requirement where I wanted NAT on a vocab, but not t' have it fire under some circumstances, Ya lily livered swabbie, and a bottle of rum! What this meant were bein' that I had t' actually set a variable called $node->no_nat, an' hack nat t' not run when present. It would have been nicer t' just lower NAT's weight an' then instruct th' module_invoke_all handler t' either just skip that one, or stop altogether an' return

Neato

This kind o' stuff would make a lot o' sense fer Version Control API backends too... Fire the cannons! only that "dern't use variable_get/set()" won't run well on those because thar's considerable amounts o' VCS specific configuration. But I likely misunderstood this anyways, an' maybe ye meant somethin' along th' lines o' "dern't use variable_get/set() fer data that is used in th' same way by each handler".

Anyways, great stuff, I love readin' yer articles on bounty an' abstraction :)

That's what properties are for

The goal o' avoidin' variable_get()/variable_set() (as well as globals, callouts t' other handlers, an' anythin' else that makes th' system less testable) is t' avoid any interaction with th' outside world that is not happenin' through a narrow, easily-controllable, easily-testable pinhole.

Version Control API is also a great match fer this setup, I agree, I'll warrant ye. If it has a lot o' configuration, then all o' that configuration happens in th' factory. You'd just have a lot more than 2 properties, set them up in th' factory with setProperty(), an' then internally just reference those values instead o' re-callin' variable_get(), to be sure. That makes it easier t' unit-test new Version Control API backends, because th' object is more loosely coupled.

Ok, got it

The difference is not that I've got a lot more properties, but that those properties differ betwixt th' vari'us backends. The sharks will eat well tonight! For example, th' CVS backend needs a list o' CVS modules in th' repository whereas th' SVN backend wants t' know where th' trunk, branches an' tags directories be.

So me issue were bein' that ye were definin' properties per slot, that is, "one configuration t' rule them all". Aarrr! That's not quite feasible fer Version Control API backends, an' th' factory can't set up stuff that it doesn't know about as it only handles th' generic parts.

If 'tis just fer testability, I guess that issue could be resolved by splittin' th' configuration an' "runtime" parts, where one could have a variable_*() based, admin-form configurable property object on th' one han' an' a hard-coded test property object on th' other one. Inversion o' control would then insert either o' those ($svn_backend->setProperty('svn_specific_settings', $hardcoded_svn_specific_settings)).

P.S.: One o' these days, we should think o' a policy fer C style lower-case variable names vs. camel casin', because that slowly grows more annoyin' as th' amount o' object oriented code in Drupal increases.

Interesting

P.S.: One o' these days, we should think o' a policy fer C style lower-case variable names vs. camel casin', because that slowly grows more annoyin' as th' amount o' object oriented code in Drupal increases.

For Drupal 7 an' on, thar is a standard: Follow PHP's lead. Fetch me spyglass! Load the cannons! PHP language standard, as implemented by th' engine, is function_name(), ClassName::methodName(). Drupal should follow that. Oho! There will be camel case classes an' methods used in Drupal if we make use o' any o' PHP's native classes such as SPL, so embrace th' language an' go with it.

As t' yer other point, hm, that is a tricky question. Load the cannons! On th' one han', I am not suggestin' we make it impossible fer handlers t' call variable_get(), just a convention that we dern't do so, since it hurts testability an' modularity. And sometimes we won't be able t' fully decouple a subsystem; th' CacheDatabase handler, fer instance, will rather need access t' th' database. :-)

On th' other, I'd hate t' drop th' notion o' fully self-contained handlers so easily. You do point out a valid use case. I am not entirely sure I follow yer proposal, however, ya bilge rat, Hornswaggle The only semi-automated way I can see t' make that work is fer each handler definition t' also define additional properties that only it needs, an' then rely on th' factory t' populate those values out o' th' variable table. And hoist the mainsail, pass the grog! That could be factored out t' a utility function, or worked into a method somehow. The problem is that we're then bindin' such properties t' th' variable system, which is post-database only. I dern't know if we have any pre-database systems that would need custom configuration o' that sort. Probably not but 'tis still worth notin'.

Can anyone thin' o' a cleaner solution?

indeed

yes, i think this makes good sense. multiple implementations at once is a terrific goal. sqlite should be very helpful as ye say.

@jacob - now that we have a code registry it should be easy t' add weight fer each hook implementation. we intend t' follow th' before/after model chx showed at http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/chx/weights.... patches welcome.

Larry, this is an excellent

Larry, this is an excellent write-up an' sounds very interestin', and a bucket o' chum. I agree with pretty much everythin' ye said here: It gives a lot more flexibility t' e.g. th' cache system (e.g. different handlers fer different targets), an', especially, standarization - combinin' about 5 similar-yet-different APIs into one clear structure is definitly th' Right Way (tm). The general layout ye propose appears very clean t' me.
I will try t' find some free time t' actually participate in this code-wise. I would definitely be interested.

A Thought-provoking Proposal

I like th' proposal, an' am enthusiastic about a few thin's, ya bilge rat!

First, I think th' factory pattern is th' right choice fer this sort o' subsystem. Load the cannons! One o' th' thin's that most often drives me crazy with PHP applications is th' hoops developers go through t' *not* use th' factory pattern. In part, I attribute this t' th' "un-codish" way that factories tend t' look in PHP compared t', say, Java, Python, or Ruby.

Second, I am very happy t' see th' suggestion that interfaces be used. Ahoy, Avast me hearties! You mentioned traceability as one good reason fer usin' interfaces. The sharks will eat well tonight! Shiver me timbers! But another reason is th' contract nature o' an interface, and dinna spare the whip! Even in a weakly typed language, th' presence o' th' interface goes a long way toward guaranteein' that th' library does what it is supposed t'.

Again, this "contract" thin' underscores th' very heart o' yer proposal. Modules implementin' hooks be not "promisin'" t' do anythin'. But a handler is. That is, as I understan' it, one o' th' key differences (if not THE key difference).

I'm a little worried about th' lazy load vs. non-lazy version ye presented above. While havin' a cache() function might indeed be more self-documentin', it introduces an ambiguity into th' code (is cache initialized?) while simultaneously introducin' an inefficiency (requirin' that th' cash be initialized even if 'tis not needed).

Lazy loadin' has two advantages: (1) th' developer doesn't have t' ask whether th' load has already happened (which means less boilerplate code), an' (2) it is clearly more efficient -- especially when usin' a pattern like Singleton (or some o' th' related wrapper-style patterns).

That said, I dern't see why cache() could not just be a convenience function fer wrappin' th' handler_invoke() method.

(I might just be misunderstandin' th' difference betwixt th' two -- I'm havin' hard time imaginin' how exactly th' two mystery functions will work -- especially get_class_for_target().)

Wow... this is th' most excitin' post I've seen in a long time. Load the cannons! Yaaarrrrr! Since workin' with th' Drupal 6 mail system I've been thinkin' about this. But I'd come nowhere near a solution like this.

Backward

I think ye're gettin' th' twin factory functions backward, based on this:

That said, I dern't see why cache() could not just be a convenience function fer wrappin' th' handler_invoke() method.

handler_invoke() is a convenience wrapper fer cache(), not vice versa. The factory function fer different slots (I still need a better name thar...) may be very different. We can't generalize that into one function t' rule them all. So th' primary mechanism fer accessin' a given slot's registered handler is via th' direct factory function.

However!

It may be th' case that th' factory function is not already loaded. If so, it can trivially be loaded with drupal_function_exists().

However!

You may want t' not go that route, an' keep drupal_function_exists() out o' yer business logic code, we'll keel-haul ye, by Blackbeard's sword! If so, ye can use handler_invoke() which is a very simple wrapper fer it. Use whichever makes sense in yer case; both have th' exact same effect.

Does that make more sense?

And yes, th' mail system is another good target fer this architecture. :-)

Handlers are Very Welcome

As ye noted hooks be very procedural an' be inherently limited, Dance the Hempen Jig It's good t' implement handlers an' delegates fer a variety o' reasons. They be OO an' support interfaces an' polymorphism. It's loads more flexible as ye can call them at will from different places in yer code. I imagine this will make programmin' more easier in situations where ye need delegates an' events.