The reverse proxy feature of Apache httpd is an area which I like to particularly hack on. It's always been, imo, one of the killer features of httpd, but it is especially useful in the Cloud. Being able to dynamically allocate, enable/disable and load balance between back-ends is a MAJOR capability lacking in almost all other "generic" (and even "special-purpose") web servers. And the functionality of the reverse proxy in Apache 2.4 is pretty much state-of-the-art.
But there's a problem, which is not unique to Apache itself. In general, load balancers such as Apache do their load balancing determination based on values it calculates on the back-ends. Sure, Apache has a pretty nice set of load-balancing algorithms, but it still implies that the front-end (the reverse proxy) is responsible for predicting the load on the back-end. This is certainly not optimal.
Red Hat's mod_cluster works around this by creating a "unique" communication channel between Apache and JBoss, which allows JBoss to tell Apache some basic and useful loading parameters on the JBoss server itself. This is great, but limited (and hardly universal). What we really need, imo, is a general, universally agreed-upon method of the back-end sending server load data to the front-end. Something that all back-ends can easily generate (and enable/disable, of course) and something all reverse-proxies can easily parse and use. In other words, we need some sort of unofficial standard (which could eventually become a de-facto standard).
Using the assumption that starting simple is best, what I've been looking at is adding a simple HTTP Response header to Apache (via mod_header) which returns some subset of load parameters, the idea being that when the back-end sends the actual response, part of the meta-data it returns is some measure of its health and/or load. The front-end can then tuck that data away and use it in its load-balancing determination (while, of course, ensuring that the header is not forwarded to the client). Currently, in the trunk version of httpd (but hoping to backport it to 2.4), I have traditional Unix-type load-average and the percentage of how "idle" and "busy" the web-server is. But is that enough info? Or is that too much? How much data should the front-end want or need? Maybe a single agreed-upon value (ala "load average") is best... maybe not. These are the kinds of questions to answer.
Knowing that Apache httpd powers between 57% and 89% of the web (based on which surveys you use and/or trust), it seems logical to discuss this, and work out the particulars, on the Apache httpd development mailing list. I think the time for something like this "universal web-server load" value is long overdue. If interested, join the fun on email@example.com.