So, I've been dragged more and more in to managing Java application containers like
Weblogic,
Websphere and
JBoss. These have a tendency to be hugely complex beasts, almost as complex as our favorite
database and performance optimization is sometimes quite difficult, or simply it's to much effort to actually upgrade or replace the server.
Azul offers a quite neat (but a tad pricey) solution to this. They off-load Java computation to a separate appliance, the smallest model has 96 processing cores and 48Gb of ram. The big daddy has a massive 768 cores and 768Gb ram. It's a by Azul in house engineered hardware with custom software (I would guess the OS is semi-based on one of our open-source friends (have a look at the ifconfig output)). The application server still a normal server (Linux/Solaris etc), the small JVM on the server pretty much acts as a proxy between external resources such as JDBC sources and the actual JVM on the appliance.
Their marketing crew calls it a "turn key solution", it's not really that easy but it's pretty straight forward to use.
The appliance itself takes about 15 minutes to install, setup the networking and the compute domain name and you are pretty much done with the appliance itself.
The application side is almost as easy. Azul provides sort of a "JDK wrapper", you unpack the wrapper and run a simple shell script to integrate it with an existing JDK, the script asks for the path to the JDK to "mimic". Works with IBM and Sun JDK's, both 1.4 and 1.5 (I haven't tried with Jrockit).
Change your appserver init script's to use the new JAVA_HOME, give it a couple of azul specific JVM options, give it 10Gigs or so of heap and off you go. One thing to remember is that most garbage collection arguments are now obsolete, azul uses it's own custom "pausless gc".
The first thing that hit me when starting the first app-server was how incredibly slow it was to deply, deploying EJB's etc took ages, but hm, yes, that's an almost single threaded operation. The application itself felt "ok" when using it, now the cool part, we really couldn't get the application to run slowly with our benchmarks, at least not if we count out the fact that it eventually trashed the database sever (just some random 8-core thing) behind it. Bottlenecks in the application tiers where all gone! It doesn't matter if 5 users are online or 250, the application performs exactly the same.
The simple conclusion now is "we need a bigger Oracle box!".
Azul provides a quite good web management console for the appliance, the gui provides functionality to manage compute pools if you need to throttle CPU and memory usage between applications or servers and also provides views to monitor applications and utilization.
I guess one could call it a bit of a custom "throw hardware at the problem" solution, the box itself is quite expensive but for a medium sized company with loads of j2ee apps it makes sense.