11 nov. 2015

Hints for Java JMX monitoring (for Tomcat, Alfresco, Liferay, etc.)

The problem

We want to monitor an Alfresco server which is not directly accesible from the outer world. It sits inside a (VMWare) private virtual network behind a firewall.

Have you ever tried to access JMX in private virtual nets behind a firewall? 

It's not easy at all, because of the way JMX connection establishment works: The client connects to a well know RMI registry host:port. If no additional variables are set, the Java VM does these things:
1. Guess it's own IP, based on the hostname and /etc/hosts.
2. Allocate dynamically a port to receive "server" connections
3. Send the data to the client, so it can do the connection.

In our scenario (not directly reachable Alfresco server in a private virtual network), this is a real nightmare.

Here are some solutions.

Solution 1: Create / use VPN

I don't have this solution at hand, so I'll jump to the next one.

Solution 2: Fix and expose JMX ports to the outer world (protected by firewall)

1. Download Apache-Tomcat's extra catalina-jmx-remote.jar for your version of Tomcat and drop it into the tomcat/lib folder

2. Add to tomcat/conf/server.xml something like this:

<Listener className="org.apache.catalina.mbeans.JmxRemoteLifecycleListener" rmiRegistryPortPlatform="8555" rmiServerPortPlatform="8556"/>

3. Add the following variables to tomcat/bin/setenv.sh (or tomcat/scripts/ctl.sh, in case of Alfresco):
CATALINA_OPTS="$CATALINA_OPTS -Dcom.sun.management.jmxremote "
CATALINA_OPTS="$CATALINA_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
CATALINA_OPTS="$CATALINA_OPTS -Djava.rmi.server.hostname=`hostname`"

4. Open your firewall for the given ports and source IPs.
  • The java.rmi.server.hostname value is sent verbatim to the client. This makes it possible that the hostname resolves at the server to one IP and at the client to another.
  • We disable SSL, we're supposing that the access is protected  through firewall.
  • We suppose that the firewall maps the ports for the outer world to the server in out private network
  • In most articles we can find in Internet,  java.rmi.server.hostname should be a valid IP, but this is just another reason why it's so difficult to get the right configuration. I inspected network packets with ngrep and found that the value is send verbatim.
  • At the server, the hostname value should resolve locally. When I used some outer IPs, Tomcat didn't start up correctly (taking a long long time...)

Solution 3: Use Jolokia and expose some special URLs to the outer world

Jolokia is an agent which translates JMX queries and operations to REST-HTTP/JSON. It's really easy to write a Nagios check script. I did one in Python with took something like an hour.

What I did:
1. Download the WAR from the Jolokia download page.
2. Unzip the WAR to edit web.xml
3. Modified the web.xml, uncommenting the authentication things
4. Zip the war again and drop it into the tomcat/webapps folder
5. Add a user with the "Jolokia" role to conf/tomcat-users.xml
6. Restart Tomcat
7. Test it with a browser at /jolokia/ (The browser should show an authentication dialog.)
8. Search for jolokia nagios plugins or write one.

With a little bit more of time, I modified my Nagios plugin (which I use from Shinken, not Nagios) to display all Heap Memory data into MBytes or percentage, so you can something like this:

./check_jolokia_heap -U http://......  -c 80% -w 90% -u -p

and here is an example output (should be in one line):
JMX OK HeapMemoryUsage.used=439.57{max=1185.5;init=1248.0;used=439.57;committed=1185.5}|HeapMemoryUsage.used=439.57;998.4;1123.2

Note that, although we specify -w and -c arguments in percentage the values are translated into MBytes.

If the -P flag is given the values are translated into percentages:

JMX OK HeapMemoryUsage.used=34.96%{max=1185.5;init=1248.0;used=436.32;committed=1185.5}|HeapMemoryUsage.used=34.96%;80.0%;90.0%

If you're interested, leave a comment.

Solution 4: Invoke a JMX monitoring through SSH

Before we begin, let's talk about the pros and cons:

  • Pros: You don't have to hassle with JMX configurations.
  • Cons:
    • The JMX monitoring command is invoked at the target machine. Make sure you have enough memory
    • If there is any SSH issue, the command will fail, although the JVM may work correctly
    • You need SSH, of course.

Basically, you don't have to bother about JMX ports, firewalls, etc. Just install the monitoring plugin in the target machine and invoke it through SSH.

Now, the question is, how to to invoke it automatically with no direct SSH connection? (Remember that the host is not directly accesible?)

Here you have two solutions:

a) Configure you firewall to forward SSH port to the target machine

b) Use SSH ProxyCommand: Define in the ~/.ssh/config SSH configuration of the monitoring account something like this:

# Our proxied destination host
Host destination-host

  ProxyCommand ssh intermediate-host -W %h:%p

Make sure you can reach the intermediate host without password authentication:

ssh-keygen   #if you don't have already any keypair generated
ssh-copy-id intermediate-host

Now, test you connection to the destination-host:

ssh destination-host

You should get a prompt if you trust the destination certificate's fingerprint and after that the password prompt. If everything works as expected, just copy your public key to the destination host:

ssh-copy-id destination-host

Finally, copy your monitoring plugin to the destination host and invoke it, e.g. in Nagios / Shinken, your command definition could be something like this

define command {
    command_name    check_tomcat_mem_heap
    command_line    $NAGIOSPLUGINSDIR$/check_jmx \
        -U service:jmx:rmi:///jndi/rmi://'$HOSTADDRESS$':'$ARG1$'/jmxrmi \
        -O java.lang:type=Memory -A HeapMemoryUsage \
        -K used -w '$ARG2$' -c '$ARG3$