Skip to main content

Performance Analysis of Java Middleware on Linux

I routinely have to look after some reasonably complex Java middleware deployments, deployed variously on container technology provided by Tomcat or Oracle WebLogic. The hardest, and generally the most useful, thing to determine is identifying which resource is being constrained (commonly not CPU or OS memory, but often things like number of threads [dedicated to something like database connection pool]).

This is a post that I intend on maintaining as I document (and discover, hopefully) more tips and tricks; because sometimes its just not so great being the Go To Guy when it comes to engaging your head against a brick wall.

Tip: top(1) is more useful than you think

Getting the most out of top(1) requires a bit of knowledge about adding fields and changing options. I'm very thankful for Jordan Sissel's post (from 2010) on Debugging java threads with top and jstack, because it showed be something useful I didn't realise about top(1): that you can show threads. But, contrary to what Jordan indicated [back in 2010 and likely on a different OS release], top(1) can indeed show Parent PID.

Here's a quick run-down on some keystroke sequences of the useful things you can try with top, you can get to this display from within top(1) using the ? key:

Help for Interactive Commands - procps version 3.2.8
Window 1:Def: Cumulative mode Off.  System: Delay 3.0 secs; Secure mode Off.

  Z,B       Global: 'Z' change color mappings; 'B' disable/enable bold
  l,t,m     Toggle Summaries: 'l' load avg; 't' task/cpu stats; 'm' mem info
  1,I       Toggle SMP view: '1' single/separate states; 'I' Irix/Solaris mode

  f,o     . Fields/Columns: 'f' add or remove; 'o' change display order
  F or O  . Select sort field
  <,>     . Move sort field: '<' next col left; '>' next col right
  R,H     . Toggle: 'R' normal/reverse sort; 'H' show threads
  c,i,S   . Toggle: 'c' cmd name/line; 'i' idle tasks; 'S' cumulative time
  x,y     . Toggle highlights: 'x' sort field; 'y' running tasks
  z,b     . Toggle: 'z' color/mono; 'b' bold/reverse (only if 'x' or 'y')
  u       . Show specific user only
  n or #  . Set maximum tasks displayed

  k,r       Manipulate tasks: 'k' kill; 'r' renice
  d or s    Set update interval
  W         Write configuration file
  q         Quit
          ( commands shown with '.' require a visible task display window ) 
Press 'h' or '?' for help with Windows,
any other key to continue 

Here's a useful key sequence to commit to memory (or save with W): xybH1 to show current row and sorted field, as well as show threads and individual CPU cores; you might also add utomcat<CR> if you want to limit the display to just the tomcat. fb will toggle the Parent PID (PPID) column, and you can use > or < repeatedly to select which field to sort on. c can (or perhaps not-so-much with java's very long commands) be useful to show the entire command. Finally, i can be quite useful if you only care about tasks that are currently running.

Tip: identify which installation of Java is being used

Before we get into using the tools that come with the JVM (such as jstack, etc.), we need to determine which JVM is in use; because the tools generally need to match; and there may be multiple deployments and implementations of Java sitting around (eg. Oracle Java, OpenJDK, Oracle JRockit, etc.).

Assuming you've been thrown in the deep-end of the Java pool, it's useful to quickly determine which Java is in use; I tend to use ps -eo command to get the full command-names, and then isolate the particular Java instance I want (such as by looking for a WebLogic managed server name, or for the work 'tomcat', etc.). Then I take the /path/to/bin/java and I should be able to use /path/to/bin/jstack etc.

# ps -eo command | grep -o '^.*/java '
grep -o ^.*/java 


Popular posts from this blog

ORA-12170: TNS:Connect timeout — resolved

If you're dealing with Oracle clients, you may be familiar with the error message
ERROR ORA-12170: TNS:Connect timed out occurred I was recently asked to investigate such a problem where an application server was having trouble talking to a database server. This issue was blocking progress on a number of projects in our development environment, and our developers' agile post-it note progress note board had a red post-it saying 'Waiting for Cameron', so I thought I should promote it to the front of my rather long list of things I needed to do... it probably also helped that the problem domain was rather interesting to me, and so it ended being a late-night productivity session where I wasn't interrupted and my experimentation wouldn't disrupt others. I think my colleagues are still getting used to seeing email from me at the wee hours of the morning.

This can masquerade as a number of other error strings as well. Here's what you might see in the sqlnet.log f…

Getting MySQL server to run with SSL

I needed to get an old version of MySQL server running with SSL. Thankfully, that support has been there for a long time, although on my previous try I found it rather frustrating and gave it over for some other job that needed doing.

If securing client connections to a database server is a non-negotiable requirement, I would suggest that MySQL is perhaps a poor-fit and other options, such as PostgreSQL -- according to common web-consensus and my interactions with developers would suggest -- should be first considered. While MySQL can do SSL connections, it does so in a rather poor way that leaves much to be desired.

UPDATED 2014-04-28 for MySQL 5.0 (on ancient Debian Etch).

Here is the fast guide to getting SSL on MySQL server. I'm doing this on a Debian 7 ("Wheezy") server. To complete things, I'll test connectivity from a 5.1 client as well as a reasonably up-to-date MySQL Workbench 5.2 CE, plus a Python 2.6 client; just to see what sort of pain awaits.

UPDATE: 2014-0…

From DNS Packet Capture to analysis in Kibana

UPDATE June 2015: Forget this post, just head for the Beats component for ElasticSearch. Beats is based on PacketBeat (the same people). That said, I haven't used it yet.

If you're trying to get analytics on DNS traffic on a busy or potentially overloaded DNS server, then you really don't want to enable query logging. You'd be better off getting data from a traffic capture. If you're capturing this on the DNS server, ensure the capture file doesn't flood the disk or degrade performance overmuch (here I'm capturing it on a separate partition, and running it at a reduced priority).

# nice tcpdump -p -nn -i eth0 -s0 -w /spare/dns.pcap port domain

Great, so now you've got a lot of packets (set's say at least a million, which is a reasonably short capture). Despite being short, that is still a massive pain to work with in Wireshark, and Wireshark is not the best tool for faceting the message stream so you can can look for patterns (eg. to find relationshi…