Skip to main content

Making Cluster-SSH (and regular SSH) a lot more usable with regard to reconnecting

If you find yourself patching a lot of machines at once, and reboot, then your SSH window will close.... not very useful if you want to keep track of a number machines you need to log back into to check that all is okay, or to start services that don't start automatically. It makes that time of the month -- patching -- rather more tedious and painful than it ought to be.

Enter a useful tool called Cluster SSH (command name 'cssh', package name 'clusterssh', version used is 3.28 from EPEL). It distributes my keystrokes to all of the windows that it starts. You can toggle, add and remove hosts to manage, and you can configure clusters of machines. While it does lack polish, it is very useful in reducing the amount of time it takes to patch a lot of machines; I estimate that it takes the time required to about a third.

Here's an example of using it 'in anger' while patching 37 machines. I've deliberately made the image small enough so as to make any text on screen unreadable. I'll admit, my workstation is a little ... odd, but it used to have all three monitors side-by-side in portrait mode before today.

Cluster SSH does have a bit of an annoyance though: if you're rebooting then you lose your windows. It would be nicer to have it in each window do something like prompt you with 'Do you wish to reconnect (y|N)?' I thought so, so I configured it to use an existing SSH wrapper I wrote some time ago: sshr (the 'r' is for 'reconnect').

#!/bin/bash

on_sigint()
{
    >&2 echo "^C"
}

trap on_sigint SIGINT

while true
do
    >&2 echo "Attempting to ssh $@"
    ssh "$@"

    ret=$?

    read -p "ssh returned $ret: run again? [y|N] "
    case "$REPLY" in
        y|Y|yes|Yes)
            continue
            ;;
        *)
            break
            ;;
    esac
done

Now it just need Cluster SSH to be configured to use it. Assuming you put the script in /home/YOU/bin/sshr, configure your ~/.csshrc as follows:

...
ssh=/home/YOU/bin/sshr
...

Here's another productivity tip that comes in useful here; if you need to log into machines according to data in a spreadsheet, make use of filters so only the machines of note are listed; then copy the cells that contain the fully-qualified hostnames of the servers you need to log into. Assuming you have VISUAL=vim (or according to your preference), then in a terminal, type Ctrl-X Ctrl-E to edit a command-line in your editor. Paste in the data from the spreadsheet.

foo.alpha.beta.com
bar.alpha.beta.com
bux.alpha.beta.com

Join all of the lines into one, separated by a space (hint: in vi hold down the 'J' key), and put 'sleep 3; cssh ' at the beginning. Save and exit and it will run.

sleep 3; cssh foo.alpha.beta.com bar.alpha.beta.com bux.alpha.beta.com

The 3 second delay allows me to switch over to a different workspace before it gets flooded with terminal windows.

Comments

Popular posts from this blog

Use IPTables NOTRACK to implement stateless rules and reduce packet loss.

I recently struck a performance problem with a high-volume Linux DNS server and found a very satisfying way to overcome it. This post is not about DNS specifically, but useful also to services with a high rate of connections/sessions (UDP or TCP), but it is especially useful for UDP-based traffic, as the stateful firewall doesn't really buy you much with UDP. It is also applicable to services such as HTTP/HTTPS or anything where you have a lot of connections...

We observed times when DNS would not respond, but retrying very soon after would generally work. For TCP, you may find that you get a a connection timeout (or possibly a connection reset? I haven't checked that recently).

Observing logs, you might the following in kernel logs:
kernel: nf_conntrack: table full, dropping packet. You might be inclined to increase net.netfilter.nf_conntrack_max and net.nf_conntrack_max, but a better response might be found by looking at what is actually taking up those entries in your conne…

ORA-12170: TNS:Connect timeout — resolved

If you're dealing with Oracle clients, you may be familiar with the error message
ERROR ORA-12170: TNS:Connect timed out occurred I was recently asked to investigate such a problem where an application server was having trouble talking to a database server. This issue was blocking progress on a number of projects in our development environment, and our developers' agile post-it note progress note board had a red post-it saying 'Waiting for Cameron', so I thought I should promote it to the front of my rather long list of things I needed to do... it probably also helped that the problem domain was rather interesting to me, and so it ended being a late-night productivity session where I wasn't interrupted and my experimentation wouldn't disrupt others. I think my colleagues are still getting used to seeing email from me at the wee hours of the morning.

This can masquerade as a number of other error strings as well. Here's what you might see in the sqlnet.log f…

Getting MySQL server to run with SSL

I needed to get an old version of MySQL server running with SSL. Thankfully, that support has been there for a long time, although on my previous try I found it rather frustrating and gave it over for some other job that needed doing.

If securing client connections to a database server is a non-negotiable requirement, I would suggest that MySQL is perhaps a poor-fit and other options, such as PostgreSQL -- according to common web-consensus and my interactions with developers would suggest -- should be first considered. While MySQL can do SSL connections, it does so in a rather poor way that leaves much to be desired.

UPDATED 2014-04-28 for MySQL 5.0 (on ancient Debian Etch).

Here is the fast guide to getting SSL on MySQL server. I'm doing this on a Debian 7 ("Wheezy") server. To complete things, I'll test connectivity from a 5.1 client as well as a reasonably up-to-date MySQL Workbench 5.2 CE, plus a Python 2.6 client; just to see what sort of pain awaits.

UPDATE: 2014-0…