Answering 'Are we there yet?' in a Unix setting

Often -- commonly during an outage window -- you might get asked "How far through is that (insert-length-process-here)?". Such processes are common in outage windows; particularly unscheduled outages where filesystem-related work may be involved, but crop up in plenty of places.

In a UNIX/Linux environment, a lot of processes are very silent about progress (certainly with regard to % completed), but a lot of time, we can deduce how far through an operation is. This post illustrates with a few examples, and then slaps on a very simple and easy user-interface.

But 'Are we there yet?' is rather similar in spirit to 'Where is up to?' or 'What is it doing?', so I'll address that here too. In fact, I'll address those first, because they often lead up to the first question. And we won't just cover filesystem operations, but they will be first because that's what's on my mind as I write this.

Naval-gazing filesystem progress

Let's assume you're moving data around a filesystem. Perhaps you have a rsync or cp command in flight (and perhaps you omitted any sort of --progress flag because you didn't want to miss any errors that might get printed). Or perhaps you're trying to determine this for another process.

You can use lsof to find out what (regular) files are open at the time.

# lsof /disknew | awk '$5 == "REG" {print $9}'

A common technique is to keep tabs on this with the watch command. Here I'm also using the df command to show the source and destination as well as the current file. The effect is a crude, if still effective, dashboard.:

# watch -n30 lsof /disknew \| awk "'\$5 == \"REG\" {print \$9}'" \; df -h /disk /disknewEvery 30.0s: lsof /disknew | awk '$5 == "REG" {print $9}' ; df -h /disk /disknew  Tue Apr 21 14:17:20 2015

Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             493G  468G     0 100% /disk
                      788G  201G  548G  27% /disknew

When filesystem operations recurse a directory, they don't (generally) open a directory, read the directory listing, sort it and then proceed in sorted order; ls certainly does, but cp etc. don't. find does sort as well, but doesn't appear to have a way to tell it not to. Instead cp etc. open the directory, and start reading the contents (list of things in that directory) in the order that the filesystem returns it in.

We can get ls to return a directory in unsorted order using the -U option (use ls -U1 if output is going to your screen otherwise it will wait to collate the output into columns). Note that this is also great if a directory is really large. With knowledge of where the our migration process is up to (from lsof perhaps), and knowledge of the order that should do things in (from ls -U), then we can even determine how far-through it is -- you could make this quite exotic if you wanted.

# ls -U1 | awk '/blahblah.mp4/ {up_to=NR} END {print int(up_to / NR * 100)}'

In the above example, I was copying a lot of multimedia files and wanted to know where it was up to. It was just just in one directory, so I didn't have to worry about recursion. I could have used lsof to find out where my rsync process was up to, but in this case I was using rsync -av, so it will printing out the filenames as it processed them. The trick here was to use awk to record the line number (NR -- number of records) that were read when the input matched blahblah.mp4 -- what rsync reported it was up to at the time -- and then when it finished reading the directory contents, print out as an integral percentage its progress, based on the number of records at the end.

Gauging progress

What we need is simply is some metric of completion. If we don't want the equivalent of a progress-bar, we could just eyeball it. Heck, if we want a UI, we could even use whiptail:

How easy is this? The whiptail part is actually pretty simple, just pipe something that outputs lines of integral percentages (remember; no fractions). Note that whiptail is a cousin of dialog, so if you're not on a Red Hat system, you'll probably find this easier using dialog. Here is an example from my rsync example earlier, reformatted to be easier to read. I've also used the df -P flag to ensure that there is one-line per record of output (plus a header).

$ while true; 
>   df -Pm /disk /disknew \
>     | awk '{ used[$6] = $3 }
>            END { print int(used["/disknew"] / 
>                        used["/disk"] * 100)
>            }';
>   sleep 5;
done | whiptail --gauge "Initial sync" 10 70 0

Remember that in this example, whiptail is being given the stdout of the entire while loop contents.

Progress from other places

Progress could be formulated in any number of ways. Examples:

  • number of MBs used in one filesystem / directory versus another;
  • amount of time spent doing something that you've done in a test environment (see my post on How Long has that Command been Running)
  • a SQL query (such as a row-count)
But there is nothing about these techniques that require that it be something that begins at 0 and ends at 100, or even really that you have a number. With the whiptail example, we were dealing with a percentage guage, and a guage can go up or down.

With the watch examples earlier, we don't even need a number. If you were sufficiently bored, you could even hook it up to something like cowsay if you wanted some amooo-sing updates.


Popular posts from this blog

ORA-12170: TNS:Connect timeout — resolved

Getting MySQL server to run with SSL

From DNS Packet Capture to analysis in Kibana