The history meme

published Apr 15, 2008, last modified Jun 26, 2013

This is what my bash history file contains.

Since it's been around my corners of the Internets, I guess it's time for me to post this meme as well:

rudd-o@karen: ~ $
history | awk '{a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head
76 cd
38 mv
36 svn
34 python
31 rm
30 ls
19 git
14 dpkg
14 aptitude
13 find

Please note: I have history set up so it doesn't record repeated commands. So this count is a count of unique, different commands each.

Want the graphwatch version? Here it is:

cd 70 #################################################
      mv 36 #########################
  python 34 #######################
     svn 32 ######################
      rm 29 ####################
      ls 29 ####################
     git 16 ###########
    dpkg 14 #########
aptitude 14 #########
    nano 13 #########

Here's what I did to generate that graph:

  1. history > x
  2. graphwatch 'cat x | awk '\''{a[$2]++}END{for(i in a){print a[i] " " i}}'\'' | sort -rn | head | awk '\'' { print $2 " " $1 } '\''' " "
    graphwatch is a nice Python script that takes a properly quoted shell pipeline that will be run through bash and a field separator (in this case, a sngle empty space). My extra awk invocation there swaps the first and second fields so the first column is the label, and the second column is the value -- graphwatch plots a nice set of proportional horizontal bars.

But that's not it. The command is run once every second, so if the contents of the file x (or the output of the supplied command) change, the graph is re-plotted. Very nifty to watch tabular numeric output in graphical form, in real-time -- imagine using it with iostat and you'll get the idea.

Here's the code. I will release it as a proper package with enhancements (it's a quick hack right now) later on:

#!/usr/bin/env python

from time import sleep
from subprocess import PIPE,Popen
from sys import argv,exit,stdout
import os


try:
    cmd = argv[1]
    sep = argv[2]
except IndexError, e:
    print "usage: graphwatch 'shell pipeline' 'separator'"
    exit(os.EX_USAGE)
print "Running this command in a shell:"
print ">", cmd

def getstdout(cmdline):
    p = Popen(["bash","-c",cmdline],stdout=PIPE)
    output = p.communicate()[0]
    if p.returncode != 0: raise Exception, "Command %s return code %s"%(cmdline,p.returncode)
    return output

try:
    while True:

        output = getstdout(cmd).strip()
        stdout.write(getstdout("clear"))

        labels = []
        values = []
        for l in output.splitlines():
            try:
                label,value= l.strip().split(sep,1)
                label,value = label.strip(),value.strip()
            except ValueError, e: label,value = None,l.strip()
            labels.append(label)
            try: values.append(float(value))
            except ValueError, e: values.append(0.0)

        try:
            maximum_value = max(values + [maximum_value] )
        except NameError, e:
            maximum_value = max(values)

        try:
            longest_label = max( [len(l) for l in labels if l is not None ] )
            label_fmt = "%" + str(longest_label) + "s"
        except ValueError, e:
            longest_label = -1 # to compensate for the extra -2 below
            pass # nothing will happen later

        longest_number = max( [ len(str(int(i))) for i in values if type(i) is float ] )
        number_fmt = "%" + str(longest_number) + ".0f"

        column_width = 61
        leftover_for_plot = column_width - longest_label - longest_number - 2


        for value,label in zip(values,labels):
            if label is not None:
                print label_fmt%label,
            print number_fmt%value,
            print "#"*int(value/maximum_value*leftover_for_plot)

        sleep(1)

except KeyboardInterrupt, e:
    exit(0)