Whodunnit
Published: July 2, 2015
When a team of people, both technical and non-technical, collectively operate a shared software installation things are bound to go wrong at some point. As the technical folk we are often engaged to perform forensic analysis. This type of work frequently includes tasks such as grep
-ping server access logs for certain request paths, dates, and IP addresses or reviewing any other logs or information related to whatever incident may have occurred.
This post is about a specific incident that came up recently. It was not a major one, but there were some learnings for me along the way and I figured it would be interesting to document the process.
Some Background
As with most things these days, the story starts with an email. I had previously requested a client conduct UAT on a bugfix in a staging environment. Not long after, the client responded back and told me that he was having trouble with the site...he couldn't proceed through the checkout flow and was getting a white screen.
"Hmm...that's strange", I thought. I had personally deployed and tested the bug fix on that same staging environment prior to notifying the client and did not observe any such issue. What could be wrong?
Diagnosis
To start out, I SSH-ed into the box and ran a git status
in the web root. While I was expecting it to be on the head of the develop branch it was not. But I had just run a git pull
before notifying the client! Something was amiss...
Autopsy (part 1)
This is where things get interesting.
"Anyone doing anything on the staging server?" I popped into the Hipchat room for the site in question. We have Hipchat rooms for each site we manage at Something Digital, which is an awesome practice.
Radio silence.
"Hmm...what's going on?", I wondered.
Like most devs do, I turned to Google.
NOTE: I probably could've figured out the solution without Googling for it. However, I didn't even take a moment to think about how I might approach the challenge. I could easily go into a tangent here into the tendency of devs to Google solutions and copy / paste answers from Stack Overflow without attempting to implement a solution on their own, which I, myself, am clearly guilty of, but I'll save that for a separate post.
"Search bash history for all users" I typed into Google
The command that turned up was as follows
For me the term to search for was git checkout
.
Upon running the command I got an inordinate amount of results. What I needed to know was the most recent execution of git checkout
.
A Sidebar On .bash_history
By default .bash_history
does not include timestamps for each command. I can't say I agree with that behavior. If you're responsible for a server that is accessed by multiple users it's probably a good idea to make sure bash commands are getting logged with time stamps. This article covers how to add timestamps to .bash_history
globally. You could also default all users to z-shell which does log timestamps to .zsh_history
. Finally, it may be worth looking into tools such as snoopy that offer several improvements over both .bash_history
and .zsh_history
.
Autopsy (part 2)
Fortunately timestamps were being recorded globally for .bash_history
on this box. Running less ~/.bash_history
looked like this...
The timestamps were recorded on the line above the command, so I needed to add the -B 1
flag to the grep
command to also get one line before.
Awesome now all I need to do is combine every two lines in the output, and finally sort
Beautiful. This command was so sexy I put it in put it into the main Something Digital Hipchat room, created an alias in my .zshrc
, tweeted about it, and now am even writing a blog about it.
The Result
After all this build up, I have to say, the result of the story is pretty anti-climactic. Or, I should say it remains an unsolved mystery...or maybe I just thought I QA-ed it. Or maybe clearing the application cache upon deploy didn't kick in immediately. Anyway, for whatever reason no one had run a git checkout
since my most recent git pull
. Who knows what happened, but I certainly had fun trying to figure it out.