When you're debugging a tough problem you sometimes need to analyze the HTTP traffic flowing between your machine and a webserver or proxy. Sometimes you can use firebug or chrome inspector for that. But here's a lowlevel alternative that I'm pretty excited about. Meet Tshark.
Because it's low level, it will run nicely in a separate console. And it will catch any request. That can be useful when you want to find out what 3rd party apps are communicating. In my case it was a Flash app that we assumed didn't respect some redirect headers while downloading static files. Since it had it's own HTTP implementation, firebug was unable to shed any light on the matter.
I knew tcpdump but was never really happy with it. And then I found TShark.
Install
On Ubuntu I typed:
$ aptitude install tshark
But I found implementations for other systems as well.
Sniff HTTP Requests
Tshark can analyze any kind of network traffic, but in my case I was particularly helped by a command I found on stackoverflow:
$ tshark 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' -R 'http.request.method == "GET" || http.request.method == "HEAD"'
Run that, and browsing to google will dump:
190.302141 192.168.0.199 -> 74.125.77.104 HTTP GET / HTTP/1.1
190.331454 192.168.0.199 -> 74.125.77.104 HTTP GET /intl/en_com/images/srpr/logo1w.png HTTP/1.1
190.353211 192.168.0.199 -> 74.125.77.104 HTTP GET /images/srpr/nav_logo13.png HTTP/1.1
190.400350 192.168.0.199 -> 74.125.77.100 HTTP GET /generate_204 HTTP/1.1
Nice and clean.
Go Crazy
The above was all I needed, but I soon found examples that demonstrate some other capabilites.
Count GIF Images Based on Content Type
The command below counts the number of GIF images downloaded through HTTP (from codealias):
$ tshark -R 'http.response and http.content_type contains image' \
-z 'proto,colinfo,http.content_length,http.content_length' \
-z 'proto,colinfo,http.content_type,http.content_type' \
-r /tmp/capture.tmp | grep 'image/gif' | wc -l
Log All POP Users
The command below captures all port 110 traffic and filters out the 'user' command and saves it to a text file (from Mark's notes):
$ tshark -i 2 -f 'port 110' -R 'pop.request.parameter contains 'user'' > /tmp/pop_users.txt
Log HTTP Request / Receive Headers
One from superuser
$ tshark tcp port 80 or tcp port 443 -V -R "http.request || http.response"
Ok that's it for now. If you have some juicy tshark commands yourself, just post a comment and I'll update the article.