How do I decode gzipped HTTP replies from packet captures?

published Jun 30, 2022

It's actually fairly straightforward with a simple tool called tshark.

Let's suppose you have a capture.pcap file that was created by a program like this:

tcpdump 'port 80' -A -s65536 -w capture.pcap

And from that capture.pcap file you want to extract HTTP replies that were gzipped when sent by the server.

Worry not, here is some Python code to do exactly that:

#!/usr/bin/env python3

import zlib
import json
import sys

import shlex, subprocess

cmd = '''\
tshark \
-r %s \
-Y 'http.content_encoding == "gzip"' \
-T fields -e data''' % shlex.quote(sys.argv[1])
args = shlex.split(cmd)

p = subprocess.Popen(args, stdout=subprocess.PIPE)
output, err = p.communicate()

decoded = []
for ll in output.splitlines():
for l in ll.split(b","):
if len(l) > 0:
decoded.append(bytes.fromhex(l.rstrip().decode('ascii')))
decoded = b"".join(decoded)

data = zlib.decompress(decoded,16+zlib.MAX_WBITS)
sys.stdout.buffer.write(data)

Run it as follows on your terminal: python3 <name of file> capture.pcap.  You'll be greeted with the output of the requests in the capture file.