TWiki> CS450fall09 Web>Homework7 (2009-10-19, Main.jakob)EditAttach

Homework 7 - parsing packet traces - due 10/26

In this homework, we use libpcap to analyze packet traces captured with tcpdump. You can use "man pcap" to learn about the pcap API. libpcap gives us one packet at a time, in the order that they originally arrived. It is up to us to process the packets, and try to learn something from them.

Our interest in this homework is to reconstruct the data flowing between hosts, based on tcpdump traces. We will focus on TCP flows, and a key aspect of the homework is the reassembly of TCP packets into the original data. A correct submission will contain the following

  • A Makefile that, given simply the command "make", produces an executable called 'hw7'.
  • The hw7 binary takes two command line arguments: an input file (produced by tcpdump -w), and a directory for output files.
  • Running hw7 produces a table of flows, per (unidirectional) flow, identified by src ip/port, dst ip/port. For each flow, the number of segments and data payload bytes in each direction should be listed.
  • In the directory indicated, a file for each flow named as follows: SRCIP.SRCPORT-DSTIP.DSTPORT.log
  • Each of these files should contain all payload data (no IP/TCP headers) sent over each flow. Take care to handle packet duplicates and reordering!

For example, it may say

~> hw6 thetrace thedirectory
a.b.c.d/8484       e.f.g.h/80       12205          2115
e.f.g.h/80      a.b.c.d/8484       3555           223
a.b.c.d/22       e.f.g.i/19495       1205           211
e.f.g.i/19495       a.b.c.d/22      335            32

and thedirectory would contain the files

~> ls thedirectory


An example tcpdump tracefile is included in the hw7 template directory. However, it would be advisable to record your own traces and try your solution on them as well. When grading, we will use this file as well as another dump containing some tcp flows.


Read the ip and tcp header structure definitions in /usr/include/netinet/ip.h and tcp.h.

Use tcpdump / wireshark to verify that your code is parsing the packets correctly.

Use lseek to jump to an arbitrary point in a file, even beyond its current size. fopen truncates the file when opening for writing, so open() may be a better idea.

inet_ntoa() is a handy function for printing IP addresses. However, beware: it uses a static char array internally, no memory is allocated for the return value!

Make sure to store the initial sequence number of each flow when it gets established.

Topic revision: r2 - 2009-10-19 - 13:05:12 - Main.jakob
Copyright 2016 The Board of Trustees
of the University of
Helping Women Faculty Advance
Funded by NSF