TWiki> CS450fall09 Web>Homework6 (revision 1)EditAttach

Homework 6 - who are the big fish on the Internet?

In this homework, we again use PlanetLab, but this time we look in more detail at the routes taken between hosts.

Using the command `traceroute', you are to collect the routes taken between each pair of hosts (using the same hosts as in homework 5). We want to find out some statistics about various networks that make up the Internet. These networks are called Autonomous Systems, or AS:es, and each has its own AS number (ASN).

You can find out the ASN that controls a given IP address using the command 'whois'. I've found that 'whois -h' gives reasonably reliable results, certainly much better than just 'whois'. The ASN for the IP is listed under 'origin'.

Submission requirements

We are interested in learning about the organizational structure of the Internet, in terms of autonomous systems. Submit a Makefile (and any necessary scripts) that collects traceroute data for each pair of hosts when 'make collect' is called. I would expect it to be around 10000 pairs. The exact number is not important, but it needs to be several thousand pairs.

Once the data is collected, we need to process it to learn the following statistics:

  1. What 20 AS:es are the most frequently occurring ASNs between our hosts? Take care not to double-count ASNs when they appear more than once in a route.
  2. What 20 AS:es have the largest `degree' or 'number of neighboring ASNs'.
  3. What are the 20 largest AS:es? You may base your ranking on considerations such as number of really long links, or the total length of observed links. You may use rtt as a proxy for distance.

Submit your code together with your collected logs, and all files generated by the post-processing step. Your makefile should support the following commands:

  • make clean - remove all logs and temporary files
  • make collect - collect new traceroute measurements
  • make postprocess - do all time-consuming post-processing here
  • make report - generate a final report on stdout

'make report' should generate as output the 3 lists above, with descriptive titles. Submit an example of your output from 'make report' in a file called REPORT.txt.


Sometimes traceroute does not finish due to nonresponsive hosts (asterisks). If you limit the max TTL to 25 hops, and the wait time to 2 seconds, it'll finish faster.

Use the whois database. If the ASN of a given IP does not exist in this databasee (it happens), you can ignore it.

'head -20' gives you the top 20 lines of a file or stdin.

traceroute -n gives you the IP addresses instead of hostnames (and it runs faster)

Edit | Attach | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2009-10-12 - 02:28:03 - Main.jakob
Copyright 2016 The Board of Trustees
of the University of
Helping Women Faculty Advance
Funded by NSF