Homework 1

Homework 1: Virtualbox, git, and a web client

Setting up an Ubuntu 14.04 virtual machine

For this class, we will use Ubuntu 14.04 as the reference platform to run all programs.

You are welcome to develope and test your programs on any platform of your choice, but you NEED TO make sure that your final programs compile and run fine on Ubuntu 14.04, because that’s what will be used for grading.

For this reason, we will provide a Ubuntu14.04 VM image as a reference, which you can use to test your programs. You should be able to use the image with any virtualization software, but we recommend and support using VirtualBox and here is how to set it up:

Once you have VirtualBox installed (https://www.virtualbox.org/wiki/Downloads), you should download the VM image we have provided from (https://drive.google.com/folderview?id=0ByajJYQ856dpVmJ6clYxeFQxcWs&usp=sharing)

  1. Open VirtualBox
  2. Go to File > Import Appliances
  3. Find the image (.ova) you downloaded.
  4. Press Import and your done!

Once you have your image imported, you can adjust the amount of RAM size etc. and start it up.

Login with:

		user: networking
		password: networking

NOTE that the GUI has been turned off by default to make the virtual OS lighter.

If you need the GUI you can activate Ubuntu display manager server by

$ sudo service lightdm start

and then Ctrl+Alt+F7 to switch to GUI terminal. You can always Ctrl+Alt+F1 to switch back to tty1 and turn the GUI off by

$ sudo service lightdm stop

I recommend having VirtualBox running in the background and working from the comfort of your own hosting environment via SSH connection to the running VM.

To SSH to the running VirtualBox VM from your hosting OS:

  1. In VirtualBox Go to Ubuntu-VM -> Settings -> Network
  2. Make sure Adapter 2 is set as an “Host-only Adapter”
  3. Start the VM and execute
    $ ifconfig -a
    

    The IP of your VM should be displayed in “eth0” under “inet addr”

  4. You can access the running VM from the terminal in your own machine by:
    $ ssh networking@IP
    

    (or use Putty in Windows)

Build essentials

Once you have an Ubuntu 14.04, type following to make it ready for Git and programming.

sudo apt-get install build-essential  
sudo apt-get install git

These has been already installed on the provided VM image. Note that the GCC 4.8 installed by build-essentials is our reference C compiler for this course.

git, class repositories (repos)

The main objective of this homework is to get you familiarized with the versioning system we will be using for homework turn-in, called git. Git is a decentralized revision control system. You will also get a quick introduction to network programming in C, which will be our main language in the course.

Using your private key, check out the public course repository:

git clone cs450git@words.cs.uic.edu:cs450-f16/public.git

Once you’ve cloned the public repo, you will have a directory public filled with useful files for all students. Some of those files will allow you to use IP version 6 on your virtual machine. You will first need to add some packages and prepare your VM for connecting to our IPv6 VPN. To prepare your new ubuntu install for class, run these commands from within the public repo you just cloned:

cd public
cd utils
tar xzf bitsvpn.tgz
cd bitsvpn
sudo ./install.sh

If all goes well, your virtual machine is now on the IPv6 Internet. You can verify this by trying the command:

$ ping6 ipv6.google.com

If that ping command works, you’re in business! Now you’re ready for the fun part.

Refer to the discussion site if you have further questions.

your personal repository

You can also check out your personal repository:

git clone cs450git@words.cs.uic.edu:cs450-f16/YOURUSERNAME.git

You will want to copy the skeleton code from the public direcotry to your new YOURUSERNAME directory. Here, my username is bvamanan-student. When you first clone the repository, it will be empty. Your first task is to copy the hw1 skeleton code to your repository. If you’ve done that correctly, the tree program will print a graphical (ascii art) representation of the directories and files within the current directory:

ubuntu@ip-10-143-165-210:~$ cd bvamanan-student
ubuntu@ip-10-143-165-210:~/bvamanan-student$ tree
.
└── hw1
    ├── hw1.c
    ├── Makefile
    └── SUBMISSION_COMMENTS.txt

1 directory, 3 files

Now, just to test things out, let’s add these files to the repository, commit them, and send the changes to the central server.

ubuntu@ip-10-143-165-210:~/bvamanan-student$ git add hw1
ubuntu@ip-10-143-165-210:~/bvamanan-student$ git commit -a -m'added hw1 skeleton'
[master e66c87f] added hw1 skeleton
 3 files changed, 77 insertions(+)
 create mode 100644 hw1/Makefile
 create mode 100644 hw1/SUBMISSION_COMMENTS.txt
 create mode 100644 hw1/hw1.c
ubuntu@ip-10-143-165-210:~/bvamanan-student$ git push
Counting objects: 7, done.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), 1.13 KiB, done.
Total 6 (delta 0), reused 0 (delta 0)
To git@git.uicbits.net:cs450-f15/bvamanan-student.git
   4781aea..e66c87f  master -> master
ubuntu@ip-10-143-165-210:~/bvamanan-student$

Remember, if you don’t push, we don’t see it! We will be making copies of the repositories at each deadline (full credit, 10% off, etc), so do not leave this for the last minute. If you submit it one second too late, you’ll be in the next lateness bracket, no exceptions.

The Programming Part!

For this week’s programming exercise, we will create a barebones web client (think wget). Based on the example tcp client code in the hw1 in the public repository, and the HTTP example sessions shown in class, you will write a command-line program called hw1 that takes a URL as its only parameter, retrieves the indicated file, and stores it in the local directory with the appropriate filename. If the URL does not end in a filename, your program should automatically request the file ‘index.html’. Make sure it works for both text and images by opening the stored file in a web browser.

You may assume that the URL is on the form http://host/path, where path may or may not be an empty string, may or may not contain multiple slashes (for subdirectories), and may or may not contain a file name. You may assume files to be no larger than one megabyte, and you are not expected to follow any HTTP redirect (3xx); your code can simply exit without saving any file.

The hostname may be a name like www.google.com, but the example code requires an ip address (like 64:ff9b::83c1:201d). To look up the IP address of a given host name, use getaddrinfo(). man 3 getaddrinfo on the command line will give you the details, or use this link, and see the getaddrinfo.c example.

Template

For this homework, there is a prepared skeleton directory that you may use located in the public git repository.

A few hints

  • Use http version 1.0. Version 1.1 can get a lot more complicated.
  • Good functions to use for handling filenames and text include sprintf, sscanf, strstr, strchr.
  • Section 2.2.2-2.2.3 in the book should also be helpful. Your book talks about the “request line” and “header lines” for an HTTP request. You will only need to use the *request line* and the *host line* of the header.

Read more about these using the man pages. For example, try man sprintf on the command line.

  • A “newline” in http consists of two ascii characters: \r\n, not just \n.

  • Your program will be tested (at least) on these urls:

http://www.google.com/
http://www.google.com/intl/en_ALL/images/logo.gif
http://www.google.com/thispagedoesnotexist
http://www.thissitedoesnotexist1776.com 
http://www.adorama.com/alc/0012691/article/Lenses-Product-Reviews-AdoramaTV
  • Make sure you handle all these cases gracefully. If you don’t send a host
  • header, the first should produce a file index.html, the second the google logo saved into the file logo.gif, and the rest should produce an error because they do not give you a 200 OK response; you can thus quit your program by calling exit(1). Beej’s Guide to Network Programming is a great resource you may want to make use of.

  • If you’re curious, try firing up Wireshark, and then fetching the URL with wget or curl. You’ll find the request they sent (which may have a lot of additional parameters in it) in one of the packets with destination port 80.

  • Spend some time thinking about how to do the string manipulation. It does not have to be complicated. The complete program, including comments, error handling etc. can be written in about 100 leisurely lines.

Turn-in instructions

For your turn-in, prepare a Makefile that compiles the hw1 target, and push your changes to your repository. To make sure your submission is complete, make sure your program works as follows:

Try the following in a temporary directory, i.e. create and change into a temporary directory under the /tmp filesystem:

mkdir /tmp/hw1-temp
cd /tmp/hw1-temp
git clone cs450git@words.cs.uic.edu:cs450-f16/MYUSERNAME.git
cd hw1
make
./hw1 http://www.google.com/index.html 

This process should produce a file called index.html in the current working directory, containing the source for the google front page.

./hw1 http://www.google.com/intl/en_ALL/images/logo.gif 

Running this line should produce a file called logo.gif, containing the google logo.

Grading

Grading will be done automatically using a script. You can check your grade even before final submission using the script. The grading script is published at:

public/grading_scripts

You can check your solution by running the script as below. 

./hw1_score homework_directory output_file score_file

The score file will look like below.

[SUCCESS] code compiled [1.0]: 1.0
[SUCCESS] index.html created for http://www.google.com? [1.0]: 1.0
[SUCCESS] skype_logo.png created for http://www.skypeassets.com/i/images/logos/skype_logo.png? [1.0]: 1.0
[SUCCESS] images_logo_lg.gif created for http://images.google.com/intl/en_ALL/images/logos/images_logo_lg.gif? [0.5]: 0.5
[SUCCESS] biagioni-gis13.pdf created for http://biagioni.net/papers/biagioni-gis13.pdf? [0.5]: 0.5
[SUCCESS] exit with error for http://www.google.com/thispagedoesnotexist? [0.5]: 0.5
[SUCCESS] exit with error for http://www.thissitedoesnotexist1000.com? [0.5]: 0.5
[SUCCESS] exit with error for http://www.skype.com? [0.5]: 0.5
[SUCCESS] skype_logo.png size correct? [0.5]: 0.5
[SUCCESS] images_logo_lg.gif size correct? [0.5]: 0.5
[SUCCESS] biagioni-gis13.pdf size correct? [1.0]: 1.0
[SUCCESS] skype_logo.png content correct? [0.5]: 0.5
[SUCCESS] images_logo_lg.gif content correct? [0.5]: 0.5
[SUCCESS] biagioni-gis13.pdf content correct? [1.0]: 1.0
[SUCCESS] Lenses-Product-Reviews-AdoramaTV created for http://www.adorama.com/alc/0012691/article/Lenses-Product-Reviews-AdoramaTV? [0.5]: 0.5
Total score: 10

Due Date

This assignment is due Wednesday, September 7, at 3 PM. See the syllabus for the late turnin policy. This assignment is worth just as much as every other homework, so getting as much credit on it as possible is important (don’t turn in late!).