homework 5: A Home-Brew Web Server
This homework is due on Monday, November 23rd 11:59pm Chicago time
This homework has the following learning objectives:
- Learn how to follow a network protocol (in this case, http)
- Learn network/socket programming
In this homework, we’ll look at the HTTP protocol as a Web server. After accepting the assignment, you will find that the
hw5 directory contains:
homework5.c: Skeleton code for the server side of a TCP application. This will be the primary file for this assignment, but feel free to modularize and create other files if you prefer to do so.
WWW/: A directory containing example files for your Web server to distribute.
Makefile. If you modularize your code into different files, make sure those changes are reflected in the Makefile (and don’t forget to
git addthose files to your personal git repo!
Your server program will receive two arguments: 1) the port number it should listen on for incoming connections, and 2) the directory out of which it will serve files (often called the document root in production Web servers). For example:
$ ./homework5 5000 WWW
This command will tell your Web server to listen for connections on port 5000 and serve files out of the
WWW directory. That is, the
WWW directory is considered
/ when responding to requests. For example, if you’re asked for
/index.html, you should respond with the file that resides in
WWW/index.html. If you’re asked for
/dir1/dir2/file.ext, you should respond with the file
Your server should handle the following cases:
- Complete the code inside the
main()function to create a socket, bind it to an address and port, listen for client connections, and accept waiting connections.
- Serve requested files out of the directory specified in the run-time arguments. For example, if the specified directory is
WWWand the requested file is
/example.jpg, you should respond with the file
WWW/example.jpgif it exists. You need to return the correct HTTP Content-Type header for
htmlfiles so they can be rendered in the browser.
- If the requested file does not exist, you should respond with a 404 error code and a readable error page containing some basic HTML.
- Common Gateway Interface (CGI) is a protocol for a web server to serve dynamic content using command-line interface programs. As a part of the protocol, you should handle requests like
cgi-bin/hello-world.py?first_name=<str1>&last_name=<str2>by executing the file
QUERY_STRINGas the environment variable in a new child process and return the response produced by the file by duplicating the file descriptor.
- Handling HTTP Content-Type headers for additional file types provided in the
Always, always, always check the return value of any system calls you make!. This is especially important for
write calls that tell you how many bytes were read or written. Failing to do so can crash autograder giving you zero points.
When testing, you should be able to retrieve byte-for-byte copies of files from your server. Use
curl to fetch files and
diff to compare the fetched file with the original. We will grade using this method. For full credit, the files need to be exact replicas of the original.
This assignment is worth 8 points in total and 2 additional bonus points:
- 2 points for serving exact copies of HTML and JPEG files in the browser and using command line clients like
curl. The MD5 sums should match!
- 2 points for correctly returning a 404 error code and HTML message when a request asks for a file that does not exist.
- 4 points for returning the plain formatted string given the request arguments.
- Bonus: 2 points for serving other files (that can be rendered correctly – set Content-Type) to a standard Web browser (e.g., Firefox).
When submitting, you must provide a
Makefile along with your program and ensure that when your program compiles, the executable’s name is
homework5. If you do not add any files to your source tree (and only update
homework5.c), you should not need to change anything.
Grading will be done automatically using a script. We will publish this script after grading has completed; you are responsible for writing your own test cases. If you wish, you can share test cases you have written with the class.
- Take compiler warnings seriously. Unless it’s an unused variable, you should address the warning as soon as you see it. Dealing with a pile of warnings just makes things more difficult later.
- Test your code in small increments. It’s much easier to localize a bug when you’ve only changed a few lines.
- If you need to copy a specific number of bytes from one buffer to another, and you’re not 100% sure that the data will be entirely text, use
strncpy(). The latter terminates early if it finds a null terminator (‘\0’).
- If you’re trying to do some sort of specific string or memory manipulation, feel free to ask if there’s a better/recommended way to do it rather than brute force. Often there may be a standard library function that will make things easier.
- Read chapter 11 (especially 11.5 and 11.6) from the textbook to learn more about the network programming in C.
Roughly, your server should follow this sequence:
- Read the arguments, bind to the specified port, and find your document root (you might find the
chdir()system call helpful).
- Accept a connection, and hand it off to a new thread for concurrent processing.
- Receive and parse a request from the client.
- Look for the path that was requested, starting from your document root (the second argument to your program). One of four things should happen:
- If the path exists and it’s a file, formulate a response and send it back to the client (bonus point for handling Content-Type header happens here).
- If the path exists and it’s a directory that contains an
index.htmlfile, respond with that file.
- If the path does not exist, respond with a 404 code with a basic error page.
- If the requested path contains
cgi-bin, retrieve the URL parameters and create a new child process. Modify the file descriptor of the child process and return the formatted string as plain text (Hint: You can use
posix_spawnto accomplish this).
- Close the connection, and continue serving other clients.
How to run your program
There are two ways to run and test your code:
- Use this spreadsheet to find your port number, which is unique to you in order to avoid collisions. Use this port number to run the webserver on
devcontainer.jsonhas settings to forward a port from the container to your local machine. If you wish to change the forwaded port, you can change the port number in line
If you have any questions about the homework requirements or specification, please post on Piazza.