Homework 3 - Data structures in C

In this homework, we practice our C skills by implementing a String data structure, and testing our implementations against each other. An implementation of the String data structure consists of a .o file that provides the following set of functions:

// hw3_string.h
struct String;
struct String* string_new(char* str);
void string_free(struct String* s);
struct String* string_clone(struct String* s);

int string_length(struct String* s);
void string_append(struct String* a, char* b);
char string_charAtIndex(struct String* s, int index);
char* string_range(struct String* s, int start, int length);
char* string_chars(struct String* s);

The template for hw3 is available at svn://bits.cs.uic.edu/cs385/pub/homeworks/hw3. This contains the hw3_string.h above, an implementation of the String data structure called simple_string.c, and programs that use String data structures and tests their correctness and runtime performance.

A faster representation of Strings

Your task is to create an alternative implementation of the String data structure, in a file called fast_string.c. You may implement your data structure whichever way you please, as long as the following conditions hold:

  • it uses dmalloc() and dfree() instead of malloc() and free(). Also uses no other functions (like mmap() or remalloc()) for dynamic memory allocation.
  • it passes all tests run by make testunit without warnings
  • it outputs the correct characters and excerpts when running make testspeed
  • it conforms to the performance expectations described below
Do not change the .c and .h files in the template. Instead, copy simple_string.c to a new file fast_string.c, and edit the first line of the Makefile to use fast_string.c instead. Make and commit your changes to fast_string.c.

Design suggestions

While you are free to implement the String data structure in any way you please, here are some suggestions.

  • simple_string.c is slow because it keeps copying an ever-growing string to new places. You need to stop it from making these copies.
  • if you use a linked-list, you'll end up dereferencing a crazy number of pointers when indexing into a large file, making the index operation slow
  • a binary tree is a natural representation. Note, however, that it will only be fast if it is somewhat balanced.

Testing correctness and performance

To test the correctness of your code, run make testunit. This performs several unit tests on the String functions declared above. Try running make testunit with STRING_CODE in the Makefile set to simple_string.c to see the expected output.

To test the performance of your code, run make testspeed. This downloads a large book, concatenates the entire book into one large string, line by line, and applies the string functions on the resulting string. make testspeed also outputs a series of "Found characters" and "Excerpts". These outputs are identical for all correct solutions - if your output differs, it will be deemed incorrect.

Performance expectations

The template directory contains three alternative versions of "speed", the binary run by make testspeed. original_speed runs the code provided in the template. unbalanced_speed runs an early solution I wrote, and balanced_speed runs my best solution.

To try one, run (for example) ./balanced_speed pg2554.txt. Your solution is expected to either finish in at most 10 times the total time of balanced_speed, or finish the "reading book" part in at most 2 times that of unbalanced_speed, and the lookup in at most 1.5x that of unbalanced_speed.

Performance competition and bonus points

All turn-ins will be automatically entered into a performance competition, pitting your solution against those of your classmates, as well as against my solutions. As soon as you commit a revision (does not have to be your final submission), it will be automatically tested for correctness and speed. The results will be posted on the hw3 scoreboard.

As an added bonus, the author of any solution that consistently beats mine in total execution time will receive 5 bonus points toward the final grade. Also, author of the highest-performing student solution will receive 5 bonus points toward the final grade. These bonuses are not stackable, so the max bonus possible is 5 points, but there's always the glory.

Topic revision: r2 - 2012-01-27 - 14:25:49 - Main.jakob
Copyright 2016 The Board of Trustees
of the University of Illinois.webmaster@cs.uic.edu
Helping Women Faculty Advance
Funded by NSF