CS 101 - Introduction to Computing, Spring 2011

Project 4 - Hiding a Text Message in a Sound

Due Date: Wednesday April 27, 2011 at 11:59 pm

This project will use a technique called "steganography" to encode a text message. Steganography is basically hiding one set of data within another set of data. Some decent discussions of this technique can be found at http://www.garykessler.net/library/steganography.html or http://en.wikipedia.org/wiki/Steganography

For this project we want to take an ascii text message and hide it into the a sound file. We will also want to be able to extract the message. For this we will need to write two program: Proj4a.java and Proj4b.java. Where Proj4a.java will place the message into a sound file and Proj4b.java will extract the message from the sound file.

Recall that a text message is a collection (i.e. encoding) of multiple ascii characters and each each ascii character can be represented as a numeric value and (finally) each numeric value can be represented in a series of decimal digits (values from zero thruogh nine). Thus the text message of "Welcome!" is made up of the ascii characters:

These ascii characters are represented with the following numeric values (in decimal): These values can be represented as three decimal digits: The ascii values can be found at http://www.asciitable.com/. We can easily compute the decimal ascii value of a character by simply doing a type cast to type int.
      char ch;
      int asciiValue;

      asciiValue = (int) ch;

Hiding a Text Message in a Sound

The plan is to encode each of the decimal digits from the ascii number of the text message in the samples of the sound object. If a sound sample's amplitude value is only changed by a small amount, most people will be unable to notice the change. The one's digit from amplitued values will be changed to the decimal digits from the text message. An amplitude value ending in 5 (like 3815) would represent a decimal digit of 5.

Initially we will want all of the samples in the sound object to be "smoothed" out. This will result in every sample having the one's digit of its amplitude value set to zero. The way to do this is to determine the one's digit of the sample and subtract this value from the sample's amplitude value.

     digit = ampValue % 10;
     modAmpValue = ampValue - digit;

Note that changes of a amplitude value should have no impact to the sound quality on the human ear. Actually it is changes of values around 10 that begin to be detected by the human ear. The average change of this smoothing will be of a value of 5.

Once a sound file has been "smoothed", we will encode one decimal digit from the text message into each sound sample. Since each ASCII character is considered to have 3 decimal digits, we will need 3 sample amplitude values for each ASCII character. Thus to encode a text message of 100 characters, we would need 300 sample amplitude values. At this rate, a one second sound sample can encode 7350 characters.

To encode a decimal digit into a sound sample, we simply add the decimal digit (a value from zero to nine) to the "smoothed" amplitude value of the sound sample when the original amplitude value is positive, and subtract the decimal digit from the "smoothed" amplitude value of the sound sample when the original amplitude value is negative. There are two special cases that we will need to consider. These cases are when the encoded amplitude values fall outside of the normal range of -32768 to +32767.

The following table is to help us with this. It shows the first 24 samples from the file gettysburg10.wav. Note the sample values are relatively small numbers as this is only the first 1000th of the sound and nothing has been stated yet.

Sample
Index
Original
Amplitude
Value
"Smoothed"
Amplitude
Value
ASCII
Characters
Decimal
Values
Decimal
Digits
Decimal
Digit
Position
Modification
Operation
Final
Sample
Values
05950W8770+57
13930 81+38
21610 02+10
31010e10110+11
4-10* 01-0
570 12+1
6-12-10l10880--18
7-70* 01-0
81010 12+11
92520c9990+29
106160 91+69
115750 02+50
124640o11110+41
134140 11+41
144340 12+41
157370m10990+79
167670 01+70
176060 12+61
184640e10110+41
193930 01+30
203130 12+31
212620!3330+23
223530 31+33
233730 02+30

To determine which decimal digit of which character is associated with which sample from the sound object, use the following where sampleIndex is the index of the SoundSample from the sound object:

One we know the Character Number, we can get the character from the string by using the charAt() method of the String class. This method will give use the character at that position. To get that ASCII Character code number for the character simply store the character into a variable of type integer.

To get the Decimal Digit value (zero through nine) for a particular Decimal Digit Position from the ASCII character code number, we should write a method with the following method description:

public static int getDecimalDigit (int asciiCode, int DecimalDigitPosition)
{
  // This method should divide the asciiCode by 10 as many times
  //   as specified by the DecimalDigitPosition, 
  // Then use the modulus operator % to find the remainder 
  //   of a division by 10.
  // This remainder value is the Decimal Digit that we want returned
  //   by the method.

}

Special Cases

There are 2 special cases that we need to take care of. These occur when the adjusted/encoded amplitude value ends up above the maximum value of +32,767, or below the minimum value of -32,768.

According to the mathematics, there are only two encoded amplitude values above +32,767 that could exist:

In both of these case, simply subtracting 10 to get results of +32,758 and +32,759 is the best solution. The decimal value gets encoded properly and we don't lose too much sound quality.

Similarily, thers is only one possible encoded amplitude value:

In this case, simply adding 10 to get the result of -32,759 is the best solution. Again, the decimal value gets properly encoded and we don't lose too much sound quality.

Extracting the Text Message

To extract the text message from a sound file, simply determine a sound amplitude value's one's digit. This will correspond to one of the three decimal values for each character of the text message.

For example, assume we needed to get the character y. The 3 Decimal Digits and the coorsponding Decimal Digit Positions would be:

ASCII
Characters
Decimal
Values
Decimal
Digits
Decimal
Digit
Position
y12110
21
12

First, we intialize the asciiCode to the Decimal Digit at DecimalDigitPosition 0.
- This would set the asciiCode to 1.

We then multiple the Decimal Digit at DecimalDigitPositions 1 by 10 and add it to the asciiCode value.
- This would add the value of 20 (2 * 10) to the asciiCode of 1 which keeps the asciiCode at 21.

We then multiple the Decimal Digit at DecimalDigitPositions 2 by 100 and add it to the asciiCode value.
- This would add the value of 100 (1 * 100) to the asciiCode of 21 to give a new value of 121.

The final value of asciiCode is 121, which is the proper ASCII code value for the character y.

We will assume that the number of SoundSamples greatly out numbers of number bits that need to be hidden/extracted. Thus there will be a large number of SoundSamples that have only even values in them. While doing the extraction if we encounter a character that was created from 3 sound samples with a zero value for one's digits, we would have extracted ASCII Character 0 or the NULL character. When this occurs, we will stop the extraction process and display the string that has been created and then exit() the program.

Project Input and OutPut

For the encoding program, Proj4a.java, you are to prompt the user for a sound file to use and also prompt the user for the string to hide in the sound file. After the string is hidden in the sound, you are to prompt the user for a file to save the sound. You may use the SimpleInput.getString() method to get the string to hide.

For the extracting program, Proj4b.java, you are to prompt the user for a sound file to use. You are to display the extracted string using one of the methods from the SimpleOutput class.

Program Writing Style

Your program must be written using good programming style which includes:

Project Collaboration

You are allowed to receive help on this project from other students who are also taking CS 101. Each student must still complete and submit his/her own project. You will be required to include a Collaboration Statement somewhere on your project if you receive help. This statement can simply be something like the following:
For this project, I received help from the following member of CS 101.
This statement should list each helping student's name in a comment in the "header comment" of your Java file that includes the main() method.

Submission of your Project

Submit your code via Blackboard.

CS 101 Home Page
Department of Computer Science
University of Illinois at Chicago