CS 111 - Program Design I

Project 4 - Hiding a Text Message in a Sound

Due Date: Thursday May 1, 2014 at 11:59 pm

This project will use a technique called "steganography" to encode a text message. Steganography is basically hiding one set of data within another set of data. Some decent discussions of this technique can be found at http://www.garykessler.net/library/steganography.html, http://windowsitpro.com/security/other-forms-steganography or http://en.wikipedia.org/wiki/Steganography

An example of steganography is the following message that might have been sent by cablegram from a spy during the First World War.

PRESIDENT'S EMBARGO RULING SHOULD HAVE IMMEDIATE NOTICE. GRAVE
SITUATION AFFECTING INTERNATIONAL LAW. STATEMENT FORESHADOWS RUIN
OF MANY NEUTRALS. YELLOW JOURNALS UNIFYING NATIONAL EXCITEMENT
IMMENSELY.

When only the first letter of every word is used, the following message is uncovered.

  • PERSHINGSAILSFROMNYJUNEI
Or with spaces inserted
  • PERSHING SAILS FROM NY JUNE I
For this project, we want to take an ascii text message and hide it into the a sound file. We will also want to be able to extract the message. For this we will need to write one program that can either embed a text message into a sound or extract a message from a sound. Your program will first first prompt the user to enter the value of either 1 or 2.
  • If the user enters a 1, your program is to perform the embedding
  • If the user enters a 2, your program is to perform the extracting
Recall that a text message is a collection (i.e. encoding) of multiple ascii characters and each each ascii character can be represented as a numeric value and (finally) each numeric value can be represented in a series of decimal digits (values from zero through nine). Thus the text message of "Welcome!" is made up of the ascii characters:
  • W
  • e
  • l
  • c
  • o
  • m
  • e
  • !
These ascii characters are represented with the following numeric values (in decimal):
  • 87
  • 101
  • 108
  • 99
  • 111
  • 109
  • 101
  • 33
These values can be represented as three decimal digits:
  • 0 8 7
  • 1 0 1
  • 1 0 8
  • 0 9 9
  • 1 1 1
  • 1 0 9
  • 1 0 1
  • 0 3 3
The ascii values can be found at http://www.asciitable.com. We can easily compute the decimal ascii value of a character by simply doing a type cast to type int.
      char ch;
      int asciiValue;
      asciiValue = (int) ch; 

Hiding a Text Message in a Sound

The plan is to encode each of the decimal digits from the ascii number of the text message in the samples of the sound object. If a sound sample's amplitude value is only changed by a small amount, most people will be unable to notice the change. The one's digit from amplitued values will be changed to the decimal digits from the text message. An amplitude value ending in 5 (like 3815) would represent a decimal digit of 5.

Initially we will want all of the samples in the sound object to be "smoothed" out. This will result in every sample having the one's digit of its amplitude value set to zero. The way to do this is to determine the one's digit of the sample and subtract this value from the sample's amplitude value.

      digit = ampValue % 10;
      modAmpValue = ampValue - digit; 

Note that changes of a amplitude value should have no impact to the sound quality on the human ear. Actually it is changes of values around 10 that begin to be detected by the human ear. The average change of this smoothing will be of a value of 5.

Once a sound file has been "smoothed", we will encode one decimal digit from the text message into each sound sample. Since each ASCII character is considered to have 3 decimal digits, we will need 3 sample amplitude values for each ASCII character. Thus to encode a text message of 100 characters, we would need 300 sample amplitude values. At this rate, a one second sound sample can encode 7350 characters.

To encode a decimal digit into a sound sample, we simply add the decimal digit (a value from zero to nine) to the "smoothed" amplitude value of the sound sample when the original amplitude value is positive, and subtract the decimal digit from the "smoothed" amplitude value of the sound sample when the original amplitude value is negative. There are two special cases that we will need to consider. These cases are when the encoded amplitude values fall outside of the normal range of -32768 to +32767.

The following table is to help us with this. It shows the first 24 samples from the file gettysburg10.wav. Note the sample values are relatively small numbers as this is only the first 1000th of the sound and nothing has been stated yet.

  • The first column shows the sample index (note the first sample is "sample index zero").
  • The second column shows the original amplitude values from the sound.
  • The third column shows the amplitude values have they have been "smoothed". Note that one of the special cases occurs twice.
  • The fourth column shows the ascii characters from the string Welcome! Note that each character corresponds to 3 amplitude values.
  • The fifth column shows the decimal value for each ascii character. Note that each character corresponds to 3 amplitude values.
  • The sixth column shows the decimal digits from the string: Welcome! Note that decimal digits are listed from smallest to largest.
  • The seventh column shows the decimal digit value.
  • The eighth column shows whether the Decimal Digit needs to be added to or subtracted from the Smoothed Amplitude Value.
  • The ninth column shows the amplitude values with the decimal digits from the string either added to it or subtracted from it.

Sample
Index
Original
Amplitude
Value
"Smoothed"
Amplitude
Value
ASCII
Characters
Decimal
Values
Decimal
Digits
Decimal
Digit
Position
Modification
Operation
Final
Sample
Values
05950W8770+57
13930 81+38
21610 02+10
31010e10110+11
4-10* 01-0
570 12+1
6-12-10l10880--18
7-70* 01-0
81010 12+11
92520c9990+29
106160 91+69
115750 02+50
124640o11110+41
134140 11+41
144340 12+41
157370m10990+79
167670 01+70
176060 12+61
184640e10110+41
193930 01+30
203130 12+31
212620!3330+23
223530 31+33
233730 02+30

To determine which decimal digit of which character is associated with which sample from the sound object, use the following where sampleIndex is the index of the SoundSample from the sound object:

One we know the Character Number, we can get the character from the string by using the charAt() method of the String class. This method will give use the character at that position. To get that ASCII Character code number for the character simply store the character into a variable of type integer.

To get the Decimal Digit value (zero through nine) for a particular Decimal Digit Position from the ASCII character code number, we should write a method with the following method description:

public static int getDecimalDigit (int asciiCode, int DecimalDigitPosition) 
{   
  // This method should divide the asciiCode by 10 as many times   
  //   as specified by the DecimalDigitPosition,    
  // Then use the modulus operator *%* to find the remainder    
  //   of a division by 10.   
  // This remainder value is the Decimal Digit that we want returned   
 //   by the method.  
 
} 

Special Cases

There are 2 special cases that we need to take care of. These occur when the adjusted/encoded amplitude value ends up above the maximum value of +32,767, or below the minimum value of -32,768.

According to the mathematics, there are only two encoded amplitude values above +32,767 that could exist:

  • +32,768
  • +32,769
In both of these case, simply subtracting 10 to get results of +32,758 and +32,759 is the best solution. The decimal value gets encoded properly and we don't lose too much sound quality.

Similarily, there is only one possible encoded amplitude value below -32,768:

  • -32,769
In this case, simply adding 10 to get the result of -32,759 is the best solution. Again, the decimal value gets properly encoded and we don't lose too much sound quality.

Extracting the Text Message

To extract the text message from a sound file, simply determine a sound amplitude value's one's digit. This will correspond to one of the three decimal values for each character of the text message.

For example, assume we needed to get the character y. The 3 Decimal Digits and the coorsponding Decimal Digit Positions would be:

ASCII
Characters
Decimal
Values
Decimal
Digits
Decimal
Digit
Position
y12110
21
12

First, we intialize the asciiCode to the Decimal Digit at DecimalDigitPosition 0.
- This would set the asciiCode to 1.

We then multiple the Decimal Digit at DecimalDigitPositions 1 by 10 and add it to the asciiCode value.
- This would add the value of 20 (2 * 10) to the asciiCode of 1 which keeps the asciiCode at 21.

We then multiple the Decimal Digit at DecimalDigitPositions 2 by 100 and add it to the asciiCode value.
- This would add the value of 100 (1 * 100) to the asciiCode of 21 to give a new value of 121.

The final value of asciiCode is 121, which is the proper ASCII code value for the character y.

We will assume that the number of SoundSamples greatly out numbers of number bits that need to be hidden/extracted. Thus there will be a large number of SoundSamples that have only even values in them. While doing the extraction if we encounter a character that was created from 3 sound samples with a zero value for one's digits, we would have extracted ASCII Character 0 or the NULL character. When this occurs, we will stop the extraction process and display the string that has been created and then exit() the program.

Project Input and Output

To determine which portion of the program the user wants you to perform, use the SimpleInput.getIntNumber() method.

For the encoding portion of the program, you are to prompt the user for a sound file to use and also prompt the user for the string to hide in the sound file. After the string is hidden in the sound, you are to prompt the user for a file to save the sound. You may use the SimpleInput.getString() method to get the string to hide.

For the extracting portion of the program, you are to prompt the user for a sound file to use. You are to display the extracted string using one of the methods from the SimpleOutput class.

Program Writing Style

Your program must be written using good programming style which includes:
  • Use of multiple methods
  • Good variable names
  • In-line commenting
  • Proper indentation of program statements
  • Use of blank lines to separate blocks of code.
  • Header block commenting for the program and each method written

    Your header block comment for the program must include the following:

    • Your Name,
    • Net-ID,
    • Course Name,
    • Assignment Name and
    • Day and time of your CS 101 lab section (i.e. Wednesday at 9:00)
    • A short description of the assignment.
Header block comments for each method must include the following:
    • A description of the purpose of the method
    • A listing of the name, type and purpose of every parameter
    • A description of the return value and its type

Project Collaboration

You are allowed to receive help on this project from other students who are also taking CS 111. Each student must still complete and submit his/her own project. You will be required to include a Collaboration Statement somewhere on your project if you receive help. This statement can simply be something like the following:
For this project, I received help from the following member of CS 101.
This statement should list each helping student's name in a comment in the "header comment" of your Java file that includes the main() method.

Submission of your Project

Submit your code via Blackboard.

Modified Sound Values

The following sound value show some of the modifications done by this project.

The first file is the original Gettysburg Address sound file that is 10 seconds long. The second shows the result of "smoothing" the smallest digit from each of the amplitude values. The third shows the final sound after a message has been encoded into the amplitude values.

-- Main.troy - 2014-04-15

Topic revision: r1 - 2014-04-15 - 01:29:52 - Main.troy
 
Copyright 2016 The Board of Trustees
of the University of Illinois.webmaster@cs.uic.edu
WISEST
Helping Women Faculty Advance
Funded by NSF