Introduction to C / C++ Programming
Basic Data Types

Note: Please review the language-independant Programming Concepts Data Types notes before this page.

This page of notes covers only the C/C++ language specific issues regarding data types. Please follow the link above for the language-independent concepts for important background information.

Data Types

C/C++ has 5 basic types of data, with specific sub-types as follows:

Integral types
- Integral types are used for whole numbers without fractions, such as counting the number of repetitions of a loop or the number of students in a class.
- "int" is the most commonly used integral type, and will always match the natural word length of the computer. ( 32 bits on most modern PCs. )
- "long int" and "short int", ( also known as just "long" and "short" ) are variations of int that may contain more or fewer bits than an ordinary int.
  - Longs may be able to handle larger values than regular ints, at the expense of more storage space required.
  - Shorts may save on storage space, at the expense of being able to handle only smaller integers.
- The "char" type ( see below ) is technically an 8-bit int. Chars are normally used for storing character codes, but can hold any integer within their range.
- Integral types are normally signed numbers, but they may also be designated as "unsigned".
  - Unsigned integers cannot store any negative numbers, but their range of positive numbers is double that of the equivalent signed versions.
  - ( The total number of bits used is the same. Signed numbers use roughly half their possible bit combinations for representing negative numbers, and the other half for positive numbers, whereas unsigned numbers use all possible bit combinations for non-negative values. )
  - For example, signed 16-bit integers can range from -32768 to + 32767, while the unsigned version can range from 0 to 65,535.
Floating-point types
- Floating-point data types represent numbers that may have fractional components, such as temperatures or grade averages.
  - "float" is the most basic floating point type, and generally follows the IEEE standard for floating point numbers.
  - Floating point numbers use some of their bits to store an exponent, which leaves fewer bits to store the actual digits of the number.
  - This gives them a much broader range of possible values than integer types ( for the same number of bits ), but less precision, since fewer bits are available to represent the digits of the numbers.
  - The data type "double" uses twice as many bits as the ordinary float, which gives it both a broader range and more precision.
    - double is the most commonly used floating point type for scientific and engineering computations.
  - "long double" uses twice as many bits as doubles, and is used when very large ranges or very precise values are needed.
Character types ( for single characters )
- The "char" data type is basically an 8-bit integer, and can be used to store small numbers. ( signed or unsigned. )
- Most commonly the char type is used to store ASCII codes representing character data.
- For example, the ASCII code for the capital letter 'A' is 65.
- Note that the character '9' is not the same as the integer value 9.
  - In this case the ASCII code for the character '9' is 57.
  - The numerical value 9 in the ASCII code set happens to represent a horizontal tab character.
- The full ASCII code table can be found in the back of most programming textbooks, or online at http://www.asciitable.com/ and many other sites.
Character strings
- Either C or C++ can represent constant strings using double quotes, such as "Hello World".
- The C language ( as opposed to C++ ) uses arrays of single characters to store character strings as variables.
  - We'll skip that for now.
- C++ stores character strings in variables of type "string", which are actually C++ objects.
  - These will be easier to deal with for simple programs.
Boolean data
- Boolean values are used for logic and decision making.
- They may have the values of "true" or "false".
- Boolean data may also be represented and/or printed using the values of 0 for false and 1 for true.
- Boolean data will be discussed more fully in the section on logic and decision making.

Constants

Constants are specific numbers ( or characters or strings ) that have a specific value which cannot be changed.
Integers may be represented in either decimal, octal, or hexadecimal formats.
- Integers may be followed by an "L" to indicate a long int or a "U" to indicated an unsigned int, or both.
- Allowable formats are as follows, where the [ square brackets ] denote optional characters:
```
Decimal:         [±]1-9[0-9...][Ll][Uu]
Octal:           [±]0[0-7...][Ll][Uu]
Hexadecimal:     [±]0x[0-9a-fA-F...][Ll][Uu]
        
```
Floating-point constants are normally indicated by the presence of a decimal point, and are normally doubles.
- Floating point constants may be followed by either an "F" to indicate an ordinary float, or an "L" to indicate a long double.
- Floating point constants can also be expressed in scientific notation
- Allowable formats are as follows:
```
[±]1-9[0-9...].[0-9...][Ee[±]0-9...][FfLl]
[±][0].[0-9...][Ee[±]0-9...][FfLl]
[±]1-9[0-9...]Ee[±]0-9...[FfLl]
[±]1-9[0-9...]Ff[Ll]
```
Single characters are enclosed with single quote marks, such as 'A'
Character strings are enclosed with double quote marks, such as "Please enter the first number > "
Special characters, as either single characters or within a string:
- \n - New line character.
- \t - Horizontal tab.
- \g - Bell

Constants Exercise

For each of the constants in the following table, indicate whether the constant is legal or illegal, what type of constant it is if legal, and why illegal otherwise:

Constant	Legal?	Explanation
486
98.6
98.6f
02.479
0.2479
"A"
'A'
'ABC'
'\n'
000042
0ffH
0xC2F9
+37.1
.0000
0xFLU
0x48.6
0x486e1
486e1
0486e1
0486

Variables

A variable is a named storage location, where data may be stored and later changed.
An identifier is a more general term for a named location, which may contain either data or code.
- Identifiers must begin with a letter or an underscore, preferable letters for user programs.
- The remaining characters must be either alphanumeric or underscores.
- Identifiers may be of any length, but only the first 31 characters are examined in most implementations.
- Identifiers are case sensitive, so "NUMBER", "number", and "Number" are three different identifiers.
- By convention ordinary variables begin with a lower case letter, globals with a Single Capital, and constants in ALL CAPS.
  - Multi-word variables may use either underscores or "camel case", such as "new_value" or "newValue".
- Integers are usually assigned variable names beginning with the letters I, J, K, L, M, or N, and floating point variables are usually assigned names beginning with other letters.
- Identifiers may not be the same as reserved words. ( See text for a full list. )
All variables must be declared before they can be used.
- In C, all variables must be declared before the first executable statement of the program.
- C++ allows variables to be declared any time before they are used, but it is still normally good practice to declare all variables at the beginning of the program, unless there is a very good reason to do otherwise.
  - ( Exceptions: Loop counter variables are often declared as part of the loop structure. Occasionally it is beneficial to declare variables within a reduced scope, to be discussed later. )
Variables may be given an initial value at the time they are declared. This is called "initialization", or "initializing the variables".
- Initialization may be performed using either parentheses or equals signs.
- Example: double x1( 0.0 ), x2 = 0.0;
- UNINITIALIZED VARIABLES ARE DANGEROUS, AND SHOULD BE CONSIDERED TO HOLD RANDOM VALUES.
Variables may be declared "const", meaning that their values cannot be changed.
- const variables MUST be initialized at the time they are declared.
- By convention, const variables are named using ALL CAPS.
- Examples:
  - const double PI = 3.14159;
  - const int MAXROWS = 100;

Type Conversions

There are certain cases in which data will get automatically converted from one type to another:
- When data is being stored in a variable, if the data being stored does not match the type of the variable.
  - The data being stored will be converted to match the type of the storage variable.
- When an operation is being performed on data of two different types.
  - The "smaller" data type will be converted to match the "larger" type.
    - For example, when an int is added to a double, the computer uses a double version of the int and the result is a double.
- When data is passed to or returned from functions.
Data may also be expressly converted, using the typecast operator
- The following example converts the value of total to a double precision value before performing the division:
```
      average = ( double ) total / nStudents;
```
- Note that total itself is unaffected by this conversion.

The following syntax is also legal in C++, but not in C:
```
      average = double( total ) / nStudents;
```

Enumerated Types ( Advanced, Optional )

Enumerated ( enum ) data types are basically ints, except that they are restricted to a limited set of values, and those values are referred to by name not by number.
The use of enums where applicable helps make code more readable and also limits the possibilities for bad values, thereby reducing bugs and making the code more maintainable and overall better.
The enum keyword is used to define a new data type, having a new data type name and list of acceptable named values.
Once the new enum type has been declared, variables can be declared of the new type, and assigned the named values.

For example:

     enum SizeType { small, medium, large };  // Declares a new data type, "SizeType"
     
     SizeType item;    // Declares a variable of type "SizeType"
   
     // ( Some code left out here. )
     
     if( num < 25 )
         item = small;  // Use as an int, using the named values instead of numbers

     cout << "\nThe item is ";
   
     switch( item ) {
   
         case small:              // Named values are valid integers
             cout << "tiny\n";
             break;

Named values can be assigned specific numbers. Those not assigned will get successive values. So in the following example, minor2 will have the value 2 and major2 will have the value 101:

     enum errorType { none = 0, minor1 = 1, minor2, major1 = 100, major2, fatal1 = 1000 };

Enumerated type variables can also be initialized.. For example:

errorType errorCode = none;
sizeType bookSize = large;

It is sometimes a good idea to include values such as "invalid", "undefined" or "none" among the list of enumerated values.
Some compilers may allow using enum variables with ordinary integers, ( e.g. using numbers instead of names ), but it is poor practice.
Printing enumerated variables prints the assigned integer value.
One should not attempt to do any math using enumerated variables.

Introduction to C / C++ Programming Basic Data Types