Networking and Sockets

1. Networking basics

1.1 Hosts and ports

To use the services of a host on the network, a program needs to identify that computer.  Internet hosts are identified by an IP address consisting of four 8-bit unsigned numbers (called octets), usually given in decimal separated by periods.  (IP stands for Internet Protocol.)  Because numeric addresses are difficult for people to remember, the internet Domain Name Service associates host names (e.g., ernie.cs.uic.edu ) with IP addresses (e.g., 131.193.40.40).

The class java.net.InetAddress encapuslates an internet host’s name and IP address.  InetAddress has no public constructors: you obtain an instance with the class messages getAllByName or getByName, which contact the internet domain name services to resolve a host name to its IP address or vice versa.  The argument is a string giving either an internet name or an IP address.   ( getAllByName is used when a host has multiple IP addresses, e.g. because it is connected to multiple networks, and it returns InetAddress[] .)  For example:
   InetAddress rtfm = InetAddress.getByName("rtfm.mit.edu");
   InetAddress tigger = InetAddress.getByName("128.248.155.51");

getAllByName and getByName signal an UnknownHostException if the host name cannot be resolved or a SecurityException if the security manager does not allow the program to resolve that host name.  The class message getLocalHost returns an InetAddress representing the machine on which the virtual machine is running.  The accessors getHostName and getHostAddress return the internet name and IP address, respectively.  The class InetAddress also defines equals, hashCode, and toString.

Typically, a computer has a single physical connection to the network through which data destined for that machine arrives. Since data arriving over that connection may be intended for various applications, the operating system needs a way to route data to the correct process.   A port is a virtual address on an internet host identified by a 16-bit integer.  (It is not the same as a hardware port.)  Ports allow a host to provide multiple services that are available remotely.  Well-known ports are port numbers that are assigned by convention for specific services.  Examples include port 21 for FTP servers, port 23 for telnet, and port 25 for SMTP (simple mail transfer protocol), and all well-known ports have numbers less than 1024.  Data transmitted over the Internet includes addressing information that identifies the host by its 32-bit IP address and the port by its number.

1.2 Protocols

TCP (Transport Control Protocol) is a connection-oriented protocol built on top of IP which establishes a reliable point-to-point connection between the source and destination that remains in effect until closed.  It ensures that all the packets in a message are delivered and are received in the correct order.  TCP error-detection and error-correction features allow the source and destination to treat the connection as a byte stream.  For example, FTP, SMTP, DNS, NNTP (net news), and HTTP are built on top of TCP.

UDP (User Datagram Protocol) is a connectionless protocol built on top of IP that does not establish a link between the source and destination, and does not guarantee the order of the packets in a message or even whether individual packets are delivered.  If a message consists of several packets, the recipient must reconstruct the message.  The advantage of UDP is that it requires much less overhead than TCP because there is no hand-shaking, no retry if an acknowledge isn't received, no buffering and numbering of packets, etc.  Connectionless protocols typically are used either for one-packet messages for which delivery is not crucial, e.g. responses to time requests, or to reduce the transmission overhead for time-critical data that can be lossy, such as streaming audio or continuous measurements.

The package java.net includes classes for both connection-oriented and connectionless communication.  We will not cover connectionless communication in detail.

1.3 Sockets

Sockets are an interprocess communication mechanism originally defined in BSD 4.x UNIX that support communication between processes on different computers.  A socket is one endpoint of a two-way communication link between two processes running on a network.  Sockets are used in pairs, one for each of the communicating processes.  A socket address consists of a IP address and a port number so that the TCP layer can identify which process should receive the data.

Client-server socket usage is not symmetric: the server is running and the client requests a connection to it.  Typically, IPC systems define two types of sockets: a server socket for handling connection requests and a client socket for transmitting and receivng data.  A server socket provides a network-accessible service which “listens” for requests on a particular port number.  The client's operating system usually assigns the port number for a client socket, which the server receives when its operating system recognizes and establishes a connection from the client.  When the operating system on the server machine recognizes a message for its IP number, it resumes the server process associated with that port (if necessary), and begins a session (for a connection-oriented protocol), which creates a client socket for the server to use to send and receive messages to and from the client's socket.  Usually, the server process begins a new process (in Java, a new thread) to handle that connection so that it can continue to service other clients (which is why the operating system creates a new socket for the connection on the server side). The session continues until the server or client closes the connection.

UNIX uses the same read and write system calls for sockets that it uses with files and pipes.  As with those streams, these calls are syunchronous: read blocks if nothing has been received from the other socket and continues when bytes are received, and write blocks if the recipient's buffer is full and continues when it is not.

Sockets are far simpler to code in Java because the classes encapsulate many of the options that must be specified and the function calls and structures that are needed when coding in C.  Furthermore, the C system calls are not safe and are difficult to debug because they include several enum and void* arguments.

2.  Networking in Java

2.1 The package java.net

The basic classes for implementing networked application are defined in the package java.net.  They include:
Java 1.4 adds several classes such as Inet6Address, URI, and InetSocketAddress.  We covered InetAddress above, and will discuss the socket classes in the next four sections.

A URL consists of a protocol, a host name, a port, a path (usually to a file), and a reference (sometimes called a "fragment").  For example, for the URL
http://www.cs.uic.edu:80/~i441/index.html#sockets, http is the protocol, www.cs.uic.edu is the host, 80 is the port number, ~i441/index.html is the path, and sockets is the reference.  The URL constructor takes a string giving the URL, strings for the protocol, host, and file, three strings and an int port, or a URL and a string to be combined with it.  The class defines the accessors getProtocol, getHost, getFile, getPort, and getRef for the components of the URL. The method getContent attempts to determine the type of the URL's data (e.g., from the filename extension or a "magic number") and returns the type Object, e.g. a String for text or an ImageProducer for an image.  The method openStream returns an input string from which the contents of the URL can be read.  The class URL also defines equals, hashCode, and toString.

2.2 The class Socket

The class java.net.Socket represents an endpoint of a connection-oriented protocol (i.e., a "client socket").  Its constructor takes an InetAddress or a host name giving the host and a port number, and can signal UnknownHostException, IOException, and SecurityException.   (For C programmers, the constructor encapsulates the socket, connect, and inet_addr function calls, and the gethostname and gethostbyname calls when the string argument is a DNS name.)  Client programs obtain an instance by calling the constructor, while server programs obtain an instance from ServerSocket.accept.   Socket defines accessors such as getPort and getInetAddress (the remote port and host with which the socket is communicating), getLocalAddress and getLocalPort, and several that access the socket configuration (e.g., get/setSoTimeout, get/setSoLinger, and get/setTcpNoDelay).  The method close closes the socket.   The methods getInputStream and getOutputStream return streams that can be used to send and receive data.   The server and client programs can filter the streams returned by the sockets to use readers and writers for text data or object streams for transmitting serialized objects, or to compress and uncompress the messages.  Examples are presented below.

2.3 The class ServerSocket

The class java.net.ServerSocket represents sockets that listen for connection requests.  The constructor takes a port number and an optional queue length.  If the port number is 0, the operating system assigns a port number, in which case, the program uses getLocalPort to obtain the port number.  (For C programmers, the constructor encapsulates the socket, bind , and listen function calls.)  The default queue length for the number of pending connections that the operating system will maintain is 50.  The constructor can also take an InetAddress to support hosts that have multiple network interfaces with different IP numbers.  The method accept waits for a client to connect and  returns a socket representing the connection with the client.  The method close closes the socket, and the class provides accessors such as getInetAddress, getLocalPort, getSoTimeout, setSoTimeout, and isClosed.  The constructor, accept, and close can signal IOException, and the constructor signals SecurityException if the server cannot be bound to the given port number.

The platform-specific implementation of instances of ServerSocket and Socket is encapsulated in a subclass of the abstract class SocketImpl.  Both classes define the class method setSocketFactory, which takes a SocketImplFactory .  For example, we could define a socket implementation that connects via a proxy or goes through a firewall.

2.4 Connection-oriented communication

The server creates a ServerSocket to accept connections and calls ServerSocket.accept to wait for a connection and obtain a socket connected to a client.  It obtains input and output streams from the socket, and uses them to write and read data.  The following is a skeleton for a server class:

import java.io.*;
import java.net.*;

public class Server extends Thread  {
   public static final int PORT_NUMBER = 12345;
   protected Socket socket;

   public static void main(String[] args) {
      ServerSocket server = null;
      try {
         server = new ServerSocket(PORT_NUMBER);
         while (true) {
            new Server(server.accept());
         }
      }
      catch(IOException ex) {
         System.out.println(“Unable to start server or accept connections”);
         System.exit(1);
      }
      finally {
         try {
            server.close()
         }
         catch(IOException ex) {
            // not much can be done: log the error
            // exits since this is the end of main
         }
      }
   }

   private Server(Socket socket) {
      this.socket = socket;
      start();
   }

   // the server services client requests in the run method
   public void run() {
      InputStream in = null;
      OutputStream out = null;
      try {
         in = socket.getInputStream();
         out = socket.getOutputStream();
         // ... do useful stuff ...
      }
      catch(IOException ex) {
         System.out.println(“Unable to get streams from client”);
      }
      finally {
         try {
            in.close();
            out.close();
            socket.close();
         }
         catch(IOException ex) {
            // not much can be done: log the error
         }

      }
   }
}

Since the server is a process running on the host, it needs a main method which is executed from the command line.  (A server usually does not have a graphic user interface because it runs in the background.)  The class Server extends Thread so that it can start a new thread for each client that connects and then continue to accept connections.  It uses a server socket and client sockets, rather than extending these classes.   serv.accept blocks waiting for a connection request and returns a client socket for communicating with the client, and is usually called in an endless loop as in this example.   In this case, the server program never exits so you must use the operating system “kill” command to stop it.  The client socket is passed to the constructor so that it is available in run.  The constructor sets the socket for communicating with the client and starts a thread for that connection.  Note that start returns without waiting for the thread's run method to finish, so the server re-enters the accept loop concurrently.   main must handle all exceptions.

The run method for a server class is usually a loop that gets a command from the client, computes or accesses the requested information, and returns it to the client.  Typically, there is a client command that breaks out of the loop, at which point the server closes the connection.  For example, a server that echoes the client text would do the following (at “… do useful stuff ... ” in the above example):

   BufferedReader inReader = new BufferedReader(new InputStreamReader(in));
   // the constructor argument “true” enables auto-flushing
   PrintWriter outWriter = new PrintWriter(out, true);
   outWriter.println(“Echo server: enter ‘bye’ to exit.”);
   while (true) {
      // readLine blocks until a line-terminated string is available
      String inLine = inReader.readLine();
      // readLine returns null if the client just presses <return>
      if (inLine != null) {
         if (inLine.trim().equalsIgnoreCase(“bye”)) {
            break;
         }
         outWriter.println(“Echo: “ + inLine);
      }
   }


The server sends a prompt to the client which it prints to standard output.  The client reads a line from standard input and sends it to the server, which checks whether it is the exit command and if not, sends it back to the client prepended with Echo: .  The client then prints this response.

The client is also a separate process and need a main method.  The client creates a Socket by passing the server’s host name and port number to the constructor, and uses it to communicate with the server.  The constructor invocation attempts to establish a connection with the given host and port, and blocks until the connection is established.  If the server process cannot be contacted or refuses the connection, an exception is signaled.  Like the server's run method, the client program obtains streams from the socket.  The following skeletal client program takes the server name from the command line.  To simplify the example, we have omitted the code that checks the number of command line arguments.  (A client could also be defined as an applet that creates the socket in init .)

import java.io.*;
import java.net.*;

public class Client {
   public static void main(String args[]) {
      new Client(args[0]);
   }

   public Client(String host) {
      Socket socket;
      try {
        socket = new Socket(host, Server.PORT_NUMBER);
      }
      catch(UnknownHostException ex) {
         System.out.println(host + " is not a valid host name.");
         return;
      }
      catch(IOException ex) {
         System.out.println(“Error communicating with ” + host);
         return;
      }

      // … initialize model, GUI, etc. ...
      InputStream in = null;
      OutputStream out = null;
      try {
         in = socket.getInputStream();
         out = socket.getOutputStream();
         // ... do useful stuff ...
      }
      finally {
         try {
            in.close();
            out.close();
            socket.close();
         }
         catch(IOException ex) {
            // not much can be done ...
         }
      }
   }
}

Note that the port number passed to the Socket constructor is the server port number.  The client's port number is assigned by the operating system and is available via getLocalPort.  In this simple example, the client exits if it cannot connect with a server.  Since the exception raised by attempting to connect might be due to a temporary condition such as the size of the server's connection request queue, we usually code the invocation of the Socket constructor in a loop that sleeps a few seconds and tries to connect again, with a counter that limits the number of connection attempts.

To test a client-server program without using another machine, you can run both processes on the same computer and have the client use the special “local loopback address” 127.0.0.1, which denotes the local machine.  Even though it is a local connection, this method still requires that you have networking software running on your machine.

2.5 Connectionless communication

java.net also provides classes for connectionless communication: both the server and client create a DatagramSocket.  The constructor takes an optional port number, and the methods send and receive send and receive the argument DatagramPacket .  Like Socket, DatagramSocket provides accessors such as getLocalAddress , getLocalPort , getSoTimeout , and close closes the socket.  Both the server and client create DatagramPacket instances to contain the data transmitted.  The constructor for a received packet takes a byte array for the data and the packet length, while the constructor for a packet to send takes the byte array data, the packet length, and the destination address and port.   DatagramPacket provides the accessors get/setAddress, get/setPort, get/setLength, and get/setData.