CS 5523 Operating Systems
A Proxy Server to Monitor Web Traffic

Objectives:

Overview

In this assignment you will develop a proxy server that monitors the traffic generated by the browsers that use it. The web monitor is a server that listens on a port for TCP/IP socket requests from clients (web browsers). Upon receiving a client (browser) request, the monitor creates another TCP/IP connection to the actual destination web server and transmits the client's request. The monitor forwards the web server's response back to the client. The monitor should also support an option to dump all of the traffic to a file. In part I, the web server name is passed as a command line argument to the monitor and all client requests are forwarded to the specified server. In part II, monitor acts as a proxy server and can handle client requests to arbitrary servers. You will be assigned 10 unique port numbers to use during testing.

Requirements:

Part I: Simple Pass Through

Write a web monitor that takes three command line arguments: a port number p, a destination web server name, and a verbose flag. The monitor listens on port p for a TCP/IP socket connection request. When a client connection is been made, the monitor makes a connection to port 80 of the destination web server. The monitor forwards all messages received from the client to the specified destination web server (that was passed to the monitor as a command line argument) and vice versa. If either the client or web server closes a connection, the monitor closes its connections to both the client and server. The monitor then writes status information to standard error indicating the name of the client, the destination URL, the number of bytes written, and the time it took from when the monitor received the first information from the client to when the connection was closed.

If the verbose flag is 1, the monitor should copy all header traffic between the client and web server to a file. Be sure to flush and close the file when the connections are closed and reopen the file for appending when when the monitor receives another connection.

Test your program by accessing the monitor you have written through your web browser. Suppose you want to test your program by retrieving http://vip.cs.utsa.edu/classes/cs5523s2001/home.html, and your port number was 10355. You would start your monitor program on machine X with:

    monitor 10355 vip.cs.utsa.edu 0
In your web browser you would access this URL with:
    http://X:10355/classes/cs5523s2001/home.html

Test your monitor with a variety of destination web servers and web pages. Why does the monitor not have to parse the client's request in order to forward it to the web server?

Part II: The Monitor as a Proxy

Modify the monitor developed in Part I so that it acts as a proxy and uses HTTP redirect to make the connection between the client and the destination server. You will no longer need to pass the destination web server as a command line argument to the monitor. Instead, you will need to parse the incoming HTTP from the client. If the first token is not GET, close all of the connections and treat the connection as an error. You will need to extract the destination web server address from the URL and peel off proxy headers. Explain how the client request to a proxy is different from a client request to a web server. How does the monitor use the proxy protocol when it forwards the client's request?

Project Notes: You may develop this in any appropriate language that supports TCP sockets. Be sure to do error checking on all system calls. You may use Java sockets (see CDK Figures 4.5 or 4.6), your own socket library or someone else's socket library such as UICI:

If you use UICI, compile and run this code before going on. In any case, use sockets with TCP-IP for the communication. Be sure to set the number of queued connections for your monitor to be high enough so that browser requests are not denied. You will be assigned 10 port numbers in class so that your monitor won't conflict with someone else's. (See this link.)

References for HTTP 1.0:


Last Revision: January 22, 2001 at 5:45 pm