Overview: In this laboratory you will develop a proxy server that monitors the traffic generated by the browsers that use it. The web monitor is a server that listens on a port for TCP/IP socket requests from clients (web browsers). Upon receiving a client (browser) request, the monitor creates another TCP/IP connection to the actual destination web server and transmits the client's request. The monitor forwards the web server's response back to the client. The monitor should also support an option to dump all of the traffic to a file. In part I, the web server name is passed as a command line argument to the monitor and all client requests are forwarded to the specified server. In part II, monitor acts as a proxy server and can handle client requests to arbitrary servers. You will be assigned 10 unique port numbers to use during testing. (See this link.) This laboratory must be done in Java.
Requirements:
Part I: Simple Pass Through (Tunnel) Write a tunnel that takes two command line arguments: a port number p and a destination web server name. The tunnel listens on port p for a TCP/IP socket connection request. When a client connection is been made, the monitor makes a connection to port 80 of the destination web server. The tunnel forwards all messages received from the client to the specified destination web server (that was passed to the tunnel as a command line argument) and vice versa. If either the client or web server closes a connection, the tunnel closes its connections to both the client and server. The tunnel then writes status information to standard error indicating the name of the client and the time it took from when the tunnel received the first information from the client to when the connection was closed.
Test your program by accessing the tunnel
through your web browser. Suppose you
want to test your program
by retrieving
http://vip.cs.utsa.edu/classes/cs5523s2002/home.html,
and your port number was 10355.
Start your tunnel program on machine X with:
java tunnel 10355 vip.cs.utsa.eduIn your web browser access this URL with:
http://X:10355/classes/cs5523s2002/home.html
Test your tunnel with a variety of destination web servers and web pages. Why does the tunnel not have to parse the client's request in order to forward it to the web server?
Notes: CDK Figures 4.5 and 4.6 have example programs for doing network communication in Java. You should start by getting these programs to run.
You will not receive full credit if your implementation assumes that first the client will send all of its implementation and then the server will respond. A tunnel should make no assumptions about the ordering of incoming information. A typical method of handling this is to have two threads in the tunnel --- one for each direction.
Part II: A Proxy Monitor
Modify the tunnel program developed in Part I so that it acts as a proxy and uses
HTTP redirect to make the connection between the client and the destination
server. You will no longer need to pass the destination web server as a command
line argument to the monitor. Instead, you will need to parse the incoming
HTTP from the client. If the first token is not GET, close all of the
connections and treat the connection as an error. You will need to extract
the destination web server address from the URL and peel off proxy headers.
Explain how the client request to a proxy is different from a client request
to a web server. How does the monitor use the proxy protocol when it forwards
the client's request?
Part III: A Proxy with Logging
Modify the proxy monitor developed in Part II so that it performs logging.
The program takes two command line arguments --- the port number on which
it listens and the level of logging that it should perform. Logging levels
are specified by the strings none (for no logging -- e.g. Part II),
headers (all headers are dumped to a log file)
and all (headers are dumped to a log file and each returned
resource is saved as a file). Each header should be prefaced
by the client name and the server name. Develop a naming
convention for your resource files that includes the server name and
the path name on the server encoded in the name. Explain your naming
convention in your report.
References for HTTP 1.0:
http://www.jmarshall.com/easy/http/
http://www.w3.org/Protocols