Wide-area wireless resources
To encourage further measurements and evaluation studies within our research community, we are releasing some of our tools and proxy softwares which implement the different optimization choices (transparent as well as dual-proxy ) in Win2000/XP and Linux, including some traces as well as benchmarks/web-sites used.
The following snap-shots and details of popular web-sites (ranked in http://www.100hot.com) are exact replicas (web snap-shots when taken) and these are now hosted in the web servers within the Computer Laboratory. These web snapshots were also used for wide-area wireless (GPRS /CDMA 2000 3G-1X/ dial-up) web performance benchmarks. To access each web-page, simply click on the URL/link below. In case you need to replicate a copy of this web-site, you can use wget to retrieve full contents of these web-pages and then change the internal links manually (or scripts) corresponding to all cl.cam.ac.uk URLs to point to your own web-server where you could locate the web objects.
e.g. to retrieve our CNN web-page, you can use:
(wget -p -H http://www.cl.cam.ac.uk/users/rc277/websites/cnn.html)
Detailed description: Implementation of a web optimizing proxy system for GPRS wireless links. There are two parts of this implementation: one that sits at the mobile client (i.e. client side proxy) and the other located at the other end of the wireless link within a cellular operator's network i.e. a proxy server (server side proxy). Thus, a mobile device with 2.5G (or 3G) connectivity will need a client side software update ("the client proxy") to benefit from this web optimizing proxy system.
The implementation augments a number of key enhancements: an aggressive content caching that includes a content-based hash-keying to improve hit rates for dynamic content, dynamic data compression, a preemptive push of web page support resources to mobile clients (parse-n-push), delta encoded data transfers, DNS lookups migration, and a custom transport protocol (UDP-based) over the wireless 2.5G (GPRS in our case) to use a link-estimated window flow control approach for rapid startup, and NACK based recovery for fully-optimized link performance. These enhancements give significant improvements in WWW performance over 2.5G (or 3G) wide-area wireless links.
Source code in Microsoft's C# applicable for WinXP/Win2000 platform
(Web-pages used for Benchmarking in ACM Mobisys 2003 paper: [STATIC-I][STATIC-II][lAMAZON][lBBC][lCNN])
GPRSWeb: Optimizing the Web for GPRS Links
R. Chakravorty, A. CLark and I. Pratt
Proceedings of ACM Mobisys 2003.
Description: Number of DNS look-up queries (especially in more popular web-sites) can entail significant fraction of the overall download times over high latency links such as GPRS/3G. DNS-boosting/URL re-writing (commonly employed by different Content Distribution Networks (CDNs) e.g. Akamai) is a technique to intelligently manipulate responses to the DNS queries originating from the mobile client in a way such that DNS look-ups are eliminated. DNS-boosting technique permits optimizing proxies to be deployed transparently such that the impact of DNS look-ups are avoided during web downloads to improve performance. By responding to the DNS queries with a fixed IP address (of that of the PEP), the DNS-boosting technique can implicitly force the mobile client to point to a single (proxy) server IP address with which a mobile client can then open an optimal number of TCP connections, and hence, improve performance. The schemes benefits performance during web downloads over wireless links in two ways: (1) by avoiding extra DNS Lookups, and, (2) by using an optimal choice of number of TCP connections opened by a web browser to the (optimizing proxy) server. More description is available in our PAM publication.
Source Code: User-level implementation of DNS-boosting proxy for Linux.
Measurement Approaches to Evaluate Performance Optimizations over Wireless Wide-Area Networks
(with S. Banerjee, P. Rodriguez, J. Chesterfield, I. Pratt)
In Passive and Active Measurements (PAM 2004) Workshop, 2004.
TCP performance Enhancing Proxy
Description: Implementation in the user-space of a TCP performance enhancing proxy for (2.5G) GPRS networks. The performance proxy improves TCP flow start-up performance by incorporating TCP cwnd clamping, performs aggressive recovery during losses, and fairly schedules TCP connections. The implementation runs in user space over a Linux router to divert packets from multiple incoming TCP flows and then fairly schedule them.
Source Code: Written in "C" for use in Linux (tested in Linux 2.4.16). A more updated (and stable) version (for Linux 2.4.21+) of this implementation is planned for release.
Flow Aggregation for Enhanced TCP over Wide Area Wireless
(with S. Katti, I. Pratt and J. Crowcroft) Proceedings of the IEEE INFOCOM 2003.
Kernel-based TCP Implementation
Description: An advantage in this approach is that we can directly make use of an application-proxy (e.g. squid) on-top-of a modified TCP stack to achieve (almost) similar benefits to that of the previous approach. Other than transport-layer benefits, this approach also benefits from other traditional application-level optimizations e.g. caching, compression etc. We have implemented these benefits in the Linux 2.4.21 TCP stack with an additional option TCP_CONTROL_CWND to be used with _setsockopt_ system call for application defined "non-changeable" congestion window (TCP cwnd clamping). Most applications can make use of this new option with setsockopt system call to benefit TCP connections over GPRS downlink. A modified squid proxy to make use of this setsockopt option to benefit all downlink TCP connections over GPRS is available.
Source Code: Source code for Linux 2.4.21 and modified squid proxy that use this modification.
linux-2.4.21-updated.tar.gz ( README)
squid-2.4-updated.tar.gz ( README )
Performance Issues with General Packet Radio Service
(with Ian Pratt )
Journal of Communications and Networks, December 2002 (ISSN 1229-2370)
A simple script written using awk, and that uses tcptrace and perl script xpl2gpl. With timeline, you can draw nice browser timelines (connections w.r.t time) plots (as shown in IEEE MWCN 2002 paper, ACM Mobisys 2004 and PAM 2004 papers).
Why browser timelines?
Browser timelines (web connections w.r.t. time) are useful for: (1) to help analyse different browser connection modes (non-persistent, persistent, pipelined behaviour etc.), (2) to observe browser connection patterns (also compare browsers as well as proxy/web server behaviour ), and, (3) to record precise web download times in case when it not clear when (and if) browsers connections were closed. An understanding of such web browser behavior is crucial as it can impact end-user experience over resource constrained links like GPRS/3G/dial-ups.
Sample Timeline for an example web download over wireless GPRS network, using Mozilla/HTTP/1.1. Each small rise in the lines indicates a separate GET request made using that specific TCP connection.
To use timeline, you need to collect tcpdump trace of a web-browser and then run timeline on the tcpdump to see how browsers behave (connections w.r.t time). also shows connections behaviour of browsers while retrieving objects from a web/proxy server. timelineFor this, you will need to have tcptrace and the xpl2gpl perl script installed and available in your PATH. Use a clean directory and get tcpdump for the web connections (can use tcpdump -w dump.file port http) and then run timeline with tcpdump. Note that timeline creates lots of files (a consequence of tcptrace which should NOT be deleted and for this reason it is recommended to run timeline on tcpdump in a clean directory). timeline actually also uses these files to plot browser connection timelines. A gnuplot file "timeline.gpl" is created which can be used to plot/modify output files in your preferred format (eps/fig/obj etc.)
Web Throughput (Magic) Scripts
These are what we call as Magic scripts for plotting (1) average and (2) instantaneous HTTP throughputs (of web downloads i.e. parrellel TCP connections) from the browser tcpdumps. These aggregate HTTP throughput plots can be used with the timeline plots to precisely determine how, when and why web downloads over GPRS/3G would underperform. These scripts were very useful to visually analyse the trade-offs in web throughputs achieved vs. by varying number of TCP connections used by browsers, browser's connection mode behaviour (non-persistent, persistent, pipelining etc.), under performance during HTTP GETs, TCP slow-start, DNS look-ups, impact due to TCP 3-way handshake etc. over GPRS/3G.
Usage: Usage is quite similar to timeline: use a clean directory and get tcpdump for the web connections (use tcpdump -w dump.file port http) and then run any of these scripts with tcpdump dumpfile. For these scripts to run successfully, two things needs to ensured: (1) tcpdump is in the system $PATH, and, (2) MOBILE_NAME mobile host name (or IP add) should be known in advance on where the client tcpdump was taken while running the browser. (to verify the mobile client name or IP: use tcpdump -r dump > dump.txt ). Check the name or IP of your mobile host from this dump.txt and manually change the MOBILE_NAME variable in the scripts. You can now use these scripts to get time vs HTTP throughput plots. Also, note that you will need to change the set yrange [0:XXXX] where XXXX is the maximum throughput you would expect of all the aggregate web connections. By default, this is set to 60 kbps (this typically suits GPRS, CDMA 3G-1X and dial-up links where these plots are most useful).
[Download: http_inst_tput, http_avg_tput]
( The throughput plot scripts can be easily modified to suit different needs.)
Commercial handsets for Cellular GPRS and 3G (CDMA 3G-1X) make use of the PPP (Point-to-Point protocol) to dial-up and establish a context with the cellular network. Unfortunately, once dialed, we cannot retrieve the Signal to Noise Ratio (SNR) and Bit Error Rate (BER) Information from the device/handset. This is because one cannot multiplex Control (ATcommands e.g. AT+csq for SNR) simultaneously during data transfer using the PPP link. The only solution out - as in this case - is to use Sierra Wireless GPRS/3G cards in Windows XP/2000 - which then allows its use as a network adapter to enable us to measure SNR/BER information simultaneously during data transfer from the mobile device. Once this information is available, the corresponding carrrier-to-interference noise ratio (C/I) information can be deducted.
To allow us to measure SNR/BER information simultaneously during data transfer, we have a simple tool that can be used to capture this information once the COM port corresponding to the Sierra Wireless device is known. The COM port setting of the device is available from the Phone and Modem settings in the Control Panel of your Windows terminal. Once this information is known, you are all set to collect SNR/BER information plus the timestamp info of the SNR/BER measurement - just download this tool and configure the COM port set to the Seirra wireles device. Our tool inspires from the AT-code snippet previously available as freeware from http://www.activexperts.com. Our tool by default collects SNR/BER every 200ms (this is now changeable in the updated tool). Note that the tool generates an output log file (named out in the C:\ drive ) for SNR/BER samples with timestamps [1 | 2] (also included find the TCP throughputs plots corresponding to these samples). We have also used this tool to collect SNR and BER info for stationary and drive tests conducted for our MAR (ACM Mobisys 2004) paper. (Please mail me if you are looking for such a WWAN tool!)
You can download TCP (tcpdump) traces taken over GPRS, UMTS 3G and CDMA 2000 networks.