Copyright (c) 2003-2008 ViSolve
Table of Contents
1. What is ViCompress?
2. Compiling and Installing ViCompress
3. Running ViCompress
4. Configuring ViCompress
5. Compression
6. Caching
7. Load Balancing and Failover
8. Log Statistics
9. Log Files
ViCompress is a web accelerator, a free HTTP compression and caching proxy server.
It speeds up download response
times by caching frequently requested pages, and by compressing text
pages for smaller downloads. ViCompress can be used in two different
setups, one for Internet Service Providers (ISPs), and one for individual
websites:
Setup for ISPs
Setup for website
What features does ViCompress support?
- In-Memory compression
ViCompress supports in-memory compression of text pages, such
as HTML, javascript, stylesheets. Image files are not compressed.
Because text pages are compressed, the download time will be faster
for clients, especially over slow connections.
- In-Memory caching
ViCompress supports in-memory caching of static data, such as
images, stylesheets, and html files. For ISPs, this results in a
faster response to clients. For websites, this reduces the
load on the backend web server.
- Load Balancing and Failover
For websites, ViCompress supports load-balancing over multiple backend
web servers. If a web server goes down, requests are automatically
redirected to another web server. ViCompress also supports sticky
sessions, where all traffic from the same client browser goes to the
same backend webserver.
- DNS lookup caching
When requesting a webpage, the DNS lookup time (looking up the IP address
of the website's hostname) can often be slow. ViCompress will cache the
DNS lookups, thereby improving response time.
- Log files and statistics
ViCompress logs every HTTP request to a log file. It supports the
Apache Combined Log Format, as well as the Squid Access Log format.
ViCompress also displays simple HTML bar charts showing the bandwidth
saved by compression and caching over a given period (hour, day, month).
- Handles thousands of simultaneous clients
ViCompress uses non-blocking network I/O with modern polling mechanisms
(epoll, kqueue, /dev/poll) rather than spawning a thread/process
per request. This allows ViCompress to handle thousands of client
connections simultaneously.
- Scales to multiple processors
ViCompress uses multiple threads for the CPU intensive gzip compression.
ViCompress will automatically determine how many processors your system has,
and will spawn one compression thread per processor.
|
2. Compiling and Installing ViCompress
|
Requirements
- A POSIX compatible Unix system. However, only Linux 2.4, Linux 2.6, and
Cygwin have been tested.
- The GNU gcc compiler. Other ANSI-C compilers may work, but will probably
require Makefile modifications to compile.
- The pthread and zlib libraries.
- GNU make.
Compiling from the source
Download the ViCompress source code from
http://www.visolve.com/vicompress/download.php
Extract the ViCompress source code, and change into the src directory.
# gunzip vicompress-1.0.x.tar.gz
# tar -xvf vicompress-1.0.x.tar.gz
# cd vicompress-1.0.x/src/
|
Run the configure script, passing the directory to install ViCompress into.
The default directory is
/usr/local/vicompress.
|
# ./configure /usr/local/vicompress
|
If you're using a C compiler other than gcc, you will need to edit the
compiler flags in the Makefile. The default flags are:
CC=cc
LIBS= -lpthread -lz
CFLAGS= -O1 -Wall
LDFLAGS=
|
Compile the source code.
The make install command will
copy the ViCompress runtimes files into the install directory
(/usr/local/vicompress ).
The following files will be installed:
ViCompress Files
|
/usr/local/vicompress/LICENSE
|
|
The License
|
|
/usr/local/vicompress/README.html
|
|
The HTML documentation
|
|
/usr/local/vicompress/bin/tune_kernel.sh
|
|
Script to tune Linux kernel parameters, for performance
|
|
/usr/local/vicompress/bin/update_log_stats
|
|
Program to generate/update log statistics every hour
|
|
/usr/local/vicompress/bin/update_log_stats.sh
|
|
Script to cleanly start/stop the update_log_stats program.
|
|
/usr/local/vicompress/bin/vicompress
|
|
Main vicompress server
|
|
/usr/local/vicompress/bin/vicompress.sh
|
|
Script to cleanly start/stop the vicompress server.
|
|
/usr/local/vicompress/etc/vicompress.conf
|
|
Configuration file
|
|
/usr/local/vicompress/etc/errorpage.html
|
|
HTML page to send back to users on errors
|
|
/usr/local/vicompress/log/ |
|
Directory where log files are stored
|
|
/usr/local/vicompress/logstats/ |
|
Directory where HTML log statistics are written to
|
|
/usr/local/vicompress/logstats/template.html |
|
Template HTML file used when generating HTML statistics reports.
|
|
/usr/local/vicompress/logstats/verticalbarN.png |
|
Image files used in HTML statistics reports.
|
|
/usr/local/vicompress/logstats/visolvelogo.png |
|
Image file used in HTML statistics reports.
|
|
Installing from an RPM
Visolve also distributes ViCompress as a pre-compiled binary RPM file
(RedHat Package Manager) for Linux x86 based systems.
Download the ViCompress RPM file from
http://www.visolve.com/vicompress/download.php
You can install ViCompress from the binary RPM by running the rpm install
command:
# rpm -i ViCompress-1.0.x-1.i386.rpm
|
To remove the ViCompress package, use the rpm erase command. All the
ViCompress files will be removed, except for log files.
The first option to configure is the network settings
listen <IP address> <port>
For example
listen 192.10.10.5 80
Next, ViCompress can be run in one of two setups:
For ISPs: a forward HTTP proxy that forwards requests to the server
given in the URL.
For websites: a reverse HTTP proxy that forwards requests to a set
of backend web servers.
If you are running ViCompress in the first setup, you can start the
server with the default settings. However, for the website setup,
you must add the IP address and port of all the backend webservers
to the configuration file
/usr/local/vicompress/etc/vicompress.conf
webserver <IP address> <port>
For example:
webserver 192.168.10.2 80
webserver 192.168.10.3 80
See
Configuring ViCompress for more details
about the configuration file.
To start and stop the server, use the script:
/usr/local/vicompress/bin/vicompress.sh
It can take one of three possible arguments:
| start | |
start the vicompress server |
| stop | |
stop a running vicompress server process |
| status | |
print whether or not vicompress is running |
To start ViCompress, simply use the "start" argument:
# cd /usr/local/vicompress/bin
# ./vicompress.sh start
|
The server is automatically run in the background.
|
4. Configuring ViCompress
|
ViCompress uses the configuration file
/usr/local/vicompress/etc/vicompress.conf
One option is specified per line. Blank lines are ignored. Lines beginning
with a hash (#) are ignored.
Each option is explained in detail below as follows:
- Option name and parameters
- Example parameters
- Description
webserver <IP address> <port>
webserver 192.168.10.2 80
webserver fe80::213:21ff:fe7c:7471 80
|
This option should only be used if ViCompress is being used as a
reverse proxy for a set of webservers, as opposed to a forward
proxy, or an ISP environment.
Specify the backend web server(s) to forward requests to. Multiple
webserver entries may be specified, each on a separate line.
Both IPv4 and IPv6 addresses are supported.
ViCompress will act as a reverse HTTP proxy, and will distribute the
requests among the backend webserver entries specified. See the
Load Balancing and Failover section for more
details.
|
listen <IP address> <port>
listen 192.168.0.1 80
listen fe80::213:21ff:fe7c:7471 81
|
Specify the IP address and port for ViCompress to listen on.
Multiple listen entries may be specified, each on a separate line.
Both IPv4 and IPv6 addresses are supported. The default value is
all IPv4 interfaces, 0.0.0.0, on port 80, the standard
HTTP port. Only servers started by root can bind to ports less
than 1024.
The older (1.0.7 and earlier) configuration options
listenip <IP address>
and
listenport <port>
have been deprecated.
|
outgoingip <IP address>
outgoingip 192.168.0.1
|
Specify the IP address for ViCompress to bind to when making outgoing connections.
Both IPv4 and IPv6 addresses are supported.
The default value is any interface, 0.0.0.0.
|
enable_sessions <yes|no>
enable_sessions yes
|
Enable or disable sticky sessions. The default value is yes. This option
is only used when two or more backend webservers are specified in the ViCompress
configuration. When enabled, ViCompress will use HTTP Cookies to ensure
that a client is sent to the same backend web server for the duration of
it's session. An HTTP session lasts until the client browser is closed.
See the Load Balancing and Failover section for
more details.
|
enable_compression <yes|no>
enable_compression yes
|
Enable or disable gzip compression. The default value is yes. When enabled,
ViCompress will gzip HTML and text pages before sending the response to
the client.
|
enable_caching <yes|no>
enable_caching yes
|
Enable or disable caching of pages. The default value is yes. When enabled,
ViCompress will cache static pages and images in memory.
|
cache_memory <size in megabytes>
cache_memory 100
|
Specify the size of the in-memory cache, in megabytes. The default value
is 100. Note that under high load, ViCompress will also use around 12 MB
for compression and 10 MB for basic operation. The total memory
(cache_memory + 22 MB) should not exceed the amount of RAM memory available.
If cache_memory is set to 0, caching is disabled.
|
max_cacheditem_size <size in kilobytes>
max_cacheditem_size 512
|
Web pages larger than this size will not be cached. In order to have a high
hit rate, ViCompress should cache many small pages,
rather than a few large pages. To prevent large pages from being cached,
use this option. The default value is 512 kilobytes.
|
cache_expires <hours>
cache_expires 240
|
When a web page is cached, it remains in the cache based on its age.
The expiration time is set to half of the item's age. For example,
a page that is 4 days old will be cached for 2 days.
This option is used to place an upper limit on the expiration time.
The default value is 240 hours (10 days). After 10 days, the page is
removed from the cache.
|
enable_dns_caching <yes|no>
enable_dns_caching yes
|
Enable or disable caching of dns lookups. The default value is yes.
When enabled, ViCompress will store the hostname-to-IP address
mappings in memory.
|
dns_expires <hours>
dns_expires 48
|
Specify the amount of time a cached DNS mapping is valid. The default
is 48 hours.
|
user <username>
user nobody
|
The user to run the server as. ViCompress will switch to this user after
binding to the listening port (usually port 80). ViCompress is generally
started as root, since only root can bind to ports less than 1024.
However, it is unsafe to run a server program as root. Therefore, ViCompress
will switch to a non-root user after binding to port 80. That user is
specified by the "user" option given above.
The default value is the user who started the server.
|
hostheader <hostname>
hostheader mydomain.com
|
This option only applies when accelerating a website. It does not apply to
ISPs. Specify the hostname to use in the HTTP Host header, when sending
the HTTP request to the backend webservers. By default, ViCompress will
just send the same HTTP Host header it receives from the client browser.
|
accesslog <path to logfile>
accesslog /usr/local/vicompress/log/accesslog
|
Specify the file path where the access log should be stored. The file must be
writable by the username given in the "user" option. If the accesslog is not
specified, no access log file will be created or written to.
|
errorlog <path to logfile>
errorlog /usr/local/vicompress/log/errorlog
|
Specify the file path where the error log should be stored. The file must be
writable by the username given in the "user" option. If the errorlog is not
specified, no error log file will be created or written to.
|
errorpage <path to errorpage.html>
errorpage /usr/local/vicompress/etc/errorpage.html
|
Specify the file path where the HTML error page is located. This
HTML page will be sent back to users when ViCompress is unable
to lookup or connect to the origin web server in the HTTP request.
If no file is specified, then no HTML content is sent back on
HTTP error replies.
|
rotatesize <size in megabytes>
rotatesize 100
|
Rotate the log files when they reach the specified size, in megabytes.
The default value is 100. The maximum value is 2047, or about 2
gigabytes. When rotation occurs, the following commands
are executed:
mv <accesslog>.1 <accesslog>.2
mv <accesslog> <accesslog>.1
and a blank log file is created at
<accesslog>.
See Log Files for further details
about log file rotation.
|
logformat <apache|squid>
logformat squid
|
Specify the format of the accesslog file. The supported formats are the
Apache Combined Log Format, and the Squid Access Log Format.
The default value is the Squid format. See the
Log Files and Log Statistics
sections for further details.
|
ViCompress can compress text pages, HTML, javascript,
stylesheets, PDF documents, and Microsoft Word documents,
before sending them to back to the client.
This results in faster download times, especially over slow modem connections.
Both static and dynamic pages can be compressed, such as output from PHP or CGI
scripts. Images and other binary file types are not compressed.
Most modern browsers recognize gzip encoding and automatically decompress it.
These include Internet Explorer 4 and above, Firefox, Opera 4 and above,
and Netscape 6 and above. ViCompress will not send gzipped content to browsers
that do not support it. ViCompress checks for the HTTP header
Accept-Encoding: gzip in the
request to determine whether the browser supports gzip encoding or not.
Compression related options:
enable_compression
ViCompress can cache data in memory, such as html pages and images.
When a browser requests an item found in the cache, ViCompress
will send the response directly, rather than contacting the destination
webserver. For ISPs, this results in a faster response time for clients.
For websites, this reduces load on the backend webserver.
ViCompress will not cache web pages that are generated dynamically,
such as through ASP, PHP, or CGI scripts. ViCompress uses the HTTP
headers
Last-Modified
Expires
Content-Length
to determine whether a response is dynamically generated or not. Only
responses which contain both a Content-Length
and one of Expires/Last-Modified will be
cached. In addition, ViCompress will not cache pages that are password protected (pages
that require the HTTP header Authorization).
If the Expires header is present, the
expiration time is set to that header. Else, if the
Last-Modified header is present,
the expiration time is set to half of the item's age. For example, a web
page that was last modified 8 days ago will remain in the ViCompress
cache for 4 days. In addition, the
cache_expires option sets an upper limit on the time an item can remain
in the cache.
When the in-memory cache becomes full, items that have not been
recently accessed are removed to make room for new items in the cache.
Users can view the list of URLs in the memory cache by logging into the
ViCompress machine and sending the following special URL to ViCompress:
http://<hostname>:<port>/_viewcache_
ViCompress will return a plain text list of the URLs in the cache, one
per line. You can retrieve this list only from an http client on the
same machine as ViCompress. Outside clients cannot access the cached URL
list.
For example, if you are using the wget or curl command line http clients,
you would run:
# wget http://<hostname>:<port>/_viewcache_
# curl http://<hostname>:<port>/_viewcache_
|
Caching related options:
enable_caching
cache_memory
max_cacheditem_size
cache_expires
|
7. Load Balancing and Failover
|
Load Balancing
When one or more webserver entries are specified,
ViCompress will act as a reverse HTTP proxy, and will distribute requests
among the backend webservers. ViCompress uses a simple round-robin
algorithm for load distribution.
Failover
If ViCompress fails to connect to a backend web server, that web server
is marked as down, and will be skipped for future requests.
Clients that had previous sessions with that web server will be forwarded
to a new backend web server. ViCompress will try to re-connect to a down
web server every 3 minutes. If the connection succeeds, the web server is
marked as up again. If all backend web servers are down when a request
arrives, ViCompress will simply choose among the down web servers, in
round-robin fashion.
Sessions
Many web applications keep session information for each client, such as
shopping cart items. Session information may be stored on a central database,
or may be stored locally on individual web servers. If your website stores
session information on individual web servers, then a client's requests
cannot be distributed across multiple web servers. To force a client to
use the same backend web server throughout a session, ViCompress provides
the enable_sessions option. When enabled,
ViCompress will send the client a cookie to indicate which backend web server
to use:
Set-Cookie: vicompressid=1
For the duration of the session, the client browser will send the vicompressid
Cookie for every request:
Cookie: vicompressid=1
ViCompress will forward the requests to the backend webserver specified by
the cookie. When the client browser is closed, the browser discards the
vicompressid Cookie, and the session is ended. Note that if sessions are
enabled, client connections may not be evenly distributed across the
multiple backend web servers.
Load Balancing related options:
webserver
enable_sessions
ViCompress includes a tool to generate statistics about bandwidth, caching,
and compression.
To generate the log statistics, use the script:
/usr/local/vicompress/bin/update_log_stats.sh
This script takes one of three possible arguments:
|
start <dir>
| |
Run a daemon to generate/update the log statistics every hour.
Store the log statistics in the given directory <dir>.
If the <dir> argument is not given, the default directory
is /usr/local/vicompress/logstats.
|
| stop | |
Stop the update_log_stats.sh program.
|
| status | |
Print whether or not the update_log_stats.sh program is running.
|
To generate the log statistics, run the following command:
# cd /usr/local/vicompress/bin
# ./update_log_stats.sh start /usr/local/vicompress/logstats
|
The update_log_stats program will run in the background. Every hour,
it will parse the accesslog and write an
HTML statistics report to
/usr/local/vicompress/logstats/YYYYMMstats.html
where YYYY is the year, and MM is the month. For example:
/usr/local/vicompress/logstats/200501stats.html (Jan 2005)
/usr/local/vicompress/logstats/200502stats.html (Feb 2005)
/usr/local/vicompress/logstats/200503stats.html (Mar 2005)
This report will show the bandwidth saved with caching and compression for
- The entire month
- Each day of the month
- Each day of the week
- Each hour of the day
View a sample report.
In addition, an HTML date index file will be created containing
hyperlinks to all the monthly statistics.
/usr/local/vicompress/logstats/statsindex.html
ViCompress produces two log files: an access log and error log.
Access Log
The accesslog stores information about each client request on a single line.
It is used for gathering website statistics. The path of the accesslog is
determined by the configuration option accesslog:
accesslog /usr/local/vicompress/logs/accesslog
The log format is determined by the logformat option:
logformat <apache|squid>
The Apache Combined Log Format is described at
http://httpd.apache.org/docs/logs.html
. A summary of the format is given below. Note that ViCompress
uses the "ident" field (2nd field) to store cache and compression information.
| clientip |
The IP Address of the client.
|
| hit and compression |
Either "hit" or "miss", followed by the content length before compression.
|
| username |
The username sent for authentication, or "-" if not given.
|
| date |
The date of the response [day/month/year:hour:min:sec +/-timezone]
|
| firstline |
The first line of the HTTP request (method url version).
|
| replycode |
The server HTTP reply status code.
|
| contentlength |
The length of the server reply body, in bytes.
|
| referer |
The URL which referred the user to this website.
|
| useragent |
The platform and version of the client browser.
|
|
Here is a sample Apache Log entry:
|
15.13.130.10 miss15923 - [21/Aug/2003:17:26:45 -0700] "GET /index.html HTTP/1.0" 200 1852 "http://www.google.com/" "Mozilla 4.0 (IE 6.0 compatible)"
|
The Squid Access Log Format is described at
http://www.squid-cache.org/Doc/FAQ/FAQ-6.html
.
The Squid format contains information similar to the Apache format.
The Squid format contains additional information about cache hits, but does not
store the Referer or User-Agent headers. Note that ViCompress
uses the "ident" field (9th field) to store compression statistics.
| date |
The date of the response, the number of seconds since 1970.
|
| duration |
The duration of the response, in milliseconds.
|
| client |
The IP address of the client.
|
| hit status |
TCP_HIT if the request is a cache hit, else TCP_MISS.
|
| replycode |
The server HTTP reply status code.
|
| contentlength |
The length of the server reply body, in bytes.
|
| method |
The HTTP request method (GET, POST, etc).
|
| URL |
The requested URL.
|
| compression |
The content length of the reply before compression.
|
| peerstatus |
NONE if the request is a cache hit, else DIRECT.
|
| peerhost |
The IP address of the backend web server, or "-" if a cache hit.
|
| contenttype |
The content type of the HTTP reply, or "-" if not given.
|
|
Here is a sample Squid log entry:
|
1112387949.000 529 15.13.130.249 TCP_MISS/200 1031 GET http://www.amazon.com/somefile.jpg 8523 DIRECT/15.0.110.12 text/html
|
Log Rotation
The accesslog file can grow quickly under heavy load. Therefore, ViCompress
will automatically rotate log files once they reach a certain size.
This size is given by the configuration option:
rotatesize <size in megabytes>
The default size is 100.
The maximum value is 2047, or 2 gigabytes.
ViCompress will save up to two previous copies of the accesslog.
When rotation occurs, ViCompress will execute the following:
mv accesslog.1 accesslog.2
mv accesslog accesslog.1
|
and will then create a new, empty accesslog file.
For errorlog rotation, the same commands occur, except that the errorlog
file is moved, instead of the accesslog file.
Error Log
The error log stores error and debugging messages from ViCompress.
The path of the errorlog is determined by the configuration option errorlog:
errorlog /usr/local/vicompress/logs/errorlog
By default, ViCompress logs just basic startup and shutdown messages,
as well as the status of backend web servers. The messages are shown
below. The maximum number of concurrent clients is half the
maximum number of file descriptors (sockets) that a process can open
(the other half is needed for connecting to the origin servers).
vicompress shutting down
vicompress started
Maximum concurrent clients is <number>
Cache size is <number> MB
Marking webserver <ip address>:<port> as down
Webserver <ip address>:<port> is now up
|
Users can enable additional debug messages during runtime to troubleshoot
any problems with ViCompress. Debugging is toggled (enabled/disabled)
by running the command below. Note that debugging is initially
disabled when ViCompress is started.
# cd /usr/local/vicompress/bin
# ./vicompress.sh debug
|
Each debug message in the errorlog contains the date, a message, and
the IP address:port of the client connection.
Here is a sample entry:
[Tue May 10 17:18:41 2005] [127.0.0.1:52689] New client accepted.
Below is the complete list of debugging messages:
New client accepted.
Read http request from client: status=<status message>, url=<urlpath> <HTTP request>
Lookup up IP address for hostname <hostname>
Connecting to server <IP address>:<port>
Read http reply from server: status=<status message> <HTTP reply>
Writing server reply
Writing and caching server reply
Writing and compressing server reply
Writing, caching, and compressing server reply
Writing error reply to client
Writing _viewcache_ page to client
Writing cached reply to client
Writing direct reply to client
Wrote server reply: status=<status message>
Wrote cached reply: status=<status message>
Wrote direct reply: status=<status message>
Wrote error reply: status=<status message>
Wrote _viewcache_ page: status=<status message>
Keeping client connection alive
Closing connection
Checking if webserver <ip address>:<port> is up
Webserver <ip address>:<port> is still down
|
For the <HTTP request> and <HTTP reply>, ViCompress
will print out the full http request and reply headers.
For <status message>, ViCompress will print out one of
the messages below:
Success
Bad HTTP Request from client
Bad HTTP Reply from server
Client closed prematurely
Server closed prematurely
Connect failed
Error reading from client
Error writing to client
Error reading from server
Error writing to server
DNS lookup failed
|
Log related options:
accesslog
errorlog
rotatesize