Tutorial
for Squid
Table Of
Contents
Introduction
Squid is a
high-performance
proxy caching server for web clients,
supporting FTP, gopher, and HTTP data objects. Unlike traditional
caching software, Squid handles all requests in a single,
non-blocking, I/O-driven process.
Squid keeps
meta data and
especially hot objects cached in RAM,
caches DNS lookups, supports non-blocking DNS lookups, and implements
negative caching of failed requests.It supports SSL, extensive access
controls, and full request
logging. By using the lightweight Internet Cache Protocol, Squid
caches can be arranged in a hierarchy or mesh for additional
bandwidth savings.
Squid
consists of a main
server
program squid, a Domain
Name System lookup program dnsserver, some optional programs
for rewriting requests and performing authentication, and some
management and client tools. When squid starts up, it spawns
a configurable number of dnsserver processes, each of which
can perform a single, blocking Domain Name System (DNS) lookup. This
reduces the amount of time the cache waits for DNS lookups.
This
web caching software
works on a variety of platforms
including Linux, FreeBSD, and Windows. Squid is created by Duane
Wessels.
Operating
Systems
Supported by Squid
- Linux
- FreeBSD
- NetBSD
- OpenBSD
- BSDI
- Mac OS/X
- OSF/Digital Unix/Tru64
- IRIX
- SunOS/Solaris
- NeXTStep
- SCO Unix
- AIX
- HP-UX
- OS/2
- Cygwin
Installation Squid
Downloading
Squid
Squid
can be download as a
squid source
archive file in a gzipped tar ball form (eg.squid-*-src.tar.gz)
available at http://www.squid-cache.org/
or from ftp://www.squid-cache.org/pub
squid can
also be downloaded as an binary from
http://www.squid-cache.org/binaries.html
Installing
Squid from Source
1.Extract
the source
tar xzf squid-*-src.tar.gz
2.Change the current directory to squid-*
cd squid-*
3.Compile and Installing squid
./configure
make
make install
Note:
This will by default, get installed in "/usr/local/squid".
To get more help for the compile time options available in squid.
./configure .help
Creating
Squid Swap Directories
The Squid
swap directories could be created by the following command
#/usr/local/squid/sbin/squid
-z
Start,
Stop & Restarting Squid
Start
Squid #/usr/local/squid/sbin/squid
Stop Squid
Stopping squid . #/usr/local/squid/sbin/squid -k shutdown
Restart Squid
Stopping squid . #/usr/local/squid/sbin/squid -k shutdown
Starting squid - #/usr/local/squid/sbin/squid
Options Available
-k reconfigure|rotate|shutdown|interrupt|kill|debug|check|parse
Parse configuration file, then send signal to
running copy (except -k parse) and exit.
Running
Squid as Daemon
For
running squid as a daemon or a background process, it could be started as
#/usr/local/squid/sbin/squid
-N
Starting
Squid in Debugging Mode
Squid can
be started in debugging mode by running squid as given below.
#/usr/local/squid/sbin/squid
-Ncd1
which
gives a debugging output.
If the
test is perfect then it would print .Ready to serve requests..
Check
Squid Status
To check
whether squid is running the following command could be used.
#/usr/local/squid/sbin/squid
-k check
Configuration
Basic Configuration
Squid
Listening to a Particular Port
The
option http_port specifies the port number where squid will listen for
HTTP
client requests. If this option is set to port 80, the client will have
the
illusion of being connected to the actual web server. Squid by default
listen
to the port 3128
Different
modes of Squid Configuration
Squid
could be configured in three different modes as Direct proxy, Reverse
proxy and
Transparent proxy.
Direct
Proxy Cache
Direct proxy cache is used to cache static web pages (html and images)
to a
squid machine. When the page is requested second time, the browser
returns the
data from the proxy instead of the origin web server. The browser is
explicitly
configured to direct all HTTP requests to the proxy cache, rather than
the
target web server. The cache then either satisfies the request itself
or passes
on the request to the target server.
Configuring as Direct Proxy
By
default, squid is configured in proxy mode. In order to cache web
traffic and
to use the squid system as a proxy, you have to configure your browser,
which
needs at least two pieces of information:
Set the
proxy server's host name
Set the
port that the proxy server is accepting requests on
Transparent
Cache
Transparent cache achieves the same goal as a standard proxy cache, but
operates transparently to the browser. The browser does not need to be
explicitly configured to access the cache. Instead, the transparent
cache
intercepts network traffic, filters HTTP traffic (on port 80) and
handles the
request if the object is in the cache. If the object is not in the
cache, the
packets are forwarded to the origin web server.
Configuring as
Transparent Proxy
Using
squid transparently is a two part process, requiring first that squid
be
configured properly to accept non-proxy requests (performed in the
squid
module) and second that web traffic gets redirected to the squid port
(achieved
in three ways namely policy based routing, Using smart switching or by
setting
squid Box as a gateway).
Getting
transparent caching to work requires the following steps
For some
operating systems, have to configure and build a version of Squid which
can
recognize the hijacked connections and discern the destination
addresses. For
Linux this seems to work automatically. For BSD-based systems, you
probably
have to configure squid with the --enable-ipf-transparent option, and
you have
to configure squid as
httpd_accel_host
virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on
You have
to configure your cache host to accept the redirected packets - any IP
address,
on port 80 - and deliver them to your cache application. This is
typically done
with IP filtering/forwarding features built into the kernel. On linux
they call
this ipfilter (kernel 2.4.x), ipchains (2.2.x) or ipfwadm (2.0.x). On
FreeBSD
and other BSD systems they call it ip filter or ipnat; on many systems,
it may
require rebuilding the kernel or adding a new loadable kernel module.
Reverse Proxy Cache
A reverse proxy cache differs from direct and transparent caches, in
that it
reduces load on the origin web server, rather than reducing upstream
network
bandwidth on the client side. Reverse Proxy Caches offload client
requests for
static content from the web server, preventing unforeseen traffic
surges from
overloading the origin server. The proxy server sits between the
Internet and
the Web site and handles all traffic before it can reach the Web
server. A
reverse proxy server intercepts requests to the Web server and instead
responds
to the request out of a store of cached pages. This method improves the
performance
by reducing the amount of pages actually created "fresh" by the Web
server.
Configuring
as
Reverse Proxy
To set
Squid up to run as an accelerator then you probably want to listen on
port 80.
And finally you have to define the machine you are accelerating for.
This is
done in squid module,
http_port
80
httpd_accel_host visolve.com
httpd_accel_port 81
httpd_accel_single_host on
httpd_accel_with_proxy on
If you
are using Squid as an accelerator for a virtual host system, then
instead of a
'hostname' here you have to use the word virtual as:
http_port
80
httpd_accel_host virtual
httpd_accel_port 81
httpd_accel_with_proxy on
Different
method of Intercepting HTTP Traffic
The
methods could found in detail in the following link.
http://www.visolve.com/squid/whitepapers/trans_caching.php
WCCP
configuration
Does
Squid supports wccp?
Yes, Squid
supports WCCP. Routers that support
WCCP can be configured to direct
traffic to one or more web caches using an efficient load balancing
mechanism.
WCCP also provides for automatic bypassing of an unavailable cache in
the event
of a failure
Configuring
Squid for WCCP Support
Patches
to be applied for linux kernel.
The linux kernel in the squid machine should be patched with ip_wccp as
ip_gre
is some what broken. Recompile the kernel enabling ip_gre and ip_wccp.
Now install the squid from source and configure it in the squid.conf to
point
to the WCCP router.
Squid Machine configuration.
The following iptables rule to be made so as to redirect all the http
traffic
to squid port 3128.
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports
3128
Cache Inside the Routers network
If the cache is inside the routers network the packets coming from
caches
should be prevented from being redirected back to the caches again. So
the the
following firewall rule has to be prepended in the router machine.
iptables -t mangle -A PREROUTING 1 -p tcp --dport 80 -s
<ip-squid> -j
ACCEPT
SNMP
Configuration
Enabling
SNMP support to Squid
To use
SNMP with squid, it must be enabled with the configure script, and
rebuilt. To
enable SNMP in squid go to squid src directory and follow the steps
given below
:
./configure --enable-snmp [ ... other configure options ]
make all
make install
And edit following tags in squid.conf file :
acl aclname snmp_community public
snmp_access aclname
Once you configure squid and SNMP server, Start SNMP and squid.
Why
should i go for SNMP?
SNMP in
squid is useful in longer term overview of how proxy is
doing. It can also be used as a problem solver. For example: how is it
going
with your file descriptor usage? Or how much does your LRU vary along a
day?
These informations can not be monitored normally.
Monitoring
Squid
There are
a number of tools used to monitor Squid via SNMP, among which MRTG is
mostly
used. The Multi Router Traffic Grapher (MRTG) is a tool to monitor
squid
information which generates a real-time status (graphical
representation), in
dynamic view by sampling data every five minutes (may vary according to
your
need). MRTG shows activity - in the last 24 hours and also in a weekly,
monthly
and yearly graph.
Parameters
Monitored
Squid
runtime information like CPU usage, Memory usage, Cache Hit, Miss etc.,
can be
monitored using SNMP.
Delay
Pools Configuration
Limiting
Bandwidth
Delay
Classes are generally used in places where bandwidth is expensive. They
let you
slow down access to specific sites (so that other downloads can happen
at a
reasonable rate), and they allow you to stop a small number of users
from using
all your bandwidth (at the expense of those just trying to use the
Internet for
work).
To ensure
that some bandwidth is available for work-related downloads, you can
use
delay-pools. By classifying downloads into segments, and then
allocating these
segments a certain amount of bandwidth (in kilobytes per second), your
link can
remain uncongested for useful traffic.
To use delay-pools you need to have compiled Squid with the appropriate
source
code: you will have to have used the --enable-delay-pools option when
running
the configure program
An acl-operator (delay_access) is used to split requests into pools.
Since we
are using acls, you can split up requests by source address,
destination url or
more.
Configuring
Squid with Delay Pools
To enable
delay pools option,
Compile
squid with --enable-delay-pools
Example
acl tech
src 192.168.0.1-192.168.0.20/32
acl no_hotmail url_regex -i hotmail
acl all src 0.0.0.0/0.0.0.0
delay_pools 1 #Number of delay_pool 1
delay_class 1 1 #pool 1 is a delay_class 1
delay_parameters 1 100/100
delay_access 1 allow no_hotmail !tech
In the
above example, hotmail users are limited to the speed specified in the
delay_class. IP's in the ACL tech are allowed in the normal bandwidth.
You can
see the usage of bandwidth through cachemgr.cgi.
Caching
Can squid
cache FTP contents?
Squid is
a http proxy with ftp support, not a real ftp proxy. It can download
from ftp,
it can also upload to some ftp, but it can't delete/change name of
files on
remote ftp servers.When we block ports 20 and 21, we won't be able to
delete/change name of files on remote ftp servers.It speaks FTP on the
server-side, but not on the client-side
Can squid
Cache dynamic pages?
Squid
will not be able to cache pages that dynamically generate the scripts.
It will
cache only the static pages.
Deleing
Objects from Cache
Deletion
of object from is possible by using .purging. method.
Squid does not allow you to purge objects unless it is configured with
access
controls in squid.conf. First you must edit the following tag in
squid.conf as
acl PURGE method PURGE
acl localhost src 127.0.0.1
http_access allow PURGE localhost
http_access deny PURGE
The above allows purge requests which come from the local host and
denies all
other purge requests.
/usr/local/squid/bin/client -m PURGE <URL>
Specifing
Cache Size
Cache size
could be specified by
Using
cache_dir directive in squid.conf,
cache_dir ufs /usr/local/squid/cache 100 16 256
Here ufs is
the squid filesystem, /usr/local/squid/cache is the
default cache directory, 100 is the cache size in MB . The cache
size could
be specified here
and 16 and
256 are the number of sublevel directories in cache
directory.
Squid
Swap Formats
The squid
swap formats systems available are
ufs,aufs,diskd and coss
UP
Authentication
Configuring
Squid for authenticating users
Squid
allows you to configure user authentication by using auth_param
directive.This
is used to define parameters for the various authentication schemes
supported
by Squid.
Proxy
authentication in transparent mode
Authentication
can't be used in a
transparently intercepting proxy as the client then thinks it is
talking to an
origin server and not the proxy. This is a limitation of bending the
TCP/IP
protocol to transparently intercepting port 80, not a limitation in
Squid.
Authentication
schemes available for
squid
The
Squid source code comes with a few authentication processes for Basic
authentication. These include
LDAP:
Uses the
Lightweight Directory Access Protocol
NCSA: Uses an NCSA-style username and password file.
MSNT: Uses a Windows NT authentication domain.
PAM: Uses the Linux Pluggable Authentication Modules scheme.
SMB: Uses a SMB server like Windows NT or Samba.
getpwam: Uses the old-fashioned Unix password file.
sasl: Uses SALS libraries.
winbind: Uses Samba authenticate in a Windows NT domain
In addition Squid also supports the NTLM and Digest authentication
schemes
which both provide more secure authentication methods where the
password is not
exchanged in plain text.
Configuring
squid for LDAP authentication
Compiling
squid with ldap support.
./configure --enable-basic-auth-helpers="LDAP"
In squid.conf file edit the following
For Example
auth_param basic program /usr/local/squid/libexec/squid_ldap_auth -b
dc=visolve,dc=com -f uid=%s -h visolve.com
acl password proxy_auth REQUIRED
http_access allow password
http_access deny all
Check
Squid working with LDAP auth
To check
whether the Squid machine communicates with the LDAP server Use the
below
command in command line
Example:
# /usr/local/squid/libexec/squid_ldap_auth -b dc=visolve,dc=com -f
uid=%s
visolve.com
This waits for the input.You have to give uid space passwd. If it was
able to
connect to LDAP server it will return "ok".
LDAP
group authentication
Compiling
squid with ldap support.
./configure --enable-basic-auth-helpers="LDAP"
--enable-external-acl-helpers=ldap_group
In the confiuration file (squid.conf)
external_acl_type group_auth %LOGIN
/usr/local/squid/libexec/squid_ldap_group
-b "dc=visolve,dc=com" -f "
(&(objectclass=groupOfUniqueNames)(cn=%a)(uniqueMember=uid=%v,cn=accounts,dc=visolve,dc=com))"
-h visolve.com
acl gsrc external group_auth accounts
http_access allow gsrc
configuring Squid for NCSA
NCSA
Authentication
This is the easiest to implement and probably the preferred
choice
for many environments. This type of authentication uses an Apache
style
htpasswd file, which is checked whenever anyone logs
in. This
is the best supported option, and a web based password changing
program is provided to make it easy for our users to
maintain their own passwords
To turn on NCSA authentication, edit some directives in squid.conf
authenticate_program /usr/local/squid/bin/ncsa_auth
/usr/local/squid/etc/passwd
This tells Squid where to find the authenticator. Next we have to
create an
ACL.
Acl configuration for ncsa_auth :
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all
Configuring
Squid for SMB
SMB Auth
Module :
smb_auth is a proxy authentication module. With
smb_auth we
can authenticate proxy users against an SMB server like Windows NT or
Samba.
Adding smb_auth in Squid.conf :
Squid Configuration :
To turn on SMB authentication, edit some directives in squid.conf.
authenticate_program /usr/local/squid/bin/smb_auth -W domain -S
/share/path/to/proxyauth
This tells Squid where to find the authenticator. Next we have to
create an ACL
.
Acl configuration for smb_auth :
acl domainusers proxy_auth REQUIRED
http_access allow domainusers
http_access deny all
Configuring
squid for MSNT
MSNT
Auth Module :
MSNT is a Squid web proxy authentication module. It allows a Unix web
proxy to
authenticate users with their Windows NT domain credentials.
Adding msnt_auth in Squid.conf :
Squid Configuration :
To turn on MSNT authentication, edit some directives in squid.conf
auth_param basic program /usr/local/squid/libexec/msnt_auth
auth_param basic children 5
auth_param basic realm Squid proxy-caching web server
auth_param basic credentialsttl 2 hours
This tells Squid where to find the authenticator. Next we have to
create an ACL
Acl configuration for msnt_auth :
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all
Configure
squid for PAM
PAM
Auth Module :
This program authenticates users against a PAM configured
authentication
service "squid". This allows us to authenticate Squid users to any
authentication source for which we have a PAM module.
Adding pam_auth in Squid.conf
Squid Configuration
To turn on PAM authentication, edit some directives in squid.conf.
authenticate_program /usr/local/squid/bin/pam_auth
This tells Squid where to find the authenticator. Next we have to
create
an ACL .
Acl configuration for pam_auth :
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all
Configure
squid for NTLM
NTLM
authentication is a
challenge-response authentication type. NTLM is a bit different and
does not
obey the standard rules of HTTP connection management. The
authentication is a
three step (5 ways) handshake per TCP connection, not per request.
1a. Client
sends
unauthenticated request to the proxy / server.
1b. Proxy /
server responds with "Authentication required" of type
NTLM.
2a. The
client responds with a request for NTLM negotiation
2b. The
server responds with a NTLM challenge
3a. The
client responds with a NTLM response
3b. if
successful the connection is authenticated for this request and
onwards.
No further authentication exchanges takes place on THIS TCP connection.
Adding
ntlm_auth and passwd file in Squid.conf
Squid
Configuration:
To turn on
NTLM authentication, edit some directives in squid.conf.
auth_param
ntlm program /usr/local/squid/libexec/ntlm_auth (domainname)/(pdc
name)
auth_param
ntlm children 5
auth_param
ntlm max_challenge_reuses 0
auth_param
ntlm max_challenge_lifetime 2 minutes
This tells
Squid where to find the authenticator. Next we have to create an
ACL.
Acl
configuration for ntlm_auth :
acl
auth_users proxy_auth REQUIRED
http_access
allow auth_users
http_access
deny all
Filtering
Filtering
a website
Filtering
of websites could be made with ACL (Access Control List). Here is an
example of
denying a group of ip addresses to a specific domain.
acl block_ips src <ipaddr1-ipaddr2>
acl block_domain dstdomain <domainname>
http_access deny block_ips block_domain
http_access allow all
Denying a
user from accessing particular site
Denying a
user from accessing particular site coule be done by ACLs.
It is possible by using 'dstdomain' acl type.
For example..
acl sites dstdomain .gap.com .realplayer.com .yahoo.com
http_access deny sites
Filter a
particular port
Filtering
a particular port could be done in ACL as follows
acl block_port port 3456
http_access deny block_port
http_access allow all
Denying
or allowing users
Denying
access to websites for a particular timing could be done as follows.
To restrict the client from a source IP to access a particular domain
during 9am-5pm on
Monday,
acl names src <ipaddr>
acl site dstdomain <domainname>
acl acltime time M 9:00-17:00
http_access deny names site acltime
http_access allow all
What all
squid cant filter?
Squid
cannot filters virus and web pages based on content.
Filtering
a Particular MAC address
To
use ARP (MAC)
access controls, you first need to compile in the optional code. Do
this with
the --enable-arp-acl configure option.
Example:
acl M1 arp 01:02:03:04:05:06
acl M2 arp 11:12:13:14:15:16
http_access allow M1
http_access allow M2
http_access deny all
Performance
Monitoring
Squid Performance
Squid
performance is monitored by using cache manager and SNMP.
Cache
Manager:
This
provides access to certain information needed by the cache
administrator. A
companion program, cachemgr.cgi
can be used to make this information available via a Web browser. Cache
manager
requests to Squid are made with a special URL of the form
cache_object://hostname/operation
The cache
manager provides essentially ``read-only'' access to information. It
does not
provide a method for configuring Squid while it is running.
SNMP:
SNMP
could be used for monitoring squid runtime information like CPU usage,
Memory
usage, Cache Hit, Miss etc. The Multi Router Traffic Grapher (MRTG) is
a tool
to monitor squid information which generates a real-time status
(graphical
representation), in dynamic view by sampling data every five minutes.
Improving
Squid Performance
Squid
performance could be improved by gathering the performance
data for the particular environment and tuning the Hardware and Kernel
parameters for the peak performance.
Does the
cache directory filesystem impact the performance?
The Cache
directory has the default option ufs. When it is made with the
following
cache_dir
aufs
The aufs
storage scheme improves the Squid.s disk I/O response time by using a
number of
thread processes for disk I/O operations .The aufs code requires a
pthreads
library. This is the standard threads interface defined by POSIX. To
use aufs
squid must be compiled with storeio option.
Note:
If disk
caching is not used, it should be disabled
by setting to 'null /tmp'.
This
eliminates the need for meta-data cache index memory space used by
squid.
Log files
Log files
produced by squid
The list
of log files produced by squid are
squid.out,
cache.log, useragent.log, store.log, hierarchy.log, access.log.
Monitoring
User Access
The
access information gets stored in the access.log file.
Rotating
Log
Larger
log files could be handled by rotating the same.This could be done with
the
following command
squid -k rotate
To specify the number of logfile rotations to make when you type 'squid
-k
rotate' configure it in the squid.conf file in logfile_rotate directive.
Scheduling of this procedure could be done by Cron entry which rotates
logs at
midnight.
0 0 * * * /usr/local/squid/bin/squid -k rotate
Can
squid supports logs of size greater than 2GB?
Squid by
default doesnt supports logs of size greater than 2 GB.To make the
squid supports
files of size greater than 2GB compile the squid with the
option(--with-large-files)
Disbaling
Squid Log File
Disabling
log files could be done
To
disable access.log:
cache_access_log
none
To
disable store.log:
cache_store_log
none
To
disable cache.log:
cache_log
/dev/null
Tools
Cache Manger
(cachemgr.cgi)
The
cache manager (cachemgr.cgi)
is a CGI utility for displaying statistics about the squid
process as it runs. The cache manager is a convenient way to manage the
cache
and view statistics without logging into the server.
Tools For
Configuring Squid
There
are
many tools available to configure squid like webmin and so on.
You can get these tools from
http://www.squid-cache.org/related-software.html
Log Analysers
Calamaris
It
is a commonly
used tool to analyze Squid's access.log.
It Supports many features like generating Status reports of incoming
UDP-Requests and incoming TCP-Requests for total as well as on per host
basis;
Reports about requested second-level-domains and Top_Level_domains are
generated; And also Reports about requested Content Types,
file_extension and
on Protocols are generated using calamaris. It generates ASCII or html
reports.
For a full list of features, please visit the Calamaris home page.
Weblog
WebLog is a group of Python modules
containing several class definitions that are useful for parsing and
manipulating common Web and Web proxy logfile formats.
The Webalizer
The
Webalizer is a fast, free web server log file analysis program It
is
written in C to be extremely fast and highly portable. The results are
presented in both columnar and graphical format. Yearly, monthly, daily
and
hourly usage statistics are presented, along with the ability to
display usage
by site, URL, referrer, user agent , search string, entry/exit page,
username
and country. Processed data may also be exported into most database and
spreadsheet
programs that support tab delimited data formats. In addition, wu-ftpd
xferlog
formatted logs and squid proxy logs are supported.
SARG
Sarg is a Squid Analysis
Report
Generator that allow you to view "where" your users are going to on
the Internet. Sarg generates reports in html, with many fields, like:
users, IP
Addresses, bytes, sites and times.
Tools to generate user
web access report
Webmin
is
a web-based tool for generating web access reports. Using any browser
that
supports tables and forms (and Java for the File Manager module), you
can setup
user accounts, Apache, DNS, file sharing and so on.
Webmin
consists of a simple web server, and a number of CGI programs which
directly
update system files like /etc/inetd.conf
and /etc/passwd.
The web server and all CGI
programs are written in Perl version 5, and use no non-standard Perl
modules.
Miscellaneous
Controlling Uploads
The
uploads can be controlled by using acls(req_header).
acl upload_control req_header Content-Length
[1-9][0-9][0-9][0-9][0-9]{3,}
http_access deny upload_control
http_access allow all
Controlling Downloads
The
Downloads can be controlled by using the following directive.
reply_body_max_size bytes allow|deny acl
|