Firewalling
HTTP Traffic Using Reverse Squid Proxy
Rajeev Kumar
Ever wonder why those lines such as "default.ida?XXXXX...."
or "cmd.exe" are in your Web server logs? If your Web
server is running at the default port 80/tcp, it's likely that
your Web server is being attacked by one of many Internet worms.
Common worms, such as Code Red, send a text string like the following
in the form of a URL (read as a single line):
http://<yoursebserver>/default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN%u909
0%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u780
1%u9090%u9090%u8190%u00c3%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a
The presence of this string in the log file does not necessarily mean
your computer has been compromised; it only implies that a Code Red
worm has attempted to infect your computer. In some cases, the worm
might have infected your computer, crashed your Web server process,
or (at the minimum) just filled your log files.
A common technique for protecting network resources is to place
them behind a port-based firewall. Unfortunately, the practice of
denying access by port number does not work well for a Web server.
You need to keep port 80 open so that outside users can access the
server, but if you pass all messages addressed to port 80 to the
interior network, you defeat the whole purpose of a firewall. If
you are going to protect a Web server with a firewall, you need
an application/protocol-based firewall that allows more diverse
and selective access rules. One type of firewall that provides this
kind of protection is known as a proxy firewall. In this article,
I will describe how to set up Squid as a proxy firewall in front
of your Web server.
Squid (http://www.squid-cache.org) is a popular freeware
Web-content caching program. The role of Squid as a forward Web
server proxy/cache is well known. In its forward proxy configuration,
Squid accesses Internet data on behalf of a client on the local
network. The configuration I describe in this article is exactly
opposite from the common forward-proxy scenario. This article describes
the case in which the Web server is on the local network and the
client is connecting from the Internet. In other words, Squid is
acting as a reverse proxy.
Squid as a Reverse Proxy
Reverse Squid proxy, also known as Web Server Acceleration, is
a method of reducing the load on your backend Web server using a
Squid cache between the Internet and the Web server. Another advantage
of reverse proxy is that you can improve security for your Web server
by applying Squid ACLs and exerting control over what Web requests
(URLs) are allowed to reach your Web server. The steps for setting
up the cache are the same for standard (forward) and reverse squid
proxy. This document will focus on the additional task of applying
Squid ACL security (HTTP URL filtering/firewalling) for reverse
proxy configurations.
A reverse proxy is positioned between the Internet and your real
Web server (see Figure 1). When a client browser on the Internet
makes an HTTP request, the DNS infrastructure routes the request
to the reverse proxy server. The reverse proxy server then connects
to the real Web server to get the Web page content and send it back
to the client browser. (Optionally, you can take advantage of the
Squid cache feature. If the same page has been previously accessed
within the timeout period, Squid may be able to serve the request
from its cache.) If your Web server is serving mostly dynamic content,
such as cgi-bin, ASP, JSP, etc., using the cache feature is not
recommended.
Setting up Squid Reverse Proxy
The following discussion applies to Squid version 2.x. Squid version
3.x is in development at the time of this writing. The version 3.x
configuration looks a little different, although it follows a similar
concept. I suggest you use the latest STABLE version for a production
environment. I have tested this setup on Linux and Solaris.
To install Squid, download the latest STABLE version (2.5.STABLE4,
as of this writing) from:
http://www.squid-cache.org
Unzip/untar at a temporary location. Change the directory to "squid
source" and run ./configure with the following options.
(See ./configure --help for more options):
./configure --prefix=/home/packages/squid --enable-ssl
--enable-useragent-log --enable-referer-log --enable-storeio=ufs,null
This command will configure Squid at location /home/packages/squid,
enable the SSL feature using OpenSSL (you need to have the OpenSSL
package installed for this option), enable a few extra logging features,
and, finally, enable storage I/O options for caching mechanism. If
--enable-storeio option is not specified, "ufs" is
the default cache storage module/format under Squid to store cached
objects. However --enable-storeio=ufs,null enable both "ufs"
and "null" storage I/0 modules. Note that these are only
compile time options to enable various I/O modules code. Actual decision
about setting up Squid with or without cache will be taken later at
a run time in a Squid configuration file (using the "cache_dir"
statement). Either "ufs" or "null" modules can
be selected to set up Squid with or without cache, respectively. See
the Configuration section for more details about where Squid is set
up without cache using the "null" module.
Then type the following commands:
make
make install
These commands install Squid in the directory /home/packages/squid
(which I'll call $SQUID_HOME from now on). Create Unix user "squid"
now, too. Change the ownership of $SQUID_HOME/var directory as follows.
(This step is important because the squid process will write in this
area as userid "squid".)
chown -Rh squid $SQUID_HOME/var
Configuring Squid
Edit the default configuration file $SQUID_HOME/etc/squid.conf.
Read this file for more detail about the various options. The variables
that are most important for reverse Squid proxy are shown in Listing
1. See the sidebar for a discussion of Listing 1.
For initial testing, make sure you are allowing any request to
your Web server behind Squid, and once that works, firewall your
HTTP server(s) as a final step. Initially, relax your Squid ACL
by adding the following line (before the last deny rule http_access
deny all; this setting may already be present in the default
squid.conf). Your squid.conf may look like:
http_access allow all #This will allow all request to
#web server behind squid.
http_access deny all
Squid will follow a top-down approach in ACL implementation. An http_access
rule will be applied on an incoming request. In the above case, Squid
will allow all connections, so the last deny rule is not effective
in this case. If you are planning to use the cache feature, initialize
the cache with the following command, which will create cache infrastructure
directories under the $SQUID/var/cache area. If you are using the
cache_dir null /tmp option in squid.conf, it won't create
any cache directory:
$SQUID_HOME/sbin/squid -z
Testing Your Reverse Proxy Configuration
It is time to test your squid configuration for reverse proxy.
Start the squid server in debug mode so that you can see detailed
logs. Watch for logs in the $SQUID_HOME/var/logs/ directory:
$SQUID_HOME/sbin/squid -d 100
Watch for any errors and correct your setup before you proceed. Open
your Web browser and access the Web server by typing URL (for example):
http://www.mywebserver.xxx. Note that external DNS infrastructure
actually maps hostname www.myserver.xxx to the IP address of
the Squid proxy server. In other words, this URL is pointing to the
proxy server for the outside world. Once this Web request is received
by the Squid server first, Squid server finally directs this to the
real back end Web server. Check the "access.log" file for
Squid and see whether you find logs for this access. If this is not
working, set up the ACL debug option in squid.conf (as described below)
and check "cache.log". (Comment out the existing "debug_options"
first):
debug_options ALL,1 33,2
Restart the Squid server and try your URL again. This should print
lots of ACL debug information. Correct your problems and then turn
off the debug ACL option by resuming the original "debug_options".
Firewalling Your HTTP server
The main objective of this document is to set up an HTTP-level
firewall in front of the real Web server. So far, we have set up
an operational Squid server in front of the Web server. All requests
are passing through the Squid server and reaching to the real Web
server behind Squid. It is time to set up a choke point at the Squid
server and allow only desired traffic to the real Web server. Look
for the "http_access" section in the squid.conf file.
Before you set up the "http_access" section in the squid.conf
file, first set up ACL variables that will be used in the "http_access"
options. The "acl" directives have the following syntax
(taken from the default squid.conf):
# acl aclname acltype string1 ...
# acl aclname acltype "file" ...
# when using "file", the file should contain one item per line
The squid.conf file uses some descriptive name for ACL variables.
For example, the default squid.conf may contain the ACL lines shown
in Listing 2.
You can define more ACL variable names per your site requirements.
The various ACL names are just definitions, and so far this is not
affecting any HTTP traffic flow. We will now use these ACL names
in a real access control list by applying the "http_access"
directive in squid.conf (see example from squid.conf in Listing
3).
I should explain a few important points first. "http_access"
ACL filters are applied as top-down filters. This means every http
request will match against the rules from the top of the list. The
first rule matched will be applied and an action taken accordingly
(allow/deny). A "LOGICAL AND" will be performed for more
than one "aclname" string listed in the same "http_access"
line, and a "LOGICAL OR" will apply for an "aclname"
in two different "http_access" lines. Consider the following
example.
Suppose a Web server with (real) IP address 10.0.0.1 is running
behind the Squid server, which is applying the ACLs shown in Listing
4. The Squid server IP address is 1.2.3.4. The Squid server will
redirect the request to a real Web server after applying filters.
This configuration will allow any valid URL, such as http://www.mywebserver.xxx,
http://1.2.3.4, etc., from "mynetwork" AND
(Logical AND) on ports 80 and 8080 (assuming Squid is listening
at these ports OR (Logical OR), if a request is coming from
somewhere other than "mynetwork" (due to "!mynetwork"),
this configuration will allow only URL http://www.mywebserver.xxx
(but not http://1.2.3.4) AND at ports 80 and 8080
(i.e., all three "aclnames" must be satisfied in this
http_access line) OR any other requests are denied with the
last rule.
As Squid notes, the "http_access" list will perform
the opposite of the last rule mentioned in the squid.conf file.
In the above case, this means Squid will actually perform "http_access
allow all" as a last rule, which doesn't have any effect
since the previous "deny" rule is denying all. If the
last rule is not a "deny all" rule, you may be allowing
some unwanted traffic. The following sections describe some case
studies.
Case 1: Deny Random Attacks on the Web Server
An attacker or worm may attempt to send a URL containing cmd.exe
(on IIS running Microsoft Windows) or /bin/sh on Unix Web servers.
Deny such attacks by adding the ACL list in Listing 5.
Any request like http://www.mywebserver.xxx/default.ida?XXXXXXX
or http://www.myserver.xxx/blahblah/cmd.exe from anywhere
will be dropped at Squid before passing to the real Web server.
Just by defining the configuration shown in Listing 5, your Web
server will be greatly relieved and will avoid most automated attacks
or script kiddie attacks. A seasoned attacker, however, can go beyond
this and may use more sophisticated techniques (such as directory
traversal and Unicode-based attacks) to evade regular expression
and strings-based signatures like the above.
Case 2: Sophisticated Access Level for Your Web Server
Suppose you want to allow only a particular URL path, such as
http://www.mywebserver.xxx/open_to_all, to your Web server
from anywhere, and you want to allow URL http://www.mywebserver.xxx/secret
only from your inside network (e.g., 192.168.0.0/24). Control such
access by adding the ACL list shown in Listing 6.
Using a Redirector
So far, we have only discussed the option of defining an access
list in the squid.conf file and redirecting HTTP requests to a real
Web server. This approach is not very flexible if you have some
complicated requirements or are dealing with many Web servers behind
Squid. A redirector program lets you redirect your HTTP request
in a more useful and flexible manner. A redirector is an external
process (an executable program, running in infinite loop, written
in shell script, Perl, C/C++, Python, or any other language) that
can rewrite a URL. Squid can be configured to pass every incoming
URL to a redirector process. The redirector will reply with a modified
URL or with a blank line. A blank line specifies no change to the
current URL. Redirector programs simply read the standard input,
consisting of four arguments (supplied by the Squid process as a
part of redirection), and after any modifications, writes these
arguments back to standard output, which Squid will accept as a
modified query. These four arguments are:
- URL
- ip-address/fqdn
- ident
- method
A redirector program is not a part of the Squid distribution.
You must write your own redirector, but this is very simple if you
know any programming language. For example, a simple redirector
written in Perl is shown in Listing 7. Add the following redirector
to your squid.conf file and restart the squid server (see Figure
2):
redirect_program=<Full_Path>/ myredirector.pl
When Squid starts, it will spawn the redirector process. The redirector
process must be running in an infinite loop (like the while() loop
in Listing 7). The redirector listens for its standard input, modifies
the URL, and spits back the modified URL at standard output. In the
above example, Squid will accept a request for www.mywebserver.xxx
at standard HTTP port 80, and then redirect the request to a real
Web server running at port 8080.
You can create a complicated redirector for your site needs. Many
of the things Squid cannot do for you out of the box can be implemented
using your redirector program. You may also use redirectors already
written by others, such as:
Squirm -- http://squirm.foote.com.au/
jesred -- http://ivs.cs.uni-magdeburg.de/~eelkner/webtools/jesred/
Squidguard -- http://www.squidguard.org/
Limitations
Squid has a few limitations that are worth mentioning. If your
real Web server is using any "HTTP REDIRECT" statement,
then the reverse Squid proxy server should listen at the same port
as the real (backend) Web server, otherwise unexpected result may
happen. This is a problem if you are running a Squid reverse proxy
server and a real (backend) Web server on the same machine because
you cannot use the same port (for example, port 80) to bind both
Squid and the real Web server. A possible workaround is to allow
squid to listen at the real IP address' port 80 and bind the
real Web server at loopback address (127.0.0.1) port 80, then direct
traffic from Squid to the loopback address.
Another limitation is that Squid cannot effectively act as a reverse
proxy for an SSL-based HTTPS backend server. Squid allows https
connections in the forward direction through CONNECT() mode, but
it does not offer an equivalent reverse direction SSL capability.
Squid can act as an HTTPS termination point. In other words, reverse
Squid proxy can be configured as an HTTPS server in front of any
HTTP (non-SSL)-based server. Thus, traffic will be encrypted between
the client browser and the reverse Squid proxy, and unencrypted
traffic will pass between the reverse Squid proxy server and the
real (backend) server. This solution is often acceptable if the
traffic to the backend server passes across a protected internal
network segment. Squid version 3.x promises to provide additional
encryption options.
Conclusions
Conventional port-based firewalls cannot detect sophisticated
traffic, such as a non-HTTP protocol running over port 80. Port
80 is a very lucrative avenue for application service providers
(ASPs), as well as for intruders. Proper monitoring of such ports
using techniques such as those described in this article can be
very helpful to organizations wanting to protect their Web servers.
Squid can protect any Web server running the HTTP protocol, including
Windows IIS server, Apache, or any other commercial Web server.
Squid's powerful redirector feature can automate the redirection
process. Squid can also work as an SSL-based HTTPS termination point,
which means you can provide an HTTPS connection to your existing
(HTTP) Web site without touching your real Web server.
References
Squid Web site -- http://www.squid-cache.org
Squid configuration guide example -- http://squid.visolve.com/
Monitoring Squid health using Calamaris -- http://calamaris.cord.de/
Detailed Squid log analysis using Sarge -- http://web.onda.com.br/orso/index.html
Rajeev Kumar is currently working as Senior Systems & Security
Administrator for Fluent Inc. USA. He received his B.E from IIT
Roorkee and M.Tech from IIT Kanpur, India, in Chemical Engineering.
He has more than six years of Unix/Linux and systems security experience.
He maintains the Web site http://www.rajeevnet.com,
where he publishes freeware code and systems/security documents.
Rajeev can be contacted at: rajeev@rajeevnet.com or rxk@fluent.com.
|