Apache web server as HTTP proxy

By | October 9, 2013

Apache proxyApache provides a lot of modules out of the box and there are lot of custom modules available which are built by community or vendors like IBM. One such out of the box module is mod_proxy. It makes it possible to use your webserver as a proxy server for handling web requests. Using proxies significantly enhances security and reduces risk to your networks. You can have your application(s) running inside a network behind the firewall with only proxy server used as gateway for all incoming web traffic if you setup Apache as reverse proxy.

Basic configuration

To configure your Apache as proxy, first enable the mod_proxy module by adding following in you httpd.conf. You’ll also need to load proxy_http_module since you want to setup proxy to handle http requests. Proxy_http_module will not work unless mod_proxy is also loaded.

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so

Then set up the proxy directives to do the job.

ProxyRequests Off
ProxyPass /webapp http://10.123.123.123:9080/webapp
ProxyPassReverse /webapp http://10.123.123.123:9080/webapp

Restart your apache and you are done. You have setup a web proxy.

 

What do above proxy directives mean?

 

ProxyRequests Off tells Apache that this server is not be used as generic forward proxy except for the specific proxy configuration done using ProxyPass and ProxyPassReverse. Never set ProxyRequests On unless you have secured your Apache webserver using combination of ssl, authentication and authorization modules.

ProxyPass will map any incoming requests on your http://proxy_server:80/webapp to http://10.123.123.123:9080/webapp. With this directive you have set up forward proxy.

ProxyPassReverse will map any response going back from 10.123.123.123 to requester from http://10.123.123.123:9080/webapp to http://proxy_server:80/webapp. By adding this directive you have configured Apache to be a reverse proxy.

 

Setup Multiple Proxies using a single apache webserver

 

You can do this in more than one ways. Best method to use is the one that fits your requirement.

An effective way can be to use multiple virtual hosts.

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
 
Listen 80
Listen 81
 
ProxyRequests Off
ProxyPass /webapp http://10.123.123.123:9080/webapp
ProxyPassReverse /webapp http://10.123.123.123:9080/webapp
 
ProxyRequests Off
ProxyPass /webapp http://10.124.124.124:9080/webapp
ProxyPassReverse /webapp http://10.124.124.124:9080/webapp

Other simple, but not so elegant way is to use different context roots.

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
 
Listen 80
Listen 81
 
ProxyRequests Off
 
ProxyPass /webapp1 http://10.123.123.123:9080/webapp
ProxyPassReverse /webapp1 http://10.123.123.123:9080/webapp
 
ProxyPass /webapp2 http://10.124.124.124:9080/webapp
ProxyPassReverse /webapp2 http://10.124.124.124:9080/webapp

Getting Apache proxy to rewrite included/embedded URLs using ProxyHTMLURLMap

Steps given above will solve your purpose if you have webpages which contain all the functionality embedded the pages. However that’s not the way most of e-commerce sites, portals or content management sites work. URLs are generated by applications on the fly. For example, Websphere Commerce will generate the URLs for all menu items when the store front is loaded first time (using a specific virtual host) and the URLs will be based on the Virtual host used for requesting the storefront. This is not good for simple proxies as we configured above. Using steps above , your proxy will work successfully for main page / store front, but it will not work for any of the sub menus as the URLs for those menus are still pointing to original server. To fix this you can add following directives to you httpd.conf or .htaccess file

SetOutputFilter INFLATE;proxy-html;DEFLATE;
ProxyHTMLURLMap (.*)10.124.124.124:9080(.*)$ $1proxy.server.com$2 [PT]

Will these steps work for IBM HTTP Server ?

These directives will reconstruct the embedded URLs in the webpage being proxy-ed and replace 10.124.124.124:9080 to proxy.server.com (:80) in those URLs.

 

Getting Apache proxy to rewrite included/embedded URLs using mod_substitute

Another way of modifying embedded URLs within the proxied page is using mod_substitute. This is particularly useful if the page contain java script (JS) and/or AJAX. Earlier method of ProxyHTMLURLMap may not work properly in such cases. Use following example for mod_substitute.

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule substitute_module modules/mod_substitute.so
 
ProxyRequests Off
ProxyPass /webapp http://10.123.123.123:9080/webapp
ProxyPassReverse /webapp http://10.123.123.123:9080/webapp
Substitute "s|http://10.123.123.123/|http://www.proxyserver.com/|ni"

The substitute directive above will replace any embedded urls in the response going from internal server (10.123.123.123) to requester/user from http://10.123.123.123 to http://www.proxyserver.com (name of the proxy server)

Will these steps work for IBM HTTP Server ?

These exact same steps work for IBM HTTP Server as well.

There are lot of advanced configurations available and you can read about them here