Monday, January 30, 2012


This articles covers the different purposes of .htaccess files.

I didnot  .htaccess could be used to do so many things until i read the juicy matters below. I though .htccess was just for re-directions and re-write.
Lets list em out and  first and describe later. First the usual,

1. Re-directions
2. Redirections

and then comes the cool stuffs that i didnot think ever about .htaccess could do

3. Serving Custom Error Pages
4. Restricting Access to Specific Resources
5. Block Access to Certain Entities
6. Force an IE Rendering Mode
7. Implement Caching
8. Enabling Compression

Redirect 301 ^old\.html$ http://localhost/new.html

This sets the HTTP status code to 301 (moved permanently) and redirects all requests to old.html transparently to new.html. We use a regular expression to match the URL to redirect, which gives us a fine degree of control to ensure only the correct URL is matched for redirection, but adds complexity to the configuration and administration of it. The full URL of the resource being redirected to is required.

RewriteEngine on
RewriteRule ^old\.html$ new.html

In this example, we just provide a simple file redirect from one file to another, which will also be performed transparently, without changing what is displayed in the address bar. The first directive, RewriteEngine on, simply ensures that the rewrite engine is enabled.
In order to update what is displayed in the address bar of the visitor's browser, we can use the R flag at the end of the RewriteRule e.g.

1 RewriteRule ^old\.html$ http://hostname/new.html [r=301]

The r flag causes an external redirection which is why the full URL (an example URL here) to the new page is given. We can also specify the status code when using the flag. This causes the address bar to be updated in the visitor's browser.
One of the possible uses for URL rewriting I gave at the start of this section was to make unsightly URLs (containing query-string data) friendlier to visitors and search engines. Let's see this in action now:

2 RewriteRule ^products/([^/]+)/([^/]+)/([^/]+) product.php?cat=$1&brand=$2&prod=$3

This rule will allow visitors to use a URL like products/turntables/technics/sl1210, and have it transformed into product.php?cat=turntables&brand=technics&prod=sl1210. The parentheses in between the forward slashes in the above regular expression are capturing groups – we can use each of these as $1, $2 and $3 respectively. The [^/]+ character class within the parentheses means match any character except a forward-slash 1 or more times.
In practice, URL rewriting can be (and usually is) much more complex and achieve far greater things than this. URL rewriting is better explained using entire tutorials so we won't look at them in any further detail here.
Serving Custom Error Pages

It's just not cool to show the default 404 page anymore. Many sites take the opportunity offered by a file not found error to inject a little humour into their site, but at the very least, people expect the 404 page of a site to at least match the style and theme of any other page of the site.
Very closely related to URL rewriting, serving a custom error page instead of the standard 404 page is easy with an .htaccess file:
1 ErrorDocument 404 "/404.html"
That's all we need; whenever a 404 error occurs, the specified page is displayed. We can configure pages to be displayed for many other server errors too.
Restricting Access to Specific Resources
Using .htaccess files, we can enable password protection of any file or directory, to all users, or based on things like domain or IP address. This is after all one of their core uses. To prevent access to an entire directory, we would simple create a new .htaccess file, containing the following code:

AuthName "Username and password required"
AuthUserFile /path/to/.htpasswd
Require valid-user
AuthType Basic
This file should then be saved into the directory we wish to protect. The AuthName directive specifies the message to display in the username/password dialog box, the AuthUserFile should be the path to the .htpasswd file. The Require directive specifies that only authenticated users may access the protected file while the AuthType is set to Basic.
To protect a specific file, we can wrap the above code in a directive, which specifies the protected file: 

AuthName "Username and password required"
AuthUserFile /path/to/.htpasswd
Require valid-user
AuthType Basic

We also require an .htpasswd file for these types of authentication, which contains a colon-separated list of usernames and encrypted passwords required to access the protected resource(s). This file should be saved in a directory that is not accessible to the web. There are a range of services that can be used to generate these files automatically as the password should be stored in encrypted form.
Block Access to Certain Entities
Another use of .htaccess files is to quickly and easily block all requests from an IP address or user-agent. To block a specific IP address, simply add the following directives to your .htaccess file:

order allow,deny
deny from
allow from all

The order directive tells Apache in which order to evaluate the allow/deny directives. In this case, allow is evaluated first, then deny. The allow from all directive is evaluated first (even though it appears after the deny directive) and all IPs are allowed, then if the client's IP matches the one specified in the deny directive, access is forbidden. This lets everyone in except the specified IP. Note that we can also deny access to entire IP blocks by supplying a shorter IP, e.g. 192.168.
To deny requests based on user-agent, we could do this:

RewriteCond %{HTTP_USER_AGENT} ^OrangeSpider

RewriteRule ^(.*)$ http://%{REMOTE_ADDR}/$ [r=301,l]
In this example, any client with a HTTP_USER_AGENT string starting with OrangeSpider (a bad bot) is redirected back to the address that it originated from. The regular expression matches any single character (.) zero or more times (*) and redirects to the %{REMOTE_ADDR} environment variable. The l flag we used here instructs Apache to treat this match as the last rule so will not process any others before performing the rewrite.
Force an IE Rendering Mode
Alongside controlling how the server responds to certain requests, we can also do things to the visitor's browser, such as forcing IE to render pages using a specific rendering engine. For example, we can use the mod_headers module, if it is present, to set the X-UA-Compatible header:
1 Header set X-UA-Compatible "IE=Edge"
Adding this line to an .htaccess file will instruct IE to use the highest rendering mode available. As demonstrated by HTML5 Boilerplate, we can also avoid setting this header on files that don't require it by using a

      Header unset X-UA-Compatible

Implement Caching
Caching is easy to set up and can make your site load faster.
Caching is easy to set up and can make your site load faster. 'Nuff said! By setting a far-future expires date on elements of sites that don't change very often, we can prevent the browser from requesting unchanged resources on every request.
If you're running your site through Google PageSpeed or Yahoo's YSlow and you get the message about setting far-future expiry headers, this is how you fix it:

ExpiresActive on
ExpiresActive on
ExpiresByType image/gif                 "access plus 1 month"
ExpiresByType image/png                 "access plus 1 month"
ExpiresByType image/jpg                 "access plus 1 month"
ExpiresByType image/jpeg                "access plus 1 month"
ExpiresByType video/ogg                 "access plus 1 month"
ExpiresByType audio/ogg                 "access plus 1 month"
ExpiresByType video/mp4                 "access plus 1 month"
ExpiresByType video/webm                "access plus 1 month"

You can add different ExpiresByType directives for any content that is listed in the performance tool you're using, or anything else that you want to control caching on. The first directive, ExpiresActive on, simply ensures the generation of Expires headers is switched on. These directives depend on Apache having the mod_expires module loaded.
Enabling Compression
Another warning we may get in a performance checker refers to enabling compression, and this is also something we can fix simply by updating our .htaccess file:

FilterDeclare   COMPRESS
FilterProvider  COMPRESS  DEFLATE resp=Content-Type $text/html
FilterProvider  COMPRESS  DEFLATE resp=Content-Type $text/css
FilterProvider  COMPRESS  DEFLATE resp=Content-Type $text/javascript
FilterChain     COMPRESS
FilterProtocol  COMPRESS  DEFLATE change=yes;byteranges=no

This compression scheme works on newer versions of Apache (2.1+) using the mod_filter module. It uses the DEFLATE compression algorithm to compress content based on its response content-type, in this case we specify text/html, text/css and text/javascript (which will likely be the types of files flagged in PageSpeed/Yslow anyhow).
In the above example we start out by declaring the filter we wish to use, in this case COMPRESS, using the FilterDeclare directive. We then list the content types we wish to use this filter. The FilterChain directive then instructs the server to build a filter chain based on the FilterProvider directives we have listed. The FilterProtocol directive allows us to specify options that are applied to the filter chain whenever it is run, the options we need to use are change=yes (the content may be changed by the filter (in this case, compressed)) and byteranges=no (the filter must only be applied to complete files).
On older versions of Apache, the mod_deflate module is used to configure DEFLATE compression. We have less control of how the content is filtered in this case, but the directives are simpler:

 SetOutputFilter DEFLATE
 AddOutputFilterByType DEFLATE text/html text/css text/javascript

In this case we just set the compression algorithm using the SetOutputFilter directive, and then specify the content-types we'd like to compress using the AddOutputFilterByType directive.
Usually your web server will use one of these modules depending on which version of Apache is in use. Generally, you will know this beforehand, but if you are creating a generic .htaccess file that you can use on a variety of sites, or which you may share with other people and therefore you don't know which modules may be in use, you may wish to use both of the above blocks of code wrapped in
directives so that the correct module is used and the server doesn't throw a 500 error if we try to configure a module that isn't included. You should be aware that it's also relatively common for hosts that run a large number of sites from a single box to disable compression as there is a small CPU performance hit for compressing on the server.

No comments: