Digital Development
Web Design

Here's a quick tip on making pretty URLs for your website. Typically you'd apply this to nasty looking URLs, you know the kind full of random character strings you'll never remember for future use, e.g. /index.php?title=Page_title, and they can get a lot uglier than that!

What I'm doing here is a more simplified approach where we'll remove the file extension from your URL, so where you would normally have 'your_site.com/index.html' or 'your_site.com/contact.php' you can simply have 'your_site.com/index/' or 'yoursite.com/contact/'.

There's a lot in favour of pretty URLs, the main benefits being:

  • They simply look nicer and are easier to remember for users.
  • They are search engine-friendly.
  • They are more secure as they prevent the display of query strings.

In this example there's no real security risk or diminished SEO activity as we're just removing the file extension from near-pretty URLs, but its always nice to have prettier-URLs and it makes for good practice in Apache mod_rewriting.

So what's this Apache mod_rewriting thing? Well, in short, if we want to make pretty URLs we have to tell the Apache server to rewrite the URLs in order to bring out the pretty factor. In order to communicate with the Apache server we write all of our commands into a system file called .htaccess which you will create and save in your site's root directory.

Using the following code in your .htaccess file will inform your Apache server to remove any .html or .php file extensions from your URLs:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([^/]+)/$ $1.html
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^([^/]+)/$ $1.php

The first line tells the server that we're going to use the rewrite engine. The following two lines will save your site losing all images, styling and behaviours by restricting the rewrite commands to paths that don't actually exist, the !d command prevents the rewriting of existing directories and likewise the !f command prevents the rewriting of existing files. The following line sets a condition to look for filenames with the .html extension, after which we set the rule to remove this file extension. The next two lines are an almost identical cond-rule set asking the server to do exactly the same thing with filenames with the .php extension. Its pretty straightforward when you think about it.

So what does the rewrite rule actually tell the server? Each line of rewrite commands creates a rule that tries to match a regular expressions pattern against the incoming URL. This regular expression pattern is contained within the caret and dollar sign (^regular expression pattern$).

Our regular expression pattern is ([^/]+)/ and tells the server the following:

  • '(...)' groups the subexpression within our pattern
  • '[...]' defines our character set
  • '[^/]' tells the server to match any character with the exception of '/'.
  • '+' tells the server to match as many of these items as possible.

We are then telling the server to rewrite all with the exception of any '/'. We then place a '/' for a bit of tidiness and the bit that says '$1.html' is where we tell the server that .html, or .php in the case of the proceeding rule, is to be removed.

One important thing to note here, you'll need to update any links within your files as, while you've told the server to rewrite the URLs, the server does not rewrite any of the links in your .html and .php files.