Thursday, June 25, 2009

URL Rewriting with ASP.NET

How URL rewriting accepts a url and rewrites it

Introduction

One of the most popular extensions to the Apache webserver has been mod_rewrite - a filter which rewrites URLs. For example, instead of a URL such as

http://www.codeproject.com/images/minus.gifCollapse

http://www.apache.org/BookDetails.pl?id=5

you could provide a filter which accepts URLs such as

http://www.codeproject.com/images/minus.gifCollapse

http://www.apache.org/Book/5.html

and it will silently perform a server-side redirect to the first URL. In this way, the real URL could be hidden, providing an obfuscated facade to the web page. The benefits are easier to remember URLs and increasing the difficulty of hacking a website.

Mod_rewrite became very popular and grew to encompass a couple of other features not related to URL Rewriting, such as caching. This article demonstrates URL Rewriting with ASP.NET, whereby the requested URL is matched based on a regular expression and the URL mappings are stored in the standard ASP.NET web.config configuration file. ASP.NET includes great caching facilities, so there's no need to duplicate mod_rewrite's caching functionality.

As more and more websites are being rewritten with ASP.NET, the old sites which had been indexed by google and linked from other sites are lost, inevitably culminating in the dreaded 404 error. I will show how legacy ASP sites can be upgraded to ASP.NET, while maintaining links from search engines.

ASP.NET support for URL Rewriting

ASP.NET provides very limited support out of the box. In fact, it's support is down to a single method:

http://www.codeproject.com/images/minus.gifCollapse

void HttpContext.RewritePath(string path)

which should be called during the Application_BeginRequest() event in the Global.asax file. This is fine as long as the number of URLs to rewrite is a small, finite, managable number. However most ASP sites are in some way dynamic, passing parameters in the Query String, so we require a much more configurable approach.

The storage location for all ASP.NET Configuration information is the web.config file, so we'd really like to specify the rewrites in there. Additionally, .Net has a fast regular expression processor, giving free and fast search and replace of URLs. Let's define a section in the web.config file which specifies those rewrites:

http://www.codeproject.com/images/minus.gifCollapse

<configuration>

<system.web>

<urlrewrites>

<rule>

<url>/urlrewriter/show\.asp</url>

<rewrite>show.aspx</rewrite>

</rule>

<rule>

<url>/urlrewriter/wohs\.asp</url>

<rewrite>show.aspx</rewrite>

</rule>

<rule>

<url>/urlrewriter/show(.*)\.asp</url>

<rewrite>show.aspx?$1</rewrite>

</rule>

<rule>

<url>/urlrewriter/(.*)show\.html</url>

<rewrite>show.aspx?id=$1&cat=2</rewrite>

</rule>

<rule>

<url>/urlrewriter/s/h/o/w/(.*)\.html</url>

<rewrite>/urlrewriter/show.aspx?id=$1</rewrite>

</rule>

</urlrewrites>

</system.web>

</configuration>

Notice how we have to escape the period in the url element such as 'show\.asp'. This is a Regular Expression escape and it's a small price to pay for the flexibility of regular expressions. These also show how we set-up a capturing expression using (.*) in the <url> element and refer to that capture in the <rewrite> element with $1

Configuration Section Handlers

.Net's configuration mechanism requires us to write code as a "handler" for this section. Here's the code for that:

http://www.codeproject.com/images/minus.gifCollapse

<configuration>

<configSections>

<sectionGroup name="system.web">

<section name="urlrewrites" type="ThunderMain.URLRewriter.Rewriter,

ThunderMain.URLRewriter, Version=1.0.783.30976,

Culture=neutral, PublicKeyToken=7a95f6f4820c8dc3"/>

</sectionGroup>

</configSections>

</configuration>

This section handler specifies that for every section called "urlrewrites", there is a class called ThunderMain.URLRewriter.Rewriter which can be found in the ThunderMain.URLRewriter.dll assembly with the given public key token. The public key token is required because this assembly has to be placed into the GAC and therefore given a strong name.

A section handler is defined as a class which implements the IConfigurationSectionHandler interface. This has one method, Create(), which should be implemented, and in our code that is very simple. It merely stores the urlrewrites element for later use:

http://www.codeproject.com/images/minus.gifCollapse

public object Create(object parent, object configContext, XmlNode section)

{

_oRules=section;

return this;

}

Initiating the rewrite process

Coming back to actually rewriting the URL, as I said earlier, we need to do something in the Application_BeginRequest() event in Global.asax - we just delegate this to another class:

http://www.codeproject.com/images/minus.gifCollapse

protected void Application_BeginRequest(Object sender, EventArgs e)

{

ThunderMain.URLRewriter.Rewriter.Process();

}

which calls the static method Process() on the Rewriter class. Process() first obtains a reference to the configuration section handler (which happens to be an instance of the current class) and then delegates most of the work to GetSubstitution() - an instance method of this class.

http://www.codeproject.com/images/minus.gifCollapse

public static void Process()

{

Rewriter oRewriter=

(Rewriter)ConfigurationSettings.GetConfig("system.web/urlrewrites");

string zSubst=oRewriter.GetSubstitution(HttpContext.Current.Request.Path);

if(zSubst.Length>0) {

HttpContext.Current.RewritePath(zSubst);

}

}

GetSubstitution() is just as simple - iterating through all possible URL Rewrites to see if one matches. If it does, it returns the new URL, otherwise it just returns the original URL:

http://www.codeproject.com/images/minus.gifCollapse

public string GetSubstitution(string zPath)

{

Regex oReg;

foreach(XmlNode oNode in _oRules.SelectNodes("rule")) {

oReg=new Regex(oNode.SelectSingleNode("url/text()").Value);

Match oMatch=oReg.Match(zPath);

if(oMatch.Success) {

return oReg.Replace(zPath,

oNode.SelectSingleNode("rewrite/text()").Value);

}

}

return zPath;

}

Installing the sample code

Extract the code into a URLRewriter folder, then turn this into a virtual directory using the Internet Information Services MMC control panel applet. Compile the code use the 'Make Rewriter.bat' batch script into the bin sub-folder. Then add bin/ThunderMain.URLRewriter.dll to the Global Assembly Cache by copying and pasting the dll into %WINDIR%\assembly using Windows Explorer. Finally, navigate to http://localhost/URLRewriter/default.aspx and try the demo URLs listed.

None will actually work because there's one last thing we have to be aware of...

Finally

There's one major caveat with all this. If you want to process a request with a file extension other than .aspx such as .asp or .html, then you need to change IIS to pass all requests through to the ASP.NET ISAPI extension. Unfortunately, you will need physical access to the server to perform this, which prevents you from simply XCOPY deploying your code to an ISP.

Adding a mapping for all file types


We've added the HEAD, GET and POST verbs to all files with .* file extension (ie all files) and mapped those to the ASP.NET ISAPI extension - aspnet_isapi.dll.

A mapping for all file types has been added


The complete range of mappings, including the new .* mapping.

for more details : http://www.codeproject.com/KB/aspnet/urlrewriter.aspx

Tuesday, June 23, 2009

Custom Error Pages In ASP.NET

This tutorial is part of series that covers error handling in ASP.NET. You can find additional information on:

* Errors And Exceptions In ASP.NET - covers different kinds of errors, try-catch blocks, introduces Exception object, throwing an exception and page_error procedure.
* Application Level Error Handling in ASP.NET - goes a level higher and explains handling of unhandled errors by using Application_Error procedure in Global.asax file, using of custom Http modules, sending notification e-mail to administrator, show different error pages based on roles and logging errors to text files, database or EventLog.

This tutorial deals with user experience when error occurs. When error is occurred on ASP.NET web application, user will get default error page (which is not so nice looking, also known as "Yellow screen of death"). This error page confuses average visitor who don't know the meaning of "Runtime Error". Although developers like to know many details about an error, it is better to show more friendly error page to your users.

Web.config file contains a customErrors section inside <System.web>. By default, this section looks like this:

<customErrors mode="RemoteOnly" />

As you see, there is mode parameter inside customErrors tag which value is "RemoteOnly". This means that detailed messages will be shown only if site is accessed through a http://localhost. Site visitors, who access from external computers will see other, more general message, like in image bellow:

Default ASP.NET error message
Default ASP.NET error message

mode parameter can have three possible values:

- RemoteOnly - this is default value, detailed messages are shown if you access through a localhost, and more general (default) error message to remote visitors.

- On - default error message is shown to everyone. This could be a security problem since part of source code where error occurred is shown too.

- Off - detailed error message is shown to everyone.

Default ASP.NET error message hides details about error, but still is not user friendly. Instead of this page, ASP.NET allows you to create your own error page. After custom error page is created, you need to add a reference to it in customErrors section by using a defaultRedirect parameter, like in code snippet bellow:

<customErrors mode="RemoteOnly" defaultRedirect="~/DefaultErrorPage.htm" />

When error occured, ASP.NET runtime will redirect visitor to DefaultErrorPage.htm. On custom error page you can gently inform your visitors what happened and what they can do about it.

DefaultErrorPage.htm will display when any error occurs. Also, there is a possibility to show different custom error pages for different types of exceptions. We can do that by using <error > sub tag. In code sample bellow, specialized pages are shown for errors 404 (File Not Found) and error 403 (Forbidden).

<customErrors mode="On" defaultRedirect="~/DefaultErrorPage.htm" >

<error statusCode="404" redirect="~/FileNotFound.htm"/>

<error statusCode="403" redirect="~/Forbidden.htm"/>

</customErrors>

How to set error page for every ASP.NET page

Custom pages configured in Web.config affect complete web site. It is possible also to set error page for every single ASP.NET page. We can do this by using @Page directive. Code could look like this:

<%@ Page language="C#" Codebehind="SomePage.aspx.cs" errorPage="MyCustomErrorPage.htm" AutoEventWireup="false"%>

Http Error page codes

There are different Http codes that your web application could return. Some errors are more often than others. You probably don't need to cover all cases. It is ok to place custom error pages for the most common error codes and place default error page for the rest.

400 Bad Request

Request is not recognized by server because of errors in syntax. Request should be changed with corrected syntax.

401 Not Authorized

This error happens when request doesn't contain authentication or authorization is refused because of bad credentials.

402 Payment Required

Not in use, it is just reserved for the future

403 Forbidden

Server refused to execute request, although it is in correct format. Server may or may not provide information why request is refused.

404 Not Found

Server can not find resource requested in URI. This is very common error, you should add custom error page for this code.

405 Method Not Allowed

There are different Http request methods, like POST, GET, HEAD, PUT, DELETE etc. that could be used by client. Web server can be configured to allow or disallow specific method. For example, if web site has static pages POST method could be disabled. There are many theoretical options but in reality this error usually occurs if POST method is not allowed.

406 Not Acceptable

Web client (browser or robot) could try to receive some data from web server. If that data are not acceptable web server will return this error. Error will not happen (or very rarely) when web browser request the page.

407 Proxy Authentication Required

This error could occur if web client accesses to web server through a proxy. If proxy authentication is required you must first login to proxy server and then navigate to wanted page. Because of that this error is similar to error 401 Not Authorized, except that here is problem with proxy authentication.

408 Request Timeout

If connection from web client and server is not established in some time period, which is defined on web server, then web server will drop connection and send error 408 Request Timeout. The reason could be usually temporarily problem with Internet connection or even to short time interval on web server.

409 Conflict

This error rarely occurs on web server. It means that web request from client is in conflict with some server or application rule on web server.

410 Gone

This error means that web server can't find requested URL. But, as opposed to error 404 Not Found which says: "That page is not existing", 410 says something like: "The page was here but not anymore". Depending of configuration of web server, after some time period server will change error message to 404 Not Found.

411 Length Required

This error is rare when web client is browser. Web server expects Content-Length parameter included in web request.

412 Precondition Failed

This is also rare error, especially if client is web browser. Error occurs if Precondition parameter is not valid for web server.

413 Request Entity Too Large

This error occurs when web request is to large. This is also very rare error, especially when request is sent by web browser.

414 Request URI Too Long

Similar like error 413, error occurs if URL in the web request is too long. This limit is usually 2048 to 4096 characters. If requested URL is longer than server's limit then this error is returned. 2048 characters is pretty much, so this error occurs rarely. If your web application produces this error, then it is possible that is something wrong with your URLs, especially if you build it dynamically with ASP.NET server side code.

415 Unsupported Media Type

This error occurs rarely, especially if request is sent by web browser. It could be three different reasons for this error. It is possible that requested media type doesn't match media type specified in request, or because of incapability to handle current data for the resource, or it is not compatible with requested Http method.

416 Requested Range Not Satisfied

This is very rare error. Client request can contain Range parameter. This parameter represents expected size of resource requested. For example, if client asks for an image, and range is between 0 and 2000, then image should not be larger from 2000 bytes. If image is larger, this error is returned. However, web page hyperlinks usually don't specify any Range value so this error rarely occurs.

417 Expectation Failed

This is also rare error, especially if client is web browser. There is Expect parameter of request, if this Expect is not satisfied Expectation error is returned.

500 Internal Server Error

This is very common error; client doesn't know what the problem is. Server only tells that there is some problem on server side. But, on the server side are usually more information provided. If server hosts ASP.NET application, then this often means that there is an error in ASP.NET application. More details about error could be logged to EventLog, database or plain text files. To see how to get error details take a look at Application Level Error Handling In ASP.NET tutorial.

501 Not Implemented

This is rare error. It means that web server doesn't support Http method used in request. Common Http methods are POST, GET, HEAD, TRACE etc. If some other method is used and web server can't recognize it, this error will be returned.

502 Bad Gateway

This error occurs when server is working as gateway and need to proceed request to upstream web server. If upstream web server response is not correct, then first server will return this error. The reason for this error is often bad Internet connection some problem with firewall, or problem in communication between servers.

503 Service unavailable

This error means that server is temporally down, but that is planned, usually because a maintenance. Of course, it is not completely down because it can send 503 error :), but it is not working properly. Client should expect that system administrator is working on the server and server should be up again after problem is solved.

504 Gateway Timeout

Similar to error 502 Bad Gateway, there is problem somewhere between server and upstream web server. In this case, upstream web server takes too long to respond and first server returned this error. This could happen for example if your Internet connection is slow, or it is slow or overloaded communication in your local network.

505 HTTP Version Not Supported

Web server doesn't support Http version used by web client. This should be very rare error. It could happen eventually if web client tries to access to some very old web server, which doesn't support newer Http protocol version (before v. 1.x).

Show different error pages based on roles

By using of RemoteOnly value for customErrors mode parameter in Web.config you can get detailed messages when you access through a localhost and custom messages if you access remotely. This could be a problem, because sometime you need to access remotely to web application and still see detailed messages. If you have shared hosting than this is only option. Of course, you still don't want to show error details to end users.

If you use some role based security architecture you can show detailed message to single logged user (you) or to all users that belong to some role, for example "Developers". On this way, developers logged in to web application will see detailed error messages and your users will still see just friendly custom notification.

To implement this idea we need to add some code to Application_Error procedure in Global.asax file. Code could look like this:

[ C# ]

void Application_Error(object sender, EventArgs e)
{
if(Context != null)
{
// Of course, you don't need to use both conditions bellow
// If you want, you can use only your user name or only role name
if(Context.User.IsInRole("Developers") ||
(Context.User.Identity.Name == "YourUserName") )
{
// Use Server.GetLastError to recieve current exception object
Exception CurrentException = Server.GetLastError();

// We need this line to avoid real error page
Server.ClearError();

// Clear current output
Response.Clear();

// Show error message as a title
Response.Write("<h1>Error message: " + CurrentException.Message + "</h1>");
// Show error details
Response.Write("<p>Error details:</p>");
Response.Write(CurrentException.ToString());
}
}
}

[ VB.NET ]

Sub Application_Error(ByVal sender As Object, ByVal e As EventArgs)
If Context IsNot Nothing Then
' Of course, you don't need to use both conditions bellow
' If you want, you can use only your user name or only role name
If Context.User.IsInRole("Developers") Or _
(Context.User.Identity.Name = "YourUserName") Then

' Use Server.GetLastError to recieve current exception object
Dim CurrentException As Exception = Server.GetLastError()

' We need this line to avoid real error page
Server.ClearError()

' Clear current output
Response.Clear()

' Show error message as a title
Response.Write("<h1>Error message: " & _
CurrentException.Message & "</h1>")
' Show error details
Response.Write("<p>Error details:</p>")
Response.Write(CurrentException.ToString())
End If
End If
End Sub

ASP.NET Custom Error Pages Remarks

By using custom error pages you can achieve more professional look for your ASP.NET web application. Although your visitors will not be happy if something is not working, it will be much better if you told them that something is wrong, but you are aware of that and you will correct it as soon as possible. That will connect you closer to your users. Errors are almost unavoidable, but your competence to deal with them makes difference.

I hope that you find this tutorial helpful. Happy programming!

Popular Posts