[Sunday, January 24, 2010]

Full Text Search using Solr / Lucene and C# / .NET. End to end tutorial on using these technologies with SolrNet.

4 comments

Link to source code and other goodies discussed here.

I think the first thing to answer is, What is Solr? I'll let Solr explain itself:

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.


Simply put, Lucene is an open-source Java library that does full text searching. In other words, if you want to provide searching beyond the simple "sql like query" you'll need full text search. Now Lucene does just that, but firstly, it's written in Java (there is a .Net version called Lucene.net but frankly they're way behind the original Java project, and I don't know how active that community is.) Secondly, it's just a library, so you'd need to create your own search engine around it. That's where Solr comes in. Solr is a Java web based application that functions as a Search engine (using Lucene under the covers). The nice thing about it is that it has a simple Web Service API that's XML based which means, it can be used from C# (or any other language for that matter).

The point of this tutorial is to provide a simple walkthrough to getting Solr up and running to index and search some documents using C#. I work in Computer Forensics / E-Discovery where being able to search and filter documents is vital. I learned alot on the job, and felt that there wasn't really one good end to end tutorial aimed at C# developers, so I hope someone will find at least some of this information useful.

Here's how this will work. First, I'll walk you through getting Tomcat up and running. Then, I'll demonstrate how to install Solr. Next, I'll have you download a little utility I wrote which basically creates a database, and downloads a whole bunch of text files off the net. Finally, we'll write some C# code to index and search those text files.

Part 1: Installing Tomcat:

From their website:

Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies. The Java Servlet and JavaServer Pages specifications are developed under the Java Community Process.


Since Solr is a Java web app, we need a server to host the Java Servlet (that's basically a Java web app). First, you'll need to make sure Java is installed on your machine. You can go to this site which will tell you whether or not it's installed: http://www.java.com/en/download/help/testvm.xml If you see this:



then you have Java installed. Otherwise, head on over to this link, and grab the latest version of the JRE (Java Runtime Environment.)

The next step is to install Tomcat. You can grab it here: http://apache.cyberuse.com/tomcat/tomcat-6/v6.0.24/bin/apache-tomcat-6.0.24.exe That's the windows Installer version. Run the install and click Next until you get to this page:



You can put any port number into the Port text box (the default is 8080). I put 8983 as that's the default you'll see in all Solr examples on the net. You can choose any port you'd like it doesn't really matter. Just remember which port you choose as that'll always be part of the URL when connecting to Solr. In all my examples here, I'll be using 8983.

On the next screen, it'll try to find the location of the JRE on your system:



It should find it automatically, but if it doesn't make sure to point it to directory where Java was installed on your system. Click next and Tomcat will install.

Once you finish the installation, tomcat will start and you'll see this:



(If you don't see that, either look at your Windows Taskbar in the bottom right for the Tomcat icon, or go to Start-> Programs -> Appache Tomcat 6.0 -> Configure Tomcat)

Click on the Start button and watch Tomcat startup. Above the Start button you should now see "Serive Status: Started. If it does not, here's what you need to do. Go to Program Files\Java\jre6\bin and locate the file called msvcr71.dll. Copy that file and paste it into C:\Windows\system. Now try starting Tomcat up again, and now it should work. Don't ask me why, but after breaking my head a while back on one of my test machines, I found the answer on Google.

Once Tomcat is started, open a web browser and navigate to: http://localhost:8983 (remember, if you used a different port during installation, use that one instead). You should be greeted with this beautiful page:



If you don't see that page, please refer to Tomcat's FAQ.

Step 2: Installing Solr

The next step in this process is to install the Solr web application. First, let's shut down Tomcat. (Click Start-> Programs -> Appache Tomcat 6.0 -> Configure Tomcat and click on the Stop button.) Next, we'll need to download Solr. As of the time of this blog post, the latest version of Solr is 1.4 so let's download and install that one. Head on over to this page and choose the mirror of your choice: http://www.apache.org/dyn/closer.cgi/lucene/solr/1.4.0 Once you click on one of the mirror links, you'll be taken to a page where you can choose the different formats to download. Download this one:

apache-solr-1.4.0.zip

Once that's downloaded, unzip the file and locate the folder "$\apache-solr-1.4.0\dist". In there you'll see a file "apache-solr-1.4.0.war"; war files are basically zipped up files that contain all of the necessary binaries for the servlet container (Tomcat) to run the webapp. Copy that file and past it to: "$\Program Files\Apache Software Foundation\Tomcat 6.0\webapps" Rename the file to solr.war because that will be the name of the URL to access Solr.

Next, we need to create the "Solr Home". The Solr home is basically a folder where Tomcat will look for all the relevant Solr configurations, as well as where the actual index files will be stored. Go back to the unzipped solr package, and locate the folder: "$\apache-solr-1.4.0\example\solr". Copy the contents of that directory (should be a bin folder, a conf folder and a readme.txt) and place it under your root directory, under a folder named Solr (e.g. C:\Solr). Note: You can place it anywhere on your hard drive, doesn't have to be under the root, but wherever you place it, make sure to remember the full path. For now, don't worry about what exactly you're doing, I'll explain more later.

Next next step is to tell Tomcat where the Solr Home folder is. Open Tomcat's configuration window again, and go to the Java tab. Locate the Java Options text box, and enter this line at the end of all the other stuff that's already there:

-Dsolr.solr.home=c:\solr

(If you chose a different Home Folder, place that path instead.)



Now go back to the General tab, and click Start. Once Tomcat starts up, open your browser and navigate to:

http://localhost:8983/solr/

You should be greeted with this page:



Click on the Solr Admin link and you'll see this page:



In the Query String box, replace the word solr with this: *:* (this is the lucene syntax for "select all") and click Search. You should now see a page of results in XML format. At this point we haven't indexed anything, so you won't see any results, but Congratulations, you've set up Solr!

Step 2.1: Configuring Solr

In order to move on to Solr Configurations, it's important to take a minute and explore how Lucene works. I'm no expert, and if you want a deeper understanding, use Google, but I will mention a few key points.

The first step to getting documents search-able is to index them. I'm sure you've heard that term before, but what exactly does that mean? At the heart of indexing are two concepts known as Tokenizing, and Filtering. The idea is, you use these tokenizers (that have custom analyzers) and filters to analyze the data and store them in a custom file structure known as an index. I think this page does a great job of explaining it:

The analyzer's job is to take apart a string of text and give you back a stream of tokens. The tokens are presumably usually words from the text content of the string, and that's what gets stored (along with the location and other details) in the index.

Each analyzer includes one or more tokenizers and may include filters. The tokenizers take care of the actual rules for where to break the text up into words (typically whitespace). The filters do any post-tokenizing work on the tokens (typically dropping out punctuation and commonly occurring words like "the", "an", "a", etc).


So you feed Lucene the data, it analyzes it and tokenizes it, at which point it is now search-able. Then, when you submit a query, it uses the same tokenizers and filters to parse the query, and return the results. Why is this relevant? Well, because different data will be tokenized differently. For most text you'll use a WhitespaceTokenizer which splits on white spaces, and tokenizes the words and generally ignoring special characters. For integers or dates for example, you obviously don't want that kind of tokenizer. You want to be able to specify that this is an int field, and be sorted as an int for example.

That's where the Solr config file comes in. You specify all the fields that you want to index, as well as which tokenizers and filters to use for those fields. If you've been following along with this tutorial, you'll find the config file(s) under $/Solr/Conf. There will be many files there, but we won't concern ourselves with all of them. For now we'll focus on schema.xml. The other one I'd like to mention is the solrconfig.xml file which basically are settings for solr itself (how often to commit the index, how much ram to use while indexing etc...). The great thing about the example files is that there's very very detailed documentation in the XML files.

For this tutorial, I'll be using text files that can be downloaded for free from http://www.textfiles.com. There you'll find thousands of text files from the early days of the net. I'll provide a link further down to a simple application that I wrote that scrapes the site, downloads as many files as you specify, and creates a database for you with the relevant information. For now though, I'd like to show you the config file needed to index these documents. We'll be indexing these fields:


  • fileid

  • doctext

  • title

  • datecreated



Here's the schema.xml file needed:



<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.2">
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" />
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0" />
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt" />
</analyzer>
</fieldType>
</types>
<fields>
<field name="fileid" type="int" indexed="true" stored="true" required="true" />
<field name="doctext" type="text" indexed="true" stored="false" required="false" />
<field name="title" type="text" indexed="true" stored="false" required="false" />
<field name="datecreated" type="date" indexed="true" stored="false" />
</fields>
<uniqueKey>fileid</uniqueKey>
<defaultSearchField>doctext</defaultSearchField>
<solrQueryParser defaultOperator="OR" />
</schema>


As you can see, the XML is pretty straight forward. I'd like to touch on two of the attributes in the field tag:

The first one is "stored". A lot of people make this mistake the first time they mess with Lucene, and they stored everything in the index. I'd like to stress this point really really strongly. An index is NOT a data storeage mechanism. It's not intended for that, it's not optimized for that. That's what Databases are for. What I mean by that is, the text you send to get indexed, gets tokenized and totally bastardized. There's no way to re-construct the original document. So what Lucene allows you to do, is store the text as is in the index as well for later retrieval. That's OK for small index's or small fields. However when you get to larger indexes or larger fields, your performance will suffer noticeably. What's most commonly done, is you index everything you need, however you only STORE the primary key (e.g. fileid). When you retrieve the items from the index during search, you get the primary key field, and then you query your database which IS meant for data storage, and get the rest of the fields you need. That's the approach I'll be taking here.

The other field I'd like to point out, is the "type" field. This tells Lucene, which kind of field this one will be, and based on the fieldtypes specified earlier, it uses the correct tokenizer.

There's also one change I think is worth making in the solconfig.xml. Not too far down in the solrconfig.xml file you'll see this:



<!-- Used to specify an alternate directory to hold all index data
other than the default ./data under the Solr home.
If replication is in use, this should match the replication configuration.-->
<dataDir>${solr.data.dir:./solr/data}</dataDir>


For some reason the example defaults to putting the index file in the same directory as the Tomcat server. Just comment out that line:



<!-- Used to specify an alternate directory to hold all index data
other than the default ./data under the Solr home.
If replication is in use, this should match the replication configuration.
<dataDir>${solr.data.dir:./solr/data}</dataDir>-->


we'll use the default "Solr Home" folder ($\solr\data\index) to store the index files.

At this point, in your /solr/conf folder, you should have these files: (delete all the other ones)


  • schema.xml (copy and paste the xml from above)

  • elevate.xml

  • protwords.txt

  • solrconfig.xml

  • stopwords.txt

  • synonyms.txt



Part 3: Downloading Some Text Files

OK, now we have most of our environment set up, it's time to get some files! I created a simple application, that connects to http://www.textfiles.com and downloads some files. It then creates a database and adds a record for each file downloaded. You can download it here:

http://www.box.net/shared/du4zac9vaa

Here's how you use it:

TextFileHarvester.exe "ALEX\SQLEXPRESS" "TextFilesDatabase" "C:\MyTextFiles" 1000

First parameter is the Sql Server name. The second one is the Database name to create. The third one is the location where to store the text files on disk, and the last parameter is how many files to download.

Once that's done, you should have a database that looks something like this:



Just as a side point, the Dates are made up. They're just random dates between 1995 - 2005.

You should also have a directory (in the location you specified to the TextFileHarvester app) with many text files in them.

Part 4: Writing Some Code!!

Now comes the fun part. Earlier I mentioned that Solr was basically a Webservice that you can interact with using http and sending XML. You can do all of the work manually and create the XML by hand, but there's an awesome library out there that already does this. The library is open source and called SolrNet. Head on over there and download the dll's. (There's actually another library called SolrSharp but that's much older, and not as up to date as SolrNet. Also, in my opinion, SolrNet is much easier to use.)

Ok, now that we have the SolrNet dll's, we can create a simple application to Index them. We'll keep it simple, and do this: Connect to the database, get all of the files in one shot, connect to Solr and send them all to be indexed.

The way you send stuff to Solr using SolrNet is by having a class that holds your data, and then you add an attribute to each property that maps to a property in the index. Let me explain with code:



public class TextFile
{
#region Members

private string documentText;

#endregion

[SolrUniqueKey("fileid")]
public int FileID { get; internal set; }

public string FileLocation { get; internal set; }

[SolrField("doctext")]
public string DocumentText
{
get
{
if (this.documentText == null)
{
this.documentText = File.ReadAllText(FileLocation);
}
return this.documentText;
}

}

[SolrField("title")]
public string Title { get; internal set; }

[SolrField("datecreated")]
public DateTime? DateCreated { get; internal set; }
}


You add the SolrField attribute (or the SolrUniqueKey for the index primary key field) and provide the name for that field in the index. I made the DocumentText property be lazy loaded so we don't hit the file system for all file's text in one shot (there should probably be more error handling there, but hey, this is just a demo...).

SolrNet will require a collection of these objects, and it will then send that off to the index. The next step therefore, is to write some code to populate these objects. We can use Linq to Sql, or whatever you want to connect to the database and populate these objects. I wrote some (very) basic ADO.NET code to do this:



internal class TextFileRepository
{
#region Members

private string connectionString;

#endregion

#region Constructors

public TextFileRepository(string connectionString)
{
this.connectionString = connectionString;
}

#endregion

#region Methods

public IEnumerable<TextFile> GetTextFiles()
{
return ExecuteSql("SELECT * FROM FILES");
}

public IEnumerable<TextFile> GetTextFiles(IEnumerable<int> fileIds)
{
if (!fileIds.Any()) { yield break; }
string sql = String.Format("SELECT * FROM FILES WHERE FILEID IN({0})", fileIds.ToDelimetedString());
foreach (var item in ExecuteSql(sql))
{
yield return item;
}
}

private IEnumerable<TextFile> ExecuteSql(string sql)
{
using (SqlConnection connection = new SqlConnection(this.connectionString))
using (SqlCommand command = connection.CreateCommand())
{
command.CommandText = sql;
connection.Open();
var reader = command.ExecuteReader();
while (reader.Read())
{
yield return FromReader(reader);
}
}
}

private TextFile FromReader(SqlDataReader reader)
{
var result = new TextFile();
result.FileID = (int)reader["FileID"];
result.Title = reader["Title"] as string;
result.FileLocation = reader["FileLocation"] as string;
var date = reader["DateCreated"];
result.DateCreated = date == DBNull.Value ? (DateTime?)null : date as DateTime?;

return result;
}

#endregion
}


Nothing fancy. If you look at the GetTextFiles method it's about as straight forward as they come. Just do a SELECT * and covert to the TextFile object. (You'll notice a GetTextFiles with a parameter, but that's used for Searching as I'll explain shortly.)

Part 4.1: Indexing the Files

Now, we can write some code to Index these files:



public class BasicIndexer
{
private string connectionString;
private string solrUrl;

public BasicIndexer(string connectionString, string solrUrl)
{
this.connectionString = connectionString;
this.solrUrl = solrUrl;
}

public void IndexFiles()
{
Startup.Init<TextFile>(this.solrUrl);
var solrWorker = ServiceLocator.Current.GetInstance<ISolrOperations<TextFile>>();
var files = new TextFileRepository(this.connectionString).GetTextFiles();
solrWorker.Add(files).Commit();
}
}


Pretty simple no? First, we need to call the Init method. Internally SolrNet uses IoC to handle the instantiation of the classes. Therefore, we can just use Microsoft.Practices.ServiceLocation from the Enterprise Library. (I'm not a huge fan of the Enterprise Library, but it's quick and easy for this example. Refer to the SolrNet documentation for better approaches using Castle Windsor.) Once we have the ISolrOperations, we can just call the Database to get all the files, and then call Add on the ISolrOperations. Once that's done, we just call Commit() and that's it!!

Let's give it a whirl. Create a quick ConsoleApplication and call the BasicIndexer class:



public class Program
{
public static void Main(string[] args)
{
string connectionString = "";
string solrUrl = "";

BasicIndexer indexer = new BasicIndexer(connectionString, solrUrl);
indexer.IndexFiles();
}
}


Fill in the blanks for your specific connection string, and your solr url (e.g. http://localhost:8983/solr). MAKE SURE YOU RESTART TOMCAT!! (Start -> Programs -> Configure Tomact -> Start)! Run the program and viola, you just indexed all of your files! (In a real world scenario, you'd obviously batch this operation, getting only x amount from the database, possibly event multithreading it to maximize performance, but again this is just a demo :) ). Now let's see if in fact our indexing worked. Open your browser to http://localhost:8983/solr (or whatever the Solr Url is) and click on Solr Admin. In the Search box enter "*:*" (the Lucene equivalent of SELECT *) and click the Search button. You SHOULD see this:



(depending on how many files you decided to download using the Harvester application you'll see different totals). Congrats! You've successfully indexed the files!

Part 4.2: Searching the Files

Now we want to search using C#. That too is VERY easy using SolrNet. The only thing to remember here, is like I mentioned earlier, we didn't STORE the fields in Solr. All we stored was the FileID. Therefore, we need to first retrieve those file id's, and then hit the database to get the rest of the information. It may seem like double work, but TRUST me!!! when dealing with larger data sets, it's MUCH faster. So, first let's create a class to hold the FileID's:



internal class FileIDResult
{
[SolrField("fileid")]
public int FileID { get; set; }
}


Same concept like Indexing. We use the SolrField attributes for the properties that will map to the Index fields.

Now, we can write some code to execute the Search. There are a few parameters I'd like to discuss first. Aside from the search query itself, you need to specify the amount per page, and which number to start from. Solr has paging built right into it, so the way it works is, you specify how many items you want per page, and then how many items to skip over. So for example, if you have 100 results, and you have 10 items per page, and you want Page 3, you'd start at item 30 (it's zero based). With that in mind, here's the searching code:



public class BasicSearcher
{
private string connectionString;
private string solrUrl;

public BasicSearcher(string connectionString, string solrUrl)
{
this.connectionString = connectionString;
this.solrUrl = solrUrl;
}

public SearchResults Search(string query, int resultsPerPage, int pageNumber)
{
var solrWorker = SolrOperationsCache<FileIDResult>.GetSolrOperations(this.solrUrl);

QueryOptions options = new QueryOptions
{
Rows = resultsPerPage,
Start = (pageNumber - 1) * resultsPerPage,
};

ISolrQueryResults<FileIDResult> results = solrWorker.Query(query, options);
var textFiles = new TextFileRepository(this.connectionString)
.GetTextFiles(results.Select(r => r.FileID));


var searchResults = new SearchResults
{
Results = textFiles,
QueryTime = results.Header.QTime,
TotalResults = results.NumFound
};

return searchResults;
}
}

public class SearchResults
{
public IEnumerable<TextFile> Results { get; set; }
public int QueryTime { get; set; }
public int TotalResults { get; set; }
}


Same idea as before. Get an instance to the ISolrOperations, and call the Query method. I used the overload that takes a QueryOptions object so that I can specify the page number and items per page. The result that comes back from that method call is an ISolrQueryResults which in addition to the search results itself, has some metrics. I wrapped all that up in the SearchResults class. Once we have the file id's we can hit the database, and get the rest of the data.

Let's give it a whirl:



public class Program
{
public static void Main(string[] args)
{
string connectionString = "";
string solrUrl = "";

BasicSearcher searcher = new BasicSearcher(connectionString, solrUrl);
var results = searcher.Search("*:*", 10, 1);
foreach (TextFile file in results.Results)
{
Console.WriteLine("FileID: {0}, Title: {1}",file.FileID,file.Title);
}
}
}


If you run this, you should see the file id's and titles for the first 10 documents! Congratulations, you've just executed a search!!

OK, this has gotten WAAAAAY longer than I ever imagined. There's ALOT going on here, I won't deny that. There's a few closing random points I'd like to make:


  • There are many various Tokenizers and Analyzers for various data. If you need to tweak one and create your own, you're SOL to do it in C#, you'll have to blow the dust off that old Java book and do it there.

  • The actual index is stored under $\solr\data\index. Sometimes it's useful to actually look into the index files and read them. For that you can use Luke (a standalone Java application.)

  • In the real world, you'd possibly have your Indexing Server somewhere other than where the data is. Keep that in mind, as the XML sent across can get quite big, so see if you can optimize that.



There's so much more to Lucene and Solr than what I covered here. Here are some links for more reading:

Lucene Home Page
Lucene tutorial. (While this is an actual Lucene tutorial of how to use the Lucene Java libraries, it has some great insight in to what's happening behind the scenes.)
Solr Home Page
Solr Wiki
The various Tokenizers an Analyzers available from Solr

That's all I can think of for now. If there's anything else, I'll add it in the future.

Lastly, here's a zip file containing everything I covered in this post. You'll find the TextFileHarvester app. You'll find the solr configuration files I've used for this demo. You'll also find a simple library I created to wrap the indexing and searching functionality of these text files, as well as a basic windows application to test both. Good luck, and Happy Coding!

P.S. This tutorial took a lot of time and effort. If you find it useful, just drop me a a line in the comments.

[Thursday, January 7, 2010]

Comparing C# structs to null. Can you or not?

0 comments

A while back I posted a blog about creating a Splash Screen in WinForms. While writing that blog, I was running into problems when calling BeginInvoke on the form, and I was getting this error:

Invoke or BeginInvoke cannot be called on a control until the window handle has been created.


Every Form has a Handle that gets assigned to it by Windows. Therefore, I figured I'd just check to see if the handle is null:



if(this.Handle == null)
{
return; //don't do anything
}


When that didn't work, I started digging in a little deeper, and found out that the Form.Handle is of type IntPtr which is actually a struct. Structs as everyone knows are value types which can't be null so therefore, the null check would always fail and therefore the return statement never got hit.(The fix for that isn't related to this post, so I won't go into detail.)

The question that came to mind though right away was, why did that compile? If you try to compare a struct to null, the compiler complains right? So just to make sure I wasn't going nuts, I quickly whipped up this code sample:



class Program
{
static void Main(string[] args)
{
MyStruct ms = new MyStruct();
if (ms == null)
{
Console.WriteLine("Can't get here.");
}
}
}

public struct MyStruct { }


And sure enough, as soon as I tried compiling this, I got this error:

Operator '==' cannot be applied to operands of type 'StructDemo.MyStruct' and ''


OK, so then how the heck can IntPtr be compared to null, but my struct can't? What's Microsoft know that I don't? So then I decided to look at the metadata for IntPtr:



[Serializable]
[ComVisible(true)]
public struct IntPtr : ISerializable
{
public static readonly IntPtr Zero;

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public IntPtr(int value);

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public IntPtr(long value);

[CLSCompliant(false)]
[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public IntPtr(void* value);

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public static bool operator !=(IntPtr value1, IntPtr value2);

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public static bool operator ==(IntPtr value1, IntPtr value2);

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public static explicit operator IntPtr(int value);

public static explicit operator int(IntPtr value);

public static explicit operator long(IntPtr value);

[CLSCompliant(false)]
public static explicit operator void*(IntPtr value);

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public static explicit operator IntPtr(long value);

[CLSCompliant(false)]
[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public static explicit operator IntPtr(void* value);

public static int Size { get; }

public override bool Equals(object obj);

public override int GetHashCode();

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public int ToInt32();

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public long ToInt64();

[CLSCompliant(false)]
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public void* ToPointer();

public override string ToString();

public string ToString(string format);
}


OK, so I decided to try and copy bit by bit to see if I can make the error go away. First I tried making it Serializable:



public struct MyStruct : ISerializable
{
#region ISerializable Members

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
throw new NotImplementedException();
}

#endregion
}


No luck. Still won't compile. Next, I decided to add the Equals and GetHashCode overloads:



public struct MyStruct : ISerializable
{
#region ISerializable Members

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
throw new NotImplementedException();
}

#endregion

public override int GetHashCode()
{
return base.GetHashCode();
}

public override bool Equals(object obj)
{
return base.Equals(obj);
}
}


Still no luck. Finally, this is what made the error go away:



public struct MyStruct : ISerializable
{
#region ISerializable Members

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
throw new NotImplementedException();
}

#endregion

public override int GetHashCode()
{
return base.GetHashCode();
}

public override bool Equals(object obj)
{
return base.Equals(obj);
}

public static bool operator ==(MyStruct m1, MyStruct m2)
{
return true;
}

public static bool operator !=(MyStruct m1, MyStruct m2)
{
return false;
}
}



Huh? Overriding the equals and not equals operator? Why? Both of those methods still only take MyStruct as arguments which can't be null?! I still can't pass null to either one of those methods, so why does the compiler allow it all of a sudden?

After much digging, I finally found my answer and I'll try to do my best to explain. Since .Net 2.0 with the introduction of Nullable Types, there's now an implicit conversion between any value type and it's nullable type. Meaning, I can always do this safely:



int x = 10;
Nullable<int> y = x;


Therefore, whenever we override the == operator, the C# compiler gives us another one for free. It also allows us to compare Nullable versions of that type. Meaning, when you do this:



public static bool operator ==(MyStruct m1, MyStruct m2)
{
return true;
}

public static bool operator !=(MyStruct m1, MyStruct m2)
{
return false;
}


You're also getting this for free:



public static bool operator ==(Nullable<MyStruct> m1, Nullable<MyStruct> m2)
{
return true;
}

public static bool operator !=(Nullable<MyStruct> m1, Nullable<MyStruct> m2)
{
return false;
}


So that's why, when you compare your struct instance to null, it casts your struct instance to a nullable type (implicitly), and calls the overload which takes two nullables. Therefore, your code compiles. Of course it can never be null and the if check will always fail, but the compiler now thinks it's OK. Heh, you learn something new every day I guess...

[Thursday, December 24, 2009]

Powershell like command line arguments in a C# Console Application

1 comments

Source Code: http://www.box.net/shared/5l587aig4k

Here at work, we use Powershell to script some automated integration tests. Sometimes, simple scripts aren't enough, and we need to resort to writing Cmdlets. For the purpose of this blog post, all you really need to know about Cmdlets is that they're basically a C# class that's invoked from the Powershell command line. What I thought was really cool with Cmdlets, is the way it deals with command-line arguments.

Basically, there's a ParameterAttribute that you put on your properties, and those properties then get set from the command line. Here's a simple example:



public class MyCmdlet : Cmdlet
{
[Parameter]
public int IntArg { get; set; }
}



Then from the Powershell command line you would do something like this:

PS>MyCmdlet -IntArg 100

And then within your Cmdlet class, your IntArg property will automatically be set to 100.

It then dawned on me, we have many Console apps at work that each have their own way of parsing the arguments. Now I know there are many command line parsers out there, but I thought, wouldn't be be cool if we can have the kind of syntax for invoking Console applications just like Powershell does.

I won't explain all of the code, you can download it at the bottom of this post, and dig in yourself, but I'll just highlight a few points. Firstly, this is how you can use this library:

Say you have an application, that you need to pass in different characteristics of a person. Things like, Name, Age, Height and Email Address, you'd want to be able to invoke the application like this:

MyConsoleApp.exe -Name "Alex Friedman" -Age 27 -Height 69.5 -EmailAddress "a.friedman07@gmail.com"

First you'd need to create a Person class to represent all the command line arguments:



internal class Person
{
[Parameter]
public string Name { get; set; }

[Parameter]
public int Age { get; set; }

[Parameter]
public double Height { get; set; }

[Parameter(Mandatory=true)]
public string EmailAddress { get; set; }
}


Then, in your main method, you'd do this:



ArgsSetter<Person> argsSetter = new ArgsSetter<Person>(args);
Person person = argsSetter.BuildParameterObject();


And that's it. Your person object will have all the properties set. The thing to note here is that this library can deal with all primitive types including strings. Remember, all arguments come in as a string, even the number 27 for example gets passed in as the string representation of "27", yet it gets converted for you automatically to an int.

A few other things to note, is the error handling. I didn't implement extensive validation in this library. What I did implement was two things. First, there's an IValidator interface that does some error checking for fatal errors; errors that would prevent from even attempting to parse. I provide a DefaultValidator that checks that there are an even number of elements in the args array. You can think of the command line arguments as a Key-Value pair; the key being the argument name, and the value being the argument value. Therefore, if there's an odd number of args, something isn't right. Also, it makes sure to check that arguments begin with a "-". If that's not enough, you can implement your own IValidator.

The second part of error handling, is to make sure that the values are correct, and that the Mandatory arguments have been provided. To get that information you do something like this:



foreach (var error in setter.Errors)
{
Console.WriteLine("Paramname: {0}, Error: {1}", error.ArgumentName, error.ErrorType);
}


The ArgsSetter class has an Errors collection that gives you all the errors that occurred.

Here's a link to the source code. It contains the actual library, as well as some unit tests:
http://www.box.net/shared/5l587aig4k

That's pretty much all there is to it. Feel free to download the source code and mess with it. I'd like to stress the point that there's alot more that can be done with this, and if you do use this and add something, feel free to drop me an email. Also, any constructive criticism is always welcome. Enjoy!

[Wednesday, August 19, 2009]

Rethrowing an Exception without resetting the Stack Trace.

0 comments

Source code: http://www.box.net/shared/kjkgq36itq

Exception handling in .NET is a complicated subject. It's complicated and always spawns all kinds of debates. I won't post my opinion or anything like that, but I want to point out a subtle yet important difference when re-throwing an exception. The two ways you can re-throw an exception are:



try
{
DoSomeExceptionThrowingMethod();
}
catch (Exception ex)
{

throw ex;
}


try
{
DoSomeExceptionThrowingMethod();
}
catch (Exception ex)
{

throw;
}


As you can see, the only difference is in the catch block where I'm calling throw. In the first case I'm calling throw ex, while in the second case I'm simply calling throw. What's the difference?

Well when you're simply calling throw, you're effectivley calling "rethrow" meaning "re throw the exception you just caught". When you're calling "throw ex" you're basically just saying "throw" and you're not rethrowing the exception you just caught. That doesn't make much sense, so let's whip up a code sample:

First, we'll have a class that has one method which throws an exception:



public class ExceptionThrower
{
public void InvalidMethod()
{
throw new InvalidOperationException("This method is invalid.");
}
}


Now, we'll add a few layers of method calling to this demo so we can make the actual stack trace a little bigger (I'll explain more soon):



public class Layer1
{
private ExceptionThrower thrower;

public Layer1()
{
this.thrower = new ExceptionThrower();
}

public void Layer1Method()
{
this.thrower.InvalidMethod();
}
}


Here's one layer of method calling, now we'll add one more:



public class Layer2
{
private Layer1 layer1;

public Layer2()
{
this.layer1 = new Layer1();
}

public void Layer2Method()
{
this.layer1.Layer1Method();
}
}


These classes aren't doing much more than just wrapping the method calls from the object they have internally.

Now, let's code up our main method:



class Program
{
static void Main(string[] args)
{
try
{
Console.WriteLine("Calling KeepStackTrace");
KeepStackTrace();
}
catch (InvalidOperationException ex)
{
Console.WriteLine(ex.StackTrace);
}

try
{
Console.WriteLine("Calling ResetStackTrace");
ResetStackTrace();
}
catch (InvalidOperationException ex)
{
Console.WriteLine(ex.StackTrace);
}

Console.ReadKey(true);
}

private static void KeepStackTrace()
{
Layer2 l2 = new Layer2();
try
{
l2.Layer2Method();
}
catch (InvalidOperationException ex)
{
throw;
}
}

private static void ResetStackTrace()
{
Layer2 l2 = new Layer2();
try
{
l2.Layer2Method();
}
catch (InvalidOperationException ex)
{
throw ex;
}
}
}


So basically we have two methods that call into our Layer2 class. One wraps the method in a try / catch but just calls throw, while the other calls throw ex. In the main method, we output the Stack Trace of the exception. Let's examine the output:

Calling KeepStackTrace

at ExceptionReThrow.ExceptionThrower.InvalidMethod() in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\ExceptionThrower.cs:line 12
at ExceptionReThrow.Layer1.Layer1Method() in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Layer1.cs:line 19
at ExceptionReThrow.Layer2.Layer2Method() in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Layer2.cs:line 19
at ExceptionReThrow.Program.KeepStackTrace() in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Program.cs:line 48
at ExceptionReThrow.Program.Main(String[] args) in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Program.cs:line 19

Calling ResetStackTrace

at ExceptionReThrow.Program.ResetStackTrace() in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Program.cs:line 61
at ExceptionReThrow.Program.Main(String[] args) in C:\Users\Alex\Documents\Visual Studio 2008\Projects\ExceptionReThrow\ExceptionReThrow\Program.cs:line 29


As you can tell, when calling the KeepStackTrace which simply did "throw" the entire stack trace with all the layers are kept and we can see it all the way down to where the exception originated.

When calling ResetStrackTrace though, the method where we do "throw ex", you'll notice that all we see is down till our ResetStackTrace method. We don't see anything past that, even though that's not really where the exception originated.

Bottom line, for the most part, this isn't all that relevant because you really shouldn't be catching exceptions if you don't plan on doing anything with it. However, if you do want to at least log it, but let it bubble up, be sure to do "throw" so that when the Exception does finally bubble to the top, you have the entire stack trace.

[Thursday, July 30, 2009]

Passing objects using the ref keyword...Wait, aren't objects *always* passed by reference???

4 comments

For the past month I've been on the job market and have been exposed to all kinds of technical questions. Most of them were run of the mill, but every once and a while I'd get asked a question that made me stop and think. Perfect for a guy with a blog, because now you have tons more stuff to blog about right?

P.S. For those interested, I accepted an offer earlier this week at a company called BIA. They are a computer forensics firm and I'm real excited to start!

So one question that I really liked, was the point of this blog post. The interviewer actually informed me that I was only one out of one hundred that got this question right. Not believing that these numbers were true, I asked everyone on my team at my old job, and only one of them sort of half knew. The others were completely stumped. I guess it is something that many developers just glance over.

OK, so what is this question already?? Well it goes something like this: "What's the difference between passing an object to a method the standard way, and passing an object to a method using the ref keyword?"

Let's break it down. Let's first talk about what the ref keyword does when it comes to value types. Simple example:



static void Main(string[] args)
{
int number = 10;

Add5(number);
Console.WriteLine(number);

Add5(ref number);
Console.WriteLine(number);

Console.ReadKey(true);
}

public static void Add5(int x)
{
x += 5;
}

public static void Add5(ref int x)
{
x += 5;
}


Before trying to run this, see if you can guess what the output would be.....


Answer: The first Console.Writline outputs 10 and the second one outputs 15. Why? Well, the definition of a "value type" is that when passing it to a method, it's passed by value. Meaning, not the actual variable itself is passed to the method, rather a copy of the value is passed to the method. Therefore, when calling the first method, just adding 5 to the variable that was passed in, does NOT affect the variable in the main method, because we only added 5 to the COPY of the x variable not the actual one.

In the second method call though, we're passing the variable using the ref keyword. This changes the behavior and actually DOES pass the actual variable itself to the method. Therefore, in the second method, when adding 5 to the value being passed in, you're actually messing with the same exact variable that's in the main method. Therefore, it outputs 15 showing the changes DID take effect.

So now we have an understanding of how value types work, and how ref changes the behavior. Let's talk about objects now, which are passed by reference. I'll start with another example:

First a simple Person class:



public class Person
{
public Person(string name, int age)
{
this.Name = name;
this.Age = age;
}

public override string ToString()
{
return String.Format("{0} is {1} years old.", this.Name, this.Age);
}

public string Name { get; set; }
public int Age { get; set; }

}


Simple class with two properties. Name and Age. Also, the ToString is overridden to make it easier to demonstrate.

Now, let's say we had something like this in our Program.cs:



static void Main(string[] args)
{
var alex = new Person("Alex", 27);

ChangePerson(alex);
Console.WriteLine(alex);

Console.ReadKey(true);
}


public static void ChangePerson(Person p)
{
p.Age += 5;
}


Try guessing what the output would be for this program.

Answer: Alex is 32 years old.

If you're paying attention you'll notice that this is different than how it was with value types. The change in the method DID affect the one in the main method! That's because objects are passed by reference, meaning that a reference (pointer) to the SAME object is being passed to the method. So in the method itself, you still have a reference to the same object that exists in the main method. Therefore, when you make changes, it does show up back in the main.

So back to the original question: If all objects are passed by reference, what's the point of the ref keyword when passing an object to a method??

So here's the deal. Let's talk in terms of the stack and the heap. In the current version of the CLR value types are stored on the stack, and reference types are stored on the heap. However, and this is key, for reference types, a *pointer* to that object is ALSO stored on the stack! Ok, so why is this so important? Well, I sort of mispoke earlier when I said objects are passed by refernece. I didn't give you the whole picture. What's actually happening is that a copy of the pointer is being passed to the the method. So while you are referencing the same object in memory in the method and the one in the main, the POINTER variable on the stack, is NOT the same. HOWEVER, when using the ref keyword, the pointer itself is passed to the method, not a copy of it! So when you're inside the method itself, you're dealing with the exact same pointer variable that's in the main.

The only way to explain is with an example:



static void Main(string[] args)
{
var alex = new Person("Alex", 27);

ChangePerson(alex);
Console.WriteLine(alex);

ChangePerson(ref alex);
Console.WriteLine(alex);

Console.ReadKey(true);
}


public static void ChangePerson(Person p)
{
p = new Person("Alex", 35);
}

public static void ChangePerson(ref Person p)
{
p = new Person("Alex", 45);
}


Here we have two methods that look the same. The only difference is the ref keyword. What the method is doing, is it's assigning a NEW person object to the Person p (the pointer) that was passed in to the method. However, if you run this program you'll see that the first Console.Writeline still outputs 27 years old, even though we assigned it to a person object that's 35 years old! The reason for this is because the pointer itself was passed by VALUE so when you're assigning a new person object, you're not assigning it to the same pointer referenced in the main.

In the second case however, since we're using the ref keyword, the pointer in the method is the SAME one that's in the main method. Therefore, the second Console.Writeline outputs 45 years old, because the pointer in the main, is now pointing to the object that was assigned to it in the method.

Personally I've never used this yet in production code, but if you understand this, then that means you understand the nitty gritty details of how parameters are passed around. Very impressive on interviews :-)

[Tuesday, July 14, 2009]

Using JQuery to post a Form with ASP.NET MVC with AJAX

0 comments

Source code:http://www.box.net/shared/2to3vfajqp In order for this sample to work on your machine, you need to have the Northwind database, and you need to configure the connection string in the HomeController.

When you install ASP.NET MVC, you'll notice that when you create a new project, the latest jQuery libraries get added for you as well. For those of you who don't know what jQuery is, think of it as a layer of abstraction for common tasks you would do with javascript. Instead of having to rewrite tons of javascript to let's say, do some animation, or post to a server with AJAX, jQuery makes it all extremely simple. I've found that there aren't that many tutorials online for getting started with jQuery and AJAX when using ASP.NET MVC, so I figured I'll share what I've learned so far in the hopes that maybe others can get some insight.

The premise here will be simple. We'll be using the Northwind database (specifically the Products table) to display a list of Products and some of their attributes. Then, there will be a textbox on top of the list. When the user enters some text into the textbox, it will post back to the server via AJAX and find any products that match what the user entered. Here's what it will look like:




So first I started with a simple ASP.NET MVC application. I'll be using the standard project for this tutorial. I then added a NorthwindDataContext to the Models folder with only the Products table from the Northwind database. Then, I added a repository that will help us retrieve the items from the database. Here's the code for the repository:



using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;

namespace MVCJqueryDemo.Models
{
public class NorthwindRepository : IDisposable
{
#region Members

private NorthwindDataContext dataContext;

#endregion

#region Constructors

public NorthwindRepository()
: this(null)
{
}

public NorthwindRepository(string connectionString)
{
dataContext = String.IsNullOrEmpty(connectionString) ? new NorthwindDataContext()
: new NorthwindDataContext(connectionString);
}

#endregion

#region Methods

public IEnumerable<Product> GetAllProducts()
{
return this.dataContext.Products.ToList();
}

public IEnumerable<Product> GetProductsByName(string name)
{
return this.dataContext.Products.Where(p => p.ProductName.Contains(name)).ToList();
}

#endregion

#region IDisposable

public void Dispose()
{
this.dataContext.Dispose();
}

#endregion
}
}


So this class will help us retrieve what we need from the db. Now, over in the Home controller, I've added two actions:



public class HomeController : Controller
{
#region Members

//CHANGE THIS CONNECTION STRING IF YOUR NORTHWIND IS IN A DIFFERENT LOCATION!!!!
private const string CONNECTIONSTRING = "Data Source=.;Initial Catalog=Northwind;Integrated Security=True";

#endregion

public ActionResult Products()
{
using (var repository = new NorthwindRepository(CONNECTIONSTRING))
{
var allProducts = repository.GetAllProducts();
return View(allProducts);
}
}

[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Search(string name)
{
using (var repository = new NorthwindRepository(CONNECTIONSTRING))
{
var result = repository.GetProductsByName(name);
return View("ProductsPartial", result);
}
}
}


So there are two methods here. Once will be ../Home/Products and the other will be a url where we'll post to ../Home/Search.

The first one is straight forward. It just hits the repository for all the products, and passes it on to the View. We can see from this, that there's a Products View. Here's the code for the Products View:



<%@ Page Title="" Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage<IEnumerable<Product>>" %>

<%@ Import Namespace="MVCJqueryDemo.Models" %>

<asp:Content ID="Content1" ContentPlaceHolderID="TitleContent" runat="server">

Products
</asp:Content>
<asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server">
<h2>
Products</h2>
<form id="searchForm" action="javascript:void();">
<input type="text" name="name" id="searchBox" />
</form>
<div id="products" class="productsDiv">
<%Html.RenderPartial("ProductsPartial", this.Model); %>
</div>
</asp:Content>


It has a form with a textbox, and then it has a div where we call RenderPartial to render a partial view called: ProductsPartial. Here's the code for the ProductsPartial.ascx:



<%@ Control Language="C#" Inherits="System.Web.Mvc.ViewUserControl<IEnumerable<Product>>" %>
<%@ Import Namespace="MVCJqueryDemo.Models" %>


<table id="productsTable">
<tr>
<th>Product ID</th>
<th>Product Name</th>
<th>Units in Stock</th>
<th>Unit Price</th>
<th>Being Produced</th>
<th>Units on Order</th>
</tr>
<%foreach (var product in this.Model)%>
<%{%>
<tr>
<td><%=product.ProductID %></td>
<td><%=product.ProductName %></td>
<td><%=product.UnitsInStock %></td>
<td><%=product.UnitPrice.Value.ToString("$#0.00")%></td>
<td><img class="inStockImages" src="<%=product.Discontinued ? "../../Content/x.png" : "../../Content/check.png" %>" /></td>
<td><%=product.UnitsOnOrder %></td>
</tr>
<%}%>
</table>


So basically, what's happening is this. When you go to ../Home/Products, the Products Action gets called on the Home controller. Then, we get a list of Products from the database, and pass that on to the Products View. The Products View then passes that on to the ProductsPartial which actually renders the products in a nice HTML table.

At this point, we haven't done anything fancy yet. If we were to run it at this point, you'd see a list of all the products displayed. If you were to type anything in the textbox, nothing would happen. Here's where we want to start using some AJAX. The idea will be, that whenever the keyup event will be triggered in the textbox, we'll fire off an AJAX call to the server, and display the results. So first, let's look at the second Action in the Home Controller:



[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Search(string name)
{
using (var repository = new NorthwindRepository(CONNECTIONSTRING))
{
var result = repository.GetProductsByName(name);
return View("ProductsPartial", result);
}
}
}


As you can see, this Action only accepts HTTP POST requests. Again, we call our repository, and get back a list of Products that match the search criteria. We then call return View to display the ProductsPartial, and we pass in the list of products.

That's all very nice, but how do we call this method? How do we hook up an event to our textbox to trigger this method to be called? This is where we'll use jQuery to make the AJAX call. First, in the head section of your master page, you need to add these lines:



<script src="../../Scripts/jquery-1.3.2.js" type="text/javascript"></script>
<script src="../../Scripts/jquery-1.3.2.min.js" type="text/javascript"></script>


This will include the jQuery libraries in your page. Then, I've added another file called ProductScripts.js into the Scripts folder. Then, I added this line to the head of my page:



<script src="../../Scripts/ProductScripts.js" type="text/javascript"></script>


Here's what the ProductScripts.js file looks like:



$(document).ready(function() {
$("#searchBox").keyup(function(item) {
var textValue = $("#searchBox")[0].value;
var form = $("#searchForm").serialize();
$.post("/Home/Search", form, function(returnHtml) {
$("#products").html(returnHtml);
});

});
});


Looks a little weird at first, but I'll try to explain. First we call $(document.ready(..)). In here is where we hook up all of our jQuery events. This ready function gets called as soon as the DOM is loaded. Then, we get a reference to the searchBox using $("#searchBox"). This is the equivalent of document.getElementById(..) in JavaScript. We then hook into the keyup event and whenever that event is triggered we call this function:



var textValue = $("#searchBox")[0].value;
var form = $("#searchForm").serialize();
$.post("/Home/Search", form, function(returnHtml) {
$("#products").html(returnHtml);


First, we get the text that was typed into the textbox. Then, we get the entire form (which in this case is just the textbox itself). We then serialize the form and call the post method. This is where the actual AJAX call is happening. The parameters that are passed into the post method are as follows:

URL : in our case it's /Home/Search
Data: in our case it's the serialized form, which will then get sent as a parameter to our Search Action on the Home Controller
Callback: in our case it's a function that we use to set the inner HTML of the products div. Remember, the Search Action in our controller renders the ProductsPartial View. So basically it send HTML back to the browesr as a result of the AJAX request. We take that HTML, and stick it into the Products div.

The full source code is available at the link posted at the top, download it and mess with it. It's actually real simple, and real powerful.

In this post I only demonstrated how to send back HTML. Another very common way of sending data is through JSON. I'll cover that in another post. ASP.NET MVC makes it EXTREMELY easy to send JSON across the wire.

[Monday, June 29, 2009]

Factory Pattern with Attributes. Get rid of the ugly switch / case.

2 comments

Source Code: http://www.box.net/shared/2yh2r6l91c

A very common design pattern used in Object Oriented Programming, is the Factory Pattern. The purpose of this post isn't to explain the factory pattern or why it's useful, rather I want to show a simple way to eliminate a giant switch / case found in many factory pattern implementations. For some good reading on the Factory Pattern, I suggest reading these two articles:

Wikipedia : http://en.wikipedia.org/wiki/Factory_method_pattern

MSDN : http://msdn.microsoft.com/en-us/library/ms954600.aspx

To demonstrate a simple example of the Factory Pattern, I've created a few classes. First, I created a base Vehicle class that looks something like this:



public abstract class Vehicle
{
public virtual int TopSpeed
{
get
{
return 150;
}
}

public abstract int Wheels
{
get;
}

public override string ToString()
{
return String.Format("A {0} has {1} wheels, and a top speed of {2} MPH."
, this.GetType().Name, this.Wheels, this.TopSpeed);
}
}


Just a base class that has one virtual property, one abstract property, and it overrides ToString. Then, I've created 4 subclasses:



public class Car : Vehicle
{
public override int Wheels
{
get { return 4; }
}
}

public class SuperCar : Car
{
public override int TopSpeed
{
get
{
return 200;
}
}
}

public class Truck : Vehicle
{
public override int Wheels
{
get { return 18; }
}
}
public class Motorcycle : Vehicle
{
public override int Wheels
{
get { return 2; }
}

public override int TopSpeed
{
get
{
return 190;
}
}
}


So we have Vehicle, Car, SuperCar, Truck and Motorcycle. Now, say we wanted to create a Factory that returns us the correct Vehicle class based on an enum that we'd supply. So let's create an enum:



public enum VehicleType
{
Car,
SuperCar,
Truck,
Motorcyle
}


We'd then have a method that looks something like this:



public static Vehicle GetVehicle(VehicleType vehicle)
{
switch (vehicle)
{
case VehicleType.Car:
return new Car();
case VehicleType.SuperCar:
return new SuperCar();
case VehicleType.Truck:
return new Truck();
case VehicleType.Motorcyle:
return new Motorcycle();
default:
return null;
}
}

Well, this is all nice, and works fine, but imagine a scenario where you may have many many subclasses. This switch statement would get huge, and unmaintainable quickly. Imagine then that new subclasses come along, you'd have to first update the enum, and then remember to update this switch / case. Well, I think there's a better way to do this.

The premise is simple; create an attribute that has one property called Type. This attribute will go on the enum, and will represent which type should be instantiated for each value of the enum. So, let's first create the Attribute class:



public class VehicleInfoAttribute : Attribute
{
private Type type;

public VehicleInfoAttribute(Type type)
{
this.type = type;
}

public Type Type
{
get
{
return this.type;
}
}
}


Nothing fancy, just a simple attribute that will house the Type to be created. Now, let's go back to our enum, and decorate the values with the correct attributes:



public enum VehicleType
{
[VehicleInfo(typeof(Car))]
Car,

[VehicleInfo(typeof(SuperCar))]
SuperCar,

[VehicleInfo(typeof(Truck))]
Truck,

[VehicleInfo(typeof(Motorcycle))]
Motorcyle
}


Now, each enum has an attribute that tells us which type to be instantiated for that type. Now, the fun part. The reflection bit in the factory method itself:

First, I've created an extension method for enum's that helps with getting custom attributes off of enum values:



public static class Extensions
{
public static T GetAttribute<T>(this Enum enumValue)
where T : Attribute
{
FieldInfo field = enumValue.GetType().GetField(enumValue.ToString());
object[] attribs = field.GetCustomAttributes(typeof(T), false);
T result = default(T);

if (attribs.Length > 0)
{
result = attribs[0] as T;
}

return result;
}
}


This allows you to do something like this:

MyCustomAttribute a = myEnumValue.GetAttribute<MyCustomAttribute>();

Now, we have all the pieces in place to write the Factory Method:



public static Vehicle GetVehicle(VehicleType vehicle)
{
var vehicleAttribute = vehicle.GetAttribute<VehicleInfoAttribute>();
if (vehicleAttribute == null)
{
return null;
}

var type = vehicleAttribute.Type;
Vehicle result = Activator.CreateInstance(type) as Vehicle;

return result;
}


First we call our extension method to get the attribute value for the enum passed in. Then, we use the handy Activator.CreateInstance() to create an object of that type.

To test it out, we can write a quick app:



static void Main()
{
Vehicle v = VehicleFactory.GetVehicle(VehicleType.Truck);
Console.WriteLine(v);
}


This will output:

A Truck has 18 wheels, and a top speed of 150 MPH.

Using this approach, yields two benefits. First, you no longer have a giant ugly switch / case. Secondly, if you ever have a case where another subclass is added, you just add another enum value (which you'd have to do anyway if you were using the switch / case), slap on the attribute, and you're done. The Factory method doesn't need to change at all.