Advertisements
Posted by: jsonmez | June 30, 2010

Parsing Columns Like A Ninja

How many times have you written this code? :

public void CreateFromLine(string line)
{
    var columns = line.split(',');

    this.Name = StripQuotes(columns[0]);
    this.Description = StripQuotes(columns[1]);
    this.Category = StripQuotes(columns[2]);
    this.Stuff = StripQuotes(columns[3]);
    this.MoreStuff = StripQuotes(columns[4]);
    if (columns.Count > 5)
        this.EvenMoreStuff = StripQuotes(columns[5]);
}

Or some code like it.  It is pretty common to parse a line and then take each column and store it in your object as data.

One of the annoying problems is that if you have optional columns you have to check to see if they are there before you can parse them.

You’re also repeating your code to strip the quotes off, or whatever other preprocessing you are doing, all over the place.

I know you can use a data driven approach to specify column to property mappings, but I wanted a really low tech, simple solution.

I finally came up with one using one of my favorite C# constructs.

Action<> is superb for solving these kinds of problems

See if this code makes you feel any better:

var propertySetters = new List<Action<string>>
{
    value => this.Name = value,
    value => this.Description = value,
    value => this.Category = value,
    value => this.Stuff = value,
    value => this.MoreStuff = value,
    value => this.EvenMoreStuff = value
};

var columns = line.split(',');

foreach (int columnNumber = 0; columnNumber < columns.Count; columnNumber++)
{
    var propertySetter = propertySetters[columnNumber];
    propertySetter(StripQuotes(columns[columnNumber]);
}

It may not seem like much.  It is not really a reduction in code, but we have done a few important things here.

  • Removed the explicit handling of optional columns, since we are now only populating columns that exist.  (Adding a new optional column is as easy as adding one more line to the list.)
  • Removed the responsibility from the code of explicitly tracking the column numbers.  Column number mapping now is implicit by the order of the columns in the list.
  • Removed the hidden code duplication of having calls to StripQuotes repeated for each column.
  • Separated the mapping of properties to columns from the assignment of them.

That last point deserves a little more explanation.  Why do we care if we have separated the mapping of properties to columns from the assignment?

The answer is not obvious until you try and use this same code to handle a different set of columns, or columns in a different order.

By separating out the mapping, we can pass the assignment code a different set of mappings, and it will still work.

This allows us to reuse the logic we have in the assignment of the columns to properties instead of rewriting it for other column to property mappings or orderings.

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Advertisements

Responses

  1. How about something like this? It works well for me in VB.net; I would think the theory would be the same in C#


    Private propertylist() As String = New String() {"myname", "mytext"}

    Private Sub Setme(ByVal params() As String)
    For i As Integer = 0 To UBound(params)
    myObject.GetType().GetProperty(propertylist(i)).SetValue(myObject, StripQuotes(params(i)), Nothing)
    Next
    End Sub

    • You and your VB.NET 😛

      Actually, although that would work, there are a couple problems with the implementation.

      1. You will be using reflection. Which isn’t necessarily bad, but it can slow things down quite a bit. 2. You will lose the type safety, because you will have to give the property name as a string.

      If you mistype the property name, you could have a problem that you would only see at runtime. Also if you do a code refactor using a refactor tool, (say you change the property name), it is not likely to catch the property name in a string. You can do the same thing I did in VB.NET using VB.NET lambda syntax though: http://msdn.microsoft.com/en-us/library/bb531253.aspx

      The place where I would use your solution would be if I were reading in the property names from some kind of configuration file on disk, or from a database, where I am dynamically creating the mapping.

      • I really don’t know why everyone is so hard on VB.net. It compiles into MSIL just like everything else. That aside, you’re right. That would be better for reading it from disk or if the properties were sent in the first row of the data.

  2. Nice. One thing that seems fragile here is the dependency of the order of the data columns matching the order of the setters in the list. If you change the data source to no longer return the Category column you have to find and change all the property setter lists that are calling the mapping method. Consider a variation that preserves the advantages of your solution while also conforming to the Open-Closed Principle.

  3. […] Parsing Columns Like A Ninja (John Sonmez) […]

  4. This method has very limited usage. All the properties Name, Description, … has to be of type string. If you have different types you have to change to Action but then you can’t cast (implicitly)object to int for example. A possible solution is to use Action if you are using C# 4.0 but you loose the compile-time checking.

    Just my 2 cents.

  5. […] Parsing Columns Like a Ninja […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: