Advertisements
Posted by: jsonmez | March 26, 2010

Refactor then Change Legacy Code

I was reminded yesterday of a very important step I had been forgetting when working with legacy code.  The first step.  Refactoring.

Working with legacy code can be challenging.  Especially legacy code that was written by someone who didn’t know what they were doing and then modified 10 times by someone who didn’t care what they were doing.  (This is perhaps 90% of legacy code.)

I was having the tendency to jump right in and start implementing my clean feature in my own class that I would integrate into the legacy code, and move the relevant logic into my own class.

The good way

Let me see if I can summarize what my steps have looked like:

  1. Create failing unit test for new functionality I am implementing.
  2. Create new class which only has the logic for that functionality, but that I know overlaps some of the legacy code.
  3. Repeat until my functionality is working.
  4. Integrate the use of my class into the legacy code.
  5. Start moving parts of the legacy code that share the responsibility of my class into my class.
  6. Refactor the remaining legacy code.

The better way

I don’t think these steps are that bad.  But there is one problem, which is easily solved by adding the step of “Refactor the legacy code” at the beginning.  The problem is that of clearly knowing the responsibilities of the legacy code.  Most of the time in this situation the legacy code has multiple responsibilities.  When you are done implementing the new functionality and cleaning up the code, you should end up with several classes in single responsibility (SRP).

The problem is that poorly written legacy code tends to hide all of the things it is doing.  Refactoring the code first allows you to be able to understand better what responsibilities may be hiding in the code.  It also allows you to better see the true structure of the logic, which helps to clearly identify the class you want to pull out to put your new logic in.

So, a better set of steps for adding functionality to legacy code is:

  1. Refactor legacy code to be as clean as possible.
  2. Create failing unit test for new functionality I am implementing.
  3. Create new class which only has the logic for that functionality, but that I know overlaps some of the legacy code.
  4. Repeat until my functionality is working.
  5. Integrate the use of my class into the legacy code.
  6. Start moving parts of the legacy code that share the responsibility of my class into my class.
  7. Refactor the remaining legacy code.
Advertisements

Responses

  1. I agree. In fact I would hazard that the legacy footprint is probably 10-20% oversized than that is in use.

    Migration exercises often is best done by anlaysing the legacy code inventory to spot unused uncalled modules and reducing the footprint leaving behind a more relevant codeset.

    see:http://eswarann.wordpress.com/2010/04/23/on-understanding-code/

  2. Hi John,
    I like the Agile approach, but I fear that your “better approach” is more risky than you can afford.
    You should not attempt refactoring legacy code without having tests that assert the correctness of the system.
    This can be difficult for a legacy system for which noone can define exactly what it does and I am looking for articles and suggestions on how to do that.
    But my inituition tells me that this is definitely a required first step.
    Refactoring without tests, especially on a system you dont understand, is a sure recipe to introduce errors at the start.
    Do you agree?
    I’d be happy to hear your opinion.
    David

    • Actually, in general I agree with you, but if you can use IDE refactoring tools to do the refactoring, you can be sure to not change functionality.
      Of course if you can write tests that verify how the system currently functions before making any change, that is the best option. It is not always an option though.

  3. Hi John,

    I want an automatic tool that could convert a nested code such kind of:

    (C / C++)

    if (q1 && q2 || ….)
    {
    do_1();
    if (q3 …)
    {
    do_3();
    if (q4 …)
    {
    do_4();

    }
    else
    {
    do_e4();

    }
    else
    {
    do_e3();

    }
    }

    To the type of linear code:

    if (q1 && q2 || ….)
    do_1();

    if ((q1 && q2 || ….) && q3)
    do_3();

    if ((q1 && q2 || ….) && q3 && q4)
    do_4();

    if ((q1 && q2 || ….) && q3 && !q4)
    do_e4();

    if ((q1 && q2 || ….) && !q3)
    do_e3();

    Do you know of such a tool?

    Thank you

    Uri.

    • Sorry, I do not.

      • I appreciate the effort 🙂


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: