Monday, October 13, 2008

Refactoring Legacy ASP.NET Code-Behind with ReSharper

I've been on a refactoring kick lately.  And with the state of the legacy code at my office, I'll be refactoring for a very long time to come.  This is a write-up of a refactor I did at work, altered enough so that I don't violate my NDA.  But I think there's enough of the real story left to illustrate what I was talking about last time: you can indeed beautify even the ugliest code.  A couple of items to note: 1) we're still on .NET 2.0 (no we haven't migrated to 3.5 yet - don't ask!), and 2) I'm not going to touch on testing much in this post.  I really want to focus on the fact that you can positively affect some seriously messed up code with some pretty simple steps.

ParticipantDetails Refactor

We have a legacy ASP.NET web app that has some very tangled code-behind pages which need to be addressed.  There are lots of items on the legacy app development wish list.  Near the top, we have a strong desire to pull .NET 1.1 custom components out of our legacy app and replace them with our .NET 2.0 custom components.  It would be wonderful if we only had to switch references to affect this change, but there are too many barriers in our code to a solution that simple.  Another wish list item is testability, which means our code has to be restructured in such a way as to make it easily testable.  If we tried to do all of this at one time we would most assuredly fail.  Instead, we have to approach the problem with a series of smaller steps with an eye toward refactoring large pages over multiple iterations.

The ParticipantDetails.aspx.cs file is a good example of a code-behind file that has grown large and cumbersome: it has 3384 lines of code in it.  It has many long methods in it, the longest of which is GetParticipant() at nearly 900 lines.  This method essentially does three things:

  1) fetches participant info from the database,

  2) formats that data for display, and

  3) assigns formatted data to the appropriate controls on the page. 

To deal with this page, and particularly long methods like GetParticipant(), I approached it with three goals for this iteration:

  1) make the methods much smaller, each focused on a single task, and therefore more easily readable and maintainable,

  3) move the persistence coupling entirely out of this page, and instead couple the page to a set of Data Transfer Objects, and

  2) apply the MVP pattern to this page to move as much code as possible into more testable classes.

I used ReSharper extensively for this refactor, particularly the Extract Method feature.

Small Methods

At 900 lines, GetParticipant() is extremely difficult to read, and even more difficult to modify correctly.  It's easy to believe that the primary purpose of writing code is to allow the developer to give operational commands to a computer.  We have to change our thinking here: that's the secondary purpose.  The code we write is a living document that contains business information.  The primary purpose I have when writing code is to create a document that allows me and you and everyone of our developers to communicate about how our company does business.  A 900 line method conceals that information by adding noise.  We reduce that noise and strengthen the real message of our code by separating concerns into multiple methods and classes.  (By the way, this has the nice side effect of breaking dependencies and reducing coupling, which will make our code much easier to change in the future.)

Granted, we will end up with "class explosion".  But the alternative is what we have now: "method bloat".  A code base with many small classes containing small methods is much easier to maintain than a few huge classes with very long methods.  How long is too long?  I like to be able to see every line of a method on one screen.  I have a rule of thumb called the "head rule": if I can't fit the whole method in my head, it's too long.  How much can I fit in my head?  Well, if I'm looking at the method on my screen and I hold my head next to it, if the method is longer than my head then it's too long. J

Data Transfer Object (DTO)

A Data Transfer Object (DTO), as the pattern is described by Martin Fowler, is "an object that carries data between processes in order to reduce the number of method calls".  A DTO contains nothing but simple properties, no business logic inside the getters and setters for each property, and no methods at all.  I used ReSharper to help me create a DTO called ParticipantDetailsDTO. 

I started by declaring an ParticipantDetailsDTO variable at the top of the method.  ReSharper immediately notices that I don’t have a class called ParticipantDetailsDTO in my solution and turns the new class name red.  When put my cursor on the class name and click Alt+Enter, a menu appears that offers to create a new ParticipantDetailsDTO class for me.


I lucked out a bit when it was time to fill out the DTO.  Most of the controls on the page are hydrated from private variables rather than directly from the Business 1.1 objects.  My next step was to delete the declarations of those private variables.  When I did that, ReSharper turned the variable names red wherever they appeared in the method.




Then, I prefixed every bad variable with “participantDetailsDTO”.  Resharper asked me if I wanted to create a new property in ParticipantDetailsDTO called “strAsstEmail”.  I took Resharper’s advice, and I worked my way through the whole method this way, filling out the ParticipantDetailsDTO class with new properties as I needed them.  I also changed the property names to make them a bit more readable: e.g., strAsstEmail became AssistantEmail.  Since the participant concept contains collections of other things (like Groups, Jobs, Specialties, etc.), I ended up needing several other DTOs.  ParticipantDetailsDTO now contains List<T> properties for those collections.  At the end of this process, all the controls on the page were dependent on DTOs rather than individual private variables.  I had created a seam - I'm sure I'm not using that term as strictly as Michael Feathers intended, but I think the idea is similar: a specific place in my class meant to be moved or otherwise modified without affecting the rest of the class.

From there, still in the page code-behind, I started pulling similar code together: anything that was concerned with hydrating participantDetailsDTO got moved into a separate method.  Same with all the other DTOs: each got its own method.  This was a bit tricky.  Sometimes I ran into code that decided how certain property values should be set.  I had to be careful to preserve this code.  But at the end of it, I had a number of methods that were all about hydrating the participantDetailsDTO and its collections.  So I put them all into a class called ParticipantDetailsDTORepository and moved that class into the Vega.Common assembly.  I also created an interface for it called IParticipantDetailsDTORepository – more about that in a minute.  I ended up with something like this:


This is just a quick sketch, but I think you can see the bones of the design.  This implementation of the IParticipantDetailsDTORepository works only with the .NET 1.1 components.  IParticipantDetailsDTORepository contains a method called GetParticipantDetailsDTO(int participantId).  When the time comes to switch this page over to the .NET 2.0 Business Components, we’ll write a new implementation of this repository interface that pulls participant info from the BC components.  That should be a goal in our next iteration.

Model View Presenter (MVP)

So we’ve moved our participant fetch logic out of the code-behind.  And obviously the code that assigns values to page controls has to stay in the code-behind (or does it?).  But I see no reason for the data formatting code to remain here.  This is where the model view presenter (MVP) pattern comes in.  I declared an empty IParticipantDetailsView and made the page inherit from it.  I also created a ParticipantDetailsPresenter class in Vega.Common – this is where I put all the formatting logic, and pretty much any other logic that wasn’t directly concerned with hydrating page controls.  I made heavy use of Resharper’s Extract Method feature here.  So now we also have something like this:

It is actually possible, and even desirable, to move control-hydration code into the presenter, but I didn’t go that far.  I would love to go that direction and show it all under unit tests, but I didn’t have time in this post. 

We're not done yet.  There is still much more we can do to simplify this page.  But the result so far is that the persistence-related code has been moved out into testable classes, the ParticipantDetails code-behind is now about 1200 lines instead of more than 3300, the code is much easier to read and understand, and we have an upgrade path to more easily switch out our persistence mechanism when we're ready.  Next time, I'll talk about testing to detect change to ensure that the DTO built from the new persistence mechanism is the same as from the original mechanism.

Share this post :