Wednesday, October 14, 2015

TDD and Getting Lost in the Trees (Part 2)

"There is no such thing as complete when it comes to stories. Stories are infinite. They are as infinite as worlds." – Kelly Barnhill

This is the sixth of a series of posts about Test-Driven Development (TDD).  Here's a recap of what I've posted so far:


In the last post, I started describing the situation when, in doing TDD, it starts to feel a lot like you're getting lost in the woods. You get that sinking feeling of helplessness when you've lost your bearings and become confused. There's probably even some dismay, desperation, and fear mixed in that bag of doom and gloom. Invariably, people in this situation ask themselves "What in the world have I gotten myself into and how the heck do I get out of it?"

There are two aspects to addressing this problem: prevention and remediation.

Let's talk about prevention first.

Extending another olive branch


Before we get back to eavesdropping on the imaginary conversation between Phil, aka "Tandy", and Carol in their simple calculator TDD session, I'd like to extend another olive branch to the practical-minded folks who have stuck it out through the five philosophically-laden posts that came before. That's a long time to hold out hope that you'll see some sample code written through TDD.

I'm going to take a slight detour from the calculator program and demonstrate a few things from a very simple exercise in refactoring. It will illustrate three simple tools that I believe can help you vastly improve 80% of the code you write.

I was looking for an analogy for these and started out with the venerable Swiss Army Knife but after thinking about it, these techniques are really like your basic oral hygiene kit: a toothbrush, floss, and mouthwash. If you use these properly and consistently, not only can you go about your daily business with freshness and confidence, you can also avoid a host of bigger problems and pain in the long term.

The Code Hygiene Starter Kit


The code hygiene starter kit includes the following techniques:

  1. Rename
  2. Extract
  3. Compose Method

That's it.

You might be saying to yourself right now that it can't be that simple. Yes, in my experience it really is that simple. However, my observation is probably consistent with what dental hygienists see with their patients: there are a lot of people who do not use these tools diligently or consistently. Only about one out of ten programmers I talk to have even heard of the Compose Method refactoring or use it on a regular basis when they are refactoring code.

In my experience, diligently applying these techniques in every TDD cycle keeps you moving forward with clean, readable, and well-factored code 80% of the time. Oftentimes, these are all that's required for 100% of your refactoring needs.

Joshua Kerievsky's Compose Method example has become my go-to example for this trio of refactoring techniques. I like to use Joshua's example not only because there's a stark improvement in the code after just a few simple changes, but it also embodies the strategy that I use to avoid getting into the "lost in the woods" predicament we're talking about.

I won't rehash what Joshua discussed in his article so go ahead and read that before you continue here.

Notice that Joshua uses the three refactoring techniques in our basic code hygiene kit. Mixed in there are some bonus goodies: the Arrow Code antipattern and the Guard Clause, and the Single Level of Abstraction Principle or SLAP, which I mentioned in passing in the first post of this series. It also has the elements of the strategy I mentioned. More on that later.

It'd be great to be a fly on the wall


In the previous post, we heard Carol tell Phil that the way we get there is just as important as what we get in the end when we do TDD.  The problem with most examples on TDD is that, for the sake of brevity, they seldom let us in on the conversation that goes on while refactoring and TDD is happening. We only get to see the results. That's understandable but unfortunate.

I think that TDD remains a mystery to many because they miss out on certain nuances about TDD that can only be discerned when we have a better understanding of the thought processes involved in arriving at the refactored code. Conversations and experimentation are what weaves the threads of knowledge into the cloth of understanding.

So here's another chance at being a fly on the wall during a reconstruction of some of the conversations I have with participants in my TDD workshop when we go over the Compose Method refactoring.

Caveat: As you read through the dialog below, remember that these are still fragments of a much longer conversation. For the sake of relative brevity and coherence, I left out parts where we go from Red to Green to Refactor or where we discuss the merits of others choices we needed to make. This illustrates my point about the challenges in giving good, complete stories about TDD. You really have to be there to appreciate the whole thing.

Me: First up, let's address the Arrow Code problem. We know how to recognize it now, right? And to remediate the problem, we introduce a guard clause, right?

Participants: Right.

Me: Hmmm, since we're doing TDD, do we really want to do that right away or should we do something else first?

Participants: We should probably write a test first to make sure we don't break anything.

Me: Great idea! What test do we write first?

Invariably, somebody will want to address the null case first. I steer them away from that and any other test idea not related to introducing the guard clause. I'll skip that part of the conversation for now; there are other topics where that's relevant but we'll get into those later.

Me: So, say we write the test to check that the guard clause works. What would this test look like?

Participants: We should check the readOnly property.

Somebody will offer something like this:

public class MyListTest {
 
   @Test
   public void testReadOnly() {
       MyList list = new MyList();
       list.setReadOnly(true);

       assertTrue(list.isReadOnly());
   }
}


Me
: Ok, so we add a setter, that's fine. But how does this test help us make sure that the "introduce guard clause" refactoring won't mess anything up?

Participants: Well, if the list is read only, then the guard clause should work, right? The guard clause is checking the readOnly attribute and it will work if the readOnly attribute is true, right?

Me: But is that really what this test is telling us?  Here's what I'm getting from this: When you make a list readonly, then its readOnly property is true.

Participants: We don't get it. We're confused. What's your point?

Me: That's like saying that if you're a U.S. citizen, then your passport is blue.

Participants: Ok, that's true, a U.S. passport is blue. But we still don't get your point.

Me: If I'm a U.S. citizen, the fact that my passport is blue doesn't mean much. Being a U.S. citizen means I can vote. Otherwise, I can't. What does the readOnly attribute allow or prevent you from doing?

Participants: Ok... well, if readOnly is true, we can't add any new elements to the list.

Me: Right! So instead of checking the readOnly attribute, we need to check the elements, right?  If we try to add to a list that is read-only, then the code should ignore that request and the elements array shouldn't change, right? Would anybody care to modify the test?

Somebody will offer something like this:

public class MyListTest {
 
   @Test
   public void testReadOnly() {
       MyList list = new MyList("First");
       list.setReadOnly(true);
 
       list.add("new element");

       assertEquals(1, list.getElements().length);
   }
}


Me: Ok, so we add a constructor that takes an initial element. We can easily make that take a variable size argument list so it can be any number of elements. That makes sense. Then we get the elements and check its length. Is everybody ok with this?

There is usually a nod of agreement or an awkward pause here, sometimes both. Sometimes somebody will object but let's go with the usual case where they still feel a little lost. I won't even go down the thread of conversation that addresses the fact that the above code has a big bug in it. That can also be the motivation for getting away from this option.

Me: Ok, is there an object-oriented design concept that we're violating here? Are we breaking anything, like say by examining the length of the elements array that we get from the list?

Participants: Yeah, we're breaking encapsulation. But we do this all the time in our code...

Me: Doesn't mean it's right, does it? This is what talking about these things does: it puts your bad habits in a spotlight. It makes you rethink your approach.  What can we do so we don't break the encapsulation of the list?

After more prodding and discussion, someone will come up with this:

public class MyListTest {
 
   @Test
   public void testReadOnly() {
       MyList list = new MyList("First");
       list.setReadOnly(true);
 
       list.add("new element");

       assertEquals(1, list.size());
   }
}


Me: Ok, that looks better.  By adding the size() method, we pushed down the knowledge about the implementation of the elements array and its length, and replaced it with a higher level concept of size().  At this level of abstraction, it doesn't matter whether there's an array or whatever involved.

Me: That's a step forward. But there's some implied dependency in this test. Where does that 1 in the assertion come from?

Participants: There's only one initial element in the list.

Me: But what if somebody comes in later and decides they want more than one initial element? Are they going to remember to modify that assertion accordingly? What are the chances of them forgetting to do that? There's an implied connection between that first line where we initialize the list and the assertion.  Can we do something to make this more explicit and less brittle? Either bring it all out in the open or hide it behind a name that reveals our intent better?

After more prodding and discussion, someone might come up with this:

public class MyListTest {
 
   @Test
   public void testReadOnly() {
       MyList list = new MyList("First");
       int initialSize = 1;
       list.setReadOnly(true);
 
       list.add("new element");

       assertEquals(initialSize, list.size());
   }
}

Me: That's good. We definitely made the intent of the test a little clearer. But there's still a need to change that line that sets the value of initialSize if we add more initial list elements in the line above. We can make that better, right? How about if we did this?

public class MyListTest {
 
   @Test
   public void testReadOnly() {
       MyList list = new MyList("First");
       int initialSize = list.size();
       list.setReadOnly(true);
 
       list.add("new element");

       assertEquals(initialSize, list.size());
   }
}

Me: That's kind of obvious now, right? Why didn't we think of that before? Now we won't have to worry about having to keep these two lines synchronized. But there's still something not quite right with that name. There seems to be a disconnect between its intent and the test intent.

I want the test to read pretty much like this: When I try to add new elements to a read-only list, the list's size should not be changed after I call the add() method. Or something along those lines. How can we refactor this test so that it says that? Is there a better name we can use instead of initialSize?

After more discussion, we have these versions to choose from:

// Choice #1

@Test
public void testReadOnly() {
   MyList list = new MyList("First");
   int initialSize = list.size();
   list.setReadOnly(true);
 
   list.add("new element");

   int finalSize = list.size();

   assertEquals(initialSize, finalSize);
}

// Choice #2

@Test
public void testReadOnly() {
   MyList list = new MyList("First");
   int sizeBefore = list.size();
   list.setReadOnly(true);
 
   list.add("new element");

   assertEquals(sizeBefore, list.size());
}

// Choice #3
 
@Test
public void testReadOnly() {
   MyList list = new MyList("First");
   int sizeBefore = list.size();
   list.setReadOnly(true);
 
   list.add("new element");

   int sizeAfter = list.size();

   assertEquals(sizeBefore, sizeAfter);
}

// Choice #4

@Test
public void add_does_not_change_size_of_readOnly_list() {
   MyList list = new MyList("First");
   list.setReadOnly(true);
   int notChanged = list.size();
 
   list.add("new element");

   assertThat(list.size(), is(notChanged));
}

Style Note: The test name convention I use here is something I picked up from Neal Ford at a No Fluff Just Stuff Symposium many years ago. Neal's assertion was that test names are special and don't really need to follow the normal convention for regular method names. I find this convention more readable and it also helps me highlight regular program element names when I use them in the test name.

That last option above actually involves a lot more discussion than I let on here. There's discussion about the test name itself, the convention (see the Style note above), whether we really need that second explaining variable, whether we want to use a more fluent API for assertions, and even a discussion about whether to move the first explaining variable closer to the Act part of the test (see Arrange, Act, Assert).

After more discussion, we settle on this:

@Test
public void size_of_readOnly_list_does_not_change_after_add() {
   MyList list = new MyList("initial element");
   list.setReadOnly(true);
   int notChanged = list.size();
 
   list.add("new element");

   assertThat(list.size(), is(notChanged));
}

There are a lot of details in this particular arch of a story that I have left out. I will leave it to you as an exercise with your programming partners to explore and discover your own version of this story.

Again with the reflections


You might be getting tired of this by now but it's not going to end. These posts are going to be full of reflections.

Reflection #1: Refactoring is a lot harder in the field than it is on paper.

On paper, the Compose Method refactoring example seems simple and straightforward. You see the end result and the way to get there seems obvious. Reality is very different, however, especially when there are people who are not familiar with the thought processes involved.  The end result is not known and getting there requires experimentation, discussion, and sometimes, negotiation.

In my experience from giving interviews/auditions and from leading a TDD workshop, this exercise can easily take 30 minutes to go through when I do it with as few as three or four people. And if you're thinking otherwise, this seems to have very little to do with the intelligence or apparent lack thereof of the people involved.

Rather, I have found that the level of familiarity and understanding of the principles of design, the mechanics of the refactoring techniques, and the kind of questions we ask each other and the quality of the ensuing discussions determines how quickly we get to the desired end state of the code or at least a good enough approximate of it.

Reflection #2: Finding good names for program elements is challenging.

Programmers have a tendency to be very technical when choosing names in their programs. Names tend to leak implementation details. Names can mislead the reader. They can obfuscate the intent of the code. I'm not even talking about the names that are chosen out of lack of effort to find anything better than numElmnts.  Names like this are a pet peeve of mine. Read the relevant chapter in Uncle Bob's "Clean Code" book. Please.

There are a few things you can do to find good names. Keep the language of the conversation focused on high-level concepts. Constantly remind each other that you want to keep as much of the implementation details hidden and make the intent abundantly clear. Pay attention to grammar, spelling, and context. These will usually help you stumble upon a good name.

In the five TDD workshops that I have led in the last two years, there has been only one person who came up with the name atCapacity() right off the bat.  And no, this guy had never seen Joshua Kerievsky's example before. I was very excited when that happened and my reaction got a few strange looks. All the other times, we needed a few iterations to get something as succinct. Some usual suspects include: hasEnoughElements(), isMaximumSize(), isMaxSizeReached(), and other similar alternatives, each with their own problems and reasons for disqualification.

I will talk about names in more detail in another post.

Reflection #3: It is important to have these kinds of conversations and thought processes to spur action down a path that leads to better code.

The conversations usually revolve around revealing intent and improving clarity. Revealing intent includes the intent of tests as well as the intent of the program code. These two intents must be in agreement. The tests need to say what it expects of the program and the program must say what it does to meet those expectations.

Questions must be asked in a way that elicits ideas for saying what you want to say both with the tests and the code in as clear and succinct way as possible. Don't settle on the first idea that pops up either. There are usually more questions and ideas that can be brought to the surface after the first few rounds of refactoring.

Reflection #4: Prefer to verify behaviors and capabilities rather than attributes

Tests should focus on verifying behavior, not so much the value of an attribute. This meshes very nicely with the idea that object-oriented programming is mostly about properly assigning responsibilities. A good set of guidelines to use is GRASP, the general responsibility assignment software principles. There's also the Law of Demeter and code smells like feature envy and inappropriate intimacy.

In the example conversation, I used the analogy of being concerned with a person's ability to vote rather than the color of a person's passport.

Since tests also should tell what is expected of the program, you should make sure the test says what these expectations are in a clear and unambiguous way. Nothing implied, nothing assumed, and nothing that the reader must deduce from clues that are found anywhere other than the immediate test. Parts of the example conversation revolved around this.

I find that reading the code out loud really helps in discovering subtle problems related to unclear intent and mis-assigned responsibilities. Try to explain what the code is doing in non-technical terms to each other. I'll write more about making code more expressive and readable and how to "listen" to the code in a subsequent post.

Reflection #5: Diligently and consistently apply the Code Hygiene Starter Kit techniques in the refactoring step.

These three simple techniques will help you quickly fix problems you introduced in the Red and Green steps of TDD and avoids the buildup of plaque in your code that can cause bigger problems and pain in the long run.

Just as you should brush, floss, and gargle daily, you should also Rename, Extract, and Compose Method constantly throughout the TDD cycle. Refactor ruthlessly.

Reflection #6: The "strategy" that I alluded to earlier in this post is this: have an idea of the end in mind and use that to guide you along the way.

It's kind of like getting a bearing on the sun and having a general sense of north, south, east, and west. Or finding the North star if you're traveling at night in the northern hemisphere. This is your "big picture" view.

As you slowly make your way through the fog of uncertainty and the unknown, the brush and undergrowth of quickly written code, and the thick of trees that are your program requirements, don't lose sight of where that guiding light is and keep a bead on it so you know you're still heading in the right direction.

Don't sweat the small stuff or let yourself get distracted by them but don't forget to pay attention to the little details, too. Little details can often make a big difference in how your code ends up saying what it's doing.

You may not see any connection between what I discussed above and the last reflection but it's there. It's just very subtle and for me, it's one of those things that is elevated to the level of a general principle for TDD.

Again with the Martial Arts connections


You may also be getting tired of these martial arts analogies but these concepts have really helped me see things in my TDD practice in a different light.

Miyamoto Musashi wrote in the "Use of the Gaze in Strategy" section of his "Book of Five Rings":

Use the eyes in a broad manner. There are two aspects of sight—perception and seeing. Perception is strong and seeing is weak. It is vital to see things which are at a distance as if they were close, and things which are close as if they were far away. It is vital in strategy to know the opponent's sword, but not to look at it.

Aikido has a similar concept called Zanshin which also pertains to a broad focus and general awareness of your surroundings.  In western terms, the closest thing to zanshin is the concept of situational awareness.  It's an awareness of your own relative position and connection to everything else in any given setting.

In an often-told story about events that led to a spiritual awakening that inspired the development of Aikido, O'Sensei was said to have seen and perceived his connection to the entire universe, thus allowing him to easily and deftly evade an expert swordsman's strikes until the swordsman was so exhausted from exertion that he had to concede defeat to O'Sensei.

I like to explain it like this: In Aikido we are taught to not let our minds be captured by the swinging katana (sword). The blade moves faster than you can ever hope to follow. We are taught to not allow our mind be captured by the other person's eyes but rather to look at nothing in particular and perceive everything as a whole. Connect with uke's center this way and you can control him before he even moves. In Aikido, the conflict is over before it even begins.

If this makes no sense to you, that's fine. Hopefully you'll start getting a sense for what it means as you practice TDD more.  More on this later.

That's it for this "little" detour. Hopefully, it has helped set the tone for a better understanding and quicker recognition of ways around the challenges that Phil and Carol are going to encounter when we rejoin them in their TDD exercise next time.

Next: TDD and Getting Lost in the Trees (Part 3)

No comments: