66. Program with GUTs – 97 Things Every Java Programmer Should Know

Chapter 66. Program with GUTs

Kevlin Henney

So you’re writing unit tests? Great! Are they any good? To borrow a term from Alistair Cockburn, do you have GUTs? Good unit tests? Or have you landed someone (future you?) with interest-accumulating technical debt in their testbase?

What do I mean by good? Good question. Hard question. Worth an answer.

Let’s start with names. Reflect what is being tested in the name. Yup, you don’t want test1, test2, and test3 as your naming scheme. In fact, you don’t want test in your test names: @Test already does that. Tell the reader what you’re testing, not that you’re testing.

Ah, no, I don’t mean name it after the method under test: tell the reader what behavior, property, capability, etc. is under test. If you’ve got a method addItem, you don’t want a corresponding addItemIsOK test. That’s a common test smell. Identify the cases of behavior, and test only one case per test case. Oh, and no, that doesn’t mean addItemSuccess and addItemFailure.

Let me ask you, what’s the purpose of your test? To test that “it works”? That’s only half the story. The biggest challenge in code is not to determine whether “it works,” but to determine what “it works” means. You have the chance to capture that meaning, so try additionOfItemWithUniqueKeyIsRetained and additionOfItemWithExistingKeyFails.

Because these names are long, and also aren’t production code, consider using underscores to improve readability—camel case doesn’t scale—so Addition_of_item_with_unique_key_is_retained. With JUnit 5 you can use DisplayNameGenerator.ReplaceUnderscores with @DisplayName​Gener⁠ation to pretty-print as “Addition of item with unique key is retained.” You can see that naming as a proposition has a nice property: if the test passes, you have some confidence the proposition might be true; if it fails, the proposition is false.

Which is a good point. Passing tests don’t guarantee that the code works. But, for a unit test to be good, the meaning of failure should be clear: it should mean the code doesn’t work. Like Dijkstra said, “Program testing can be used to show the presence of bugs, but never to show their absence!”1

In practice, this means a unit test shouldn’t depend on things that can’t be controlled within the test. Filesystem? Network? Database? Asynchronous ordering? You may have influence, but not control. The unit under test shouldn’t depend on things that could cause failure when the code is correct.

Also, watch out for overfitting tests. You know the ones: brittle assertions on implementation details rather than required features. You update something—spelling, a magic value, a quality outcome—and tests fail. They fail because the tests were at fault, not the production code.

Oh, and keep your eyes open for underfitting tests too. They’re vague, passing at the drop of a hat, even with code that’s wildly and obviously wrong. You successfully add your first item. Don’t just test the number of items is greater than zero. There’s only one right outcome: one item. Many integers are greater than zero; billions are wrong.

Speaking of outcome, you may find many tests follow a simple three-act play: arrange–act–assert, aka given–when–then. Keeping this in mind helps you focus on the story that the test is trying to tell. Keeps it cohesive, suggests other tests, and helps with the name. Oh, and as we’re back on names, you may find names get repetitive. Factor out the repetition. Use it to group tests into inner classes with @Nested. So, you could nest with_unique_key_is_retained and with_existing_key_fails inside Addition_of_item.

I hope that’s been useful. You’re off to revisit some tests? OK, catch you later.

1 Edsger W. Dijkstra, “Notes on Structured Programming.” In Structured Programming, O.-J. Dahl, E.W. Dijkstra, and C.A.R. Hoare, eds. (London and New York: Academic Press, 1972), 6.