… but you should still be writing them.

Given the following set of test cases:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
class PrimeTests: XCTestCase { func testPrime() { XCTAssertFalse(isPrime(1)) XCTAssertTrue(isPrime(2)) XCTAssertTrue(isPrime(3)) XCTAssertFalse(isPrime(4)) XCTAssertTrue(isPrime(5)) XCTAssertFalse(isPrime(6)) XCTAssertTrue(isPrime(7)) XCTAssertFalse(isPrime(8)) XCTAssertFalse(isPrime(9)) } } |

We can quickly verify whether or not our isPrime() function is working for the numbers one through nine. But we can’t guarantee that our isPrime() function has no bugs. Do we know whether or not it returns the right value for ten or eleven?

What if the implementation of isPrime() looks like this:

1 2 3 |
func isPrime(value: Int) -> Bool { return value == 2 || value == 3 || value == 5 || value == 7 || value == 10 } |

Our function has a bug. We know 10 isn’t prime, and we know 11 is prime. Plus there’s a lot of primes that come after 11, and all of those would return incorrect values. Despite this bug, this function will pass our test cases. Once we discover this bug, we can add a new test to at least check 10 and 11.

1 2 3 4 |
func testMore() { XCTAssertFalse(isPrime(10)) XCTAssertTrue(isPrime(11)) } |

If we run the test again with the aforementioned function, these will both fail. Now we can fix our function to make these pass.

What these tests are doing are not preventing bugs. They are preventing regressions. Whether you fix the bug yourself or someone else fixes it, we have two tests which tests all of the numbers one through eleven to make sure our function is returning the result we expect. The function can be rewritten, refactored, optimized, or have anything at all done to it. And our tests will make sure that it has not regressed. It’s not going to change the results it was giving for any of the cases we were already testing.

It’s pretty clear here though that this could become a pretty tedious process of writing out a test for every single possible number. We, of course, can write smarter tests. Consider this test which will check every number that fits in a single byte.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
func testEightBitPrimes() { let primes = [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251 ] for value in 1...256 { let prime = primes.contains(value) XCTAssertEqual(isPrime(value), prime, "isPrime incorrect for \(value), should be \(prime)") } } |

If we run this test with the original function, it reports 50 errors. We’re wrong in 50 of the cases we tested (which was every number from 1 to 256).

At this point, we know that our function simply does not work at all. As long as we’re comfortable with the logic in our test, we can get to work improving our function until it at least passes this test. But let’s be clear, this test tells us nothing about whether or not the function works for values over 256.

Let’s take a quite naïve approach to the prime number problem.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
func isPrime(value: Int) -> Bool { if value < 2 { return false } for divisor in 2..<value { if value % divisor == 0 { return false } } return true } |

Do we think this will work? Well, we can run our test suite to find out if it at least works for the values 1 through 256. And it does appear to work.

Now, we can try to refactor and optimize our function. Can we make this faster?

The following chunk of code runs significantly faster in long running tests checking large primes.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
func isPrime(value: Int) -> Bool { if value < 2 { return false } if value % 2 == 0 { return value == 2 } if value % 3 == 0 { return value == 3 } if value % 5 == 0 { return value == 5 } if value == 7 { return true } var divisor = 7 while divisor * divisor <= value { if value % divisor == 0 { return false } if value % (divisor + 4) == 0 { return false } if value % (divisor + 6) == 0 { return false } if value % (divisor + 10) == 0 { return false } if value % (divisor + 12) == 0 { return false } if value % (divisor + 16) == 0 { return false } if value % (divisor + 22) == 0 { return false } if value % (divisor + 24) == 0 { return false } divisor += 30 } return true } |

It can run through all the values from 1 to 50,000 and print the values it thinks are prime in approximately a tenth of a second. The previous iteration of this function takes almost twenty times as long to run.

Of course, our tests don’t tell us whether either is accurate at all. Our tests have only shown us that both are accurate up to the number 256. We can always add more tests.

But when you leverage tests to your advantage, they can mean a lot.

When a bug is discovered, a test case can generally be written for that bug. Running the test case and finding that it fails is proof that the test works. Modifying the code to make the test pass then becomes proof that you have fixed *that* bug. And keeping the test is a safe guard against regression on this particular issue.

Importantly, when we write code around this isPrime() function for example, we are assuming that the isPrime() function has a particular behavior. Documentation and good naming can go a long way to helping your fellow developers (including your future self) figure out what the code is supposed to do. Unit tests are a means of codifying and asserting these assumptions.

If you have found a bug in code, you have hopefully found a short-coming in that code’s test suite. And if that’s the case, you can solve the problem by adding a test case and then making that test case pass, and then rest comfortably know that this specific bug should never return.

## One thought on “Unit Tests Don’t Prevent Bugs”