Alex Martsinovich

Software Engineer
Home Posts Github LinkedIn

Do Repeat Yourself

The conventional wisdom stemming from the famous DRY (Don't Repeat Yourself) principle is that if you notice yourself copying code around, it's a smell. And I agree. Mostly. More often than not. Listen, it's complicated. I think this is a good principle, but even a perfectly functioning analog clock can be wrong for half a year, okay?

One of the few strong opinions I have is that it's not only acceptable but highly desirable to copy code around when you're writing tests. And this has very little to do with the code itself; it's mostly a people thing.

Sorry, tests

Let me tell you a little about myself. Before I discovered Elixir and decided that this language was actually nice enough to make it my full-time job, I was a test engineer. I was testing things for a living. I still wrote code, but that was all test code. And I was writing a lot of it. Not only I, but other test engineers were writing it too. And even regular developers. Sometimes.

When all you write is tests, all your engineering ambitions are funneled into making them good, in a conventional sense. Clean code, SOLID code, and of course, DRY code. But if you work long enough, you at some point face an inevitable question: if our tests are so good, why don't developers write it more often? They certainly have the skills; they deal with much more complex code every day. And this is when I had a simple revelation:

  1. People see tests as fundamentally secondary part of the codebase.
  2. Dealing with test code is, therefore, a chore that people will actively try to avoid.

This is it. That's the harsh truth. The content of the lib/ folder is just not the same as what's inside the test/ folder.

"But TDD," I hear you saying. "It's very popular; people love tests and see value in them." Of course! A lot of people love writing tests. Writing. Write-time is the honeymoon phase for a test; it's when it's being cared for the most. Unfortunately, it's all downhill from there. They will never see the same treatment again, not even from their own creator.

        ╭──────────────────────────╮
        │ i ain't reading all that │
        ╰──────────────────────────╯
        ╭───────────────────────╮
        │ i'm happy for you tho │
        ╰───────────────────────╯
        ╭────────────────────────╮
        │ or sorry that happened │
        ╰────────────────────────╯
╭────────╮
│ (⌐■_■) │
╰────────╯
A common reaction to mix test output.

Copy copy copy

But what does it have to do with DRY? The thing is, we know that tests are immensely useful, so we need to reduce our own aversion to them as much as possible. And one of the ways to make it easier is to ditch DRY and indulge in self-repetition. Here's an incomplete list of well-intended things people often overuse:

  1. Extracting common parts into helper functions
  2. Sharing sophisticated setup between tests
  3. Using third-party testing libraries with a steep learning curve

Next time you are tempted to create a test helper, think twice. Maybe it makes total sense now. Sure, it's just one click away... No! Stop! People don't open test files because they just feel like doing it. They do it because a test fails, and by this time they're already annoyed. Not only do they not want to read more than one test at a time, they will actively avoid doing so! Every function that's defined somewhere will make their eyes roll. Every setup they need to scroll to will make them angry. And if their rushed fix to a shared setup breaks another test? They may just delete it, because code reviewers have even less incentive to read it.

When writing a test, you have to remember that this is the most effort anybody will ever put into it. Everybody else, including future you, will see this test as a burden. A time capsule with annoying ancient puzzles. So try to fill it with love instead, make a test that's easy to follow, that doesn't require any previous reading or knowledge of that fancy assert_nested_in_any_order custom function used in exactly two places throughout the codebase.

HOWEVER

Of course, you have to exercise your best judgment. I would be crazy to advocate for not having any helpers at all. Take my factories from me, and I'm helpless. Make me create a conn struct without a Phoenix helper, and I don't know how!

In fact, I have a confession to make. Whenever I want to test a large number of cases that can be represented as a test table, I love employing what is usually called parameterized tests:

# Let's say we want to test a function that checks user permissions
# depending on user type and resource privacy

defmodule PermissionsTest do
  use ExUnit.Case

  test_table = [
    {:guest, :public, true},
    {:guest, :private, false},
    {:guest, :system, false},
    {:user, :public, true},
    {:user, :private, true},
    {:user, :system, false},
    {:admin, :public, true},
    {:admin, :private, true},
    {:admin, :system, true},
  ]

  for {user, resource, exp_result} <- test_table do
    test "#{user} access #{resource} resource" do
      user = Users.create_user(%{type: unquote(user)})
      resource = Resources.create_resource(%{privacy: unquote(resource)})

      assert Permissions.has_access?(
        user,
        resource
      ) == unquote(exp_result)
    end
  end
end

As you see, I commit a serious crime of meta-programming in a test, but test tables are just too good. In my defense here, this still reads like a single independent test; it just has a neat little truth table that is right over here in its fullness.

Conclusion

For better or worse, people tend to see test code as less important. And while it may be tempting to fight for higher standards and better tests, I'd argue it's wiser to accept the reality and give both tests and people some slack. It doesn't matter if your test file is 2,000 lines long if you don't have to read it all at once.