“Don’t repeat yourself”, or “DRY” for short, is a somewhat controversial principle of software development. It aims to make code cleaner, which is to say less buggy and easier to work with. DRY purports to accomplish this by reducing repetition in your codebase and replacing that duplicate code with abstractions like functions, classes, and methods.
First, why should we care about cleaner code? 🔗
Clean code is like clean garbage - it doesn’t really exist. The only clean code is code that doesn’t exist at all.
I’m happy to admit that a perfectly clean (empty) codebase is useless. Without code, an application can’t provide value to users. With that in mind, our pursuit of “clean code” will necessarily consist of tradeoffs. We’ll trade usefulness for cleanliness, complexity for speed, ownership for ease of development, and abstractions for reusability.
DRY code is often held aloft as an ideal in the quest for clean code. Let’s explore why I think DRY can be a good heuristic for better code but is far from an “absolute good”. Let’s take a look at some examples in JavaScript.
export async function updateUserHandle(handle) {
  if (!isLoggedIn()) {
    // redirect to login screen
    return;
  }
  let token = localStorage.getItem(jwtKey);
  let decodedToken = decodeJWT(token);
  const hoursDelta = 24;
  if (decodedToken.exp < (Date.now() + hoursDelta * 60 * 60) / 1000) {
    refreshToken();
  }
  return await fetch(`${domain}/v1/users/handle`, {
    method: "PUT",
    mode: "cors",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${token}`,
    },
    body: JSON.stringify({
      handle,
    }),
  });
}
export async function updateUserInterests(interestUUIDs) {
  if (!isLoggedIn()) {
    // redirect to login screen
    return;
  }
  let token = localStorage.getItem(jwtKey);
  let decodedToken = decodeJWT(token);
  const hoursDelta = 24;
  if (decodedToken.exp < (Date.now() + hoursDelta * 60 * 60) / 1000) {
    refreshToken();
  }
  return await fetch(`${domain}/v1/users/interests`, {
    method: "PUT",
    mode: "cors",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${token}`,
    },
    body: JSON.stringify({
      interestUUIDs,
    }),
  });
}
You may have noticed that the beginnings of those two API calls are nearly identical - the first few lines check to see if a user is properly authenticated and sends authentication data in the respective requests. This might not be a big problem with just 2 API calls, but what if we have 30? Or maybe 1000? Instead, we can DRY up this code by writing a simple fetchWithAuth() function that centralizes all the client’s authentication logic in a single place:
async function fetchWithAuth(url, params) {
  if (!isLoggedIn()) {
    // redirect to login screen
    return;
  }
  let token = localStorage.getItem(jwtKey);
  let decodedToken = decodeJWT(token);
  const hoursDelta = 24;
  if (decodedToken.exp < (Date.now() + hoursDelta * 60 * 60) / 1000) {
    refreshToken();
  }
  if (!params.headers) {
    params.headers = {};
  }
  params.headers.Authorization = `Bearer ${token}`;
  return await fetch(url, params);
}
export async function updateUserHandle(handle) {
  return await fetchWithAuth(`${domain}/v1/users/handle`, {
    method: "PUT",
    mode: "cors",
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      handle,
    }),
  });
}
export async function updateUserInterests(interestUUIDs) {
  return await fetchWithAuth(`${domain}/v1/users/interests`, {
    method: "PUT",
    mode: "cors",
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      interestUUIDs,
    }),
  });
}
Why wouldn’t you DRY out your code? 🔗
It seems like a good idea to reduce code duplication right? Well, yes, in general, it is. That said, let’s look at some of the drawbacks that come along with too much centralization.
1. Too many abstractions 🔗
Sometimes two pieces of code happen to be the same at a given point in time, but later on, they become distinct in some way. It’s really hard to guarantee that duplicate blocks of code will remain perfect copies of each other forever. A hypothetical example of this would be if the Facebook and Instagram APIs had the same way to create a social post. Just because they’re coincidentally the same, probably doesn’t mean that the logic should only be written once. One day, Instagram might introduce something dumb like “filters” or “stories” and all of a sudden the common abstraction needs crazy flags and parameters like:
- is_story
- is_instagram
- filter_enum
… and those flags make the logic do different things depending upon whether it’s a Facebook or Instagram post.
The solution is likely to remain disciplined about splitting out code that, while it may be similar now, is only coincidentally similar. We should try to only merge code that’s fundamentally the same. A great example would be a math function like log2. That function should work for every case where you need to calculate a logarithm - each calculation of a log is fundamentally the same.
2. External dependency creation 🔗
If two different projects share the same logic, it can often make sense to centralize it in a library package. While this is often a great idea, it can add overhead and can end up being more trouble than it’s worth. For example, even if the abstraction makes sense, you’re adding at least the following complexity to the project:
- Management of the dependencies versions and running updates regularly
- Requires multi-project updates to get a new change to a specific dependent
- Often involves more remote infrastructure like NPM or PyPy
- Gets harder to make “breaking” changes to the library’s core functions - requires a higher standard of code quality and architecture
3. Localization complexity 🔗
When debugging or reading code, it would be easiest if the flow of logic started at the top of a file and flowed in a linear path. For example:
START PROGRAM
INSTRUCTION 0
INSTRUCTION 1
INSTRUCTION 2
INSTRUCTION 3
END PROGRAM
Unfortunately, in large programs, we need functions, classes, methods, type definitions, etc to organize and reuse our logic. We’re forced to give up a bit of simplicity and read and write code in a non-linear way. I believe our goal should be to keep everything as linear as possible and sacrifice linearity and simplicity for reusability and separation of concerns only as necessary. Each time we extract a chunk of code from a larger function into a smaller more encapsulated one, the code becomes just a little bit harder to follow.
In short, we should optimize for simplicity first.
With a highly compartmentalized project, when we see a function called getUser(), if we want to really know what’s going on we have to peek into that function and remember the external calling context because we’re now looking at an entirely different file. The cognitive burden becomes greater and greater the more definitions we need to jump through to grok a single logical pathway.
Takeaways - Code smells and heuristics 🔗
Since no code is perfect, we need to make use of some heuristics (rules of thumb), to try to work towards a cleaner codebase.
1. WET code, or the rule of three 🔗
In my opinion, WET is a better rule of thumb than DRY.
WET stands for “write everything twice”, and forces you to think a bit harder about whether or not a piece of logic deserves an abstraction. The rule of three is an alternative that says you should wait until you’ve written something three times before breaking it out.
2. Is it testable? 🔗
Most functions should be predictable and testable. They should behave like math functions or pure functions where possible - given a set of inputs the function will always produce the same outputs, and the state of the program isn’t mutated. If the code you’re thinking about condensing into a function can be a pure function, then it’s likely a better idea than if it would create an impure function.
Pure functions are easy to write good unit tests for - if your abstraction is easily testable it’s more likely to be a good abstraction.
3. Are there special cases or arguments only used by a fraction of callers? 🔗
Take a look at the following example:
function getArea(height, width) {
  return height * width;
}
This is a great function! It’s very simple, and obviously can be used to calculate the area of any shape. Here’s a bad example:
function getArea(height, width, isTriangle) {
  if (isTriangle) {
    return (height * width) / 2;
  }
  return height * width;
}
Special cases are bad news - I’m trying to be too abstract. Instead, I should just create two separate functions:
function getTriangleArea(height, width) {
  return (height * width) / 2;
}
function getRectangleArea(height, width) {
  return height * width;
}
Have any more ideas for good rules of thumb? Let me know on Twitter.
 
       
     
						 
						 
						