Premature Optimization and the Rule of Three

Avoiding premature Over-architecting using the Rule of Three

@July 19, 2023

TL;DR: The Rule of Three says the second time you encounter a problem, copy-paste the first attempt and adapt and start generalizing only from the third time on.

The first time: implement as if it were not to be reused to ensure you rapidly have initial results.

Remember the very wise YAGNI pattern that stands for You Ain’t Gonna Need It? It goes hand in hand with the famous quote “The best code I ever wrote is the one I didn’t write”.

+-----------------------+
|      Project A        |
+-----------------------+
|                       |
| +-------------------+ |
| |     Module X      | |
| |   Function x_foo  | |
| +-------------------+ |
+-----------------------+

The second time: Not all clones are bad. Review how it was done the first time, copy-paste and adapt as if it was never done before - ensures continued sustained pace without wasting time generalizing concepts that aren’t used that often. Also, if you do happen to find a common bug, fix it in both places.

+-----------------------+
|      Project A        |
+-----------------------+
|                       |
| +-------------------+ |
| |     Module X      | |
| |   Function x_foo  | |
| +-------------------+ |
|                       |
| +-------------------+ |
| |     Module Y      | |
| |   Function y_foo  | | <-- an adapted copy of x_foo
| +-------------------+ |
+-----------------------+

The third time and subsequent: review how it was done the previous times, refactor the common code - now it seems that we’re encountering this problem more often and it may pay off to start generalizing it. Maybe as part of a whole new project.

+-----------------------+
|      Project A        |
+-----------------------+
|                       |
| +-------------------+ |
| |     Module X      | |
| |   Function x_foo  | | <-- calls xyz_foo
| +-------------------+ |
|                       |
| +-------------------+ |
| |     Module Y      | |
| |   Function y_foo  | | <-- calls xyz_foo
| +-------------------+ |
|                       |
| +-------------------+ |
| |     Module Z      | |
| |   Function z_foo  | | <-- calls xyz_foo
| +-------------------+ |
|                       |
| +-------------------+ |
| |   Helper Module   | |
| |    xyz_foo        | | <-- extracted common function from x_foo and y_foo
| +-------------------+ |     and combined with new functionality from z_foo
+-----------------------+

or it could look like this if a separate project enters the game:

+-----------------------+          +-----------------------+
|       Project B       |          |      Project A        |
+-----------------------+          +-----------------------+
|                       |          |                       |
| +-------------------+ |          | +-------------------+ |
| |     Module Z      | |          | |     Module X      | |
| |   Function z_foo  | |    /----<| |   Function x_foo  | |
| +---------v---------+ |    |     | +-------------------+ |
|           |           |    |     |                       |
| +---------v---------+ |    |     | +-------------------+ |
| |   Helper Module   | |    |     | |     Module Y      | |
| |    xyz_foo        | |<---+----<| |   Function y_foo  | |
| +-------------------+ |          | +-------------------+ |
+-----------------------+          +-----------------------+

The advantage of not generalizing the second time already, is that you get the chance of including more factors that are introduced only on the third usage into how to best generalize the problem. When done by the same developer, the brain will already start to think on potential solutions anyway giving you a useful head start.

A variation is to also generalize the “and subsequent” part into another instance of rule-of-three application. Wait for another three changes before generalizing again. It becomes kind of an amortized-cost algorithm / heuristic.

The only potential pitfall of the rule-of-three is that you need to keep track of how many times a code path was already used before. If the code base is small and number of developers low, that’s usually not a practical issue but the larger it gets, the harder it becomes to track. Code comments will help. Also, the senior devs on the project should have a way of tracking this that’s compatible with the workflow.