In this video Eliotte Rusty Harold, Tech lead Google Cloud Tools for Eclipse, provides a brief overview of Effective Unit Testing.
While testing your code, you come up with a bug, which makes your system collapse. There is nothing to worry about this as you are going to create less buggy and faster software, at a lower cost. The solution to this issue lye in unit testing.
In this video, Eliotte Rusty Harold, Tech lead Google Cloud Tools for Eclipse, will discuss the Principle of unit testing, test and thread safety, flakiness, debuggability, reproducibility, speed, independence, and other characteristics of effective unit tests.
Also, he will discuss some examples in Java and JUnit, but the general principle discussed here, can be easily applied to all other languages and test frameworks.
Why do we write unit tests?
At 2:20, Eliotte answers this question that we write a unit test to keep code working faster, with fewer regression, to see mistakes immediately, instead of finding them on a place where it’s very difficult to fix them.
The Fundamental Principle of Unit Testing
At 4:29, Eliotte talks about the fundamental principle of unit testing. He briefly explains that Fixed input produces a known fixed output. So, always verify a code that has known output, like the square root of 9, which is 3 instead of the square root of 11, which does not produce an exact output.
At 5:57, he answers the question, what if you don’t know the correct output? Its deterministic answer is, write a characterization test where it’s an obvious answer. Find input data that gives the exact result for your problem. If the problem is fuzzy or not perfectly defined, test a similar problem with fewer fuzzy answers.
At 7:26, he explains to eliminate and avoid everything that makes input or output unclear or contingent. Never generate random input. Always use fixed values as known fix input gives us the known output. Don’t use named constants from the model code. They may be wrong or change. Prefer literal strings and numbers. Don’t access the network and preferably not the file system, it’s a little flakier than in-memory access.
At 8:53, for correct output, he added that doesn’t depend on any input which is in the environment, or dependent on surrounding like time, the speed of light, or the gravitational constant of the universe. Don’t let that control the input of your test.
Write Your Tests First!
At 9:42, he said that directly writing model code makes you slow. Tests First not just about testing; it’s about software development to create better APIs. At 10:45, he talks about the common criticism of unit testing that if I want to change something in my model code, I need to change all tests. It happens if you coupled tests tightly to your implementation, if you wrote tests before implementation it only coupled to APIs change.
Test first development creates a better API because you start with the user, not the user. Also, the Test first hides implementation and avoids exposing internal implementation details. It avoids brittle and tightly coupled tests.
At 14:00, he answers a very important question that Why one should use Unit Tests instead of any other test like integration, fuzz, etc.? As unit means One, each test depends upon only one thing in your framework and each test method is a separate test.
Best Practice for unit testing
At 14:45, he talks about the best practice of unit test which is used of one assertion per each test method. Always share setup in a fixture, not in the same method. In this way, one can have multiple test classes per model class. Do not feel compelled to stuff all your tests for Foo into one Foo test only.
At 16:26, he explains it as a unit also means independent, so always run one test per module. Tests can be do run in any order, it does not matter. Tests can run in parallel in multiple threads. The only thing kept in mind is that tests should not interfere with each other to avoid bugs.
Tests and Thread Safety
At 17.37, he enlightens the important tests and thread-safety parameters which should be kept in mind. First is never depend on synchronization, semaphores, or special data structures, creating deadlock problems. The second is do not to share data between tests and do not use any non-constant static fields in your tests. Be cautious of code global state in the model code under test, which should be watched out for.
At 19:18, he tells about the Two Least Known Facts of Unit Testing, which are:
Tests do not share instance data.
You can have many test classes per model class.
At 21:25, he threw light on the factors which affect the speed of the test. A single test should run in a second or less. It should be as fast as it could be, but sometimes not happen like this and take 10 to 20 minutes to run the entire test suite. A complete suite should run in a minute or less. If it takes more time, separate larger tests into additional suites and break them up to avoid run them slow. This is for ease of development that runs slowest tests last.
At 24:10, Eliotte adds a general principle for testing. He said, passing a test will not produce any useful output, but failing tests may produce more useful output like debugging information, stack traces, etc. You go through stack traces and other pieces of information from a test you fail. Make sure to rotate your test data and don’t use the same data in every test. This makes it much easier to see immediately which test is failing, going slow and why.
At 27:22, he moves to a very important topic, Flakiness. The test sometimes passes while sometimes fails. You run it once, it’s ok but next time it gives an error and works really bad. These kinds of errors are really hard to avoid.
Sources of flakiness are maybe anything which is time dependence, Network availability or by changing external network resources, Explicit Randomness, Multithreading, anything in your test happening because of race condition, it will be very problematic hard to track down.
At 31:54, he talks about System Skew which is another kind of flakiness. It works ok with you on your system, but when you pass the code to another teammate, it fails. The reason maybe is one using the window, another one using Linux. It may also due to undefined behavior, integer width within C like 4 bytes, 8 bytes, 16 bytes, etc., default character sets, etc. At 34:22, he again asked to avoid Conditional Logic in tests, and try to add known fixed input to get known fixed output.
Debugging and Refactoring
At 35:17, he points out to write a failing test before fixing the bug so that one you know where it comes from and does not reoccur, or if it does, it will be easily caught. If the test passes, the bug isn’t what you think it is. He emphasizes breaking the code before you refactor it.
Sometimes, refactoring is done by automatic tools like renaming a variable but in this case where we were trying to combine eight different folders, copy and paste child folder method with slightly different signatures that need some more thought. Eliotte demonstrated it at 37:07. If necessary, write additional tests before doing unsafe refactoring.
At 39:15, every time a code goes into a repo all the test are run automatically and is sent out and if you break something email goes out, better yet the test to run before the code is merged into the repo and use continuous integration method (e.g. Travis).
Use a submit queue, you never directly commit to the repo, instead, you send the pull request, the continuous server merges it into the branch, run all the test and if all the tests are passed it will commit to the repo. Never, ever allow a check-in with a failing test. If it does happen, don’t try an emergency fix, just roll back and fix it at your leisure until the build changes to green from red.
Final Thoughts and Conclusion
At 41:55, Eliotte gives his final thought and conclusion as follows: Write your test first, before you change a model code, make all tests unambiguous and reproducible. There should be no flakiness, no randomness, no dependence on external environmental conditions. Make all tests unambiguous and reproducible.