Last week, my development team and I ran across a problem with a library we had written several months before that parsed spreadsheets given to us by one of our clients and inserted the data provided into the database. At the time we weren’t sure what the problem was. We decided to run the tests and two frustratingly useless things occurred. First, all of the tests passed. Second, the test suite took 3 hours to run.
Here’s what we had done in our “unit tests.” We had placed several of these spreadsheets as examples into our fixtures folder. Then, we had created the importer object and told it to import. We then checked the results. These tests take a long time, specifically because an import can sometimes take 15 minutes depending on the amount of data we need to import. So, how do you fix something like that? Here are some ideas.
Test each piece in isolation
This was probably the biggest part of our problem with the test suite. We were running the entire importer on huge excel spreadsheets to test that one or two behaviors exhibited themselves in the data that was returned. This is like running diagnostics on a tank by taking it to war and seeing if you can kill the enemy. You drive it all the way there, only to find out that something is wrong.
So, as we worked through the test suite, we began breaking each step in the process out into its own method and then creating a series of unit tests for each step. This worked extremely well. It allowed us to pinpoint exactly where problems were occurring. And it didn’t need to import the entire file to verify one small behavior.
Use small datasets
Once we started writing these shortened unit tests, we realized that a whole worksheet of data was overkill. We could pass a small dataset to each method and verify that each and every result returned was what we expected. In other words, we’d get a small dataset back that was quickly and easily verified.
If it looks like a duck and quacks like a duck, it must be a mock object
To be honest, mock objects are the real trick to isolating portions of your code. All you really need in a good mock object is for it to respond to any method calls passed to it and have return something, where necessary, that makes sense. In our case, mocking workbooks, worksheets, rows, and cells saved a lot of time trying to understand the spreadsheet library we were using. Instead, all I really had to do was create a mock object of each. In the case of worksheets, rows, and cells, most of the time all it was doing was iterating over them, so I could use an Array instead of creating a mock object.
All in all, this worked out really well. Instead of doing 15 data imports, we slimmed things down to only importing data that was needed to verify a specific aspect of importing data. We don’t have enormous amounts of data to verify. And we can pinpoint exactly where the problems lie if the code gets modified in a way that creates more problems.