Contact me if you want your company to benefit from data driven testing
I am an IT consultant with my limited company here www.inforhino.co.uk . I have identified a common problem with how data driven technology projects fail within organisations and have implemented this solution on-site. The article below is fairly technical but I will publish a simpler article on why Data Driven Testing can increase collaboration, reduce confusion, reduce application complexity, enhance knowledge on an application, and give better confidence on the quality of the product.
Contact me if you want to know more.
One does not simply walk into Mordor…
This isn't simply regression testing...
Big data and its promises
I attended a presentation many years ago and the main premise is one that rings true if you are a Business Intelligence developer. The metaphor was to imagine data as fish and rather than catching individual fish you could cast a net. This would allow emails, spreadsheets, word documents to feed into a single access point where information could be collated and useable. In IT terms we discuss the idea of structured versus unstructured data and for non-technical people we can take structured as meaning something that is easier to analyse than unstructured data.
What really struck me, at the time of this presentation was the idea that in any corporate environment so much stuff gets lost. When we take this far, we have repeated meetings where the same item gets discussed and not closed off, we have emails about the same items between different groups of people, we sketch out models on paper, write notes on paper that never goes anywhere. Could any of this information be useful to a company? Perhaps and almost definitely, imagine if doctor's notes in a hospital could be analysed and quickly identify a misdiagnosis or an accidental administering of an incorrect dosage or if prospective client notes could tell you those most likely to complete on a purchase. Imagine if meetings were productive and resulted in actionable outcomes? Whenever I work in places, I can't help but feel that most people are by nature ineffective and more effort is made into appearing to be effective than being effective. Some of these opinions seem extreme but when we ask honest questions of ourselves we cannot but find this to be true.
The answer isn't some kind of police state system recording everything and taking the enjoyment out of work - being useless is kind of fun but instead it is about trying to remove nonsense impracticalities to allow people to be more effective.
How bad software projects get worse by misalignment
This idea of "loss", that unproductive stuff happens is uncomfortable and as a software developer I see this every day. The scenario goes something like. We are developing a new system, it has an SQL database, there are testers, there are technical business analysts and semi-technical business users - all have varying levels of skills in SQL. Each person during the Software Development Life Cycle (SDLC) writes queries to validate data, many of these queries are useful but releasing them into production carries overhead and so they get emailed around, saved on personal drives, datasets get exported but not really centralised. As the system testing increases, different queries gets written sometimes taking us on wild goose chases. Another factor compounding this sense of loss is when a person with the understanding of the data issue cannot convey it to non-technical people because they don't own the problem.
I have presented the scenario of dysfunction to team leaders and they seem reluctant to acknowledge there is a problem or don't quite understand the issue. This is perfectly understandable, who wants to look under the bonnet when the car keeps driving? My approach is to these guys, there is a problem and it is not just in your company but in every company I have ever been engaged in. The approach I will present is not a magical solution but it will help ensure you have better confidence not only in your software but also the competence of your team. This sounds harsh but really honestly it can only be incompetence that is witnessed by the scenario described above.
Unit testing
What is being proposed will seem like heresy to many and this post isn't intended to fully cover the merits or demerits of unit testing but to instead explain why it isn't enough and NBI as a data testing framework is the go to guy.
Advocating unit testing in applications but only for calculations and extension methods
If I had an extension method that repeated the string, I would write a unit test on that method and be confident it was working throughout the application. If I was performing a calculation I would probably write a unit test on that calculation. I will rarely bother creating unit tests on repositories as we will see later.
Unit testing as a guide on how to…
I see unit tests more as a guide on how to develop against a component or application. This is where unit testing really excels. If we took the basic and familiar pattern of opening a database connection, setting up a command object and retrieving a dataset. We would write a unit test such as
Iconnection conn = A.Fake<Connection>(); Icommand comm = A.Fake<Command>(); Ienumerable<data> dataset = A.Fake<List<data>(); list.add etc….; conn.Open().CallTo(); etc…
Unit testing is not what it is made out to be
This deserves its own post as it is something I feel strongly about. Unit testing is focused on testing a single unit of work which typically does something or returns a value or object of data. A simple unit test might be Assert(8+4,12). Application developers obsess about isolation from external dependencies. A simple example would be a method that deleted a file. Imagine if the file didn't exist, what would happen - it might error or perhaps some other test might also fail. One way of avoiding this would be to create a fake file and to allow the application to work. This is a hard thing for people to get around their head so a simple example will suffice.
public class Filedeleter : IFileDeleter
public void delete file(){…}
And in my class I might instead of calling FileDeleter.Delete(), I may instead call IFileDeleter.Delete();
This is perfectly reasonable and good practice but it can be a lot of faff. More importantly, if we are building code through dynamic or compilation building a lot of these tests become useless. Will give a really quick example. Imagine we are building a data warehouse and we have written an application to manage data. We have twenty tables all being managed in the same way. We as developers tend to write a pattern on one table and then work through repeating this code until we complete our work. An application interacting with those tables tends to entail faking/mocking that data to avoid external dependencies but what happens is that our understanding at the time of that data moves on and maintaining these unit tests becomes unmanageable. It results in hard coded data in the unit tests not reflecting the real business of that data. In the end we have to rely upon integration tests and stop looking at these unit tests.
NBI as a small way to improve collaboration between teams in software development and to reduce unit testing scope
http://www.nbi.io/
For too many reasons that escapes application developers, the most important thing in any application is data, be it configuration, reference or dynamic data. We can write the most perfectly unit tested application with everything being isolated and faked and yet when the application runs something completely unexpected happens. It can be as bad as building a ship in a yard and finding it sinks the minute it hits water. Even an application configuration file is data, it is XML and can be and should be tested.
As an application developer, as our application gets promoted through the SDLC we encounter issues we never imagined could exist. Simple examples are different datasets in the test environment to the development environment. We also find there to be a lack of understanding on how the data is structured and the tests and assumptions made by the testers are different to the knowledge owned by the developers. At the same time, the real data we encounter can be different to our expectations.
In another approach to application development, we tend to develop fairly openly and liberally and as more understanding is acquired we tend to make our applications more bullet proof. We tend to prefer getting core functionality working in a shell before making it a fully functioning application and as requirements come we tend to focus more on those than bullet proofing that application. In what is an allusion to "Promise theory" it doesn't make sense to keep adding more and more validation to an application in the event of something happening. https://www.amazon.co.uk/Promise-Theory-Principles-Applications-1/dp/1495437779
To give a brief example, System A is responsible for taking customer orders, dispatching products and generating invoices. As a developer we immediately see multiple points in the chain, stock levels, delivery agents, confirmations from customers, returns from customers, accounting systems, email systems and the list goes on. The defensive developer who has experience of being burned by many other implementations seeks to make his application more robust but takes on more and more responsibilities that perhaps belong in other applications and systems. Eventually the system becomes so complicated and confused because it is trying to cover all possibilities. The application becomes the opposite of what is was intended to do and no longer works.
Instead, by "Thinking in promises" (a book by Mike Burgess) we can move beyond this towards a higher plain. We can identify state and gain environmental assurance. We can write expectations on an environment for an application to function and flag up if this environment isn't adequate.
This may not seem that significant but just as with unit tests where we get traffic lights telling us whether something failed or passed, a set of tests that are runnable not on the internals of an application but more on the validity of the data is far more beneficial for one simple reason - data that conforms to allowing an application to run conforms to allowing a different application providing the same functionality to run. This knowledge is shared in a higher place. This is the same reason why batch management is a necessary evil in corporate organisations, whilst a batch may not be the highest level of intelligence of an organisation we gain oversight of most of an organisation's operations through a single common interface that a non-technical person can see. In the same way, a person seeing a GUI or output showing which tests passed or failed on data fitness means something to non-technical people. This is a massive benefit increasing common understanding of an organisation's operations.
Many people can contribute to the test suite
Anybody can contribute to the test suite, even create their own test suite and this knowledge is structured for others to share and use. This is a major win and reduces information loss as discussed earlier. I have seen a reduction in discussing the same point in meetings simply because these tests exist and a common understanding exists.
We can demonstrate defects without having to change the application
We can write new tests which may not need an official software release to help prove/disprove defects, and again this is so important because it can be very hard to deploy fixes into environments that might not work. Much easier to write some SQL , bundle into a test suite and to run those tests to get a diagnostic on possible issues than to immediately write a fix that may or may not work.
Thanks for reading
As I said, this document is more of a technical discussion and this was deliberate. Getting developers and technical teams to recognise the value of an approach is extremely hard because their views are deeply rooted and because traditional separation of roles divides responsibilities. We see each team involved in the development of the product using wildly varying ways to assure a product works but never spending much time on aligning understanding or stating what the application data environment should contain to make the application work. It would be great to take your company through the process of using NBI to become more effective.