Test data preparation tools generate and manage test data for validating the functionality, performance, and security of applications under development.
Table of Contents
Test Data Preparation is a Must-Have
Test Data Preparation Tools are a Business Necessity
5 Features You Want Your Test Data Preparation Tools to Include
2 Ways to Use Your Tools
Business Entities: A New Approach to Test Data Preparation
Enterprises have complex IT landscapes with multiple applications deployed across disparate on-premise and cloud environments. With so many production source systems, and even more target testing environments, preparing test data is a complex and often lengthy process. To understand how test data generation can prepare production-grade test data for DevOps, here's a checklist of requirements for your test data preparation tools.
Enterprises are facing quite a few challenges to test data management and preparation. They must integrate data across multiple production systems, comply with data privacy regulations, pay high data storage costs, deal with constant application changes, and provision test data, on-demand, to DevOps pipelines and continuous testing environments.
This makes finding the right test data preparation tools a critical task for your enterprise. Such tools should support complex IT environments with multiple production applications deployed across disparate on-premise and cloud environments, and even more target testing environments. The data should be constantly synced to production, always secure, and immediately available for provisioning.
Multiple source systems and environments
As we’ve mentioned, production data is often scattered across many systems and environments, which makes data collection and integration essential. When performed manually, collection and integration is extremely time-consuming, so test data preparation tools must be able to automate these tasks.
Data masking
Data masking keeps sensitive information private and secure, and is key to complying with privacy regulations and protecting data from internal and external threats. With data breaches on the rise, anonymizing sensitive data in non-production testing and development environments is both a logical and financial imperative. Enterprises should embrace masking and security processes to ensure that customer privacy, and the company’s reputation, both remain intact. Test data preparation tools should provide a variety of data masking tools including randomization, substitution, scrambling, and more.
Referential integrity and consistency
As discussed above, masking is a must. At the same time, when collecting customer data from multiple sources, it is critical to preserve data integrity. This means that when John becomes Steve, when CRM data is masked, John’s invoices become Steve’s invoices when data from the billing system is masked. Your test data preparation tools need to have identity resolution capabilities. They should also consistently mask fields, e.g., John will always become Steve, to support defect tracking and resolution validation. This is known as persistent data masking.
Data Subsets
There are functional and practical reasons for creating subsets of test data. For example, test data subsets require less storage and bandwidth, and can be focused on a particular population segment. Whatever your reason, creating test data subsets should be based on flexible parameters spanning one or more source systems. In addition, even the most complex selections should be ready for provisioning in minutes.
Time machine
Testing scenarios may involve multiple steps that sometimes change the test data. Since running an entire scenario can be time consuming, testers often need to repeat just a part of the process. However, the test data has already been changed. Instead of going back to square one, your test data preparation tools should enable you to go back in time, to a particular iteration, on demand.
With the right tools, testers can create the test data they need, by themselves, in minutes.
The previous checklist discussed what the test data preparation tools should be able to do. The following 2 points focus on “how to use them”:
Self-service portal: One cause for major delays in test data provisioning is the need to extract the data from the source systems. Instead of having to wait for hours for this to happen, choose a system that enables testers to create the test datas they need, in minutes. This feature not only accelerates software delivery, it also improves testing coverage.
API-based automation: API integration will give testing and software teams the ability to build test data pipelines that will serve the organization every step of the way. Choose an API-first solution that delivers continuous testing data to all your testing environments as part of your DevOps CI/CD strategy.
Preparing test data at enterprise scale and complexity requires a whole new approach. The entity-based test data management approach –where a business entity might be a customer, device, or loan – and where the data schema of each entity instance aggregates its data, across all source systems. The test data from these entities, which is stored in a centralized test data warehouse and synced to all production sources, can be provisioned to any testing environment instantly.
Entity-based test data is always:
Fresh – with individual entities continuously updated, to reflect the latest changes to the production data
Secure – with test data masked on ingestion, to ensure compliance and protection
Trusted – with integrity built into every business entity schema
Relevant – with parameter-based subsets, enabling simple and fast test data provisioning
On demand – with test data always ready for provisioning, via a self-service portal or API