K2view TDM solutions validated for reducing test data provisioning delays, eliminating manual data masking, and simplifying complex test environment setups.
Broadly referred to as K2tdm, K2view test data management tools include data subsetting, data masking, and synthetic data generation solutions.
Data subsetting is a test data management technique used to create smaller, more representative subsets of real-life data for testing environments. Subsetting lets organizations work with more precise datasets – that maintains the same characteristics and patterns of the actual data, while also complying with data privacy regulations, lowering storage costs, and accelerating the testing process.
Data masking tools are software products that discover and obfuscate sensitive data, so that Personally Identifiable Information (PII) from the original source data can’t be detected. They replace PII with fake information, ensuring that the original data cannot be reverse-engineered while retaining the relational integrity and usability of the data. Traditional data masking tools replace sensitive information with “null” values while modern masking tools replace PII with realistic-looking fake information that facilitates robust application testing processes.
Synthetic data generation tools create artificial data that mimics the features, structures, and statistical attributes of real-life data, thereby complying with data privacy laws. The K2view synthetic data generation solution focuses on structured, tabular data and the 4 methods used to synthesize it:
Generative AI synthetic data models, that creates realistic, synthetic data using Machine Learning (ML) algorithms
A rules engine, that generates synthetic data using user-defined parameters
Entity cloning, that extracts business entity data, masks it, and then replicates it, typically at a very large scale for performance and load testing
Data masking, that replaces PII and sensitive data with new, fake data
According to Bloor Senior Analyst, Daniel Howard, K2tdm organizes test data management solutions into 3 key phases: Extraction, Organization, And Operation. Test data access follows.
Notably, these phases can be repeated – either on-demand or on schedule – to selectively refresh your test data and keep it aligned with changing production environments.
In the extraction phase, K2tdm ingests data from any source, whether structured or unstructured. Business entities (such as customers) are automatically classified within the platform’s data catalog, aptly named K2catalog. This includes a sensitive data discovery process, where what qualifies as sensitive data is controlled by a series of customizable rules and parameters. These rules can match against column names and/or the values and format of the data itself.
Any discovered sensitive data is automatically masked within the platform, including highly unstructured data masking of documents and images. Persistent masking is available at rest and in-flight, as is dynamic masking with Role Based Access Control (RBAC). Masking is always consistent, maintains referential integrity, and many prebuilt, customizable masking functions are provided. Facilities for data compression and versioning are also offered during this phase.
In the next phase, your imported data is organized into a structure that makes test datasets and subsets easy to create and then provision to the target environment.
Specifically, it is partitioned into discrete, customizable business entities (determined by your data model, but typically representing customers), with each entity then used to create a unique Micro-Database™ that contains all the data associated with that entity, often centralizing data from several different sources in the process.
Each Micro-Database can be likened to a miniature data warehouse, offering a 360º view of the specific entity it represents. Notably, the centralized nature of these Micro-Databases ensures that referential integrity is always maintained further down the TDM process, because the Micro-Database itself is always referred to as the ultimate source of truth for the entity it represents.
Therefore, if you change the data in a business entity, those changes will automatically propagate elsewhere in your testing environment.
The third phase is operation. Now that your source data has been appropriately ingested, masked, compressed, and organized into entities, you can use it to create and provision your test datasets.
There are 2 primary ways to do this in K2tdm: data subsetting and synthetic data generation.
For data subsetting, you create a subset of your business entities by filtering them through various customizable business rules, which are created via dropdown menus (in other words, no SQL – or other query language – required) and, therefore, are simple to build.
The corresponding Micro-Databases are fetched and combined into your test dataset, then provisioned into the target environment. For example, a tester could rapidly select 1,000 customers based on location, purchase history, and loyalty program status.
Howard differentiates K2view from other synthetic data generation solutions, by explaining the 4 methods it uses to synthesize data:
The first is rules-based data generation, in which a series of specific, manually created business rules are used to generate data sets.
The second uses Machine Learning (ML) and Generative AI (GenAI) to analyze your production data and create a “lookalike” dataset, such that the individual entities are completely fabricated but the overall makeup of the data is very similar to the original.
The third leverages data masking techniques to create “new” data.
The fourth, described as data cloning, involves duplicating business entities but changing their identifying features.
These techniques vary in sophistication, and we would generally consider the latter 2 to be relatively ancillary: in most cases, the choice will be between rules-based or machine learning-based data generation, depending on whether fine-grained control or automation is most appropriate for a particular use case.
Even so, the fact that they are all offered within a single product is a point in K2tdm’s favor and empowers you to choose the best technique for each use case. Datasets can also be created by blending different types of test data, such as masked production data and synthetic data. In any case, the K2tdm’s mastery of the synthetic data generation lifecycle is quite apparent.
The Bloor analyst concludes his description of K2tdm by detailing the various other test data management functionality that’s available, including reserving data for individual testers, versioning, performing rollbacks, and so on.
The platform’s ETL roots enable various value-added capabilities, including loading and moving test data from any environment to any environment – useful for QA teams performing continuous testing – and a wealth of data transformations are possible, including sophisticated masking techniques, like data aging and data enrichment.
Test data is exposed for access via a self-service web portal designed to isolate its users from the complexities of test data provisioning. This allows your testers, and potentially other users such as developers, to access your test data without worrying about what is going on under the hood. And, APIs are also provided, allowing you to integrate K2tdm into your existing CI/CD pipelines.
Bloor concludes: “K2tdm is a modern TDM solution that’s made a big impact on the space in recent years. In particular, the fact that it is a single, unified platform that provides the lion’s share of the test data functionality you could ask for (and plenty of more general functionality, besides, such as the K2catalog), all built on its core framework, makes for a compelling offering.
In addition, the organization of data into business entities and the corresponding creation and usage of Micro-Databases is a standout feature: few others in the testing space offer anything similar. We also appreciate the emphasis on self-service test data access evident in the product’s web portal user interface design.
Between its unified approach and its offering of data masking, subsetting and synthetic data generation techniques driven by entity-driven data modelling, K2tdm is more than capable of addressing a diverse array of test data management use cases, up to and including those found within complex, enterprise-level data environments. In short, it is a powerful test data management solution that is more than worth your consideration.”
Get the Bloor InBrief on K2view Test Data Management.