Test data generation with RapidRep®

Generate your test data for the Back-End with RapidRep – uncomplicated, lasting, and highly flexible.

A central step in the testing process is to test with a high amount of records. Not the least for reasons of data protection, however, should your test data not be your production data. This is highly relevant especially with regards to the EU-wide General Data Protection Regulation, which comes into force in May 2018, as well as the supplementary, national “new Federal Data Protection Act” (“neue Bundesdatenschutzgesetz“; in Germany).1 Random, fictitious data is required. But data generated with online generators or low-compliance software is more often than not insufficient to fully cover your own use cases. Generating the necessary test data is therefore often a time-consuming and costly process, possibly complicated by complex data relations.

Here we offer you an easy solution with RapidRep. Thanks to RapidRep sets of rules, you create with RapidRep realistic, random data for the back-end that comply with your requirements as well as with the new data protection regulations, are flexibly adaptable and can thus cover a high number of use cases. You are not limited to a small amount of records or a single target system, nor to a single output format. Since RapidRep also has an innovative solution for data quality evaluation on board, you can optionally also always check the quality of your test data.

The procedure for creating test data with RapidRep essentially comprises two steps: the description of a data model for your test data and the creation of a RapidRep set of rules for test data generation. The rules must then be included in a RapidRep report definition to control the output of the data.2

Step 1: Identify the data model and target tables

First, the question arises what data you need for the test. This should normally result from your particular use case and in the best case you already have a corresponding data model available. If this is not the case, identify a data model for your test data in the first step. From this can then be derived both the target tables and columns, which we want to fill with test data, as well as their relational dependencies. (In the following example, we will look at the output in spreadsheets, but with RapidRep you can choose other output formats, such as CSV.)

Data model for the sample of test data generation

(Figure 1: Data model for the sample of test data generation)

In this example, the record "Customer" is the starting point on which the further records depend and which, as well as e.g. the order should be uniquely identifiable, i.e. in the table "Customer", no customer should be duplicated and exactly one customer must be assigned to each order, invoice and address. On the other hand, a customer could have different, i.e. several, addresses, e.g. a different delivery address for a particular order. The model shows that we want to create the tables "Address", "Customer", "Order", "Order_Position", "Invoice" and "Payment" for our data.

However, how are the values to fill these target tables determined, and how can the resulting test data be made realistic (e.g. correspond to statistical frequencies)? For this, you transfer in the next step the data model into a RapidRep set of rules that contains all the rules needed to generate the corresponding test data.

Step 2: Create RapidRep set of rules for test data generation

For the sake of clarity and accessibility, RapidRep rules are created in Excel and have a flexible structure. They should be understandable to all involved persons in order to ensure transparency and credibility of the results. More information about the properties of the rules and the model-based testing with RapidRep can be found here: Model-based testing with the RapidRep Test Suite.  

The test data generation set of rules includes a worksheet that depicts the data model, a worksheet with raw data, and various worksheets with rules and specifications for how they are used. The raw data represents the lists of values used as the data source for the target tables and columns. By means of the step-by-step evaluation of the contained rules, the output or raw data are transformed into the expected result, in this case the test data. The following graphic shows a selection of the rules that are used in our test data generation example.

Detail Set of Rules; Rules for the sample for test data generation

(Figure 2: Detail Set of Rules; Rules for the sample for test data generation)

RapidRep rules have the following common properties:

  • Rules have the form: If (condition) -> Then (action).
  • Each rule is uniquely identifiable via the attribute RULE_ID (Figure 2, column 3).
  • With the attribute ORDER_ID the order of the evaluation of rules can be changed, if required (Figure 2, column 2; not relevant in our example).
  • Each rule performs a very specific task in a set of rules (functional aspect).
  • Each rule must have at least one attribute, which the rule engine can use to evaluate the If-condition.
  • Rules may have as many other attributes as needed.

In the stepwise evaluation of the rules, the rule with the RULE_ID 1 would first be used in the example shown. This refers to the table "Customer" (see Figure 2, column "Aspect") and fills in the column "Country" (see Figure 2, column "Target"). The value that RapidRep will add to the column is determined by the function specified in the "Source" column. This selects a value from the value list "Country" according to given weightings for the probability (see Figure 3 below). Where the values can be found and with which weighting they will be selected is specified on other worksheets ("Import specifications", "Enumerations").

Value list

(Figure 3: Value list "Country" for the sample for test data generation)

The succeeding rules in the example in figure 2 refer to the selection and, based on this selection, determine the value of the "Language" column of the "Customer" table. If, for example, the value "IT" is selected for "Country", then Rules 2-5 are evaluated as "failed" because the pre-condition (see figure 2, column 5, "Pre-Condition") is not fulfilled. Depending on the random percentage likelihood set in the condition (see figure 2, column 6, "Condition"), the value "it", "de" or "fr" is entered in the target column "Language". If the value for "Country" does not match any of the default countries, rules 2-12 will be counted as "failed" and the rule with rule_ID 13 will apply, i.e. in this case, enter "en" as the language value in the "Language" column.

In this way, all columns of the target tables specified in our data model are filled. This is done with fictitious and random values, which are stored in the worksheet "Raw data" in lists. Functions used, for example, to select the country, are created in the report definition in RapidRep, which interprets the rules that are created and can be traced in the rules. Once created, the test data in RapidRep can be randomly generated whenever needed by means of the set of rules and the report definition, and always according to the completely user-defined and therefore to any special case applicable requirements.

Conclusion: Test data generation with RapidRep

To make sure your processes and data processing work and comply with the new safety regulations, you should try as many data records as possible in your tests. Of course, these test data must also comply with the provisions of the EU General Data Protection Regulation and the "new Federal Data Protection Act". Test data that is not only fictitious but also matches your use cases and correctly maps the relational dependencies are the nuts and bolts.
With RapidRep sets of rules you create test data that match exactly your requirements:

  • they are fictitious;
  • they are random;
  • they are comprehensible;
  • they can correspond to the frequency distributions of general and customer-specific statistics;
  • they can be adapted to individual circumstances (implementation of your own data model).

Further advantages:

  • The number of test data records is not limited to a few hundred;
  • the output format and the target system can be adjusted;
  • sets of rules can be easily copied and modified for other case scenarios;
  • the quality of test data can be directly tested with RapidRep’s integrated solutions.

You do not need any other test data generation software, but can directly generate it with RapidRep, while taking advantage of all the other benefits of working with the RapidRep suites.

If you need support for creating your test data with RapidRep, please contact us and we will advise you!


1: See for example https://www.datenschutz-wiki.de/BDSG_2018 (in German)

2: These are typical work steps with RapidRep that go beyond the topic discussed here. Information is provided by other articles from this website, our forum and the RapidRep documentation.

Regulatory compliant IDP/EUC solutions with RapidRep®

With RapidRep, the software family by FINARIS, you can create and modify IDP/EUC solutions that meet the regulatory banking requirements for IT from the beginning.

Solutions for individual data processing (IDP) or end user computing (EUC) cannot be excluded from the IT landscape of many companies. Frequently occurring disadvantages of IDPs such as lack of reproducibility or comprehensibility are, however, a risk. This applies above all to the financial sector. For this reason, the current circular of "Bankaufsichtliche Anforderungen an die IT" (BAIT) (translate: "Banking Supervision Requirements for IT") contains specific requirements for IDP/EUC applications.

Existing IDP/EUC applications are often isolated applications that can only be used for specific workflows, are not overly compatible, and do not present results that are comprehensible to management. They are dependent on the individual, especially since the documentation is often lacking as well, and they have, if at all, only small security measures. The security aspect and the associated risk awareness in particular are the reason for the consultation 02/2017 on the banking supervisory requirements for the IT of banks. The requirements for individual data processing include, but are not limited to, comprehensibility, user authentication and appropriate versioning.
With RapidRep the software manufactured by FINARIS, you create IDP/EUC solutions that meet these criteria. Applications that are based on RapidRep:
  • create comprehensible results by using sets of rules,
  • can reproduce results (also from earlier versions),
  • enable constructive dialog with management dues to integrating clear templates,
  • are person-independent and easy to transfer in line,
  • are developed with agile or conventional (waterfall) methods,
  • quickly return first results,
  • can be given decentralized to subsidiaries without installation if required (portable app),
  • separate development and operation through a smart release process,
  • meet very high demands on authentication and authorization,
  • have a simple and intuitive user interface,
  • integrate into the existing IT infrastructure (LDAP, Back Up, RDBMS),
  • can be flexibly adapted to new requirements,
  • offer version management and an optional four-eye principle.
How is it that RapidRep is so well adapted to the needs of financial IT? The combined IT and financial experience of FINARIS has proven itself for many years in projects of major banks and financial institutions and the gained knowledge as well as best practice approaches flow continuously into the expansion of RapidRep. The software is used in many companies in the financial sector. By combining practical experience, IT knowledge and financial knowledge, RapidRep already covers the basic requirements for the IDP/EUC and can be used in accordance with the regulatory guidelines. RapidRep-based IDP solutions for regulatory requirements are already successfully used in numerous regulatory projects (for example, EBA stress test, ECB stress test, EBA funding plan, ...) (see also: http://www.rapidrep.com/en/special-regulatory-requirements).
The BAIT (Banking Supervision Requirements for IT) are primarily aimed at the management of the credit institutions and have the objective to clarify the expectations of the supervisory authorities regarding IT security. Since the spring of 2017, the draft of the circular on BAIT has been available and can be downloaded from the BaFin website (see below). The letter was developed jointly by BaFin and Bundesbank, with the support of the expert committee IT, which, in addition to the supervisory authorities, includes representatives of various IT service providers, banking associations and science. The BAIT requirements deal with eight main topics, including IT strategy, information security management and IT operations. In particular in the chapter "IT projects, application development (incl. by end users in the specialist departments)", the requirements for IPD/EUC applications are provided.

Cookies help us in providing our services. By using our services, you agree that we use cookies.