Seed Value
Set the Seed value to maintain referential integrity.
When you want to maintain referential integrity, set the Seed option. This will still mask the data, but in a way that ensures consistent values each time the data is output. Let's say that you are masking the Customer_ID value, and want to ensure each ID is randomized on output. You can use any combination of numbers and characters to create an identifiable value such as Region9_Cust. This value is not output; it just ensures that the output data is consistent each time the flowgraph is run. For example, let's say that you are running a Numeric Variance with a Fixed Number and have set the Variance option to 5.
| Input data | Valid output range |
|---|---|
| 2550 | 2545-2555 |
| 3000 | 2995-3005 |
| 5500 | 4595-5505 |
| Output data after initial processing |
|---|
| 2552 |
| 3001 |
| 5505 |
| Output after the second run with the seed value set | Output after the second run without the seed value |
|---|---|
| 2552 | 2554 |
| 3001 | 2998 |
| 5505 | 5497 |
Example
Retain referential integrity using a seed value to keep the altered values the same when you run a job multiple times.
Date variance seed example: If you randomize the input value "June 10, 2016" by 5 days, the output will be a date between "June 5, 2016" and "June 15, 2016". If the output for the first run is "June 9, 2016", using the seed value will output the value "June 9, 2016" on all subsequent runs, so that you can be certain the data is consistent. Not using the seed value might return a value of "June 11, 2016 on the next run, and "June 7, 2016" on the following run.
Numeric variance seed example: If you randomize the input value "500" with a fixed value of 5, the output will be a number between 495-505. If the output for the first run is "499", using the seed value will output the value "499" in all subsequent runs, so that you can be certain the data is consistent. Not using the seed value might return a value of "503" on the next run, and "498" on the following run.
