Data Mask
Protect the personally identifiable or sensitive information by covering all or a portion of the data.
Some examples of personal and sensitive data include credit card numbers, birth dates, tax identification numbers, salary information, medical identification numbers, bank account numbers, and so on. Use data masking to support security and privacy policies, and to protect your customer or employee data from possible theft or exploitation.
Place the Data Mask operator toward the end of your graph to ensure that all columns that are to be masked have undergone processing by upstream operators. If you place Data Mask before other operators, the downstream operators may not process the actual data but rather the masked data, and in some cases, the operator won’t be able to process the columns at all if Data Mask replaced input data with blanks or a masking character such as “#”.
Configuration Parameters
Parameter | Type | Description |
---|---|---|
Label | String | Mandatory. Enter the name of the data mask operator. |
Seed | String | Optional. An alpha and/or numeric string. Set this option to mask the data in a way that ensures consistent output values each time the data is output. One seed value maintains referential integrity for the following variance types set up in the Data Mask transform: Number Variance, Date Variance, and Pattern Variance. |
Date Format | String | Required. Specifies the order in which month, day, and year elements appear in the input string. This value is used only when the day, month, or year in the input string is ambiguous. |
Month Format | String | Required. Specifies the format in which the randomized month is output when the software cannot determine the output month format based on the input alone. |
Language | String | Required. Specifies the language that the software should use when determining the output of an ambiguous input month string. |
Century Threshold | String | Optional. Indicates whether a two-digit date is considered part of the 20th or 21st century. Enter a value from 0-99. For example, when set to 25, the dates with a 2-digit value from 00-25 result in the years 2000-2025. Dates with a 2-digit value of 26-99 result in the years 1926-1999. |
Column Definitions | You can define the mask operation on one or more columns.
Each column has its own definition. Click the Open Editor icon,
and then click +Add item and complete the
following options:
|
Mask Options
Parameter | Type | Description |
---|---|---|
Starting Position | String | Required. Specifies whether masking should start at the beginning or end of the value. |
Unmasked Length | String | Required. Specifies the number of characters at the beginning or end of the value that should not be masked. |
Masking Character | String | Required. The character or number that replaces the characters in the input data, for example, "#" or "*". |
Maintain Formatting | String | Required.
|
Pattern Variance Options
Parameter | Type | Description |
---|---|---|
Variance type | String | Required. Choose one of the options.
|
Starting Position | Integer | Required. A positive integer that indicates the character number where masking should start. Alpha, numeric, space, and other printable characters are included in the position count. |
Length | Integer | Required. A positive integer that indicates the number of positions (characters) to mask. |
Value | String | Required. When parsing the string for a value, it trims leading and trailing spaces from the value. You can maintain the spaces by surrounding the value in framing characters such as double quotes. For example, STRING, "John,Smith". If the value has double quotes, you can escape the double quotes using a backslash. For example, "\"Slim\"". If the value with framing characters has a backslash in it, the backslash can be escaped with an additional backslash. For example, "\\path". |
Numeric Variance Options
Parameter | Type | Description |
---|---|---|
Numeric Variance Type | String | Required. Define how you want to vary a number.
|
Numeric Variance | Number | Required. Determines the number by which to randomize the input. Enter a value greater than zero. |
Minimum Value | Number | Required. Determines the number by which to randomize the input. Enter the lowest value that can be output as a whole number or decimal. Negative decimal numbers are supported. For best results, set a realistic maximum value. |
Maximum Value | Number | Required. Enter the highest value that can be output as a whole number or decimal. Negative decimal numbers are supported. For best results, set a realistic maximum value. |
Numeric Generalization Options
Parameter | Type | Description |
---|---|---|
Minimum Value | Integer | Enter the lowest acceptable value in the range. |
Minimum Value Inclusive | String | Required. Select True when you want to include the minimum value. Select False when you do not want to include the minimum value in the results. For example, if you set the minimum value to 30, then 30 is included in the results when True is selected. |
Maximum Value | Integer | Enter the highest acceptable value in the range. |
Maximum Value Inclusive | String | Required. Select True when you want to include the maximum value. Select False when you do not want to include the minimum value in the results. For example, if you set the maximum value to 50, then numbers through 49 are included in the results when False is selected. |
Replacement Value | String | Optional. Enter a value to describe the group. |
Default Replacement Value | String | Optional. Value to output when the input value does not fall into any of the defined ranges. For example, if you might want to label those records as Exceptions. |
Numeric Generalization Example
Date Variance Options
Parameter | Type | Description |
---|---|---|
Date Variance Type | String | Specifies how you want to vary a date.
|
Date Variance | Number | Required. Determines the number of days, months, or years by which to randomize the input. The value must be greater than zero. |
Minimum Date | String | Required for Range; optional for other types. Specify the minimum date allowed on output. |
Maximum Date | String | Required for Range; optional for other types. Specify the maximum date allowed on output. |
Date Generalization Options
Parameter | Type | Description |
---|---|---|
Auto Range Scale | String | Required. Defines the scale on which to base the auto range.
|
Minimum Date | String | Enter the lowest acceptable date in the range. |
Minimum Date Inclusive | String | Required. Select True when you want to include the minimum date. Select False when you do not want to include the minimum date in the results. For example, if you set the minimum value to 12/31/2020, then 12/31/2020 is included in the results when True is selected. |
Maximum Date | String | Enter the highest acceptable date in the range. |
Maximum Date Inclusive | String | Required. Select True when you want to include the maximum date. Select False when you do not want to include the minimum date in the results. For example, if you set the maximum date to 06/30/2020, then dates through 06/29/2020 are included in the results when False is selected. |
Replacement Value | String | Required. Enter a value to describe the group. |
Default Replacement Value | String | Optional. Value to output when the input value does not fall into any of the defined ranges. |
Auto Range Duration | Integer | Required. Number of years or months to include in the range. |
Auto Range Start Date | String | Required. Starting date in auto range. |
Auto Range End Date | String | Required. Ending date in auto range. |
Auto Range Output Format | String | Required. Determines the format of the output Auto Range Replacement Value. |
Auto Range Year Format | String | Required. Specifies the number of digits to use for the year. Full Year outputs a four-digit number, for example, 2018. Short Year outputs a two-digit number, for example, 18. |
Auto Range Month Format | String | Required. Determines the month format to use in the Auto Range Replacement Value. Full Text outputs the month name, for example, January. Short Text outputs the abbreviated month name, for example, Jan. Numeric outputs the number of the month, for example, 1 for January. |
Auto Range Date Delimiter | String | Required. Determines the delimiter to use in the Auto Range Replacement Value. |
Auto Range Numeric Format | String | Optional. Determines the numeric format to use in the Auto Range Replacement Value. |
Auto Range Enable Zero Pad | String | Optional. Pad a one-digit number with zero when the format includes the month and day. For example, 1/5/2018 changes to 01/05/2018 when set to True. |
Auto Range Output Language | String | Optional. Determines the language to use in the Auto Range Replacement Value. This setting is applicable when the Month Format is set to Short Text or Full Text. |
Input
Input | Type | Description |
---|---|---|
Input | Message | The input is expected to be in JSON format. |
Output
Output | Type | Description |
---|---|---|
Output | BLOB | The output is in JSON format. |