GDPR Labels for Analytics Variables

Why Label Your Data?

Many Adobe customers have legal teams that have reviewed the GDPR law and that have drawn their own conclusions about how data should be handled in order to conform with GDPR. The legal interpretations may differ across companies and the desired data handling settings may also differ across customers. Since customers have differing preferences for GDPR data processing and differing data sets, Adobe is enabling Adobe customers, as the data controller, to customize their desired settings for GDPR data processing for their unique data. This allows each unique customer to process GDPR requests in the way that makes most sense for their brand and their unique data set.

Adobe Analytics provides tools for labeling data according to its sensitivity and contractual restrictions. Labels are important and useful for helping: (1) identify data subjects, (2) determine which data to return as part of an access request, and (3) identify data fields that must be deleted as part of a deletion request.

Before you can figure out which labels should be applied to which variables/fields, you need to understand the IDs that you are capturing in your Analytics data, and to decide which you will use for GDPR requests.

The Adobe Analytics GDPR implementation supports the following labels for identity data, sensitive data, and data governance.

DULE Labels

Note: The Data Usage Labeling & Enforcement (DULE) Framework is designed to provide a uniform way across all Adobe Solutions/Services/Platforms to capture, communicate, and use metadata about data across the Adobe Experience Cloud. The metadata helps data controllers indicate which data is personal information, which data is sensitive data, and what contract restrictions are associated with data. In this initial release, Analytics is exposing only the DULE labels that are relevant to GDPR. As other Adobe products implement support for DULE labels, future releases will introduce additional sensitive data labels, as well as contractual labels, which will help ensure that data shared between products is used only in legally permissible ways.

Identity Data Labels (DULE)

Identity data "I" labels are used to categorize data that can identify or contact a specific person.

Label Definition Other Requirements

I1

Directly identifiable: Data that can specifically identify or enable direct contact with an individual, such as a name or an email address.

  • Cannot be set on events
  • Cannot be set on Merchandising eVars

I2

Indirectly identifiable: Data that can be used in combination with any other data to identify or enable direct contact with an individual or device.

Does not allow identification of an individual by itself, but can be combined with other information (that may or may not be in your possession) to identify someone. Examples include a customer loyalty number, or an ID used by a company's CRM system that is unique for each of their customers.

  • Cannot be set on events
  • Cannot be set on Merchandising eVars

Sensitive Data Labels (DULE)

Sensitive data "S" labels are used to categorize sensitive data such as geographic data. Additional Sensitive Data labels will be introduced in the future to identify other types of sensitive information.

Label Definition

S1

Precise geo-location data related to latitude and longitude that can be used to determine the exact location of a device (within 100 meters or less).

S2

Geo-location data that can be used to determine a broadly defined geo-fence area.

Data Governance Labels (GDPR)

Data Governance labels provide users the ability to classify data that reflects privacy-related considerations and contractual conditions to be compliant with regulations and corporate policies.

GDPR Access Labels

Label Definition Other Requirements

None

Select this option if this variable does not contain data that must be included in data returned to the data subject as part of a GDPR access request.

ACC-ALL

Values in this field should be included in all GDPR access requests.

If this hit came from a device shared by multiple individuals, by applying this label, you, as the data controller, are indicating that it is acceptable to share the data in this field with any individual who had access to the shared device.

Fields with this label will be returned for all GDPR requests.

ACC-PERSON

Values in this field should be included only for GDPR access requests when we are reasonably certain that the hit was from the data subject, as determined by a GDPR request ID matching an ID-PERSON field’s value.

You must also have an ID-PERSON label set on some variable within this report suite, and submit requests using that ID, or this label will never apply.

While few variables will receive any of the other labels, it is expected that access labels will be applied to many of your variables. However, it is up to you, in consultation with your Legal team, to decide which data you have collected should be shared with data subjects.

GDPR Delete Labels

Label Definition Other Requirements

Unlike the other labels, these Delete labels are not mutually exclusive. You can select either, both or none. A separate None label is not necessary, because None is indicated simply by not checking either of the Delete options.

A delete label is required only for fields that contain a value that would allow a hit to be associated with the data subject (i.e. that would allow identification of the data subject).

Other personal information (favorites, browsing/purchase history, health conditions, etc.) does not need to be deleted since the association with the data subject will be severed.

DEL-DEVICE

For GDPR delete requests, values in this field should be anonymized only for requests where a specified ID-DEVICE is present in the hit.

If the same value occurs on other hits, which are not being deleted, then those other instances will not be changed. This will result in the counts changing for reports which compute unique counts on this field. On shared devices, this may remove identifiers for other individuals, beyond just the data subject.

Counts do not change if this field also has an ID-DEVICE label and the value in this field was used as an ID for the GDPR request.

  • Also requires I1 or I2 or S1 label
  • Cannot be set on events
  • Cannot be set on Merchandising eVars
  • Cannot be set on Classifications
  • You must submit requests using an ID-DEVICE or set expandIDs to true, or this label will never apply.

DEL-PERSON

For GDPR delete requests, values in this field should be anonymized only for requests where a specified ID-PERSON is present in the hit.

If the same value occurs on other hits, which are not being deleted, then those other values will not be changed. This will result in the counts changing for reports which compute unique counts on this field. Counts will not change if this field also has an ID-PERSON label and the value in this field was used as an ID for the GDPR request.

  • Also requires I1 or I2 or S1 label
  • Cannot be set on events
  • Cannot be set on Merchandising eVars
  • Cannot be set on Classifications
  • You must also have an ID-PERSON label set on some variable within this report suite and submit requests using that ID, or this label will never apply.

GDPR Identity Labels

Label Definition Other Requirements

None

This variable does not contain an ID that will be used for GDPR requests.

You need to set one of these other labels only if this field contains an ID that you will use when submitting access or delete requests through the GDPR API or UI.

ID-DEVICE

This field contains an ID that can be used to identify a device for a GDPR request , but cannot distinguish between different users of a shared device.

You do not need to specify this label for all variables that contain IDs (that is what the I1/I2 labels are for). Use this label if you submit GDPR requests using IDs stored in this variable and want to search this variable for the specified ID.

  • Also requires I1 or I2 label
  • Cannot be set on events
  • Cannot be set on Merchandising eVars
  • Cannot be set on Classifications

ID-PERSON

This field contains an ID that can be used to identify an authenticated user (a specific person) for a GDPR request.

You do not need to specify this label for all variables that contain IDs (that is what the I1/I2 labels are for). Use this label if you will submit GDPR requests using IDs stored in this variable and want to search this variable for the specified ID.

  • Also requires I1 or I2 label
  • Cannot be set on events
  • Cannot be set on Merchandising eVars
  • Cannot be set on Classifications

Provide a Namespace when Labeling a Variable as ID-DEVICE or ID-PERSON

When you label a variable as ID-DEVICE or ID-PERSON, you are prompted to provide a namespace. You can either use a previously defined namespace or define a new one.

Use a Previously Defined Namespace

If you have previously assigned an ID label to other variables in any of the report suites in your login company, you can select one of these existing namespaces. You should reuse the namespace if this variable contains the same type of IDs as other variables that are already labeled with this namespace and you want to search all of them when submitting a request.

  1. Click Select Namespace and select one of the existing namespaces.
  2. Click Apply.



Define a New Namespace

You can also define a new namespace. We recommend that namespace strings be limited to alphanumeric characters, plus the characters underscore, dash and space. They will be converted to all lower case.

  1. Click Select Namespace and type in the namespace title.



  2. Press Enter to add this namespace. Only now will the Apply button be activated.
  3. Click Apply.

The string you specify as the namespace is the same string you should use when submitting requests through the GDPR API as the value of the “namespace” parameter. The request will then cause Adobe Analytics to search all variables in all of your report suites that share this namespace for the ID you specified with the request.

You do not need to specify the ID-DEVICE or ID-PERSON labels on all variables that contain IDs (that is what the I1/I2 labels are for). Use this label if you will be submitting GDPR requests using IDs stored in this variable and want to search this variable for the specified ID. As an example, if eVar1 can contain an email address, and eVar2 can contain a login user name, but you will only ever submit requests using the user name, then you might label eVar1 as I1, ACC-PERSON, DEL-PERSON, but eVar2 as I2, ACC-PERSON, DEL-PERSON, ID-PERSON with namespace “user name”. You can then submit a request with a user section JSON block such as:

{
	    "namespace": "user name",
	    "type": "analytics",
	    "value": "rocketman123"
}

It is acceptable to use the same namespace for different variables within the same report suite. For example, some custom implementations store a CRM-ID in both a prop and an eVar. If the CRM-ID always occurs in one of them (such as the eVar), and only occasionally occurs in the other (the prop), and never in the prop when not also in the eVar, then only the eVar requires an ID label and a namespace, as Adobe can search only in that eVar for the ID. If, however, the CRM-ID sometimes occurs in one variable and sometimes in the other, then both should have the same namespace and Adobe will search both variables for occurrences of the ID specified as part of a GDPR request with this namespace. You should still have DEL labels on all of these variables, so that the value is anonymized no matter where it occurs.

As another example, you might have a CRM ID that is sometimes sent in via eVar1 and sometimes sent in via prop7. You then have a processing rule that copies the value from eVar1, if it exists, into eVar3. Otherwise it copies the value from prop7 into eVar3. In this scenario, eVar3 will always contain the CRM ID if it is known, so only eVar3 requires an ID-PERSON label.

Variable Types and the GDPR/DULE Labels they support

GDPR/DULE labeling affects four broad classes of Analytics variables. Not all variables support all labels. This table shows which variables support or don't support which labels.

Variable Type Supported Labels Unsupported Labels
  • Custom Success Events
  • Merchandising eVars
  • Multi-valued variables (mvVars)
  • Hierarchy variables

S1/S2

ACC-ALL, ACC-PERSON

I1/I2

ID-DEVICE, ID-PERSON

DEL-DEVICE, DEL-PERSON

Classifications

I1/I2, S1/S2

ACC-ALL, ACC-PERSON,

ID-DEVICE, ID-PERSON

DEL-DEVICE, DEL-PERSON

  • Traffic variables (props)
  • Commerce variables (non-merchandising eVars)

All labels

-

Most other variables

(See table below for exceptions)

ACC-ALL, ACC-PERSON

I1/I2, S1/S2

ID-DEVICE, ID-PERSON

DEL-DEVICE, DEL-PERSON

Variables to which Labels other than ACC-ALL/ACC-PERSON can be assigned/modified

Group Variables Modifiable Labels Comment
  • Conversion Dimensions
  • Custom Traffic Dimensions

All, except classifications

All

Classifications

None / I1 / I2

None / S1 / S2

Conversion Events

All

None / S1 / S2

Solution Dimensions and Events

Activity Map Link,

Activity Map Page

None / I1 / I2

None / DEL-DEVICE / DEL-PERSON

Variables can contain URL parameters, which may include directly or indirectly identifiable data. If your implementation does not collect directly or indirectly identifiable data in these variables, then they don’t need Identity or deletion labels.

Note that delete clears the URL parameters, but preserves the base URL.

Data Processing Dimensions

Custom Visitor ID

ID-DEVICE/ID-PERSON

DEL-DEVICE/DEL-PERSON

You cannot remove the ID or DEL labels (se to None), but you can change them to be either the DEVICE or PERSON variants, depending on your custom ID implementation.

If you don’t use the custom visitor ID, then the setting does not matter.

  • Standard Dimensions
  • Data Processing Dimensions

IP Address

IP Address 2

DEL-DEVICE/DEL-PERSON

You cannot remove the DEL label, but you can change it to be either DEL-DEVICE or DEL-PERSON, or both.

ClickMap Action (Legacy),

ClickMap Context (Legacy),

Page,

Page URL,

Original Entry Page URL,

Referrer,

Visit Start Page URL

None / I1 / I2

None / DEL-DEVICE / DEL-PERSON

Variables can contain URL parameters, which may include directly or indirectly identifiable data. If your implementation does not collect directly or indirectly identifiable data in these variables, then they don’t need Identity or deletion labels.

Note that delete clears the URL parameters, but preserves the base URL.

Deletion Handling

Adobe Analytics support for GDPR deletion requests is designed to minimize impacts to reporting. In most cases, the metrics displayed in reports should not change. A historical report that was run before GDPR deletion will match the same report run after deletion has been performed. This is accomplished by completely disassociating the deleted data from the data subject, while leaving non-identifiable data in place so that reported values remain consistent.

The following table describes how various variables are “deleted”. This is not a complete list.

Variables Deletion Method

• Traffic Variables (props)

• Commerce Variables (eVars)

Existing value is replaced with a new value of the form “GDPR-356396D55C4F9C7AB3FBB2F2FA223482” where the 32-digit hexadecimal value after the “GDPR-“ prefix is a cryptographically strong 128-bit pseudorandom number. Because it is essentially being replaced by a random string, there is no way to determine the original value from this new value, and no way to derive the new value knowing the original value.

For a given variable, if the identical value as that being replaced occurs within other hits that are also being deleted as part of the same GDPR request, all instances of that value will be replaced with the same new value.

If some instances of a value are replaced with one delete request, and a later request deletes other (new) instances of the original value, the new replacement value will be different than the original replacement value.

Purchase ID

Existing value is replaced by a new value of the form “G-7588FCD8642718EC50” where the 18 hexadecimal digits after the “G-“ prefix are the first 18 digits of a cryptographically strong 128-bit pseudorandom number. All comments that apply to deletion of traffic and commerce variables apply here as well.

The Purchase ID is a transaction ID whose main purpose is to make sure that a purchase is not credited twice, such as when someone refreshes their purchase confirmation page. The ID itself may tie the purchase to a row in your own DB where the purchase is recorded. In most cases it is not necessary to delete this ID, so it is not deleted by default. If you are still able to tie the purchase back to a user after the GDPR delete request of your own data, then you may need to delete this field, so that the Analytics data for this visitor cannot be tied back to the purchaser.

Visitor ID

Value is a 128-bit integer and is replaced with a cryptographically strong 128-bit pseudorandom value.

• MCID

• Custom Visitor ID

• IP Address

• IP Address 2

Value is cleared (set to either the empty string or 0 depending on the variable’s type).

• ClickMap Action (Legacy)

• ClickMap Context (Legacy)

• Page

• Page URL

• Original Entry Page URL

• Referrer

• Visit Start Page URL

URL parameters are cleared/removed. If the value does not look like a URL, then the value is cleared (set to the empty string).

• Latitude

• Longitude

Precision is reduced to no better than 1 km.

Variables that Don’t Support the Expected Delete Labels

This section intends to clarify information about Analytics variables that don’t support deletion. Sometimes, these variables get deleted by non-Analytics users (such as the legal team) who do not understand the type of data contained in the variable and make incorrect assumptions based on the name of the variable. Here is a list of some of these variables and why they don’t require deletion, or why they don’t require a specific deletion label.

Variable Comments

New Visitor ID

New visitor id is a Boolean that is true the first time we see a given visitor ID. There is no need to delete it once the visitor ID is anonymized. After anonymization, it will correspond to the first time we have seen this anonymized ID.

Zip Code

Geo Zip Code

Zip codes are set only for hits originating in the USA. They are not set for hits coming from the EU. Even when set, they only provide a broad geographic area that makes re-identification of the data subject difficult.

Geo Latitude

Geo Longitude

These provide a rough location derived from the IP address. The accuracy is generally similar to that of a zip code, within a few dozen kilometers of the actual location.

User Agent

The User Agent identifies the version of the browser that was used.

User ID

Specifies the Analytics report suite (as a number) containing the data.

Report Suite ID

Specifies the name of the Analytics report suite containing the data.

Visitor ID

MCID / ECID

These have a DEL-DEVICE label, but the DEL-PERSON label cannot be added. If you specify ID Expansion with each request, then these IDs will automatically be deleted for all delete requests, even those using an ID-PERSON.

If you do not use ID Expansion, but want these cookie IDs anonymized on hits that contain a matching ID in a prop or eVar, you can work around this labeling limitation by labeling the prop or eVar with an ID-DEVICE label, even if it really identifies a person (all DEL-PERSON labels would also need to be changed to DEL-DEVICE labels). In this case, since only some instances of the visitor ID or ECID are being anonymized, unique visitor counts will change in historical reporting.

AMO ID

The Adobe Media Optimizer ID is a solution variable that has an unmodifiable DEL-DEVICE label. It is populated from a cookie just as the Visitor ID and MCID are. It should be deleted from hits whenever those other IDs are deleted. See the description for those variables for more details.

Date Fields for Access Requests

There are five standard variables that contain timestamps:

Time Stamp Definition

Hit Time UTC

The time that Adobe Analytics received the hit.

Custom Hit Time UTC

Time that the hit occurred, which for some mobile apps and other implementations may be earlier than the time it was received. For example, if a network connection was not available when it occurred, the app may hold the hit and send it in when a connection becomes available.

Date Time

Same value as Custom Hit Time UTC, but in the time zone of the report suite, rather than GMT. (Due to a bug, this value is not currently visible in the labeling UI, and always has a value of ACC-ALL. This should be fixed in July 2018)

First Hit Time GMT

The Custom Hit Time UTC value for the first hit received for the visitor ID value for this hit.

Visit Start Time UTC

The Custom Hit Time UTC value for the first hit received for the current visit for this visitor ID. (Due to a bug, this variable is displayed in the UI as First Visit Start Time UTC).

The code for generating the files returned for GDPR access requests requires that at least one of the first three timestamp variables be included in the access request (have an ACC label that applies to the type of request). If none of these are included, then Custom Hit Time UTC will be treated as if it has an ACC-ALL label.

The hit-level CSV file returned for GDPR access requests will convert the values in these fields from unix timestamps to date/time fields of the format YYYY-MM-DD HH:MM:SS (for example, 2018-05-01 13:49:22). In the summary HTML file, these timestamp values will be truncated to only include the date, YYYY-MM-DD, to reduce the number of unique values that occur for these fields.