Special Characters

Information about special characters used in the Clickstream data feed.

Special characters in the hit_data file

The following characters have a special meaning in the hit_data file:

Character Meaning Description
\t (tab character) End of column

Marks the end of a data field.

\n (newline character) End of row

Marks the end of a data row.

\ (backslash character) Escape character

Escapes tab, newline, and backslash when the character was part of the value sent during data collection.

When any of the special characters are preceded by a backslash, they represent a literal character.

Character Meaning Description
\\t Tab

Literal tab character. This character was part of the value sent in during data collection.

\\n Newline

Literal newline. This character was part of the value sent in during data collection.

\\ Backslash

Literal backslash character. This character was part of the value sent in during data collection.

Special characters in multi-valued variables (events_list, products_list, mvvars)

The following characters have a special meaning in multi-valued variables:

Character Meaning Description
, (comma character) End of value

Separates product strings, event IDs, or other values in multi-valued variables.

; (semicolon character) End of sub-value within an individual product value

Separates values associated with an individual product in the product_list.

= (equals character) Value assignment

Assigns a value to an event in the event_list.

When any of the special characters are preceded by a caret, they represent a literal character.

Character Meaning Description
^, Comma

Literal comma character. This character was part of the value sent in during data collection.

^; Semicolon

Literal semicolon character. This character was part of the value sent in during data collection.

^= Equals

Literal equals character. This character was part of the value sent in during data collection.

^^ Caret

Literal caret character. This character was part of the value sent in during data collection.

Sample workflow

If some of the columns in your data feed contain user-submitted data, you should check for special characters before separating the data by column or row using split or readLine, or similar.

Consider the following data:

Browser Width Browser Height eVar1 prop1
1680 1050 search\nstring en
800 600 search\tstring en

During export, the newline and tab characters in the eVar1 values are escaped. The data feed for these rows appears as follows:

1680\t1050\tsearch\\nstring\ten\n
800\t600\tsearch\\tstring\ten\n

Calling readLine() on the first row returns the following partial string:

800\t600\tsearch\

Calling split("\t") on the second row returns the following string array:

800
600
search\
string
en

To avoid this, use a solution similar to the following:

  1. Starting at the beginning of the file, read until you locate a tab, newline, backslash or caret character.
  2. Perform an action based on the special character encountered:
    • Tab - insert the string up that point into a data store cell and continue.
    • Newline - complete the data store row.
    • Backslash - read the next character, insert the appropriate string literal, and continue.
    • Caret - read the next character, insert the appropriate string literal, and continue.