Content Analysis Functionality

Detailed information about how Social's machine-learning software classifies incoming text into sentiment-bearing data, foreign-language data, or spam categories.

Sentiment analysis is performed on all three categories, provided that the sentiment engine supports the text's language. The major difference in regards to categories is that, by default, Social hides content that has been determined to be spam. You can use filter options in reports and moderation feeds to display or hide spam, as desired.

The following table illustrates the different content triggers used by the machine-learning engine to classify text into sentiment-bearing data, foreign-language data, or spam:

Content Trigger Details

Sentiment-bearing words

The text contains one or more words that the engine determines to convey sentiment.

For example, "I love Adobe" would be classified as sentiment-bearing data because the word "love" contains sentiment.

HTTP links

The text contains one or more HTTP links.

www links

The text contains one or more www links.

Symbol-only words

The text contains non-ASCII text.

Telephone numbers

The text contains telephone numbers.

Numeric-only words

The text contains many numbers.

Alphanumeric words

The text contains many alphanumeric words.

Hashtags

The text contains many hashtags.

Lowercase-only words

The text contains only lowercase letters.

Title-case content

The text contains title-case content.

Uppercase-only words

The text contains only uppercase letters.

Stop words

The text does not contain stop words. Advertisements and marketing material tend to use incomplete sentences.

Currency spam patterns

The text contains currency spam patterns.

Spam n-gram

The text contains content that matches spam patterns, such as "Buy one, get two."

Foreign language detected

The engine determines that the text is primarily a non-English language.

Spelling ratio

The content contains a high percentage of misspelled words vs. properly spelled words.