Prediction Through Personalization

Wouldn't it be nice to be able to predict what an online customer wants, exactly when s/he wants it? Artificially intelligent methods of personalization are said to do just that.

The strategy of offering appropriate products and services based on customer behavior and preference is definitely not a new idea. What is new is the ability to collect informative data about each customer and provide more statistically accurate means of catering to their needs.

Often, online adult companies have had to launch several sites, each directed towards a different product or service line (e.g. fetish, teens, interracial). The costs of maintaining several sites can be high. Personalization techniques may allow you to maintain a single Website offering highly targeted adult products and services that are tailored to individual customers with the highest inclination to buy them.

Some sites claim to offer "personalized porn," but few offer real personalization. Most confuse the term with customization, and are not necessarily utilizing the predictive capabilities of advanced customer data analysis. Customization requires total user control, where the user explicitly selects between certain options; personalization is computer driven, where the machine displays individualized pages and options to the user based on a model of that user's needs. But customers' needs and wants are ever changing.

An efficient way to have your consumers feel they're clicking through the ideal site is to provide them with what they want when - or even before - they know they want it. By analyzing the click-through patterns of a particular customer, we can hope to predict his or her next want. For example, if a given customer whose click-through pattern has been analyzed is currently looking at Latina teens, we hope to predict that he will want his next click to lead him to, say, Latina housewives. If our click-through analysis was correct, the Latina teen page would be the optimal point of cross-/up-sell. Latina housewife advertising and cross-marketing would theoretically have the highest impact at this juncture. The customer is not only at the highest inclination of purchase, but s/he is receiving a more personalized, and therefore more ideal, experience.

The Basics

There are two types of personalization, Implicit and Explicit. Implicit personalization is the capture of a customer's behavior in real-time during actual Website navigation. Explicit personalization is the collection of data directly from the customer such as addresses, profiles, and registration information. Personalization in its most basic form is using the customer's name, however, as technology advances, there's a need for many more variables. A good personalization plan has to be able to learn and create market solutions for existing customers; this is a vastly different concept from the usual hustle to increase revenue by increasing the number of new customers. Several types of customer information are gathered. The most common data is on customers' click-through patterns, buying patterns, contact information, contact history, credit history, life events (career changes, changes in marital status), birthday, and religion/holidays. The personalization system should automatically provide enough information to be able to extrapolate a trend or a next move from the data provided.

Aside from demand predictions, analysis of customer data can help facilitate customer service and marketing automation. Many software vendors, such as Avaya (www.avaya.com), KANA (www.kana.com), and Siebel Systems (www.siebel.com), are offering "Web service" or self-service solutions that enable automated online customer service. Web service in its most basic form requires a portal interface that the customer can customize, some tools for collaboration and interaction with others (message boards and chat rooms are usually good for this), and a knowledge base/database that is as comprehensive as possible. The knowledge base is where customers can make queries, or find information for troubleshooting and frequently asked questions, or search the contents for information. Recent research on popular software solutions demonstrated that many were lacking in knowledge base abilities, such as Avaya Interaction Center. PeopleSoft's (www.peoplesoft.com) and Siebel's products proved to be the most popular with analysts (data: Sonic Wave International).

Marketing automation vendors offer personalization tools not only to increase the inclination to buy, but to assess the effectiveness of various advertising campaigns. These solutions are meant to find more specific, and therefore more effective, target segments. Marketing campaigns are driven by implicit customer data and purchase history. Customer profile changes are also analyzed in order to discover any patterns between purchases and information changes. Some marketing automation solutions are E.piphany's E.6 (www.epiphany.com), SAS's EMA (www.sas.com), and Unica's Affinium (www.unicacorp.com). These vendors provide workflow design tools to facilitate implicit data collection and analysis.

According to the current premise behind personalization algorithms, we can predict future demand and successfully attempt to be at the right place at the right time for a point of sale. The goal of personalization is to create useful information from as much data as it's possible to collect. This information can be used to automatically drive marketing campaigns, manage customers, and, hopefully, to increase profits by leveraging existing customers and easing the hustle to bring in new ones.

Once all the necessary customer data is gathered, how does the personalization software make useful information out it?

Rule Based Systems

Customer data analysis is primarily based on what's called a Rule Based System (RBS). Rule based systems work by using basic logic to establish the truth or falsity of a particular assertion. The core of an RBS as a database is called a rule base, which is comprised of several "IF THEN" rules. The RBS applies these rules to the data to arrive at certain conclusions in a workflow manner.

"IF THEN" statements take the form of:

IF x

AND y

THEN z

Typically, x is a description of some situation, y is some constraint on that situation, and z is either an action or a conclusion. Suppose we have collected customer data about employment stability and buying frequency, and we want to assess a customer's inclination to buy something at this given moment. Then we could form the rule:

IF employment stability is high

AND buying frequency is high

THEN inclination to buy is high

You could then form another rule that states:

IF inclination to buy is high

THEN present sale offer

We've defined employment stability and buying frequency by ratings of high or low and established a relationship between them that results in defining the customer's inclination to buy as either being high or low, which in turn implements the marketing decision to present a sale offer. It's important when creating these rules to define the categories so that it's possible for the right decision to be made. If the categories are too general, then you may miss important distinctions. If the categories are too specific, then you may have too many combinations and may never reach a useful conclusion.

A good checklist when establishing rules is to make sure that:

1) The variables, such as buying frequency, can be categorized easily (e.g. low, medium, high)

2) The rule covers a real life possibility. The logic must apply to reality.

3) The rules do not overlap, or overlap is minimal.

4) The rules cover as many situations and variables as possible.

Along with the rule base, a rule-based system has a working memory and a rule interpreter. The working memory stores initial facts. The rule interpreter is what matches all the patterns to the content in the working memory. When data is matched to a condition in the rule, the rule is termed as "instantiated," meaning some data has been recognized to fit part of the pattern the rule expresses. The rule interpreter keeps track of all the instantiated rules, and executes them one at a time. New facts are obtained, and the working memory is updated. This is the basis of rule-based machine learning.

The Downfall

Software vendors today have to be clever in selecting their algorithms because scalability is a major issue of a workflow, tree-like, knowledge base. As a workflow knowledge base gains more levels, its complexity and difficulty to update increases exponentially. More advanced algorithms organize information in a network/web, so the order of actions is not so critical in reaching a certain conclusion. Rule based systems are also space intensive. Quite a bit of drive space is needed for storing rules and freeing up resources for the working memory. Flexibility is also an issue with a rule-based system. Flexibility is defined as a measurement for the ability to make changes easily, and is difficult to achieve when there are a large number of complex interactions between rules. Small changes could cause unwanted side results, and may require costly pre-implementation analyses.

Fuzzy Logic

Rule-based systems for personalization are great for small to moderate amounts of data that have concrete answers to given "IF THEN" statements. Fuzzy logic comes into play when a variable cannot be defined by distinct categories, and we are forced to accept ambiguity. The introduction of fuzzy logic increases the flexibility and scalability of the personalization system.

Fuzzy logic is based on approximate reasoning. For example, you could have the following corresponding rules:

IF time spent on Latina teen page is LONG

THEN interest in Latina teens is HIGH

and

IF time spent on Latina teen page is NOT LONG

THEN interest in Latina teens is NOT HIGH In order for these rules to be executed properly, you must first operationally define LONG. For this example, LONG will be defined as 45 minutes. Suppose you have a customer that stays on the Latina teen page for 44 minutes, and another customer stays on the page for 46 minutes. In reality, there is little effective difference between 44 and 46 minutes. However, a rule-based system would have not considered interest in Latina teens high for the customer that stayed on the page just one minute short being categorized as LONG. Fuzzy logic adds the ability to set ranges for categories. LONG can be defined as being 40-50 minutes, or anything above 42 minutes. NOT LONG can be set to anything 42 minutes or under.

Fuzzy logic takes place in data analysis as the variables become more complex. In other words, the variables begin to interact in non-linear ways (e.g. in web or net form instead of in a tree or workflow structure). Fuzzy logic does tend to lose accuracy with the evolution of "IF THEN" rules to "IF THEN ELSE" rules, however.

IF something is true

THEN do set of actions 1

ELSE do set of actions 2

Such rules require more concrete categorization.

Case Based Reasoning

In conjunction with rule-based systems and the use of fuzzy logic, a data analysis system may use analogous and historic inferences to come to conclusions. This is known as Case Based Reasoning (CBR).

For example, if a particular customer purchases online strip-show time only on Saturday nights, and on some nights s/he responds to ads based on Vegas girls, you have historic data that may help you make the decision to increase Vegas girl advertising for this individual on Saturday nights.

A case in a CBR is made up of attributes that describe a situation along with its consequence. Manipulation of the attributes leads to different outcomes. In order for a CBR to be accurate, it must contain a large database of cases, known as a case base. The more cases there are in a case base, the more complete the coverage can be, and thus the more accurate the predictions.

CBR systems may have a reliance on database experts when it comes to implementation, but the addition of CBR to rule-based systems and fuzzy logic significantly increases the accuracy, reliability, and validity of predicted outcomes. However, it takes time and patience to develop enough cases in order to effectively use case-based reasoning. _

Anand Bhatt is CTO of SWI Labs, a recognized technical consulting and research group, and is an executive at Sonic Wave International Entertainment. His name is also recognizable from his mainstream music career. He can be reached at [email protected].