Predictive Analytics Summary and Review

by Eric Siegel

Has Predictive Analytics by Eric Siegel been sitting on your reading list? Pick up the key ideas in the book with this quick summary.

The amount of data swirling around online every day is truly mind blowing. Every post you like on Facebook, every purchase you make online and every ad you click generates data – oceans of data. For companies and governments wishing to better understand, and influence, your behavior, this data is a goldmine.

By using predictive analytics, or PA, a branch of analytics concerned with predicting future events, companies can use the data you leave behind to predict your future behavior with uncanny precision.

As you’ll see in this book summary, however, this increasing ability to predict individual behavior also raises important moral and ethical questions. Do we want our future to be predicted?

In this summary of Predictive Analytics by Eric Siegel, you’ll learn

  • how PA works;
  • why IBM’s Watson is the greatest leap in artificial intelligence so far; and
  • that PA may be somewhat prejudiced.

Predictive Analytics Key Idea #1: Predictive analytics can help you lower your risks and make safer decisions.

Every time a company invests in an expensive marketing campaign, they’re taking a risk; there’s always a chance the campaign might fail and millions of dollars will disappear down the drain. However, when predictive analytics are used, a company can reduce that risk.

The purpose of predictive analytics, or PA, is to study human behavior and get a sense of how people will respond to certain situations, such as seeing an advertisement.

It does this by taking into consideration a wide variety of statistics and human characteristics, all of which are focused on understanding individual, as opposed to general, behaviors. So you wouldn’t use PA to determine which advertisement has the broadest appeal; you’d use it to determine the likeliest responses of specific people to specific advertisements.

More precisely: once you enter all your variables, you’re given a predictive score. Now, this score doesn’t tell you the future as much as it tells you how probable certain individual reactions will be.

For example, let’s say you want to know which online ad people in the United States will be most tempted to click on while searching for grants and scholarships. The more variables you supply, such as age, gender and email domain, the more precise the predictive score will be.

These predictive scores are useful to organizations that want to know the best demographics to target for certain discount offers and advertisements, as well as for organizations that want to know which stocks to buy or people to audit.

The predictive model used in PA is more dynamic than other models since it’s based on machine learning, which means it can change, grow and adapt based on the kind of data it is given. And it’s more accurate than other predictive tools since it uses backtesting, which takes old data to determine how accurate your results will be.

So, if you’re trying to predict whether the S&P Index is going to go up or down in a year’s time, with backtesting, you can feed it old data from 1990 to see how accurate it is about the S&P in 1991.

Predictive Analytics Key Idea #2: Making predictions leads to questions of responsibility, morality and prejudice.

As our ability to use technology for predictive purposes gets better and more sophisticated, an important question arises: How many spoilers do you want to be given about your life? And, more importantly, how many lives are you willing to spoil?

But it’s more than just the question of knowing about your future; a more pressing concern with predictive analytics, and the data mining that goes hand-in-hand with it, is privacy.

When the press found out that the retail corporation Target was using PA to determine which customers were most likely to get pregnant, many felt that the company had gone too far. While Target said it just wanted to advertise maternity goods to the right women, this kind of marketing runs the risk of leaking people’s personal information to their friends, family and coworkers – information that those people may not be ready to share.

But there’s clearly the potential to do good in crime prevention. One firm used backtesting of old data from Santa Cruz, California, to show that 25 percent of burglaries could be accurately predicted. This sort of system can help police identify different “hot spots” that can then be patrolled on a daily basis.

Large cities like Chicago, Memphis and Los Angeles all use PA in an effort to reduce crime rates. They use a diverse range of data, including past and present crimes and the conditions surrounding those crimes, such as what day of the week it was, if it was a holiday and what the weather was like.

Once again, though, some find that the data being used may be taking things too far, especially when it’s making assumptions about one individual’s behavior based on the actions of other people.

For example, some cities are using PA to determine the likelihood of a convict returning to his criminal ways. And many people believe this opens the door for prejudice to enter into PA models.

For instance, imagine there are two criminals who committed the same crime, and they’re both up for parole – but one of them comes from a zip code with a higher crime rate. Because of the rate of crime in his zip code, that criminal will be deemed more likely to return to a life of crime. This is clearly a prejudiced prediction, and since inner-city neighborhoods with disenfranchised minority populations tend to have higher crime rates, it’ll be the people from those neighborhoods who suffer most. In short, it’s just another iteration of racial profiling.

Predictive Analytics Key Idea #3: Data is always predictive but accuracy requires a balanced amount of data.

These days data is a valuable and essential business commodity, and we’re producing more of it every single day. In predictive analytics, the philosophy is: the more data, the better – as long as it’s well balanced.

This means you have to be selective in the data you use, and add equal amounts of each type.

One type of data is the kind that relates to our routine tasks and behavior, which can be taken from such sources as phone records, bank transactions and online purchases. This kind of data is often used in PA models, as is a person’s social media and blogging history.

There’s an estimated 864,000 blog posts every day, and they essentially transform a person’s private thoughts into public data. As of 2011, there were 100 million individual blogs just on WordPress and Tumblr alone.

That’s a lot of data. In fact, if you were to take all the data stored on computers in 1986 and print it out double-sided, you’d have enough sheets to cover the Earth’s entire landmass. But in 2011, you’d be able to cover the whole planet in a layer as thick as two books!

While this abundance of data is what makes our analyses so sophisticated, it also leads to more potential mistakes when it’s unbalanced.

As data increases, so does the likelihood of a random event being mistaken for something meaningful. Most of the errors that occur in PA are the result of too many variables in one area leading to a false correlation, but this can be avoided by creating balanced data sets, which usually means adding more data.

For example, according to one study using PA, you have a lower chance of buying a “lemon” – that is, a faulty car – if the car in question is painted orange. This is obviously nonsense, yet the data backed it up. The problem was there weren’t enough car sales to balance out the data, and when more were added it became clear that the color of the paint had nothing to do with the chances of buying a clunker.

Predictive Analytics Key Idea #4: Machine learning can find risks that get overlooked, but there are risks to machine learning as well.

As we've seen, predictive analytics benefits from machine learning, so that, over time, a model can get even more precise in its predictions.

But there’s another very important benefit of machine learning, which is its ability to recognize disguised risks, or “microrisks.”

Disguised risks are a common danger in business. They tend to be tiny losses that are easy to either miss or ignore, until they build up into a huge problem.

For example, when Chase Bank started using PA to make long-term forecasts on mortgage payments, it realized just how much future interest it was missing out on by letting customers prepay or make early payments on loans. These payments looked like minor losses at first, but, when added up in projected earnings, they became a painfully big issue.

With predictive analysis and machine learning, computers literally program themselves. No detail is too small to avoid being considered. As a result, no microrisk will go unnoticed since the model will always be considering the long-term effects. This way, an organization like Chase will have the chance to do something before it’s too late. This is why banks now use PA to catch all the small risks associated with mortgages.

But in the world of analytics, there is such a thing as learning too much, and it’s a problem similar to having too much of one kind of data, since overlearning can lead to mistaken or faulty predictions as well.

A Berkeley professor once provided a great example of this by presenting data that supported a curious statement: that the stock market follows the same pattern as the rate of butter production in Bangladesh.

The solution to machine overlearning is a very human one: allow the machine to make mistakes so that it can learn from them and then recognize a false connection the next time one appears.

Predictive Analytics Key Idea #5: Bringing together multiple sources and models increases accuracy and performance.

Like aspiring artists and entrepreneurs, predictive analytics has benefited from crowdsourcing. By opening itself up to the collective mind of the public, PA has been able to reap the rewards of ensemble modeling.

The ensemble model is a combination predictive model, and it was developed through the kind of competition and cooperation that crowdsourcing contests promote.

According to a McKinsey report, there is a significant deficit of analytical skills in the PA workforce. In fact, by 2018, there may be a shortage of between 140,000 and 190,000 workers with deep analytical skills in the United States.

In light of this shortage, companies have turned to crowdsourcing in order to meet their goals and discover new talent.

What’s widely considered as “the dawn of the ensemble model” occurred in 2008, when Netflix launched a crowdsourcing competition to increase the accuracy of their recommendation system by 10 percent. In the final stages of the competition, two big teams of over 20 members merged, as did two very powerful predictive models, both of which met Netflix’s goal.

This was made possible due to the environment of friendly competition that was created during the challenge. The competition included online forums that promoted the sharing of new ideas and an ongoing, open dialogue.

Since then, we’ve seen ensemble models routinely outperform single models.

Research shows that by moving from a single model to an ensemble model, performance will increase between 5 and 30 percent. Furthermore, ensemble models have been shown to continue improving as more models are integrated into them. This general improvement has come to be known as the ensemble effect, and it’s being used to tackle increasingly complex problems.

Organizations using ensemble models these days include the IRS, as a way to spot tax fraud; Nature Conservancy, to predict donation amounts; Nokia-Siemens, to reduce dropped calls; and the US Department of Defense, to spot fraudulent government invoices.

Predictive Analytics Key Idea #6: Human language poses difficult challenges, but big advancements have already been made.

The increased power of ensemble models has been put to use for some fascinatingly complex projects, including the ability of machines to process natural language.

One of the biggest challenges in any project involving computational linguistics is picking up on all the subtle nuances in human speech.

When two people talk to one another, there are a variety of complex factors at work, each helping to determine the true meaning of what’s being said. For example, two people could say “This is great,” and one might be sarcastic, and therefore mean the exact opposite of the other person.

Still, textual data is thought to compose 80 percent of all data. So understanding text is both PA’s biggest opportunity as well as its greatest challenge.

One of the biggest leaps in overcoming this hurdle happened in 2011, when IBM developed Watson, a program designed to compete against humans on the quiz show Jeopardy!.

IBM created Watson by using a massive wealth of textual data that included decades of old Jeopardy! episodes. The real challenge was how to process the data, and the answer lay in ensemble models.

IBM essentially combined all of the existing and most up-to-date models for human language processing. While each was flawed in its own way, together they became a powerful whole with the ability to process natural language.

On February 14, 2011, Watson crushed the competition, which in this case were two former Jeopardy! champions, and it was, perhaps, the greatest single advancement in artificial intelligence yet.

Unlike other PA models, Watson wasn’t built to predict a future outcome, but rather to eliminate possibilities and predict the likeliest answer to a question. And Watson did this better than Google or any other internet search engine currently available.

Watson’s technology is now being reworked for making financial and medical diagnoses. And we can currently see its influence in Apple’s Siri program, which is designed to answer the simple questions of a smartphone user. But let’s face it, any iPhone user knows that Siri wouldn’t do very well on Jeopardy!.

Predictive Analytics Key Idea #7: Predictive analytics can help identify the imperceptible by quantifying persuasion.

Do you hate getting spam from mobile-phone companies and creditors? The good news is that advancements in predictive modeling now let companies know which people are receptive to unprompted ads and which ones are best left alone.

No company wants to annoy its target audience into avoiding its product or service, so there’s a need to be subtly persuasive, which is where the art of predictive analytics is headed.

One of the best examples of this is the Norwegian telecommunications company, Telenor. It found out that when it reaches out to its target customers, it also reaches other people, with unintended consequences.

Specifically, Telenor found that when it reaches out to customers considered at risk of switching their service, it’s also reaching customers considered not to be at risk of switching. And by doing this, Telenor is actually turning them into customers at risk of switching.

This raises another problem facing the PA industry: Is it possible to simultaneously predict how a targeted and an untargeted person will respond to the same message?

This question has brought us the uplift model, which takes into account the imperceptible quality of persuasion. The uplift model is designed with two data sets so that it can target two different customers and ask this question: Which target audience will be the most persuaded?

Generally, one of the scenarios is a control set, such as contacting no customers at all. So, in this way, the uplift model is similar to medical trials, which tend to give one set of participants placebos so they’ll have baseline results by which to measure everything else.

In addition to measuring the amount of persuasion, the uplift model can also tell you who you don’t need to bother with, such as the “sure things,” people who need no extra persuasion, and the “do-not-disturbs,” those who’ll never be persuaded.

This model has worked wonders for such companies as US Bank, Fidelity and Telenor, and has increased marketing effectiveness by as much as 36 percent.

Along with the ensemble effect, the uplift model is a great example of how predictive analysis has grown and a testament to how seemingly insoluble problems can be overcome.

In Review: Predictive Analytics Book Summary

The key message in this book:

You may not be aware of the massive influence predictive analytics has on your daily life, but it’s just about everywhere. It not only influences the way technologies interact with you; it’s also a driving force behind many of our current technological advancements. If you want to know what innovations are happening in the world today, you should be familiar with predictive analytics.