How to Apply Artificial Intelligence and Machine Learning to Predict Consumer Behavior

 

Applying AI and Machine Learning to Predict Consumer Behavior

In this article, we will study and analyze general consumer behavior. We'll also learn how AI can help uncover valuable insights that allow companies to make the right decisions to fulfill their vision of delivering better value and generating better revenue.

We'll also walk through a case where we use data science and analytics to uncover valuable insights that lead to better solutions.

prerequisite

As a prerequisite, the reader must have some familiarity with Python and machine learning.

What is artificial intelligence?

Artificial intelligence refers to the ability of a machine to learn like a human, thereby reaching the level of human intelligence, or even more.

As the field of artificial intelligence advances, it leads to improvements in multiple industries such as automation, supply chain, e-commerce, manufacturing, and many more.

Not only that, but the sub-parts of artificial intelligence, namely data science and machine learning, have enabled businesses to make the right decisions. More simply, in order to increase the revenue of an e-commerce store, we can analyze and provide personalized recommendations to customers based on their preferences, most frequently purchased items, previous searches, correlation between item purchases, etc.

AI has played an important role in e-commerce by planning inventory, logistics, finding trends, patterns, predicting future results based on historical trends, informing fact-based decisions, etc.

Understanding Consumer Behavior

Consumer behavior, in its broadest sense, involves how consumers select, decide, use and dispose of goods and services. It covers individuals, groups or organizations in any vertical.

It provides a great idea and insights into consumers' emotions, attitudes, and preferences that influence purchasing behaviour. Thus, helping marketers understand the needs of customers, bringing value to customers and in turn generating revenue for the company.

Predict Consumer Behavior

Large companies understand that predicting customer behavior can fill gaps in the market, identify desired products, and generate greater revenue.

Prediction of consumer behavior can be done in the following ways.

  1. Segmentation: Divide customers into smaller groups based on purchasing behavior. This helps to separate concerns, which in turn helps us identify areas of the market.
  2. Predictive Analysis: We use statistical techniques to analyze previous historical data in order to predict the future behavior of our customers.

step by step implementation

Now, let's see how this is done with a live example.

Understanding Datasets

In this dataset we have information related to customers eg.

  • CustomerID - Customer's ID
  • Gender - the gender of the client
  • Age - customer's age
  • AnnualIncome - Customer's annual income
  • SpendingScore - Scores assigned based on customer behavior and their purchase data

Purpose

The purpose of this tutorial is to understand customer behavior based on their purchase data. This helps marketing teams understand and develop new strategies accordingly.

import library

For data exploration, some Python libraries must be installed.

There are libraries that need to be downloaded.

  • [NumPy]
  • [Pandas]
  • [Matplotlib]
  • [sklearn]
  • [University of Cambridge]
import numpy as np
import pandas as pd
import sklearn
import matplotlib.pyplot as plt
import seaborn as sns
复制代码

view dataset

Before we start, let's take a look at the dataset. In order to view the dataset, we have to import it by reading the CSV file as shown in the image below.

df = pd.read_csv(r'../input/Mall_Customers.csv')
df.head()
复制代码

 The first 5 rows of the dataset

data visualization

Correlation between age, income and consumption scores

A better marketing strategy is to analyze consumption patterns. Here, let's try to analyze and find out the situation of the customer's age, annual income and spending score.

plt.figure(1 , figsize = (15 , 6)) # sets the dimensions of image
n = 0 
for x in ['Age' , 'Annual Income (k$)' , 'Spending Score (1-100)']:
    n += 1
    plt.subplot(1 , 3 , n) # creates 3 different sub-plots
    plt.subplots_adjust(hspace =0.5 , wspace = 0.5)
    sns.distplot(df[x] , bins = 20) # creates a distribution plot
    plt.title('Distplot of {}'.format(x)) # sets title for each plot
plt.show() # displays all the plots
复制代码

output.

 Distribution plot of age, annual income, and consumption fraction

 

gender analysis

The second most important thing to decide on a strategy is to analyze consumption patterns according to gender. Here we find that women are more inclined to buy than men.

plt.figure(1 , figsize = (15 , 5))
sns.countplot(y = 'Gender' , data = df)
plt.show()
复制代码

output.

 Count plots depicting consumption patterns for males and females.

client subdivision

Segmentation helps divide a large set of data into smaller groups of observations that are similar in specific aspects relevant to marketing.

Each group contains individuals that are similar to each other but different from individuals from other groups.

Segmentation is widely used as a marketing tool to create customer segments and tailor relevant strategies for each customer.

Here, we'll learn to segment this data based on several factors and see how it can help improve existing strategies.

Segment using age and spending scores

Let's try segmenting based on the age of the customers and their spending score. This helps us understand the age category of our customers, which may improve the spending score and thus increase the revenue of the company.

Here we have to decide on the number of possible clusters (subdivisions) to get the best results. To do this, we're here 1 to 11 find out which group is the right choice.

X_age_spending = df[['Age' , 'Spending Score (1-100)']].iloc[: , :].values # extracts only age and spending score information from the dataframe
inertia = []
for n in range(1 , 11):
    model_1 = (KMeans(n_clusters = n ,init='k-means++', n_init = 10 , max_iter=300, 
                        tol=0.0001,  random_state= 111  , algorithm='elkan')) # use predefined Kmeans algorithm
    model_1.fit(X_age_spending) # fit the data into the model
    inertia.append(model_1.inertia_)
复制代码

Let's illustrate this with a diagram.

plt.figure(1 , figsize = (15 ,6)) # set dimension of image
plt.plot(np.arange(1 , 11) , inertia , 'o') # Mark the points with a solid circle
plt.plot(np.arange(1 , 11) , inertia , '-' , alpha = 0.5) # connect remaining points with a line
plt.xlabel('Number of Clusters') , plt.ylabel('Inertia') # label the x and y axes
plt.show() # display
复制代码

 Line plot showing clusters

 

As you may have noticed, in clusters 4 , the line graph starts to stabilize. This method is called the "elbow method".

Now, let's further explore the case with 4 clusters.

model_2 = (KMeans(n_clusters = 4 ,init='k-means++', n_init = 10 ,max_iter=300, 
                        tol=0.0001,  random_state= 111  , algorithm='elkan') ) # set number of clusters as 4
model_2.fit(X_age_spending) # fit the model
labels1 = model_2.labels_
centroids1 = model_2.cluster_centers_
复制代码

Now let's visualize them.

Before that, there are some prerequisites for drawing graphs -- like setting the ranges for maximum and minimum values, initializing one meshgrid() , and so on.

h = 0.02
x_min, x_max = X_age_spending[:, 0].min() - 1, X_age_spending[:, 0].max() + 1
y_min, y_max = X_age_spending[:, 1].min() - 1, X_age_spending[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = model_2.predict(np.c_[xx.ravel(), yy.ravel()])  # returns flattened 1D array
复制代码

Now, let's draw the graph.

plt.figure(1 , figsize = (15 , 7) )
plt.clf()
Z = Z.reshape(xx.shape)
plt.imshow(Z , interpolation='nearest', 
           extent=(xx.min(), xx.max(), yy.min(), yy.max()),
           cmap = plt.cm.Pastel2, aspect = 'auto', origin='lower')

plt.scatter( x = 'Age' ,y = 'Spending Score (1-100)' , data = df , c = labels1 , 
            s = 200 )
plt.scatter(x = centroids1[: , 0] , y =  centroids1[: , 1] , s = 300 , c = 'red' , alpha = 0.5)
plt.ylabel('Spending Score (1-100)') , plt.xlabel('Age')
plt.show()
复制代码

output.

 KMeans with 4 clusters

 

From the graph above, we can infer a lot about consumption patterns.

  • The average spending score, regardless of age, is approximately20
  • In the top cluster, 40 customers under age have the highest spending scores. This group is less sparse.
  • Over age 40 , the consumption score is always kept 30 - 60 within the range of .

Further insights about these data can be extracted through deeper data analysis associated with all possible directly or indirectly related parameters.

in conclusion

As we know from the simple case studies above, we find that AI is playing a significant role in almost every industry. With the rising trend of data analytics, customer behavior is being continuously monitored to improve strategies and take better decisions.

This article is meant only as a guide for beginners to get them started in this field.

Guess you like

Origin blog.csdn.net/weixin_73136678/article/details/128507058#comments_26295913