Fine marketing and customer segmentation

Hear a report yesterday, I feel good, tidy.

(Computer crash, had almost ready and we cry)

1. What is the fine marketing

Fine marketing is appropriate, the appropriate differences according to the customer's customer segmentation, take a different marketing strategy. More famous day of cat thousand thousand faces, Amazon book recommender system ...... recommended "the era of big data," very good Popular Science.

2. What is customer segmentation

Interpretation of three angles customer segmentation

  1. Customer demand: demand determines market
  2. Customer Value: customers are large customers and small customers, new customers, old customers and so divided, their interests ranging from the value of the enterprise.

  3. Enterprise resources and capabilities: for measures of firm size

Internet electricity supplier, for example, can focus on the following data:

  1. Customer Demographics

  2. Customer touch-channel enterprises (through what channels?)

  3. Customer purchase frequency

  4. Amount purchased by the customer

  5. The scene is not the same, different data collection

  6. As a customer of a certain brand of customer time period

  7. Frequency of purchase of certain brands

  8. The average contribution to the purchase of the product

  9. The probability of customers to buy brand

  10. R (Recency) recent consumer

    F (Frequency) frequency of consumption

    M (Monetary) Consumer Resources

Different problem to be solved, different data collection solutions are also different.

3. The data processing flow of fine marketing

Business understanding --- --- understand the data model to build data preprocessing ---- --- --- model model assessment publish (iterative process) machine learning and learning a lot like people.

4, machine learning algorithms

Divided into: supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning

1, supervised learning

Keywords: classification of learning, prediction model

First, the child pointed to the dog, said: "Mom Mom This is the cat," my mother said: "No, this is a dog";

Second, child pointing to the dog, said: "Mom Mom This is a pig," the mother said: "No, this is a dog";

First, the child pointed to the dog, said: "Mom Mom This is a dog," the mother said: "Yes, this is a dog";

…………

Mother knows the correct answer, you can correct the children, the children know what a dog is constantly corrected, this model is supervised learning.

Characterized by - the conclusion is known, the result of the known sample, the algorithm model training.

2, unsupervised learning

Key words: correlation model, cluster analysis

To the new class, we are very strange, after a period of time in small groups, the whole process without human intervention, and automatically form a small group. The members of each small group must have similarities, what people in groups. This is unsupervised learning. No training before, we can not determine the result will be divided into several groups, which will become a group.

Cluster: an unsupervised learning, it is a set of objects into clusters cluster, a cluster objects within the class as large as the similarity between clusters as small as possible. Customer segmentation clustering is used.

3, simple clustering algorithm --KMeans

KMeans algorithm:,

Step 1: Set the value of classification K

Step Two: Set the position of the initial cluster centroids

The third step: the continuous iteration, calculate the distance, looking for something new classification of heart cluster cluster (the horse from Europe's style)

The third step: clustering convergence completion (convergence criteria, such as 0.001)

The third step of the process of mathematical formulas. Big data is not science learning algorithm is not mathematics, mathematical formulas and ideas need to know the meaning of the algorithm can be, you do not have a closer look.

Some repairer, a car being driven, big data is driving. Car naturally algorithm engineers and mathematicians made.

advantage:

(1), it is a classical clustering algorithm to solve the problem, a simple, fast (who is close to whom, the logic is very simple)

(2), the processing of large data sets, the algorithm maintains scalability and efficiency

(3) when the cluster close to the Gaussian distribution, it's better.

Disadvantages:

1) to use in the case where the average value of the cluster may be defined, it may be suitable for some applications;
(2), the K-means algorithm is K given in advance, the value of K is chosen very difficult to estimate. In many cases, not known in advance how much of a given data set should be divided into two categories is most appropriate;
(3), the K-means algorithm, first needs to be determined according to an initial division of initial cluster centers, and then be divided into the initial optimization. The initial cluster centers choose to have a greater impact on the clustering result, once the initial selection of good value, may not be effective clustering results;
(4), the algorithm needs to constantly adjust the sample classification, constantly calculating a new cluster centers adjusted, so when the amount of data is very large, the algorithm time overhead is very large;
(5), if they contain outlier clusters, will cause a serious deviation from the mean (i.e.: noise and isolation sensitive data point);

5, Big Data learning

Learning is, follow this order: "What is, how to use, with better."

Placement technology is business, in order to solve the problem taken techniques.

Technology and application combination from the surface to the point of lifting. What used to learn what to do development, do not language, not limited by language.

Programming languages, tools, not good or bad, depending on the scene. The right is the best.

java must learn, must learn a static language can go further in the IT industry.

Python moment more fire, but the status of Java in large data or no shake, python scripting language is simple and efficient, but too lightweight, and large data processing in high concurrency, multithreading optimization or not, it is more important the role of glue.

Current popular Hadoop ecosystem, almost all written in java; although Spark with Scala development, but Scala is running in the JVM;

Flink also use java. Big Data learning, like learning java framework.

Constantly emptied himself, empty cup mentality of continuous learning in order not to be eliminated. Big Data has developed rapidly, various frameworks will be more and more.

Beginners should grasp the entire business process as soon as possible, before focusing on the application layer, business processes familiar to the bottom and then go in-depth study, considering optimization issues.

6. Other

Some things just can not do, but no need to do:

Speech recognition, technical difficulty is not great. But the current algorithms, even if a large enough company resources, training models also need thousands of hours (for a long time have a point of cognitive training model), this model is based on mature enough case. So no need to do it themselves, other people do tune on the line.

Then Ali cloud platform, previously only used server. It provides a platform for discovery algorithms, ah, a cloud database, ah, ah solutions for various scenes filled with wonder ......

Big Data era, the era of big data ...... seems these five words a few more insights and thinking. Ecosystem, the era of intelligent solutions ...... a new door.

Ali cloud follow this line, to expand their knowledge of it.

Guess you like

Origin www.cnblogs.com/for-ever-ly/p/10934832.html