(2) search ad CTR estimates

https://www.cnblogs.com/futurehau/p/6184585.html

 1. CTR estimated flow

Data - "pretreatment -" feature extraction - "Model Training -" after treatment

Characteristics determine the upper limit to achieve a good evaluation, the model determines how close to this limit.

2. Data Preprocessing

label matches : Show logs and click logs do a join

Sampling : negative samples (low CTR, randomly dropping a portion of the negative samples

A combination of information : the relevant information needed to find another file, you need a combination of relevant information. For example: What if you need to see a query_id representatives that need to correspond to the id of txt query: cat queryid_tokensid.txt | awk '$ 1 == 14092 {print $ 0}' | head

Every time this operation would be more trouble, so they need this information directly incorporated into the training data to go. This is a combination of data preprocessing features inside: Join

 

Guess you like

Origin www.cnblogs.com/Lee-yl/p/10936370.html