论文《A Generative Entity-Mention Model for Linking Entities with Knowledge Base》

A Generative Entity-Mention Model for Linking Entities with Knowledge Base

 

主要方法:

In our model, each name mention to be linked is modeled as a sample generated through a three-step generative story, and the entity knowledge is encoded in the distribution of entities in document P(e), the distribution of possible names of a specific entity P(s|e), and the distribution of possible contexts of a specific entity P(c|e). To find the referent entity of a name mention, our method combines the evidences from all the three distributions P(e), P(s|e) and P(c|e).

The P(e), P(s|e) and P(c|e) are respectively called the entity popularity model, the entity name model and the entity context model

        

The Generative Entity-Mention Model for Entity Linking

 

  the probability of a name mention m (its context is c and its name is s) referring to a specific entity e can be expressed as the following formula (here we assume that s and c are independent given e):

                                

       Given a name mention m, to perform entity linking, we need to find the entity e which maximizes the probability P(e|m). Then we can resolve the entity linking task as follows:

                   

       Candidate Selection.:

    building a name-to-entity dictionary using the redirect links, disambiguation pages, anchor texts of Wikipedia, then the candidate entities of a name mention are selected by finding its name’s corresponding   entry in the dictionary

Model Estimation

  1. Entity Popularity Model

           

    where Count(e) is the count of the name mentions whose referent entity is e, and the |M| is the total name mention size.

  2.Entity Name Model

         

      where e is a normalization factor, f is the full name of entity e, lf is the length of f, ls is the length of the name s, si the i th word of s, fj is the j th word of f and t(si|fj) is the lexical translation probability which     indicates the probability of a word fj in the full name will be written as si in the output name.

       

  3.Entity Context Model

    given a context c containing n terms t1t2…tn, the entity context model estimates the probability P(c|e) as

                                              

                     

           where Pg(t) is a general language model which is estimated using the whole Wikipedia data, and the optimal value of λ is set to 0.2

                 

    where Counte(t) is the frequency of occurrences of a term t in the contexts of the name mentions whose referent entity is e

        4. The NIL Entity Problem

                   the referent entity may not be contained in the given knowledge base

        

                   first add a pseudo entity, the NIL entity, into the knowledge base

         if the probability of a name mention is generated by the NIL entity is higher than all other entities in Knowledge base, we link the name mention to the NIL entity.

                           

猜你喜欢

转载自www.cnblogs.com/dhName/p/11078630.html