- Maria Pershina, Yifan He, and Ralph Grishman - NAACL 2015 (The 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics) ## Introduction - Recent NED systems (Cucerzan, 2007; Ratinov et al., 2011; Alhelbawy and Gaizauskas, 2014) usually exploit two types of KB information: local information, which measures the similarity between the text mention and the a candidate KB node; and global information, which measures how well the candidate entities in a document are connected to each other, with the assumption that entities appearing in the same document should be coherent. - local features better encode similarity between a candidate and a KB node, but overlook the coherence between entities; global features are able to exploit interlinking information between entities, but can be noisy if they are used by their own, without considering information from the text and the KB - we propose to disambiguate NEs using a Personalized PageRank (PPR)-based random walk algorithm. Given a document and a list of entity mentions within the document, we first construct a graph whose vertices are linking candidates and whose edges reflects links in Wikipedia. We run the PPR algorithm on this graph, with the constraint that we only allow the highest scored candidate for each entity to become the start point of a hop. As all candidates but the correct one are erronous and probably misleading, limiting the random walk to start from the most promising candidates effectively filters out potential noise in the Personalized PageRank process. - our system is based on a random walk algorithm, it does not require training model parameters - our method is able to better utilize the local similarity between a candidate and a KB node (Section The Graph Model) - we tailor the Personalized PageRank algorithm to only focus on one high-confidence entity at a time to reduce noise (Section Algorithm) ## Related Work - Alhelbawy and Gaizauskas (2014) successfully apply the PageRank algorithm to the NED task. Their work is the closest in spirit to ours and performs well without supervision. We try to further improve their model by using a PPR model to better utilize local features, and by adding constraints to the random walk to reduce noise. - *Ayman Alhelbawy and Robert Gaizauskas. 2014. Graph Ranking for Collective Named Entity Disambiguation.* *In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 75–80.* ## The Graph Model - Vertices V are defined as pairs V = { (mi, ci j) | mi∈ M, ci j∈ Ci}, corresponding to the set of all possible KB candidates for different mentions in M. - no edge is allowed between candidates for the same named entity - Every vertex (m, c) is associated with an initial similarity score between entity mention m and candidate c - We experiment with the local measure (localSim), based on the local information about the entity in the text, and the global measure (popSim), based on the global importance of the entity. - localSim - string similarity between the title of the Wikipedia entry and the entity mention, such as edit distance, whether the text mention starts or ends with the Wikipedia title, etc; and whether they have the same type (e.g. person, organization, location, etc) - *TAC2014 EDL training data (LDC2014E15)* - popSim - The Freebase popularity is a function of entity’s incoming and outgoing link counts in Wikipedia and Freebase.1 - We first perform coreference resolution on the whole document and expand m to the longest mention in the coreference chain. - We insert an edge between two candidates if the Wikipedia entry corresponding to either of the two candidates contains a link to the other candidate ## Algorithm - A successful entity disambiguation algorithm would benefit from both the initial similarity between candidate and entity, as well as the coherence among entities in the same document. - all candidates except for the correct one for each entity are erroneous and will introduce noise into the document graph - the typical random walk approach, which computes coherence of one candidate to the whole graph, is not suitable for our scenario - Personalized PageRank - The PageRank algorithm considers random walk on a graph, where at each step with probability ? (tele port probability) we jump to a randomly selected node on a graph, and with probability 1 − ? we follow a random outgoing edge of the current node - Personalized PageRank is the same as PageRank, except that all teleports are made to the same source node, for which we are personalizing the PageRank. - c1 - ignore contributions from candidate nodes competing for an entity m - The first constraint (c1) means that alternative candidates ¯ e(m, ¯ c), generated for the same entity mention m, should not contribute to the coherence of e(m, c), as only one candidate per entity can be correct - c2 - take only one, highest contribution from candidate nodes, competing for an entity m' != m That is, for all other nodes than the one being personalized, only pick one of the candidates for the calculations - namely the candidate with the highest contribution. ## Experiments and Results - Our set of candidates is publicly available for experiments3 - https://github.com/masha-p/PPRforNED - For our experiments we use dataset AIDA2 - https://www.mpi-inf.mpg.de/yago-naga/aida/ - We initialized 2,000 random walks for every source node, performed 5 steps of PPR, and computed PPR weights from all iterations dropping walks from the first one. The teleport probability is set to 0.2. ## Mindmap ![](Personalized%20Page%20Rank%20for%20Named%20Entity%20Disambiguation%20-%202015.pdf)