Common methods of text similarity calculation

Text similarity can be used in many places, such as text classification, similar text extraction, you can first establish a vocabulary or sentence table, and then find similar texts, documents, articles or comments from the database.

There are several types of similarity calculation methods, character level, keyword level, semantic level, etc.

The character level has the longest common subsequence, edit distance, etc.

The keyword level has weights commonly used tfidf, cosine function, word2vector, etc.

Semantic levels are lad, lsi, etc.

To be continued

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325123639&siteId=291194637