1、字符串序列匹配度衡量

  • 类似于计算编辑距离 ```python from difflib import SequenceMatcher

def get_match_raio(text1, text2, junk=None): “”” 计算两个文本序列的匹配度得分(类似于编辑距离); “””

  1. # jy: 去除文本中的空格;
  2. text1 = "".join([i for i in text1.split() if i != ""])
  3. text2 = "".join([i for i in text2.split() if i != ""])
  4. # jy: 统一大小写(如果有英文字符, 统一转为小写)
  5. text1 = text1.lower()
  6. text2 = text2.lower()
  7. sequence_matcher = SequenceMatcher(junk, text1, text2)
  8. match_ratio = sequence_matcher.ratio()
  9. return match_ratio

```