之前两节课,我觉得已经很了解整个es的相关度评分的算法了,算法思想,TF/IDFvector modelboolean model; 实际的公式,query normquery coordinationboost
对相关度评分进行调节和优化的常见的4种方法

1、query-time boost

  1. GET /forum/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "should": [
  6. {
  7. "match": {
  8. "title": {
  9. "query": "java spark",
  10. "boost": 2
  11. }
  12. }
  13. },
  14. {
  15. "match": {
  16. "content": "java spark"
  17. }
  18. }
  19. ]
  20. }
  21. }
  22. }

2、重构查询结构

重构查询结果,在es新版本中,影响越来越小了。一般情况下,没什么必要的话,大家不用也行。

GET /forum/_search 
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "content": "java"
          }
        },
        {
          "match": {
            "content": "spark"
          }
        },
        {
          "bool": {
            "should": [
              {
                "match": {
                  "content": "solution"
                }
              },
              {
                "match": {
                  "content": "beginner"
                }
              }
            ]
          }
        }
      ]
    }
  }
}

3、negative boost

搜索包含java,不包含sparkdoc,但是这样子很死板
搜索包含java,尽量不包含sparkdoc,如果包含了spark,不会说排除掉这个doc,而是说将这个doc的分数降低
包含了negative termdoc,分数乘以negative boost,分数降低

GET /forum/_search 
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "content": "java"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "content": "spark"
          }
        }
      ]
    }
  }
}
GET /forum/_search 
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "content": "java"
        }
      },
      "negative": {
        "match": {
          "content": "spark"
        }
      },
      "negative_boost": 0.2
    }
  }
}

negativedoc,会乘以negative_boost,降低分数

4、constant_score

如果你压根儿不需要相关度评分,直接走constant_scorefilter,所有的doc分数都是1,没有评分的概念了

GET /forum/_search 
{
  "query": {
    "bool": {
      "should": [
        {
          "constant_score": {
            "query": {
              "match": {
                "title": "java"
              }
            }
          }
        },
        {
          "constant_score": {
            "query": {
              "match": {
                "title": "spark"
              }
            }
          }
        }
      ]
    }
  }
}

报错:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "[constant_score] query does not support [query]",
        "line" : 7,
        "col" : 22
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[7:22] [bool] failed to parse field [should]",
    "caused_by" : {
      "type" : "parsing_exception",
      "reason" : "[constant_score] query does not support [query]",
      "line" : 7,
      "col" : 22
    }
  },
  "status" : 400
}

GET /forum/_search 
{
  "query": {
    "constant_score": {
      "filter": {
        "terms": {
          "title": [
            "java",
            "spark"
          ]
        }
      }
    }
  }
}
GET /forum/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "constant_score": {
            "filter": {
              "term": {
                "title": "java"
              }
            }
          }
        },
        {
          "constant_score": {
            "filter": {
              "term": {
                "title": "spark"
              }
            }
          }
        }
      ]
    }
  }
}