OntoNote的SOTA,一个是AllenNLP的Spanbert,另一个是wl roberta

Spanbert

相关的链接如下
根据相关安装allennlp库,以及spacy对应的en_web_core_sm(spacy是3.0.8的,需要手动去github release下载对应的en_web_core_sm版本)

然后用如下代码

  1. from allennlp.predictors.predictor import Predictor
  2. import allennlp_models.tagging
  3. predictor = Predictor.from_path("/root/SheShuaijie/Data/PLM/coref-spanbert-large-2020.02.27.tar.gz")
  4. document="Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen. Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers."
  5. result = predictor.predict(
  6. document
  7. )
  8. print(result)
  9. print(predictor.coref_resolved(document))

Wl Roberta

下载代码和模型
然后用如下的脚本开始测试

  1. CUDA_VISIBLE_DEVICES=2 python predict.py roberta ./data/english_train_head.jsonlines output.jsonlines --weight='/home/shesj/workspace/Data/PLM/roberta_(e20_2021.05.02_01.16)_release.pt'

由于他的代码问题,他会要求读取Trainset来设置Optimizer,显然我只要test,不要这些东西
可以在init里直接注释掉get_optimizer