OntoNote的SOTA,一个是AllenNLP的Spanbert,另一个是wl roberta
Spanbert
相关的链接如下
根据相关安装allennlp库,以及spacy对应的en_web_core_sm(spacy是3.0.8的,需要手动去github release下载对应的en_web_core_sm版本)
然后用如下代码
from allennlp.predictors.predictor import Predictor
import allennlp_models.tagging
predictor = Predictor.from_path("/root/SheShuaijie/Data/PLM/coref-spanbert-large-2020.02.27.tar.gz")
document="Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen. Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers."
result = predictor.predict(
document
)
print(result)
print(predictor.coref_resolved(document))
Wl Roberta
下载代码和模型
然后用如下的脚本开始测试
CUDA_VISIBLE_DEVICES=2 python predict.py roberta ./data/english_train_head.jsonlines output.jsonlines --weight='/home/shesj/workspace/Data/PLM/roberta_(e20_2021.05.02_01.16)_release.pt'
由于他的代码问题,他会要求读取Trainset来设置Optimizer,显然我只要test,不要这些东西
可以在init里直接注释掉get_optimizer