Transformers使用

本地使用 Huggingface 的 transformer 挺麻烦的，可能之后还会碰到。这里做一个记录。

模型下载

模型如果标注了from_pretrain('')之类的，则需要加载模型。而不幸的是，官方下载模型的渠道很慢，需要使用离线下载方法。下述以 BERT 作为例子，下载步骤如下：

进入链接，找到模型的位置。
找到特殊的模型，进入界面。
在模型下方找到List all files in model。
选择其中的’config.json; pytorch_model.bin; vocab.txt’下载到同一个文件夹。

模型载入

在开始载入以前，需要将上述下载文件放置到一个文件夹下。

\- bert-uncased
    config.json
    pytorch_model.bin
    vocab.txt
train.py

在 train.py 中的代码：

from transformers import BertTokenizer, BertModel

path = 'D:/LAB/LAB-last/lex-dis/cont-cont/bert-uncased'
tokenizer = BertTokenizer.from_pretrained(path)
model = BertModel.from_pretrained(path)

这里模型载入即结束。使用相对路径不知为何不能成功，需要使用全地址。

模型使用

tokenizers 和 model 的使用这里暂时搁置。