现在模型建的比较多了,因此也形成了一套成熟的流程,这里简单的记述一下常用的模型构建的方法,为了后续改进。

文件夹架构

1
2
3
4
5
6
7
8
9
10
\- model
dataset.py
\- data
\- images
\- ori_data
\- preprocess
\- utils
train.py
config.py
README.md

config.py 保存 train 以及预处理中的超参,但是不建议使用该文件保存模型的超参(除非整个调整结束)。 utils.py 保留操作函数,用来辅助预处理以及数据分析等等功能。data中存储原始数据以及处理后的数据,部分时候有中间生成数据。images保存为了报告生成的图片。

生成上述结构代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import os

def mkdir(paths):
for path in paths:
folder = os.path.exists(path)
if not folder:
os.makedirs(path)

mkdir(['model', 'data', 'preprocess', 'utils'])

with open('train.py', 'w') as f:
pass
with open('config.py', 'w') as f:
pass
with open('README.md', 'w') as f:
pass

Import

常用的 import 库文件。

1
2
3
4
5
6
7
8
9
10
from sklearn.metrics import classification_report
from sklearn.metrics import precision_score
from torch.utils import data
import torchvision.transforms as tfs
import torch.nn as nn
import pickle as pk
import numpy as np
import argparse
import torch
import os

Model

自己的模型

常用的 Model 架构:

1
2
3
4
5
6
7
8
9
10
11
12
13
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision

class NaiveModel(nn.Module):
def __init__(self):
super(NaiveModel, self).__init__()
pass

def forward(self, x):
pass
return x

预训的模型

使用一些预训的模型使用。有两种魔改方法,其一是替代原模型中的部分层,另一部分是取出模型的某些部分和自己的其他网络组合。

替换层方法

以 vgg16 的替换方法为例。其中可以通过model.features._modules[]拿到对应的层,其中输入为 print(model)产生的输出。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class VGG(nn.Module):
def __init__(self):
super(VGGTorch, self).__init__()
model = torchvision.models.vgg16(pretrained = False)
init = torch.load('data/vgg16-397923af.pth')
model.load_state_dict(init)
conv2d = nn.Conv2d(1, 64, kernel_size=5, stride=(2, 2), padding=(3, 3), bias=False)
model.features._modules['0'] = conv2d
model.classifier = nn.Sequential(
nn.Linear(25088, 20),
nn.LogSoftmax(dim=1)
)
self.model = model

def forward(self, x):
x = x.reshape(x.shape[0], 1, 128, 128).float()
x = self.model(x)
return x, x

上面可以通过pretrain = True拿到预训参数,但是下载很慢,可以复制链接自行离线下载然后通过上述方法导入。

重新组合方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

下面是使用resnet18作为下采样层的UNet模型,通过将ResNet的层取出获得最终模型。其中还使用了预训参数。

class Unet(nn.Module):
def __init__(self, n_class):
super().__init__()
self.base_model = torchvision.models.resnet18(pretrained = True)
self.base_layers = list(self.base_model.children())
self.layer1 = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False),
self.base_layers[1],
self.base_layers[2])
self.layer2 = nn.Sequential(*self.base_layers[3:5])
self.layer3 = self.base_layers[5]
self.layer4 = self.base_layers[6]
self.layer5 = self.base_layers[7]
self.decode4 = Decoder(512, 256+256, 256)
self.decode3 = Decoder(256, 256+128, 256)
self.decode2 = Decoder(256, 128+64, 128)
self.decode1 = Decoder(128, 64+64, 64)
self.decode0 = nn.Sequential(
nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
nn.Conv2d(64, 32, kernel_size=3, padding=1, bias=False),
nn.Conv2d(32, 64, kernel_size=3, padding=1, bias=False)
)
self.conv_last = nn.Conv2d(64, n_class, 1)
self.fc = nn.Sequential(
nn.Linear(n_class * 224 * 224, 1024),
nn.ReLU(),
nn.Dropout(0.4),
nn.Linear(1024, 2),
)

def forward(self, x):
pass
return x

Train

下面是简化的框架。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
from sklearn.metrics import precision_score
from sklearn.metrics import classification_report
import pickle as pk
import numpy as np
import torchvision.transforms as tfs
import torch.nn as nn
import torch
import argparse
import os

from model.dataset import DataSet
from torch.utils import data
from model.naive_model import NaiveModel

def train(args):
cuda = torch.cuda.is_available()
if cuda:
print("CUDA is prepared")

# dataset
trainset = DataSet('train', transform_train, normal_path)
validset = DataSet('valid', transform_valid, normal_path)
trainloader = data.DataLoader(trainset, batch_size = batch_sz, shuffle = True)
validloader = data.DataLoader(validset, batch_size = batch_sz, shuffle = False)

# model
model = NaiveModel(batch_sz, 224)
if cuda:
model = model.cuda()

# lossfunc and optim
lossfun = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

accs = []
for epoch in range(epochs):
# valid
with torch.no_grad():
preds, labels = [], []
for idx, samples in enumerate(validloader):
img, label = samples['img'], samples['label']
if cuda:
img = img.cuda()
pred = model.eval(img, labels)
preds.append(pred)
labels.append(label)
torch.save(model.state_dict(), model_path)
labels = np.concatenate(labels, axis = 0).astype(int)
preds = np.concatenate(preds, axis = 0)
report = classification_report(labels, preds)
print(report)
acc = precision_score(labels, preds, average = 'micro')
print("Precision: {}".format(acc))
accs.append(acc)

# train
for idx, samples in enumerate(trainloader):
optimizer.zero_grad()
imgs, labels = samples['img'], samples['label']
if cuda:
labels = labels.cuda()
imgs = imgs.cuda()

pred = model(imgs)
loss = lossfun(pred, labels.long())
loss.backward()
optimizer.step()
print('loss: {}'.format(loss), end = '\r')

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--pretrain', type=str, default='yes',
help='if pre-train')
parser.add_argument('--finetune', type=str, default='yes',
help='if finetune')
args = parser.parse_args()

train(args)

模型保存及预加载

保存

1
2
3
4
5

torch.save(model.state_dict(), PATH)

# example
torch.save(model.state_dict(),'model.pth')

加载

1
2
3
4
model.load_state_dict(torch.load(PATH))

# example
model.load_state_dict(torch.load('model.pth'))

这样就是整个模型的最基础框架搭建。但事实上一个任务真正困难的是在数据预处理策略和最后的调参上,这些就放在别的地方补充了吧。