TF <--> Torch

将Pytorch卷积层权重转到Tensorflow中

上面刚刚说了在Pytorch的卷积层中,kernel weights存储格式是[kernel_number, kernel_channel, kernel_height, kernel_width],但在Tensorflow的卷积层中kernel weights存储格式是[kernel_height, kernel_width, kernel_channel, kernel_number]。还有就是在卷积层中如果使用了bias那么bias weights是不需要处理的,因为卷积的bias weights只有一个维度,所以Pytorch和Tensorflow中存储的格式是一样的(后面测试也能验证这个结论)。 在下面代码中:

  • 分别使用Pytorch和Tensorflow的Keras模块创建了卷积层
  • 获取Pytorch创建卷积层的kernel weight以及bias weight
  • 使用numpy对kernel weight的进行transpose处理
  • 将转换后的权重载入到tensorflow的卷积层中
  • 将之前创建的数据分别传入Pytorch和Tensorflow的卷积层中进行正向传播
  • 再使用numpy对Pytorch得到的结果进行transpose处理(保证和tensorflow输出的结果Tensor格式一致)
  • 对比两者输出的结果是否一致
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def conv_test(torch_image, tf_image):
"""
测试转换权重后pytorch的卷积层和tensorflow的卷积层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch卷积层
torch_conv = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1)
# [kernel_number, kernel_channel, kernel_height, kernel_width]
# 卷积层的weights
torch_conv_weight = torch_conv.weight
# 卷积层的bias
torch_conv_bias = torch_conv.bias

# 创建tensorflow卷积层
tf_conv = tf.keras.layers.Conv2D(filters=32, kernel_size=3, padding='same')
tf_conv.build([1, 5, 5, 3])
# 将pytorch的卷积层权重进行转换并载入tf的卷积层中
# to [kernel_height, kernel_width, kernel_channel, kernel_number]
value = np.transpose(torch_conv_weight.detach().numpy(), (2, 3, 1, 0)).astype(np.float32)
tf_conv.set_weights([value, torch_conv_bias.detach().numpy()])

# 计算pytorch卷积层的输出
# [B, C, H, W]
v1 = torch_conv(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow卷积层的输出
# [B, H, W, C]
v2 = tf_conv(tf_image).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("convolution layer test is great!")

将Pytorch DW卷积层权重转到Tensorflow中

在Pytorch的dw卷积层中,dw kernel weights存储格式是[kernel_number, kernel_channel, kernel_height, kernel_width],但在Tensorflow的dw卷积层中dw kernel weights存储格式是[kernel_height, kernel_width, kernel_number, kernel_channel](注意这里最后两个维度和卷积层有些差异)。同样在dw卷积层中如果使用了bias那么dw bias weights是不需要处理的。
在下面代码中:

  • 分别使用Pytorch和Tensorflow的Keras模块创建了dw卷积层

  • 获取Pytorch创建dw卷积层的dw kernel weight以及dw bias weight

  • 使用numpy对dw kernel weight的进行transpose处理

  • 将转换后的权重载入到tensorflow的dw卷积层中

  • 将之前创建的数据分别传入Pytorch和Tensorflow的dw卷积层中进行正向传播

  • 再使用numpy对Pytorch得到的结果进行transpose处理(保证和tensorflow输出的结果Tensor格式一致)

  • 对比两者输出的结果是否一致

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def dw_conv_test(torch_image, tf_image):
"""
测试转换权重后pytorch的dw卷积层和tensorflow的dw卷积层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch的dw卷积层
torch_conv = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, padding=1, groups=3)
# [kernel_number, kernel_channel, kernel_height, kernel_width]
# dw卷积层的weights
torch_conv_weight = torch_conv.weight
# dw卷积层的bias
torch_conv_bias = torch_conv.bias

# 创建tensorflow的dw卷积层
tf_conv = tf.keras.layers.DepthwiseConv2D(kernel_size=3, padding='same')
tf_conv.build([1, 5, 5, 3])
# 将pytorch的dw卷积层权重进行转换并载入tf的dw卷积层中
# to [kernel_height, kernel_width, kernel_number, kernel_channel]
value = np.transpose(torch_conv_weight.detach().numpy(), (2, 3, 0, 1)).astype(np.float32)
tf_conv.set_weights([value, torch_conv_bias.detach().numpy()])

# 计算pytorch卷积层的输出
# [B, C, H, W]
v1 = torch_conv(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow卷积层的输出
# [B, H, W, C]
v2 = tf_conv(tf_image).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("depthwise convolution layer test is great!")

将Pytorch BN层权重转到Tensorflow中

BatchNorm中涉及4个参数:gamma,beta,mean,var。由于这四个参数的shape都是一维的,所以只要找到对应权重名称关系就行了,不需要对数据进行转换。
在Pytorch中,这四个参数的名称分别对应weight,bias,running_mean,running_var
在Tensorflow中,分别对应gamma,beta,moving_mean,moving_variance
在下面代码中:

  • 分别使用Pytorch和Tensorflow的Keras模块创建了bn层(注意,epsilon要保持一致)
  • 随机初始化Pytorch创建bn层的权重信息(默认初始化weight都是1,bias都是0)
  • 获取Pytorch随机初始化后bn的weight,bias,running_mean以及running_var
  • 将对应的权重载入到tensorflow的bn层中
  • 将之前创建的数据分别传入Pytorch和Tensorflow的bn层中进行正向传播
  • 再使用numpy对Pytorch得到的结果进行transpose处理(保证和tensorflow输出的结果Tensor格式一致)
  • 对比两者输出的结果是否一致
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
def bn_test(torch_image, tf_image):
"""
测试转换权重后pytorch的bn层和tensorflow的bn层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch的bn层
torch_bn = nn.BatchNorm2d(num_features=3, eps=1e-5)
# 随机初始化bn的参数
nn.init.uniform_(torch_bn.weight, a=1, b=5)
nn.init.uniform_(torch_bn.bias, a=0.05, b=0.1)
nn.init.uniform_(torch_bn.running_mean, a=0.05, b=0.1)
nn.init.uniform_(torch_bn.running_var, a=1, b=5)
# bn的weights
torch_bn_weight = torch_bn.weight
# bn的bias
torch_bn_bias = torch_bn.bias
# bn的running_mean
torch_bn_mean = torch_bn.running_mean
# bn的running_var
torch_bn_var = torch_bn.running_var

# 创建tensorflow的bn层
tf_bn = tf.keras.layers.BatchNormalization(epsilon=1e-5)
tf_bn.build([1, 5, 5, 3])
# 将pytorch的bn权重载入tf的bn中
tf_bn.set_weights([torch_bn_weight.detach().numpy(),
torch_bn_bias.detach().numpy(),
torch_bn_mean.detach().numpy(),
torch_bn_var.detach().numpy()])

# 计算pytorch bn的输出
# [B, C, H, W]
torch_bn.eval()
v1 = torch_bn(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow bn的输出
# [B, H, W, C]
v2 = tf_bn(tf_image, training=False).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-04)
print("bn layer test is great!")

将Pytorch全连接层权重转到Tensorflow中

在全连接层中涉及两个参数:输入节点个数,和输出节点个数。转换权重时只用转换fc weight即可,fc bias不用做任何处理。 在下面代码中:

  • 对输入的特征矩阵在height以及width维度上进行全局平均池化
  • 分别使用Pytorch和Tensorflow的Keras模块创建了fc层
  • 获取Pytorch创建fc层的fc weight以及fc bias
  • 使用numpy对fc weight的进行transpose处理
  • 将转换后的权重载入到tensorflow的fc层中
  • 将之前创建的数据分别传入Pytorch和Tensorflow的卷积层中进行正向传播
  • 对比两者输出的结果是否一致
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
def fc_test(torch_image, tf_image):
"""
测试转换权重后pytorch的fc层和tensorflow的fc层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""

# mean height and width dim
torch_image = torch.mean(torch_image, dim=[2, 3])
tf_image = np.mean(tf_image, axis=(1, 2))

# 创建pytorch的fc卷积层
torch_fc = nn.Linear(in_features=3, out_features=5)
# [output_units, input_units]
# fc层的weights
torch_fc_weight = torch_fc.weight
# fc层的bias
torch_fc_bias = torch_fc.bias

# 创建tensorflow的fc层
tf_fc = tf.keras.layers.Dense(units=5)
tf_fc.build([1, 3])
# 将pytorch的fc层权重进行转换并载入tf的fc层中
# to [input_units, output_units]
value = np.transpose(torch_fc_weight.detach().numpy(), (1, 0)).astype(np.float32)
tf_fc.set_weights([value, torch_fc_bias.detach().numpy()])

# 计算pytorch fc的输出
# [B, C]
v1 = torch_fc(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)

# 计算tensorflow fc的输出
# [C, B]
v2 = tf_fc(tf_image).numpy()
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("fc layer test is great!")

完整代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
import tensorflow as tf
import torch
from torch import nn
import numpy as np


def conv_test(torch_image, tf_image):
"""
测试转换权重后pytorch的卷积层和tensorflow的卷积层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch卷积层
torch_conv = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1)
# [kernel_number, kernel_channel, kernel_height, kernel_width]
# 卷积层的weights
torch_conv_weight = torch_conv.weight
# 卷积层的bias
torch_conv_bias = torch_conv.bias

# 创建tensorflow卷积层
tf_conv = tf.keras.layers.Conv2D(filters=32, kernel_size=3, padding='same')
tf_conv.build([1, 5, 5, 3])
# 将pytorch的卷积层权重进行转换并载入tf的卷积层中
# to [kernel_height, kernel_width, kernel_channel, kernel_number]
value = np.transpose(torch_conv_weight.detach().numpy(), (2, 3, 1, 0)).astype(np.float32)
tf_conv.set_weights([value, torch_conv_bias.detach().numpy()])

# 计算pytorch卷积层的输出
# [B, C, H, W]
v1 = torch_conv(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow卷积层的输出
# [B, H, W, C]
v2 = tf_conv(tf_image).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("convolution layer test is great!")


def dw_conv_test(torch_image, tf_image):
"""
测试转换权重后pytorch的dw卷积层和tensorflow的dw卷积层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch的dw卷积层
torch_conv = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, padding=1, groups=3)
# [kernel_number, kernel_channel, kernel_height, kernel_width]
# dw卷积层的weights
torch_conv_weight = torch_conv.weight
# dw卷积层的bias
torch_conv_bias = torch_conv.bias

# 创建tensorflow的dw卷积层
tf_conv = tf.keras.layers.DepthwiseConv2D(kernel_size=3, padding='same')
tf_conv.build([1, 5, 5, 3])
# 将pytorch的dw卷积层权重进行转换并载入tf的dw卷积层中
# to [kernel_height, kernel_width, kernel_number, kernel_channel]
value = np.transpose(torch_conv_weight.detach().numpy(), (2, 3, 0, 1)).astype(np.float32)
tf_conv.set_weights([value, torch_conv_bias.detach().numpy()])

# 计算pytorch卷积层的输出
# [B, C, H, W]
v1 = torch_conv(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow卷积层的输出
# [B, H, W, C]
v2 = tf_conv(tf_image).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("depthwise convolution layer test is great!")


def bn_test(torch_image, tf_image):
"""
测试转换权重后pytorch的bn层和tensorflow的bn层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""
# 创建pytorch的bn层
torch_bn = nn.BatchNorm2d(num_features=3, eps=1e-5)
# 随机初始化bn的参数
nn.init.uniform_(torch_bn.weight, a=1, b=5)
nn.init.uniform_(torch_bn.bias, a=0.05, b=0.1)
nn.init.uniform_(torch_bn.running_mean, a=0.05, b=0.1)
nn.init.uniform_(torch_bn.running_var, a=1, b=5)
# bn的weights
torch_bn_weight = torch_bn.weight
# bn的bias
torch_bn_bias = torch_bn.bias
# bn的running_mean
torch_bn_mean = torch_bn.running_mean
# bn的running_var
torch_bn_var = torch_bn.running_var

# 创建tensorflow的bn层
tf_bn = tf.keras.layers.BatchNormalization(epsilon=1e-5)
tf_bn.build([1, 5, 5, 3])
# 将pytorch的bn权重载入tf的bn中
tf_bn.set_weights([torch_bn_weight.detach().numpy(),
torch_bn_bias.detach().numpy(),
torch_bn_mean.detach().numpy(),
torch_bn_var.detach().numpy()])

# 计算pytorch bn的输出
# [B, C, H, W]
torch_bn.eval()
v1 = torch_bn(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)
# [H, W, C]
v1 = np.transpose(v1, (1, 2, 0))

# 计算tensorflow bn的输出
# [B, H, W, C]
v2 = tf_bn(tf_image, training=False).numpy()
# [H, W, C]
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-04)
print("bn layer test is great!")


def fc_test(torch_image, tf_image):
"""
测试转换权重后pytorch的fc层和tensorflow的fc层输出是否一致
:param torch_image:
:param tf_image:
:return:
"""

# mean height and width dim
torch_image = torch.mean(torch_image, dim=[2, 3])
tf_image = np.mean(tf_image, axis=(1, 2))

# 创建pytorch的fc卷积层
torch_fc = nn.Linear(in_features=3, out_features=5)
# [output_units, input_units]
# fc层的weights
torch_fc_weight = torch_fc.weight
# fc层的bias
torch_fc_bias = torch_fc.bias

# 创建tensorflow的fc层
tf_fc = tf.keras.layers.Dense(units=5)
tf_fc.build([1, 3])
# 将pytorch的fc层权重进行转换并载入tf的fc层中
# to [input_units, output_units]
value = np.transpose(torch_fc_weight.detach().numpy(), (1, 0)).astype(np.float32)
tf_fc.set_weights([value, torch_fc_bias.detach().numpy()])

# 计算pytorch fc的输出
# [B, C]
v1 = torch_fc(torch_image).detach().numpy()
v1 = np.squeeze(v1, axis=0)

# 计算tensorflow fc的输出
# [C, B]
v2 = tf_fc(tf_image).numpy()
v2 = np.squeeze(v2, axis=0)

# 检查pytorch和tensorflow的输出结果是否一致
np.testing.assert_allclose(v1, v2, rtol=1e-03, atol=1e-05)
print("fc layer test is great!")


def main():
image = np.random.rand(5, 5, 3)
torch_image = np.transpose(image, (2, 0, 1)).astype(np.float32)
# [B, C, H, W]
torch_image = torch.unsqueeze(torch.as_tensor(torch_image), dim=0)
# [B, H, W, C]
tf_image = np.expand_dims(image, axis=0)

conv_test(torch_image, tf_image)
dw_conv_test(torch_image, tf_image)
bn_test(torch_image, tf_image)
fc_test(torch_image, tf_image)


if __name__ == '__main__':
main()

REF

Pytorch与Tensorflow权重互转_太阳花的小绿豆的博客-CSDN博客