Simon Shi的小站

人工智能,机器学习, 强化学习,大模型,自动驾驶

0%

OpenVINO是英特尔推出的视觉推理加速工具包。应用在Intel的CPU及其GPU上。OpenCV 3.4.1版本加入了英特尔推理引擎后端(英特尔推理引擎是OpenVINO中的一个组件),为英特尔平台的模型推理进行加速。OpenCV新版本(4.3.0)加入nGraph OpenVINO API(2020.03)。
2018 年5月 Intel 发布了 OpenVINO(Open Visual Inferencing and Neural Network Optimization, 开放视觉推理和神经网络优化)工具包,旨在为Intel 计算平台的(基于神经网络的视觉推理任务)提供高性能加速方案,同时支持Intel CPU、 GPU、FPGA 和 Movidius 计算棒等。
————————————————

原文链接:https://blog.csdn.net/weixin_39956356/article/details/107103244

u版YOLOv5目标检测openvino实现

[TOC]

libtorch_URLs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
libtorch 1.0.0

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.0.0.zip
# cuda
https://download.pytorch.org/libtorch/cu80/libtorch-shared-with-deps-1.0.0.zip
https://download.pytorch.org/libtorch/cu90/libtorch-shared-with-deps-1.0.0.zip


libtorch 1.0.1

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.0.1.zip
# cuda
https://download.pytorch.org/libtorch/cu80/libtorch-shared-with-deps-1.0.1.zip
https://download.pytorch.org/libtorch/cu90/libtorch-shared-with-deps-1.0.1.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.0.1.zip



libtorch 1.1.0

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.1.0.zip
# cuda
https://download.pytorch.org/libtorch/cu90/libtorch-shared-with-deps-1.1.0.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.1.0.zip


libtorch 1.2.0

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.2.0.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.2.0.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.2.0.zip



libtorch 1.3.0

从这个版本开始,官方提供了 Pre-cxx11 ABI 和 cxx11 ABI 两种版本
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.3.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.3.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.3.0.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.3.0.zip


cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.3.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.3.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-cxx11-abi-shared-with-deps-1.3.0.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.3.0.zip


libtorch 1.3.1
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.3.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.3.1%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.3.1%2Bcu100.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.3.1.zip

cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcu100.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.3.1.zip


libtorch 1.4.0
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.4.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.4.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.4.0%2Bcu100.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.4.0.zip


cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.4.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.4.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu100/libtorch-cxx11-abi-shared-with-deps-1.4.0%2Bcu100.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.4.0.zip

libtorch 1.5.0
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.5.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.5.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.5.0.zip
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.5.0.zip

cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.5.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.5.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.5.0.zip
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.5.0.zip


libtorch 1.5.1
cxx11 ABI
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.5.1%2Bcpu.zip
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.5.1%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.5.1%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.5.1.zip

libtorch 1.6.0
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.6.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.6.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.6.0%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.6.0.zip


cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.6.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.6.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.6.0%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.6.0.zip


libtorch 1.7.0
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.7.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.7.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.7.0%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.7.0.zip
https://download.pytorch.org/libtorch/cu110/libtorch-shared-with-deps-1.7.0%2Bcu110.zip


cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.7.0.zip
https://download.pytorch.org/libtorch/cu110/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcu110.zip


libtorch 1.7.1
Pre-cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.7.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-shared-with-deps-1.7.1%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.7.1%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.7.1.zip
https://download.pytorch.org/libtorch/cu110/libtorch-shared-with-deps-1.7.1%2Bcu110.zip

cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.7.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu92/libtorch-cxx11-abi-shared-with-deps-1.7.1%2Bcu92.zip
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.7.1%2Bcu101.zip
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.7.1.zip
https://download.pytorch.org/libtorch/cu110/libtorch-cxx11-abi-shared-with-deps-1.7.1%2Bcu110.zip


libtorch 1.8.0
Pre-cxx11 ABI
# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.8.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.8.0.zip
https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.8.0%2Bcu111.zip

cxx11 ABI
# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.8.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.8.0.zip
https://download.pytorch.org/libtorch/cu111/libtorch-cxx11-abi-shared-with-deps-1.8.0%2Bcu111.zip


libtorch 1.8.1 (LTS)
Pre-cxx11 ABI
# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.8.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.8.1%2Bcu102.zip
https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.8.1%2Bcu111.zip

cxx11 ABI

# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.8.1%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.8.1%2Bcu102.zip
https://download.pytorch.org/libtorch/cu111/libtorch-cxx11-abi-shared-with-deps-1.8.1%2Bcu111.zip


libtorch 1.9.0
Pre-cxx11 ABI
# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.9.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.9.0%2Bcu102.zip
https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.9.0%2Bcu111.zip
cxx11 ABI
# cpu
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.9.0%2Bcpu.zip
# cuda
https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.9.0%2Bcu102.zip
https://download.pytorch.org/libtorch/cu111/libtorch-cxx11-abi-shared-with-deps-1.9.0%2Bcu111.zip

Source Build

1
2
git clone --recursive https://github.com/pytorch/pytorch  # --recursive表示下载子模块,但有些模块很难下载下来,可以先用 git clone https://github.com.cnpmjs.org/pytorch/pytorch 
cd pytorch

克隆下来的是最新版本,执行git tag查看分支,然后git checkout branch_name切换到想要的分支,

1
2
3
4
5
git checkout v1.7.1-rc3

# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive

注意:如果下载比较慢或者报错,可以在pytorch目录下查看.gitmodules文件, 切换分支后先把里面网址替换为github加速插件的地址, 然后再执行git submodule sync 和 后面的命令

1
2
3
4
5
6
7
8
9
10
git config --global --unset http.proxy
git config --global --unset https.proxy

git config --local http.proxy 127.0.0.1:11000
git config --local https.proxy 127.0.0.1:11000

git config --local --unset http.proxy
git config --local --unset https.proxy

unset https.proxy

[TOC]

Offical web

github offical

https://stackoverflow.com/questions/65379070/how-to-use-onnx-model-in-c-code-on-linux

pytorch to onnx to tensorRT

mx2onnx

MXNet model to the ONNX model format

onnx2mx

pytorch2onnx

pb2onnx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import tf2onnx
from tf2onnx import tf_loader


def convert_ckpt(checkpoint, inputs, outputs, out_path):
graph_def, inputs, outputs = tf_loader.from_checkpoint(checkpoint, inputs, outputs)
model_path = checkpoint

model_proto, external_tensor_storage = tf2onnx.convert.from_graph_def(graph_def,
input_names=inputs,
output_names=outputs,
output_path=out_path
)

print('---1----', model_proto)
print('---2----', external_tensor_storage)


def demo1():
inputs = ['X_in:0']
outputs = ['softmax:0',' out_argmax:0' ,'out_put_k_indices:0']
ck = r'model.ckpt-100000.meta'
out_path = r'test.onnx'
convert_ckpt(ck, inputs, outputs, out_path)

[TOC]

C++模型调用

模型转换思路通常为:

  • Pytorch -> ONNX -> TensorRT
  • Pytorch -> ONNX -> TVM
  • Pytorch -> 转换工具 -> caffe
  • Pytorch -> torchscript(C++版本Torch) [此方式]
  • pytorch-> JIT -> TensorRT

https://pytorch.org/cppdocs/api/library_root.html

https://pytorch.org/tutorials/advanced/cpp_frontend.html

最近所里有一个 GUI 项目需要调用 PyTorch 的模型,虽然之前做过一些,但是大部分用的是 Python 接口,这次对实效性有要求,因此做一个 C++的接口,现在把一些配置事项做个记录。

准备工作

下载安装支持库

首先,需要下载安装LibTorch支持库,推荐使用LibPyTorchLibPyTorch

下载后直接解压

1
2
wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip
unzip libtorch-shared-with-deps-latest.zip

基于已训练的 Torch 模型

追踪原始模型

需要注意的是,如果希望加载 PyTorch 库到 C++中,首先需要基于 JIT 库的 TorchScript 对模型进行转化,这里以简单resnet18模型来作为示例,可以简单的使用torchvision中的模型库进行生成,接着我们生成一个简单的假数据,利用torch.jit.trace让 TorchScript 能够遍历一遍模型,便可完成追踪。

1
2
3
4
5
6
7
8
import torch
import torchvision
# 实例模型
model = torchvision.models.resnet18()
# 假数据
example = torch.rand(1, 3, 224, 224)
# 使用JIT遍历模型,从而获得记录
traced_script_module = torch.jit.trace(model, example)

对于可能存在依赖于数据输入条件的情况,如以下模型:

1
2
3
4
5
6
7
8
9
10
11
12
13
import torch

class MyModule(torch.nn.Module):
def __init__(self, N, M):
super(MyModule, self).__init__()
self.weight = torch.nn.Parameter(torch.rand(N, M))

def forward(self, input):
if input.sum() > 0:
output = self.weight.mv(input)
else:
output = self.weight + input
return output

数据的前向传播有赖于输入的值,那么可以调用torch.jit.script直接进行转换:

1
2
my_module = MyModule(10,20)
traced_script_module2 = torch.jit.script(my_module)

区别在于第二种方式实现时可以直接将正在训练的模型调用加载。 在获得上述的traced_script_module后,实际上这是一个序列化的 torch 张量字典,可以直接调用save方法完成保存:

1
2
# 保存使用TorchScript遍历的模型
traced_script_module.save("traced_resnet_model.pt")

加载 Torch 模型

有了保存后的 pt 模型后,在 C++中的调用,即为和 LibTorch 库的交互,这里以官方的例子作说明

新建 C++项目, CMakeList 配置可以参考以下

1
2
3
4
5
6
7
8
cmake_minimum_required(VERSION 3.16)
project(torchcpp)
set(Torch_DIR ./libtorch/share/cmake/Torch) #设置Torch的执行位置

find_package(Torch REQUIRED) # 查找支持库
add_executable(torchcpp main.cpp) # 项目主入口
target_link_libraries(torchcpp "${TORCH_LIBRARIES}") # 指出动态连接库
set(CMAKE_CXX_STANDARD 14) # C++标准

对应简单加载 C++代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <torch/script.h> // One-stop header.
#include <iostream>
#include <memory>

at::Tensor baseModel(std::vector<torch::jit::IValue> inputs, torch::jit::script::Module module) {
at::Tensor output = module.forward(inputs).toTensor();
return output;
}

int main(int argc, const char *argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
torch::jit::script::Module module;
try {
// 使用 torch::jit::load() 反序列化原有模型.
module = torch::jit::load(argv[1]);
}
catch (const c10::Error &e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "model loads ok\n";
// 生成假数据以测试
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 3, 224, 224}));
at::Tensor output = baseModel(inputs, module);
std::cout << output.slice(1, 0, 5) << "\n";
return 0;
}

同时我们新建一个 build 文件夹以保存编译时文件

至此项目大致结构如下:

1
2
3
├── build
├── CMakeLists.txt
└── main.cpp

进入 build 文件夹执行

1
2
3
(base) ➜  cd build
(base) ➜ cmake ..
(base) ➜ cmake --build . --config Release

可以获得类似输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
(base) ➜  build cmake ..
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "10.2")
-- Caffe2: CUDA detected: 10.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 10.2
-- Found CUDNN: /usr/local/cuda/lib64/libcudnn.so
-- Found cuDNN: v8.0.4 (include: /usr/local/cuda/include, library: /usr/local/cuda/lib64/libcudnn.so)
-- Autodetected CUDA architecture(s): 7.5
-- Added CUDA NVCC flags for: -gencode;arch=compute_75,code=sm_75
-- Found Torch: /media/hao/Data/Code/DL/torchcppsample/libtorch/lib/libtorch.so
-- Configuring done
-- Generating done
-- Build files have been written to: /media/hao/Data/Code/DL/torchcppsample/build
(base) ➜ build cmake --build . --config Release
Scanning dependencies of target torchcpp
[ 50%] Building CXX object CMakeFiles/torchcpp.dir/main.cpp.o
[100%] Linking CXX executable torchcpp
[100%] Built target torchcpp

接着前往上级文件夹,执行编译得到的主程序:

1
2
3
4
5
(base) ➜  cd ..
(base) ➜ torchcppsample build/torchcpp Python/traced_resnet_model.pt
model loads ok
0.1439 -0.8914 -0.0475 0.2474 0.3108
[ CPUFloatType{1,5} ]

使用CLion等IDE可以更简单的编译管理,而不需要自行build。

注意事项

注意加载模型时,两者必须在同一设备(Device)中。

基于 C++ 前端训练模型

实际上 C++前端提供了训练模型的接口,但是实施难度不低,相比 Python 训练完成后转 TypeScript 调用,这个方式稍显复杂。 官方提供的教程如下:使用 PyTorch 的 C++前端,后续再更新吧。

参考:

Offical Doc Pytorch cpp_export

zhuhu_C++ 如何调用Pytorch模型

2019-07 Cnblog 使用C++调用并部署pytorch模型

2020-07 CSDN Ubuntu下C++调用pytorch训练好模型–利用libtorch

⭐2019-05 Cnblog 使用C++调用pytorch模型(Linux)

⭐2020-10 使用 C++ 调用 PyTorch 模型

无人驾驶中的动态环境检测-2D检测

[TOC]

2D检测

image-20220412180039632

preview

IDea:

  • 位置:先找到所有的ROI
    • Sliding Window / Slective Search / … | CNN(RPN …)
  • 类别:对每个ROI进行分类提取类别信息
    • HOG/DPM/SIFT/LBP/… | CNN(conv pooling)
    • SVM / Adaboost / … | CNN (softmax ….)
  • 位置修正:Bounding Box Regression
    • Linear Regresion / … | CNN(regression …)

How to Generate ROI

preview

How To Classify ROI

preview

4.1 two-step (基于图片的检测方法)

  • RCNN, SPPnet, Fast-RCNN, Faster-RCNN

Befor CNN

  • 位置:sliding window / region proposal(候选框)

    • 手工特征 + 分类器
    • 位置修正

img

RCNN

  • 位置:Selective Search 提取候选框
  • 类别:CNN提取特征 + SVM分类
    • 每个候选区域都要做一遍卷积,太多重复计算
  • 位置修正:Linear Regression

img

SPPnet

  • 位置:Selective Search 提取候选框
  • 类别:CNN提取特征 + SVM分类
    • 共享卷积,大大降低计算量
    • SPP层,不同尺度的特征–>固定特尺度特征(后接全连接层)
      • 把原始图片中的box区域mapping映射到CNN提取后的feature的一个box
      • 通过金字塔池化,把原本不同大小的box,提取成固定大小的特征
      • 输入到FC层
  • 位置修正:Linear Regression

image-20220417232415966

Fast-RCNN

  • 位置:Selective Search 提取候选框
  • 类别:CNN特征提取 + CNN分类
    • 分类和回归都使用CNN实现,两种损失可以反传以实现联动调参(半end-to-end)
    • SPP层—换成—>ROI pooling: (可能损失精读)加速计算
  • 位置修正:CNN回归

image-20220417232604317

Faster-RCNN

  • 位置:CNN提取候选框
    • RPN:Region Proposal Net
      • feature 点对应的原图感受野框处生成不同ration/scale的anchor box
      • 对anchor box (锚点框) 二分类 + 回归
        • 2k socre 是否有物体
        • 4k coork 回归量,修正位置($\delta{A}$)
  • 类别:CNN特征提取 + CNN分类
  • 位置修正:CNN回归

image-20220417233138483

4.2 one-step

  • YOLO,
  • SSD
  • YOLOv2

YOLO

  • 位置:
    • Faster-RCNN
    • YOLO
      • 全图划分成7x7的网格,每个网格对应2个default box
      • 没有候选框,直接对default box做全分类+回归(box中心坐标的x,y相对于对应的网格归一化到0-1之间,w,h用图像的width和height归一化到0-1之间)
      • FC1—->FC2{1470x1}–reshape->{7x7x30} ————{1x1x30}
  • 类别:CNN提取特征 + CNN分类
  • 优点:实时性
  • 缺点:
    • 准确率不高(不如faster-rcnn);定位精度差(anchor box不够丰富且只能回归修正一次)
    • 小物体差:anchor和scale不够多样。
    • 不规则物体差:anchor的ratio不够多样。

image-20220418014209305

1x1x30的含义:

​ 两个默认框的预测值

​ 4 xywh (坐标预测), 1, 4 xywh(坐标预测), 1, 20(20个分类预测)

image-20220418021347698

SSD

  • 位置:
    • 借鉴RPN的anchor Box机制: feature点对应的原图感受野框处生成不同ratio/scale的default box
    • 没有候选框!直接对default box做全分类+回归
  • 类别:CNN提取特征 + CNN分类
    • 多感受野特征词输出:前面层感受野小适合小物件,后面层感受野大适合大物体。

image-20220418021934942

YOLOv2

  • 更丰富的default box
    • 从数据集统计出default box(k-means);随着k的增大,IOU也增大(高召回率)但是复杂度也在增加,最终选择k=5
  • 更灵活的类别预测
    • 把预测类别的机制从空间位置(cell)中解耦,由default box同时预测类别和坐标,有效解决物体重叠。

image-20220418022547449

YOLOv3

  • 更好的基础网络
    • darknet-19 换成darknet-53
  • 考虑多尺寸
    • 多尺度
    • 多感受野特征层输出
    • 更多default box:K=9,被3个输出平分3*(5+80)=255;
    • 3个box 5(x,y,w,h,confi), 80(coco class)

image-20220418023004836

实战

https://github.com/andylei77/object-detector

[TOC]

环境配置VS Studio

c_cpp_properties.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"configurations": [
{
"name": "windows-gcc-x64",
"includePath": [
"${workspaceFolder}/**",
"C:/Users/Simon/.conda/envs/torch_gpu/include/",
"C:/Users/Simon/.conda/envs/torch_gpu/Lib/site-packages/numpy/core/include"
],
"compilerPath": "D:/Tools/Mingw/mingw64/bin/gcc.exe",
"cStandard": "${default}",
"cppStandard": "c++11",
"intelliSenseMode": "windows-gcc-x64",
"compilerArgs": []
}
],
"version": 4
}

tasks.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
{
"version": "2.0.0",
"tasks": [
{
"type": "shell",
"label": "C/C++: g++编译前清理",
"command": "rm",
"args": [
"${workspaceRoot}/*.exe"
]
},
{
"label": "build c++Callpython",
"type": "shell",
"command": "g++",
"args": [
"-g", "c++_python.cpp",
"-I", "C:/Users/Simon/.conda/envs/torch_gpu/include/",
"-I", "C:/Users/Simon/.conda/envs/torch_gpu/Lib/site-packages/numpy/core/include",
"-L", "C:/Users/Simon/.conda/envs/torch_gpu/libs/*",
"-o", "c++_python.exe"
],
"group": {
"kind": "build",
"isDefault": true
},
"problemMatcher": [],
"dependsOn":[
"C/C++: g++编译前清理",
]
}
]
}

// g++ -g c++_python.cpp -L ./libs/* -o c++_python.exe
// -fdiagnostics-color=always

0. Test Run

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>

#include "numpy/arrayobject.h"
#include "Python.h"

using namespace std;

// ref: https://wenku.baidu.com/view/01fab1346f175f0e7cd184254b35eefdc8d315cd.html

int demo0(){

Py_SetPythonHome(L"C:/Users/Simon/.conda/envs/torch_gpu");
Py_Initialize();
PyRun_SimpleString("print('hello')");
Py_Finalize();

}

1. C++传参

call python method, set parameter ,return value

常用的有两种方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
使用 PyTuple_New 创建元组, PyTuple_SetItem 设置元组值

PyObject* args = PyTuple_New(3);
PyObject* arg1 = Py_BuildValue("i", 100); // 整数参数
PyObject* arg2 = Py_BuildValue("f", 3.14); // 浮点数参数
PyObject* arg3 = Py_BuildValue("s", "hello"); // 字符串参数
PyTuple_SetItem(args, 0, arg1);
PyTuple_SetItem(args, 1, arg2);
PyTuple_SetItem(args, 2, arg3);

# 直接使用Py_BuildValue构造元组

PyObject* args = Py_BuildValue("(ifs)", 100, 3.14, "hello");
PyObject* args = Py_BuildValue("()"); // 无参函数
原文链接:https://blog.csdn.net/tobacco5648/article/details/50890106

img

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
int demo1()
{
Py_SetPythonHome(L"C:/Users/Simon/.conda/envs/torch_gpu");
Py_Initialize();

PyObject *module = NULL;
PyObject *pFunc = NULL;
PyObject *pArg = NULL;
PyObject *value = NULL;
// 导入文件
module = PyImport_ImportModule("pthonnx_ru");

pFunc = PyObject_GetAttrString(module, "demo1"); // 找到函数地址
pArg = Py_BuildValue("(S)", "my is c++ test");
value = PyEval_CallObject(pFunc, pArg); // 调用函数
float val;
PyArg_Parse(value, "f", &val);
// PyArg_ParseTuple(value, "f", &val);
cout << "--val1--" << val << endl;

pFunc = PyObject_GetAttrString(module, "Add");
pArg = Py_BuildValue("i, i)", 21, 23);
value = PyEval_CallObject(pFunc, pArg);
PyArg_Parse(value, "f", &val);
cout << "--val2--" << val << endl;

PyRun_SimpleString("print('hello')");
Py_Finalize();

return 0;
}

2.传递List参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
int demo2()
{

cout << "Hello World" << endl;

Py_SetPythonHome(L"C:/Users/Simon/.conda/envs/torch_gpu");
Py_Initialize();

PyObject *module = NULL;
PyObject *pFunc = NULL;
PyObject *pArg = NULL;
// 导入文件
module = PyImport_ImportModule("pthonnx_ru");
// 找到函数地址
pFunc = PyObject_GetAttrString(module, "demo2");
// Set Parameter
PyObject *pyParams = PyList_New(0);
PyList_Append(pyParams, Py_BuildValue("i", 5)); // float
PyList_Append(pyParams, Py_BuildValue("i", 3));
PyObject *args = PyTuple_New(1);
PyTuple_SetItem(args, 0, pyParams);

// 调用函数
float val;
PyObject *value = PyEval_CallObject(pFunc, args);
PyArg_Parse(value, "f", &val);
// PyArg_ParseTuple(value, "f", &val);
cout << "--val--" << val << endl;

PyRun_SimpleString("print('hello')");
Py_Finalize();

return 0;
}

3 python类操作,类属性,类成员函数 Todo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int demo3()
{
cout << "Hello World" << endl;

Py_SetPythonHome(L"C:/Users/Simon/.conda/envs/torch_gpu");
Py_Initialize();

PyObject *module = NULL;
PyObject *pFunc = NULL;

module = PyImport_ImportModule("pthonnx_ru"); // 导入文件
pFunc = PyObject_GetAttrString(module, "demo2"); // 找到函数地址

// 创建参数
PyObject *pArgs = PyTuple_New(1);
// 设置函数参数的值
int InParm = 1;
PyTuple_SetItem(pArgs, 0, PyLong_FromLong(InParm));

// 调用函数
float val;
PyObject *value = PyEval_CallObject(pFunc, pArgs);
PyArg_Parse(value, "f", &val);
// PyArg_ParseTuple(value, "f", &val);
cout << "--val--" << val << endl;

Py_Finalize();
return 0;
}

4 传递c++数组转python的list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
int demo4(){
Py_SetPythonHome(L"C:/Users/Simon/.conda/envs/torch_gpu");
Py_Initialize();
// #include "numpy/arrayobject.h"
// api doc : https://numpy.org/doc/1.17/reference/c-api.array.html#importing-the-api
import_array();

PyObject *module = NULL;
PyObject *pFunc = NULL;
PyObject *pArgs = NULL;

module = PyImport_ImportModule("pthonnx_ru");
pFunc = PyObject_GetAttrString(module, "demo4");

float buf[2][3];
buf[0][0] = 0;
buf[0][1] = 1.1230;
buf[0][2] = 2.340;
buf[1][0] = 4.540;
buf[1][1] = 6.900;
buf[1][2] = 8.090;

pArgs = PyTuple_New(1);
npy_intp dims[2] = {2, 3}; // 定义list的shape
int ND = 2; // list 维度
PyObject * pPyArray = PyArray_SimpleNewFromData(ND, dims, NPY_FLOAT, buf); // list dim,shape, type, buffer
PyTuple_SetItem(pArgs, 0, pPyArray); // 变量转换
PyEval_CallObject(pFunc, pArgs);

Py_Finalize();
return 0;
}

参考:

https://docs.python.org/2/extending/embedding.html

https://wenku.baidu.com/view/01fab1346f175f0e7cd184254b35eefdc8d315cd.html

https://numpy.org/doc/1.17/reference/c-api.array.html

https://numpy.org/doc/1.17/reference/c-api.array.html#importing-the-api

无人驾驶感知基础–车道线检测

[toc]

3.1无人驾驶感知系统概述

Preception

img

img

Content

  • 实战基于传统方法的车道线检测
  • 图片分割算法综述
  • 实战基于深度学习的图片分割算法综述

3.2 实战分割基于传统方法的车道线检测

静态环境感知与分割算法

https://github.com/andylei77/lane-detector

对比算法:

Zhihu算法集锦(8)|自动驾驶|车道检测实用算法

1
2
3
4
5
6
# 该算法利用了OpenCV库和Udacity自动驾驶汽车数据库的相关内容。
摄像头校准,以移除镜头畸变(Lens distortion)的影响
图像前处理,用于识别车道线
道路视角变换(Perspective transform)
车道线检测
车辆定位和车道半径计算

3.2.2 canny边缘检测

image-20220412102729106

1
2
3
4
5
6
7
8
def do_canny(frame):
# Converts frame to grayscale because we only need the luminance channel for detecting edges - less computationally expensive
gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# Applies a 5x5 gaussian blur with deviation of 0 to frame - not mandatory since Canny will do this for us
blur = cv.GaussianBlur(gray, (5, 5), 0)
# Applies Canny edge detector with minVal of 50 and maxVal of 150
canny = cv.Canny(blur, 50, 150)
return canny

3.2.3 手动分割路面区域

image-20220412103304006

CV坐标系

  • polygones=[] # 手动指定三角形的三个点
  • mask = zeros_like() 生成mask
  • fillpoly(mask, polygones, 255) ; poly 范围内填充255,区域外保留原始值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def do_segment(frame):
# Since an image is a multi-directional array containing the relative intensities of each pixel in the image, we can use frame.shape to return a tuple: [number of rows, number of columns, number of channels] of the dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since height begins from 0 at the top, the y-coordinate of the bottom of the frame is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y) coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment

3.2.4 霍夫变换得到车道线

  • 霍夫变换

    • 参数和变量互换
  • 图像中的一条线,变换到霍夫空间,就变成一个(霍夫空间的)点

  • 图像中的一个点(有多条线穿过),对应霍夫空间的一条线;

cartesian: 笛卡尔坐标系

image-20220412103416004

  • 将笛卡尔坐标系中一系列的可能的点(连接成线),投影到霍夫空间(应该是一个点)

另一种霍夫空间(极坐标)

  • 极坐标法表示直线

image-20220412103957969

HoughLinesP函数在HoughLines的基础上末尾加了一个代表Probabilistic(概率)的P,表明它可以采用累计概率霍夫变换(PPHT)来找出二值图像中的直线。

1
hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]), minLineLength = 100, maxLineGap = 50)

3.2.5 获取车道线并叠加到原始图像中

  • 综合所有线,求两条车道线的平均斜率和截距
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def calculate_lines(frame, lines):
return [left_line, right_line]

def calculate_coordinates(frame, parameters):
slope, intercept = parameters
# Sets initial y-coordinate as height from top down (bottom of the frame)
y1 = frame.shape[0]
# Sets final y-coordinate as 150 above the bottom of the frame
y2 = int(y1 - 150)
# Sets initial x-coordinate as (y1 - b) / m since y1 = mx1 + b
x1 = int((y1 - intercept) / slope)
# Sets final x-coordinate as (y2 - b) / m since y2 = mx2 + b
x2 = int((y2 - intercept) / slope)
return np.array([x1, y1, x2, y2])

3.3 实战基于深度学习的图片分割算法综述

3.3.1 代表算法讲解

img

img

全连接

下采样上采样,池化

融合,上采样的和下采样平行的融合

语义地图,比如输入RGB3通道,最后输出是6个分类,那就是6张图

img

img

img

img

img

img

img

3.3.2 基于图片分割的车道线检测

image-20220412171351872

image-20220412171257608

  • 语义分割,二值分类,得到车道
  • 分类车道-同向,—聚类方法 通向车道在n-dim聚类中,距离很近

源码

Github lanenet-lane-detection

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# https://github.com/andylei77/lanenet-lane-detection/blob/master/tools/test_lanenet.py

def test_lanenet(image_path, weights_path, use_gpu):
"""
:param image_path:
:param weights_path:
:param use_gpu:
:return:
"""
assert ops.exists(image_path), '{:s} not exist'.format(image_path)

log.info('开始读取图像数据并进行预处理')
t_start = time.time()
image = cv2.imread(image_path, cv2.IMREAD_COLOR)
image_vis = image
image = cv2.resize(image, (512, 256), interpolation=cv2.INTER_LINEAR)
image = image - VGG_MEAN
log.info('图像读取完毕, 耗时: {:.5f}s'.format(time.time() - t_start))

input_tensor = tf.placeholder(dtype=tf.float32, shape=[1, 256, 512, 3], name='input_tensor')
phase_tensor = tf.constant('test', tf.string)

net = lanenet_merge_model.LaneNet(phase=phase_tensor, net_flag='vgg')
binary_seg_ret, instance_seg_ret = net.inference(input_tensor=input_tensor, name='lanenet_model')

cluster = lanenet_cluster.LaneNetCluster()
postprocessor = lanenet_postprocess.LaneNetPoseProcessor()

saver = tf.train.Saver()

# Set sess configuration
if use_gpu:
sess_config = tf.ConfigProto(device_count={'GPU': 1})
else:
sess_config = tf.ConfigProto(device_count={'CPU': 0})
sess_config.gpu_options.per_process_gpu_memory_fraction = CFG.TEST.GPU_MEMORY_FRACTION
sess_config.gpu_options.allow_growth = CFG.TRAIN.TF_ALLOW_GROWTH
sess_config.gpu_options.allocator_type = 'BFC'

sess = tf.Session(config=sess_config)

with sess.as_default():

saver.restore(sess=sess, save_path=weights_path)

t_start = time.time()
binary_seg_image, instance_seg_image = sess.run([binary_seg_ret, instance_seg_ret],
feed_dict={input_tensor: [image]})
t_cost = time.time() - t_start
log.info('单张图像车道线预测耗时: {:.5f}s'.format(t_cost))

binary_seg_image[0] = postprocessor.postprocess(binary_seg_image[0])
mask_image = cluster.get_lane_mask(binary_seg_ret=binary_seg_image[0],
instance_seg_ret=instance_seg_image[0])

for i in range(4):
instance_seg_image[0][:, :, i] = minmax_scale(instance_seg_image[0][:, :, i])
embedding_image = np.array(instance_seg_image[0], np.uint8)

plt.figure('mask_image')
plt.imshow(mask_image[:, :, (2, 1, 0)])
plt.figure('src_image')
plt.imshow(image_vis[:, :, (2, 1, 0)])
plt.figure('instance_image')
plt.imshow(embedding_image[:, :, (2, 1, 0)])
plt.figure('binary_image')
plt.imshow(binary_seg_image[0] * 255, cmap='gray')
plt.show()

sess.close()

return

相关算法汇总

Zhihu 自动驾驶中的车道线检测算法汇总

[TOC]

1.1 行业概述

img

img

preview

preview

preview

preview

1.2 技术路径

preview

preview

L2级别无人驾驶

preview

L3级别无人驾驶

preview

L4级别无人驾驶

preview

preview

V2X

img

preview

preview

1.3 技术概述

img

img

img

img

硬件概述

img

img

软件概述

img

img

操作系统OS

img

HD MAP

img

软件概述

  • 定位
  • 感知
  • 决策
  • 控制
定位

img

img

感知

img

img

决策

img

控制

img

[TOC]

转自–目标跟踪算法综述

第一部分:目标跟踪速览

先跟几个SOTA的tracker混个脸熟,大概了解一下目标跟踪这个方向都有些什么。一切要从2013年的那个数据库说起。。如果你问别人近几年有什么比较niubility的跟踪算法,大部分人都会扔给你吴毅老师的论文,OTB50和OTB100(OTB50这里指OTB-2013,OTB100这里指OTB-2015,50和100分别代表视频数量,方便记忆):

Wu Y, Lim J, Yang M H. Online object tracking: A benchmark [C]// CVPR, 2013.

Wu Y, Lim J, Yang M H. Object tracking benchmark [J]. TPAMI, 2015.

顶会转顶刊的顶级待遇,在加上引用量1480+320多,影响力不言而喻,已经是做tracking必须跑的数据库了,测试代码和序列都可以下载: Visual Tracker Benchmark,OTB50包括50个序列,都经过人工标注:

目标跟踪

  • 视觉和激光方向

SORT

https://github.com/abewley/sort

DeepSORT

ADAS