Introduction¶
Many developers share pre-trained models when learning deep learning frameworks, which we can use in our own projects. If the same framework is used, direct usage is convenient. However, for different frameworks, model conversion is necessary. Below, we introduce how to convert a Caffe model to a PaddlePaddle Fluid model.
Environment Preparation¶
- Install the latest PaddlePaddle online using pip:
pip install paddlepaddle
- For the latest installation, choose the appropriate version from the following link and install using pip:
http://www.paddlepaddle.org/documentation/docs/zh/0.14.0/new_docs/beginners_guide/install/install_doc.html#id26
- Clone the PaddlePaddle models source code:
git clone https://github.com/PaddlePaddle/models.git
Model Conversion¶
- Navigate to the
caffe2fluiddirectory under the clonedmodels:
cd models/fluid/image_classification/caffe2fluid/
- Download the required Python dependency file:
cd proto/ && wget https://raw.githubusercontent.com/ethereon/caffe-tensorflow/master/kaffe/caffe/caffepb.py
- Rename the downloaded file:
mv caffepb.py caffe_pb2.py
- Obtain the Caffe model to convert (using the following open-source model as an example):
https://gist.github.com/ksimonyan/211839e770f7b538e2d8
First, download the network configuration file:
cd ../ && wget https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/ded9363bd93ec0c770134f4e387d8aaaaa2407ce/VGG_ILSVRC_16_layers_deploy.prototxt
Second, download the weight file:
wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
- Convert the Caffe model to Fluid’s network structure and weight files.
VGG16.pyis the Python file defining the PaddlePaddle network structure, andVGG16.npyis the weight file:
python convert.py VGG_ILSVRC_16_layers_deploy.prototxt \
--caffemodel VGG_ILSVRC_16_layers.caffemodel \
--data-output-path VGG16.npy \
--code-output-path VGG16.py
During execution, the following information will be printed:
register layer[Axpy]
register layer[Flatten]
register layer[ArgMax]
register layer[Reshape]
register layer[ROIPooling]
register layer[PriorBox]
register layer[Permute]
register layer[DetectionOutput]
register layer[Normalize]
register layer[Select]
register layer[Crop]
register layer[Reduction]
------------------------------------------------------------
WARNING: PyCaffe not found!
Falling back to a pure protocol buffer implementation.
* Conversions will be drastically slower.
------------------------------------------------------------
Type Name Param Output
----------------------------------------------------------------------------------------------
Data data -- (10, 3, 224, 224)
Convolution conv1_1 (64, 3, 3, 3) (10, 64, 224, 224)
Convolution conv1_1 (64,) (10, 64, 224, 224)
Convolution conv1_2 (64, 64, 3, 3) (10, 64, 224, 224)
Convolution conv1_2 (64,) (10, 64, 224, 224)
Pooling pool1 -- (10, 64, 112, 112)
Convolution conv2_1 (128, 64, 3, 3) (10, 128, 112, 112)
Convolution conv2_1 (128,) (10, 128, 112, 112)
Convolution conv2_2 (128, 128, 3, 3) (10, 128, 112, 112)
Convolution conv2_2 (128,) (10, 128, 112, 112)
Pooling pool2 -- (10, 128, 56, 56)
Convolution conv3_1 (256, 128, 3, 3) (10, 256, 56, 56)
Convolution conv3_1 (256,) (10, 256, 56, 56)
Convolution conv3_2 (256, 256, 3, 3) (10, 256, 56, 56)
Convolution conv3_2 (256,) (10, 256, 56, 56)
Convolution conv3_3 (256, 256, 3, 3) (10, 256, 56, 56)
Convolution conv3_3 (256,) (10, 256, 56, 56)
Pooling pool3 -- (10, 256, 28, 28)
Convolution conv4_1 (512, 256, 3, 3) (10, 512, 28, 28)
Convolution conv4_1 (512,) (10, 512, 28, 28)
Convolution conv4_2 (512, 512, 3, 3) (10, 512, 28, 28)
Convolution conv4_2 (512,) (10, 512, 28, 28)
Convolution conv4_3 (512, 512, 3, 3) (10, 512, 28, 28)
Convolution conv4_3 (512,) (10, 512, 28, 28)
Pooling pool4 -- (10, 512, 14, 14)
Convolution conv5_1 (512, 512, 3, 3) (10, 512, 14, 14)
Convolution conv5_1 (512,) (10, 512, 14, 14)
Convolution conv5_2 (512, 512, 3, 3) (10, 512, 14, 14)
Convolution conv5_2 (512,) (10, 512, 14, 14)
Convolution conv5_3 (512, 512, 3, 3) (10, 512, 14, 14)
Convolution conv5_3 (512,) (10, 512, 14, 14)
Pooling pool5 -- (10, 512, 7, 7)
InnerProduct fc6 (4096, 25088) (10, 4096)
InnerProduct fc6 (4096,) (10, 4096)
Dropout drop6 -- (10, 4096)
InnerProduct fc7 (4096, 4096) (10, 4096)
InnerProduct fc7 (4096,) (10, 4096)
Dropout drop7 -- (10, 4096)
InnerProduct fc8 (1000, 4096) (10, 1000)
InnerProduct fc8 (1000,) (10, 1000)
Softmax prob -- (10, 1000)
Converting data...
Saving data...
Saving source...
set env variable before using converted model if used custom_layers:
- Generate the prediction model file using PaddlePaddle’s network structure and weight files:
python VGG16.py VGG16.npy ./fluid_models
- After execution, the prediction model will be generated and stored in the
fluid_modelsdirectory, containing two files:modelandparams. This is compatible with thepaddle.fluid.io.save_inference_modelinterface (see the documentation). We will use this model for image prediction in the next step.
Testing the Prediction Model¶
To predict images using the converted model, first write a PaddlePaddle prediction program:
# coding=utf-8
import os
import time
from PIL import Image
import numpy as np
import paddle.v2 as paddle
import paddle.fluid as fluid
def load_image(file):
im = Image.open(file)
im = im.resize((224, 224), Image.ANTIALIAS)
im = np.array(im).astype(np.float32)
# PIL opens images in H(Height), W(Width), C(Channels) order
# PaddlePaddle requires CHW order, so we transpose dimensions
im = im.transpose((2, 0, 1))
# CIFAR uses BGR order, while PIL opens RGB, so swap channels
im = im[(2, 1, 0), :, :] # Convert to BGR
# Subtract mean values
mean = [123.68, 116.78, 103.94]
mean = np.array(mean, dtype=np.float32)
mean = mean[:, np.newaxis, np.newaxis]
im -= mean
return im
def infer_one(image_path, use_cuda, model_path, model_filename, params_filename):
# Set execution device
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
with fluid.scope_guard(inference_scope):
# Load the inference model
[inference_program, feed_target_names, fetch_targets] = fluid.io.load_inference_model(
model_path, exe, model_filename=model_filename, params_filename=params_filename)
# Prepare input data
test_datas = [load_image(image_path)]
test_data = np.array(test_datas)
# Execute prediction
results = exe.run(
inference_program,
feed={feed_target_names[0]: test_data},
fetch_list=fetch_targets)
# Process results (sort in descending order)
results = np.argsort(-results[0])
result = results[0][0]
print("Predicted label: %d" % result)
if __name__ == '__main__':
image_path = "0b77aba2-9557-11e8-a47a-c8ff285a4317.jpg"
use_cuda = False
model_path = "fluid_models/"
model_filename = "model"
params_filename = "params"
infer_one(image_path, use_cuda, model_path, model_filename, params_filename)
This program handles image preprocessing and model inference. Note the consistent image processing requirements with the training phase.
References¶
- https://github.com/PaddlePaddle/models/tree/develop/fluid/image_classification/caffe2fluid
- https://gist.github.com/ksimonyan/211839e770f7b538e2d8