Original blog: Doi Technology Team
Link: https://blog.doiduoyi.com/authors/1584446358138
Original intention: Record the learning experience of the excellent Doi Technology Team
*This article is based on PaddlePaddle 0.11.0 and Python 2.7
Introduction to the Dataset¶
As the title suggests, the training uses handwritten digits from the MNIST database, which contains a training set of 60,000 examples and a test set of 10,000 examples. The images are 28x28 pixel matrices, and the labels correspond to 10 digits from 0 to 9. Each image has been normalized in size and centered. The images in this dataset are grayscale single-channel images, and an example image is shown below:

This dataset is very small, making it suitable for image recognition beginners. There are a total of 4 files: training data and corresponding labels, test data and corresponding labels. The files are shown in the table:
|File Name |Size |Description |
| :—: |:—: | :—:|
|train-images-idx3-ubyte |9.9M |Training images, 60,000 examples |
|train-labels-idx1-ubyte |28.9K |Training labels, 60,000 examples |
|t10k-images-idx3-ubyte |1.6M |Test images, 10,000 examples |
|t10k-labels-idx1-ubyte |4.5K |Test labels, 10,000 examples |
Compared to the 170+M CIFAR dataset, this dataset is much smaller. This makes training very fast, which can immediately spark developers’ interest.
During training, developers do not need to download the dataset separately. PaddlePaddle has encapsulated it. When calling paddle.dataset.mnist, it will automatically download to the cache directory /home/username/.cache/paddle/dataset/mnist. Subsequent uses will directly access the cache without re-downloading.
Define the Neural Network¶
We use the convolutional neural network LeNet-5. PaddlePaddle officially provides 3 classifiers: Softmax regression, Multi-layer Perceptron, and convolutional neural network LeNet-5. Convolutional neural networks are commonly used in image recognition tasks. We create a cnn.py Python file to define the LeNet-5 neural network, with the following code:
# coding=utf-8
import paddle.v2 as paddle
# Convolutional Neural Network LeNet-5, get the classifier
def convolutional_neural_network():
# Define the data model, the size is 28*28, i.e., 784
img = paddle.layer.data(name="pixel",
type=paddle.data_type.dense_vector(784))
# First convolutional-pooling layer
conv_pool_1 = paddle.networks.simple_img_conv_pool(input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
# Second convolutional-pooling layer
conv_pool_2 = paddle.networks.simple_img_conv_pool(input=conv_pool_1,
filter_size=5,
num_filters=50,
num_channel=20,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
# Fully connected output layer with Softmax activation, size must be 10 (number of digits)
predict = paddle.layer.fc(input=conv_pool_2,
size=10,
act=paddle.activation.Softmax())
return predict
Start Training the Model¶
We create a train.py Python file for model training.
Import Dependencies¶
First, import necessary packages, including the critical PaddlePaddle V2 package:
# encoding:utf-8
import os
import sys
import paddle.v2 as paddle
from cnn import convolutional_neural_network
Initialize Paddle¶
Create a class and initialize PaddlePaddle in its constructor, specifying whether to use GPU and the number of threads:
class TestMNIST:
def __init__(self):
# The model runs on CPU with 2 threads
paddle.init(use_gpu=False, trainer_count=2)
Get the Trainer¶
Generate a loss function using the classifier and labels, then create the training parameters and optimizer. Finally, create the trainer with these components:
# *****************Get Trainer********************************
def get_trainer(self):
# Get the classifier
out = convolutional_neural_network()
# Define labels
label = paddle.layer.data(name="label",
type=paddle.data_type.integer_value(10))
# Get the loss function
cost = paddle.layer.classification_cost(input=out, label=label)
# Get parameters
parameters = paddle.parameters.create(layers=cost)
"""
Define the optimization method
learning_rate: Iteration speed
momentum: Momentum optimization ratio
regularization: Regularization to prevent overfitting
"""
optimizer = paddle.optimizer.Momentum(learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
'''
Create the trainer
cost: Loss function
parameters: Training parameters (can be created or loaded from previous training)
update_equation: Optimization method
'''
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
return trainer
Start Training¶
Train the model using the training data, number of passes, and event handlers to print logs and save parameters:
# *****************Start Training********************************
def start_trainer(self):
# Get the trainer
trainer = self.get_trainer()
# Define training event handlers
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "\nPass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
else:
sys.stdout.write('.')
sys.stdout.flush()
if isinstance(event, paddle.event.EndPass):
# Save trained parameters
model_path = '../model'
if not os.path.exists(model_path):
os.makedirs(model_path)
with open(model_path + "/model.tar", 'w') as f:
trainer.save_parameter_to_tar(f=f)
# Test on the test set
result = trainer.test(reader=paddle.batch(paddle.dataset.mnist.test(), batch_size=128))
print "\nTest with Pass %d, Cost %f, %s\n" % (event.pass_id, result.cost, result.metrics)
# Get training data
reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=20000),
batch_size=128)
'''
Start training
reader: Training data
num_passes: Number of training passes
event_handler: Training events (e.g., logging, saving parameters)
'''
trainer.train(reader=reader,
num_passes=100,
event_handler=event_handler)
Call the training function in the main entry point:
if __name__ == "__main__":
testMNIST = TestMNIST()
# Start training
testMNIST.start_trainer()
Sample training logs:
Pass 0, Batch 0, Cost 2.991905, {'classification_error_evaluator': 0.859375}
...................................................................................................
Pass 0, Batch 100, Cost 0.891881, {'classification_error_evaluator': 0.3046875}
...................................................................................................
Pass 0, Batch 200, Cost 0.309183, {'classification_error_evaluator': 0.0859375}
...................................................................................................
Pass 0, Batch 300, Cost 0.289464, {'classification_error_evaluator': 0.078125}
...................................................................................................
Pass 0, Batch 400, Cost 0.131645, {'classification_error_evaluator': 0.03125}
....................................................................
Test with Pass 0, Cost 0.117626, {'classification_error_evaluator': 0.03790000081062317}
Prediction Using Trained Parameters¶
We create an infer.py Python file for prediction.
Initialize PaddlePaddle¶
Same as training, initialize Paddle with CPU settings:
class TestMNIST:
def __init__(self):
# The model runs on CPU with 2 threads
paddle.init(use_gpu=False, trainer_count=2)
Load Trained Parameters¶
Load the saved model parameters after training:
# *****************Load Parameters********************************
def get_parameters(self):
with open("../model/model.tar", 'r') as f:
parameters = paddle.parameters.Parameters.from_tar(f)
return parameters
Read the Image¶
Process the input image to match the training data format (28x28 grayscale, normalized to float array):
# *****************Get Test Data********************************
def get_TestData(self, path):
def load_images(file):
# Convert to grayscale
im = Image.open(file).convert('L')
# Resize to 28x28
im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).astype(np.float32).flatten()
im = im / 255.0
return im
test_data = []
test_data.append((load_images(path),))
return test_data
Start Prediction¶
Use the trained parameters and the classifier to predict the image:
# *****************Prediction with Trained Parameters********************************
def to_prediction(self, out, parameters, test_data):
# Perform prediction
probs = paddle.infer(output_layer=out,
parameters=parameters,
input=test_data)
# Process and print results
lab = np.argsort(-probs)
print "Prediction result: %d" % lab[0][0]
Call the prediction function in the main entry point:
if __name__ == "__main__":
testMNIST = TestMNIST()
out = convolutional_neural_network()
parameters = testMNIST.get_parameters()
test_data = testMNIST.get_TestData('../images/infer_3.png')
testMNIST.to_prediction(out=out, parameters=parameters, test_data=test_data)
Sample prediction output:
Prediction result: 3
All Code Files¶
cnn.py code:
# coding=utf-8
import paddle.v2 as paddle
# Convolutional Neural Network LeNet-5, get the classifier
def convolutional_neural_network():
img = paddle.layer.data(name="pixel",
type=paddle.data_type.dense_vector(784))
conv_pool_1 = paddle.networks.simple_img_conv_pool(input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
conv_pool_2 = paddle.networks.simple_img_conv_pool(input=conv_pool_1,
filter_size=5,
num_filters=50,
num_channel=20,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
predict = paddle.layer.fc(input=conv_pool_2,
size=10,
act=paddle.activation.Softmax())
return predict
train.py code:
# encoding:utf-8
import os
import sys
import paddle.v2 as paddle
from cnn import convolutional_neural_network
class TestMNIST:
def __init__(self):
paddle.init(use_gpu=False, trainer_count=2)
def get_trainer(self):
out = convolutional_neural_network()
label = paddle.layer.data(name="label",
type=paddle.data_type.integer_value(10))
cost = paddle.layer.classification_cost(input=out, label=label)
parameters = paddle.parameters.create(layers=cost)
optimizer = paddle.optimizer.Momentum(learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
return trainer
def start_trainer(self):
trainer = self.get_trainer()
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "\nPass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
else:
sys.stdout.write('.')
sys.stdout.flush()
if isinstance(event, paddle.event.EndPass):
model_path = '../model'
if not os.path.exists(model_path):
os.makedirs(model_path)
with open(model_path + "/model.tar", 'w') as f:
trainer.save_parameter_to_tar(f=f)
result = trainer.test(reader=paddle.batch(paddle.dataset.mnist.test(), batch_size=128))
print "\nTest with Pass %d, Cost %f, %s\n" % (event.pass_id, result.cost, result.metrics)
reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=20000),
batch_size=128)
trainer.train(reader=reader,
num_passes=100,
event_handler=event_handler)
if __name__ == "__main__":
testMNIST = TestMNIST()
testMNIST.start_trainer()
infer.py code:
# encoding:utf-8
import numpy as np
import paddle.v2 as paddle
from PIL import Image
from cnn import convolutional_neural_network
class TestMNIST:
def __init__(self):
paddle.init(use_gpu=False, trainer_count=2)
def get_parameters(self):
with open("../model/model.tar", 'r') as f:
parameters = paddle.parameters.Parameters.from_tar(f)
return parameters
def get_TestData(self, path):
def load_images(file):
im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).astype(np.float32).flatten()
im = im / 255.0
return im
test_data = []
test_data.append((load_images(path),))
return test_data
def to_prediction(self, out, parameters, test_data):
probs = paddle.infer(output_layer=out,
parameters=parameters,
input=test_data)
lab = np.argsort(-probs)
print "Prediction result: %d" % lab[0][0]
if __name__ == "__main__":
testMNIST = TestMNIST()
out = convolutional_neural_network()
parameters = testMNIST.get_parameters()
test_data = testMNIST.get_TestData('../images/infer_3.png')
testMNIST.to_prediction(out=out, parameters=parameters, test_data=test_data)
Previous: Notes on “My PaddlePaddle Learning Journey” - Part 1: PaddlePaddle Installation¶
Next: Notes on “My PaddlePaddle Learning Journey” - Part 3: CIFAR Color Image Recognition¶
Project Code¶
GitHub Repository: https://github.com/yeyupiaoling/LearnPaddle
References¶
- http://paddlepaddle.org/
- http://yann.lecun.com/exdb/mnist/