Original blog: Doi Technology Team
Link: https://blog.doiduoyi.com/authors/1584446358138
Original intention: Record the learning experience of the excellent Doi Technology Team

*This article is based on PaddlePaddle 0.11.0 and Python 2.7

Introduction to the Dataset


As the title suggests, the training uses handwritten digits from the MNIST database, which contains a training set of 60,000 examples and a test set of 10,000 examples. The images are 28x28 pixel matrices, and the labels correspond to 10 digits from 0 to 9. Each image has been normalized in size and centered. The images in this dataset are grayscale single-channel images, and an example image is shown below:

This dataset is very small, making it suitable for image recognition beginners. There are a total of 4 files: training data and corresponding labels, test data and corresponding labels. The files are shown in the table:
|File Name |Size |Description |
| :—: |:—: | :—:|
|train-images-idx3-ubyte |9.9M |Training images, 60,000 examples |
|train-labels-idx1-ubyte |28.9K |Training labels, 60,000 examples |
|t10k-images-idx3-ubyte |1.6M |Test images, 10,000 examples |
|t10k-labels-idx1-ubyte |4.5K |Test labels, 10,000 examples |

Compared to the 170+M CIFAR dataset, this dataset is much smaller. This makes training very fast, which can immediately spark developers’ interest.

During training, developers do not need to download the dataset separately. PaddlePaddle has encapsulated it. When calling paddle.dataset.mnist, it will automatically download to the cache directory /home/username/.cache/paddle/dataset/mnist. Subsequent uses will directly access the cache without re-downloading.

Define the Neural Network


We use the convolutional neural network LeNet-5. PaddlePaddle officially provides 3 classifiers: Softmax regression, Multi-layer Perceptron, and convolutional neural network LeNet-5. Convolutional neural networks are commonly used in image recognition tasks. We create a cnn.py Python file to define the LeNet-5 neural network, with the following code:

# coding=utf-8
import paddle.v2 as paddle

# Convolutional Neural Network LeNet-5, get the classifier
def convolutional_neural_network():
    # Define the data model, the size is 28*28, i.e., 784
    img = paddle.layer.data(name="pixel",
                            type=paddle.data_type.dense_vector(784))
    # First convolutional-pooling layer
    conv_pool_1 = paddle.networks.simple_img_conv_pool(input=img,
                                                       filter_size=5,
                                                       num_filters=20,
                                                       num_channel=1,
                                                       pool_size=2,
                                                       pool_stride=2,
                                                       act=paddle.activation.Relu())
    # Second convolutional-pooling layer
    conv_pool_2 = paddle.networks.simple_img_conv_pool(input=conv_pool_1,
                                                       filter_size=5,
                                                       num_filters=50,
                                                       num_channel=20,
                                                       pool_size=2,
                                                       pool_stride=2,
                                                       act=paddle.activation.Relu())
    # Fully connected output layer with Softmax activation, size must be 10 (number of digits)
    predict = paddle.layer.fc(input=conv_pool_2,
                              size=10,
                              act=paddle.activation.Softmax())
    return predict

Start Training the Model


We create a train.py Python file for model training.

Import Dependencies

First, import necessary packages, including the critical PaddlePaddle V2 package:

# encoding:utf-8
import os
import sys
import paddle.v2 as paddle
from cnn import convolutional_neural_network

Initialize Paddle

Create a class and initialize PaddlePaddle in its constructor, specifying whether to use GPU and the number of threads:

class TestMNIST:
    def __init__(self):
        # The model runs on CPU with 2 threads
        paddle.init(use_gpu=False, trainer_count=2)

Get the Trainer

Generate a loss function using the classifier and labels, then create the training parameters and optimizer. Finally, create the trainer with these components:

# *****************Get Trainer********************************
def get_trainer(self):

    # Get the classifier
    out = convolutional_neural_network()

    # Define labels
    label = paddle.layer.data(name="label",
                              type=paddle.data_type.integer_value(10))

    # Get the loss function
    cost = paddle.layer.classification_cost(input=out, label=label)

    # Get parameters
    parameters = paddle.parameters.create(layers=cost)

    """
    Define the optimization method
    learning_rate: Iteration speed
    momentum: Momentum optimization ratio
    regularization: Regularization to prevent overfitting
    """
    optimizer = paddle.optimizer.Momentum(learning_rate=0.1 / 128.0,
                                          momentum=0.9,
                                          regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
    '''
    Create the trainer
    cost: Loss function
    parameters: Training parameters (can be created or loaded from previous training)
    update_equation: Optimization method
    '''
    trainer = paddle.trainer.SGD(cost=cost,
                                 parameters=parameters,
                                 update_equation=optimizer)
    return trainer

Start Training

Train the model using the training data, number of passes, and event handlers to print logs and save parameters:

# *****************Start Training********************************
def start_trainer(self):
    # Get the trainer
    trainer = self.get_trainer()

    # Define training event handlers
    def event_handler(event):
        if isinstance(event, paddle.event.EndIteration):
            if event.batch_id % 100 == 0:
                print "\nPass %d, Batch %d, Cost %f, %s" % (
                    event.pass_id, event.batch_id, event.cost, event.metrics)
            else:
                sys.stdout.write('.')
                sys.stdout.flush()
        if isinstance(event, paddle.event.EndPass):
            # Save trained parameters
            model_path = '../model'
            if not os.path.exists(model_path):
                os.makedirs(model_path)
            with open(model_path + "/model.tar", 'w') as f:
                trainer.save_parameter_to_tar(f=f)

            # Test on the test set
            result = trainer.test(reader=paddle.batch(paddle.dataset.mnist.test(), batch_size=128))
            print "\nTest with Pass %d, Cost %f, %s\n" % (event.pass_id, result.cost, result.metrics)

    # Get training data
    reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=20000),
                          batch_size=128)
    '''
    Start training
    reader: Training data
    num_passes: Number of training passes
    event_handler: Training events (e.g., logging, saving parameters)
    '''
    trainer.train(reader=reader,
                  num_passes=100,
                  event_handler=event_handler)

Call the training function in the main entry point:

if __name__ == "__main__":
    testMNIST = TestMNIST()
    # Start training
    testMNIST.start_trainer()

Sample training logs:

Pass 0, Batch 0, Cost 2.991905, {'classification_error_evaluator': 0.859375}
...................................................................................................
Pass 0, Batch 100, Cost 0.891881, {'classification_error_evaluator': 0.3046875}
...................................................................................................
Pass 0, Batch 200, Cost 0.309183, {'classification_error_evaluator': 0.0859375}
...................................................................................................
Pass 0, Batch 300, Cost 0.289464, {'classification_error_evaluator': 0.078125}
...................................................................................................
Pass 0, Batch 400, Cost 0.131645, {'classification_error_evaluator': 0.03125}
....................................................................
Test with Pass 0, Cost 0.117626, {'classification_error_evaluator': 0.03790000081062317}

Prediction Using Trained Parameters


We create an infer.py Python file for prediction.

Initialize PaddlePaddle

Same as training, initialize Paddle with CPU settings:

class TestMNIST:
    def __init__(self):
        # The model runs on CPU with 2 threads
        paddle.init(use_gpu=False, trainer_count=2)

Load Trained Parameters

Load the saved model parameters after training:

# *****************Load Parameters********************************
def get_parameters(self):
    with open("../model/model.tar", 'r') as f:
        parameters = paddle.parameters.Parameters.from_tar(f)
    return parameters

Read the Image

Process the input image to match the training data format (28x28 grayscale, normalized to float array):

# *****************Get Test Data********************************
def get_TestData(self, path):
    def load_images(file):
        # Convert to grayscale
        im = Image.open(file).convert('L')
        # Resize to 28x28
        im = im.resize((28, 28), Image.ANTIALIAS)
        im = np.array(im).astype(np.float32).flatten()
        im = im / 255.0
        return im

    test_data = []
    test_data.append((load_images(path),))
    return test_data

Start Prediction

Use the trained parameters and the classifier to predict the image:

# *****************Prediction with Trained Parameters********************************
def to_prediction(self, out, parameters, test_data):

    # Perform prediction
    probs = paddle.infer(output_layer=out,
                         parameters=parameters,
                         input=test_data)
    # Process and print results
    lab = np.argsort(-probs)
    print "Prediction result: %d" % lab[0][0]

Call the prediction function in the main entry point:

if __name__ == "__main__":
    testMNIST = TestMNIST()
    out = convolutional_neural_network()
    parameters = testMNIST.get_parameters()
    test_data = testMNIST.get_TestData('../images/infer_3.png')
    testMNIST.to_prediction(out=out, parameters=parameters, test_data=test_data)

Sample prediction output:

Prediction result: 3

All Code Files


cnn.py code:

# coding=utf-8
import paddle.v2 as paddle

# Convolutional Neural Network LeNet-5, get the classifier
def convolutional_neural_network():
    img = paddle.layer.data(name="pixel",
                            type=paddle.data_type.dense_vector(784))
    conv_pool_1 = paddle.networks.simple_img_conv_pool(input=img,
                                                       filter_size=5,
                                                       num_filters=20,
                                                       num_channel=1,
                                                       pool_size=2,
                                                       pool_stride=2,
                                                       act=paddle.activation.Relu())
    conv_pool_2 = paddle.networks.simple_img_conv_pool(input=conv_pool_1,
                                                       filter_size=5,
                                                       num_filters=50,
                                                       num_channel=20,
                                                       pool_size=2,
                                                       pool_stride=2,
                                                       act=paddle.activation.Relu())
    predict = paddle.layer.fc(input=conv_pool_2,
                              size=10,
                              act=paddle.activation.Softmax())
    return predict

train.py code:

# encoding:utf-8
import os
import sys
import paddle.v2 as paddle
from cnn import convolutional_neural_network


class TestMNIST:
    def __init__(self):
        paddle.init(use_gpu=False, trainer_count=2)

    def get_trainer(self):
        out = convolutional_neural_network()
        label = paddle.layer.data(name="label",
                                  type=paddle.data_type.integer_value(10))
        cost = paddle.layer.classification_cost(input=out, label=label)
        parameters = paddle.parameters.create(layers=cost)
        optimizer = paddle.optimizer.Momentum(learning_rate=0.1 / 128.0,
                                              momentum=0.9,
                                              regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
        trainer = paddle.trainer.SGD(cost=cost,
                                     parameters=parameters,
                                     update_equation=optimizer)
        return trainer

    def start_trainer(self):
        trainer = self.get_trainer()

        def event_handler(event):
            if isinstance(event, paddle.event.EndIteration):
                if event.batch_id % 100 == 0:
                    print "\nPass %d, Batch %d, Cost %f, %s" % (
                        event.pass_id, event.batch_id, event.cost, event.metrics)
                else:
                    sys.stdout.write('.')
                    sys.stdout.flush()
            if isinstance(event, paddle.event.EndPass):
                model_path = '../model'
                if not os.path.exists(model_path):
                    os.makedirs(model_path)
                with open(model_path + "/model.tar", 'w') as f:
                    trainer.save_parameter_to_tar(f=f)
                result = trainer.test(reader=paddle.batch(paddle.dataset.mnist.test(), batch_size=128))
                print "\nTest with Pass %d, Cost %f, %s\n" % (event.pass_id, result.cost, result.metrics)

        reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=20000),
                              batch_size=128)
        trainer.train(reader=reader,
                      num_passes=100,
                      event_handler=event_handler)


if __name__ == "__main__":
    testMNIST = TestMNIST()
    testMNIST.start_trainer()

infer.py code:

# encoding:utf-8
import numpy as np
import paddle.v2 as paddle
from PIL import Image
from cnn import convolutional_neural_network


class TestMNIST:
    def __init__(self):
        paddle.init(use_gpu=False, trainer_count=2)

    def get_parameters(self):
        with open("../model/model.tar", 'r') as f:
            parameters = paddle.parameters.Parameters.from_tar(f)
        return parameters

    def get_TestData(self, path):
        def load_images(file):
            im = Image.open(file).convert('L')
            im = im.resize((28, 28), Image.ANTIALIAS)
            im = np.array(im).astype(np.float32).flatten()
            im = im / 255.0
            return im

        test_data = []
        test_data.append((load_images(path),))
        return test_data

    def to_prediction(self, out, parameters, test_data):
        probs = paddle.infer(output_layer=out,
                             parameters=parameters,
                             input=test_data)
        lab = np.argsort(-probs)
        print "Prediction result: %d" % lab[0][0]


if __name__ == "__main__":
    testMNIST = TestMNIST()
    out = convolutional_neural_network()
    parameters = testMNIST.get_parameters()
    test_data = testMNIST.get_TestData('../images/infer_3.png')
    testMNIST.to_prediction(out=out, parameters=parameters, test_data=test_data)


Previous: Notes on “My PaddlePaddle Learning Journey” - Part 1: PaddlePaddle Installation
Next: Notes on “My PaddlePaddle Learning Journey” - Part 3: CIFAR Color Image Recognition


Project Code


GitHub Repository: https://github.com/yeyupiaoling/LearnPaddle

References


  1. http://paddlepaddle.org/
  2. http://yann.lecun.com/exdb/mnist/
Xiaoye