Foreword

This project references the loss function of ArcFace and the model structure of PP-OCRv2. Its goal is to develop a face recognition project with a small model, high recognition accuracy, and fast inference speed. The training data uses the emore dataset, which contains 85,742 people and a total of 5,822,653 images. The lfw-align-128 dataset is used as the test data.

Source code address: https://github.com/yeyupiaoling/PaddlePaddle-MobileFaceNets

Dataset Preparation

This project provides annotation files stored in the dataset directory, which can be used after decompression. Additionally, download the following two datasets and decompress them into the dataset directory:
- emore dataset Baidu Netdisk
- lfw-align-128 download address: Baidu Netdisk Extraction code: b2ec

Then execute the following command to extract the face images to dataset/images and package the entire dataset into a binary file, which can significantly improve the data reading speed during training.

python create_dataset.py

Training

Execute train.py to start training. For more training parameters, please refer to the code.

python train.py

Evaluation

Execute eval.py to perform evaluation. For more evaluation parameters, please refer to the code.

python eval.py

Prediction

This project already provides prediction functionality, and the model file can be directly used for prediction. Before performing prediction, you need to place face images in the face_db directory. Each image should contain only one face and be named with the person’s name, which establishes a face database. Subsequent recognitions will compare with these images to find the matching person. The face detection used here is the MTCNN model, which is fast and has a small size. The source code address is: PaddlePaddle-MTCNN

If predicting via image path, execute the following command:

python infer.py --image_path=temp/test.jpg

The log output is as follows:

Face detection time: 45ms
Face recognition time: 6ms
Face comparison results: [('Yang Mi', 0.61594474), ('Dilraba Dilmurat', 0.37707973)]
Face comparison results: [('Dilraba Dilmurat', 0.7290128), ('Yang Mi', 0.3993025)]
Predicted face positions: [[156, 80, 214, 135, 1], [269, 67, 327, 121, 1]]
Recognized face names: ['Yang Mi', 'Dilraba Dilmurat']
Total recognition time: 53ms

If predicting via camera, execute the following command:

python infer_camera.py --camera_id=0
Xiaoye