Foreword¶
This project references the loss function of ArcFace and the model structure of PP-OCRv2. Its goal is to develop a face recognition project with a small model, high recognition accuracy, and fast inference speed. The training data uses the emore dataset, which contains 85,742 people and a total of 5,822,653 images. The lfw-align-128 dataset is used as the test data.
Source code address: https://github.com/yeyupiaoling/PaddlePaddle-MobileFaceNets
Dataset Preparation¶
This project provides annotation files stored in the dataset directory, which can be used after decompression. Additionally, download the following two datasets and decompress them into the dataset directory:
- emore dataset Baidu Netdisk
- lfw-align-128 download address: Baidu Netdisk Extraction code: b2ec
Then execute the following command to extract the face images to dataset/images and package the entire dataset into a binary file, which can significantly improve the data reading speed during training.
python create_dataset.py
Training¶
Execute train.py to start training. For more training parameters, please refer to the code.
python train.py
Evaluation¶
Execute eval.py to perform evaluation. For more evaluation parameters, please refer to the code.
python eval.py
Prediction¶
This project already provides prediction functionality, and the model file can be directly used for prediction. Before performing prediction, you need to place face images in the face_db directory. Each image should contain only one face and be named with the person’s name, which establishes a face database. Subsequent recognitions will compare with these images to find the matching person. The face detection used here is the MTCNN model, which is fast and has a small size. The source code address is: PaddlePaddle-MTCNN
If predicting via image path, execute the following command:
python infer.py --image_path=temp/test.jpg
The log output is as follows:
Face detection time: 45ms
Face recognition time: 6ms
Face comparison results: [('Yang Mi', 0.61594474), ('Dilraba Dilmurat', 0.37707973)]
Face comparison results: [('Dilraba Dilmurat', 0.7290128), ('Yang Mi', 0.3993025)]
Predicted face positions: [[156, 80, 214, 135, 1], [269, 67, 327, 121, 1]]
Recognized face names: ['Yang Mi', 'Dilraba Dilmurat']
Total recognition time: 53ms

If predicting via camera, execute the following command:
python infer_camera.py --camera_id=0