Foreword¶

This project references the loss function of ArcFace combined with MobileNet. It aims to develop a face recognition project with a small model, high recognition accuracy, and fast inference speed. The training data uses the emore dataset, which contains 85,742 people and 5,822,653 images. The lfw-align-128 dataset is used as the test data.

Source code address: https://github.com/yeyupiaoling/Pytorch-MobileFaceNet

Dataset Preparation¶

This project provides annotation files stored in the dataset directory, which can be used after decompression. Additionally, you need to download the following two datasets and extract them into the dataset directory:
- Emore dataset: Baidu Netdisk
- lfw-align-128 download address: Baidu Netdisk Extraction code: b2ec

Then execute the following command to extract the face images to dataset/images and package the entire dataset into a binary file, which can significantly improve the data reading speed during training:

python create_dataset.py

Training¶

Execute train.py for training. For more training parameters, please check the code:

python train.py

Sample training output:

[2021-11-03 15:18:28.813591] Train epoch 9, batch: 6100/90979, loss: 1.215695, accuracy: 0.859375, lr: 0.000107, eta: 5 days, 5:28:26
[2021-11-03 15:18:37.044353] Train epoch 9, batch: 6200/90979, loss: 0.908210, accuracy: 0.859375, lr: 0.000107, eta: 5 days, 6:35:02
[2021-11-03 15:18:45.229030] Train epoch 9, batch: 6300/90979, loss: 0.964092, accuracy: 0.875000, lr: 0.000107, eta: 5 days, 9:17:21
[2021-11-03 15:18:53.449567] Train epoch 9, batch: 6400/90979, loss: 1.208947, accuracy: 0.828125, lr: 0.000107, eta: 5 days, 12:41:06
[2021-11-03 15:19:01.682437] Train epoch 9, batch: 6500/90979, loss: 1.081449, accuracy: 0.875000, lr: 0.000107, eta: 5 days, 10:29:44
[2021-11-03 15:19:09.895995] Train epoch 9, batch: 6600/90979, loss: 1.277803, accuracy: 0.828125, lr: 0.000107, eta: 5 days, 12:29:05
[2021-11-03 15:19:18.086872] Train epoch 9, batch: 6700/90979, loss: 1.308692, accuracy: 0.828125, lr: 0.000107, eta: 5 days, 7:23:03
[2021-11-03 15:19:26.306897] Train epoch 9, batch: 6800/90979, loss: 1.474561, accuracy: 0.781250, lr: 0.000107, eta: 5 days, 8:20:23
[2021-11-03 15:19:34.528685] Train epoch 9, batch: 6900/90979, loss: 1.295028, accuracy: 0.812500, lr: 0.000107, eta: 5 days, 5:54:56
[2021-11-03 15:19:42.736712] Train epoch 9, batch: 7000/90979, loss: 1.474828, accuracy: 0.812500, lr: 0.000107, eta: 5 days, 8:32:33

Evaluation¶

Execute eval.py for evaluation. For more evaluation parameters, please check the code:

python eval.py

Prediction¶

This project provides prediction functionality. The model file can be directly used for prediction. Before performing prediction, you need to place face images in the face_db directory. Each image should contain only one face and be named after the person’s name to build a face database. All subsequent recognitions will compare against these images to find matching faces. The face detection used here is the MTCNN model, which is characterized by fast speed and small size. The source code address: Pytorch-MTCNN

If predicting from an image path, execute the following command:

python infer.py --image_path=temp/test.jpg

Sample prediction output:

Face detection time: 38ms
Face recognition time: 11ms
Face comparison results: [('Dilraba Dilmurat', 0.7030987), ('Yang Mi', 0.36442137)]
Face comparison results: [('Yang Mi', 0.63616204), ('Dilraba Dilmurat', 0.3101096)]
Predicted face positions: [[272, 67, 328, 118, 1], [156, 80, 215, 134, 1]]
Recognized face names: ['Dilraba Dilmurat', 'Yang Mi']
Total recognition time: 82ms

If predicting from a camera, execute the following command:

python infer_camera.py --camera_id=0

Foreword¶

Dataset Preparation¶

Training¶

Evaluation¶

Prediction¶

Related Articles