- Notifications
You must be signed in to change notification settings - Fork27
💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉
License
xlite-dev/torchlm
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
English |Data Augmentations API Docs |ZhiHu Page |Pypi Downloads
torchlm is aims to build a high level pipeline for face landmarks detection, it supportstraining,evaluating,exporting,inference(Python/C++) and100+ data augmentations, can easily install viapip.
- High level pipeline fortraining andinference.
- Provides30+ native landmarks data augmentations.
- Canbind 80+ transforms fromtorchvision andalbumentations withone-line-code.
- SupportPIPNet, YOLOX, ResNet, MobileNet and ShuffleNet for face landmarks detection.
- [2022/03/08]: AddPIPNet:Towards Efficient Facial Landmark Detection in the Wild, CVPR2021
- [2022/02/13]: Add30+ transforms andbind80+ transforms from torchvision and albumentations.
| Model | Backbone | Head | 300W | COFW | AFLW | WFLW | Download |
|---|---|---|---|---|---|---|---|
| PIPNet | MobileNetV2 | Heatmap+Regression+NRM | 3.40 | 3.43 | 1.52 | 4.79 | link |
| PIPNet | ResNet18 | Heatmap+Regression+NRM | 3.36 | 3.31 | 1.48 | 4.47 | link |
| PIPNet | ResNet50 | Heatmap+Regression+NRM | 3.34 | 3.18 | 1.44 | 4.48 | link |
| PIPNet | ResNet101 | Heatmap+Regression+NRM | 3.19 | 3.08 | 1.42 | 4.31 | link |
you can installtorchlm directly frompypi.
pip install torchlm>=0.1.6.10# or install the latest pypi version `pip install torchlm`pip install torchlm>=0.1.6.10 -i https://pypi.org/simple/# or install from specific pypi mirrors use '-i'
or install from source if you want the latest torchlm and install it in editable mode with-e.
git clone --depth=1 https://github.com/xlite-dev/torchlm.gitcd torchlm&& pip install -e.
torchlm provides30+ native data augmentations for landmarks and canbind with80+ transforms from torchvision and albumentations. The layout format of landmarks isxy with shape(N, 2).
Use almost30+ native transforms fromtorchlm directly
importtorchlmtransform=torchlm.LandmarksCompose([torchlm.LandmarksRandomScale(prob=0.5),torchlm.LandmarksRandomMask(prob=0.5),torchlm.LandmarksRandomBlur(kernel_range=(5,25),prob=0.5),torchlm.LandmarksRandomBrightness(prob=0.),torchlm.LandmarksRandomRotate(40,prob=0.5,bins=8),torchlm.LandmarksRandomCenterCrop((0.5,1.0), (0.5,1.0),prob=0.5)])
Also, a user-friendly APIbuild_default_transform is available to build a default transform pipeline.
transform=torchlm.build_default_transform(input_size=(input_size,input_size),mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225],force_norm_before_mean_std=True,# img/=255. firstrotate=30,keep_aspect=False,to_tensor=True# array -> Tensor & HWC -> CHW)
Seetransforms.md for supported transforms sets and more example can be found attest/transforms.py.
💡 more details about transform in torchlm
torchlm provides30+ native data augmentations for landmarks and canbind with80+ transforms from torchvision and albumentations throughtorchlm.bind method. The layout format of landmarks isxy with shape(N, 2),N denotes the number of the input landmarks. Further,torchlm.bind provide aprob param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations intorchlm aresafe andsimplest. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. Yes, is ok if you pass a Tensor to a np.ndarray-like transform,torchlm will automatically be compatible with different data types and then wrap it back to the original type through aautodtype wrapper.
bind 80+ torchvision and albumentations's transforms
NOTE: Please install albumentations first if you want to bind albumentations's transforms. If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless,ablumentations need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall albumentations. Seealbumentations#1140 for more details.
# first uninstall conflict opencvspip uninstall opencv-pythonpip uninstall opencv-python-headlesspip uninstall albumentations# if you have installed albumentationspip install albumentations# then reinstall albumentations, will also install deps, e.g opencv
Then, check if albumentations is available.
torchlm.albumentations_is_available()# True or False
transform=torchlm.LandmarksCompose([torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5,25)),prob=0.5),torchlm.bind(albumentations.ColorJitter(p=0.5))])
bind custom callable array or Tensor transform functions
# First, defined your custom functionsdefcallable_array_noop(img:np.ndarray,landmarks:np.ndarray)->Tuple[np.ndarray,np.ndarray]:# do some transform here ...returnimg.astype(np.uint32),landmarks.astype(np.float32)defcallable_tensor_noop(img:Tensor,landmarks:Tensor)->Tuple[Tensor,Tensor]:# do some transform here ...returnimg,landmarks
# Then, bind your functions and put it into the transforms pipeline.transform=torchlm.LandmarksCompose([torchlm.bind(callable_array_noop,bind_type=torchlm.BindEnum.Callable_Array),torchlm.bind(callable_tensor_noop,bind_type=torchlm.BindEnum.Callable_Tensor,prob=0.5)])
some global debug setting for torchlm's transform
- setup logging mode as
Trueglobally might help you figure out the runtime details
# some global settingtorchlm.set_transforms_debug(True)torchlm.set_transforms_logging(True)torchlm.set_autodtype_logging(True)
some detail information will show you at each runtime, the infos might look like
LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOutLandmarksRandomScale() Execution Flag: FalseBindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOutBindTorchVisionTransform(GaussianBlur())() Execution Flag: TrueBindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOutBindAlbumentationsTransform(ColorJitter())() Execution Flag: TrueBindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOutBindTensorCallable(callable_tensor_noop())() Execution Flag: FalseError atLandmarksRandomTranslate() Skip, Flag: False Error Info:LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!LandmarksRandomTranslate() Execution Flag: False
Execution Flag: True means current transform was executed successful, False means it was not executed because of the random probability or some Runtime Exceptions(torchlm will should the error infos if debug mode is True).
AutoDtype Info:
- Array_InOut means current transform need a np.ndnarray as input and then output a np.ndarray.
- Tensor_InOut means current transform need a torch Tensor as input and then output a torch Tensor.
- Array_In means current transform needs a np.ndarray input and then output a torch Tensor.
- Tensor_In means current transform needs a torch Tensor input and then output a np.ndarray.
Yes, is ok if you pass a Tensor to a np.ndarray-like transform,torchlm will automatically be compatible with different data types and then wrap it back to the original type through aautodtype wrapper.
Intorchlm, each model have two high level and user-friendly APIs namedapply_training andapply_freezing for training.apply_training handle the training process andapply_freezing decide whether to freeze the backbone for fune-tuning.
Here is an example ofPIPNet. You can freeze backbone before fine-tuning throughapply_freezing.
fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)model.apply_freezing(backbone=True)model.apply_training(annotation_path="../data/WFLW/converted/train.txt",# or fine-tuning your custom datanum_epochs=10,learning_rate=0.0001,save_dir="./save/pipnet",save_prefix="pipnet-wflw-resnet18",save_interval=1,logging_interval=1,device="cuda",coordinates_already_normalized=True,batch_size=16,num_workers=4,shuffle=True)
Please jump to the entry point of the function for the detail documentations ofapply_training API for each defined models in torchlm, e.gpipnet/_impls.py#L166. You might see some logs if the training process is running:
Parametersfor DataLoader: {'batch_size': 16,'num_workers': 4,'shuffle': True}Built _PIPTrainDataset: train count is 7500!Epoch 0/9----------[Epoch 0/9, Batch 1/468]<Total loss: 0.372885><cls loss: 0.063186><x loss: 0.078508><y loss: 0.071679><nbx loss: 0.086480><nby loss: 0.073031>[Epoch 0/9, Batch 2/468]<Total loss: 0.354169><cls loss: 0.051672><x loss: 0.075350><y loss: 0.071229><nbx loss: 0.083785><nby loss: 0.072132>[Epoch 0/9, Batch 3/468]<Total loss: 0.367538><cls loss: 0.056038><x loss: 0.078029><y loss: 0.076432><nbx loss: 0.083546><nby loss: 0.073492>[Epoch 0/9, Batch 4/468]<Total loss: 0.339656><cls loss: 0.053631><x loss: 0.073036><y loss: 0.066723><nbx loss: 0.080007><nby loss: 0.066258>[Epoch 0/9, Batch 5/468]<Total loss: 0.364556><cls loss: 0.051094><x loss: 0.077378><y loss: 0.071951><nbx loss: 0.086363><nby loss: 0.077770>[Epoch 0/9, Batch 6/468]<Total loss: 0.371356><cls loss: 0.049117><x loss: 0.079237><y loss: 0.075729><nbx loss: 0.086213><nby loss: 0.081060>...[Epoch 0/9, Batch 33/468]<Total loss: 0.298983><cls loss: 0.041368><x loss: 0.069912><y loss: 0.057667><nbx loss: 0.072996><nby loss: 0.057040>
Theannotation_path parameter is denotes the path to a custom annotation file, the format must be:
"img0_path x0 y0 x1 y1 ... xn-1,yn-1""img1_path x0 y0 x1 y1 ... xn-1,yn-1""img2_path x0 y0 x1 y1 ... xn-1,yn-1""img3_path x0 y0 x1 y1 ... xn-1,yn-1"...
If the label in annotation_path is already normalized by image size, please setcoordinates_already_normalized asTrue inapply_training API.
"img0_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img1_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img2_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img3_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"...
Here is an example ofWFLW to show you how to prepare the dataset, also seetest/data.py.
Some models in torchlm support additional custom settings beyond thenum_lms of your custom dataset. For example,PIPNet also need to set a custom meanface generated by your custom dataset. Please jump the source code of each defined model intorchlm for the details about additional custom settings to get more flexibilities of training or fine-tuning processes. Here is an example of How to trainPIPNet in your own dataset with custom meanface setting?
Set up your custom meanface and nearest-neighbor landmarks throughpipnet.set_custom_meanface method, this method will calculate the Euclidean Distance between different landmarks in meanface and will auto set up the nearest-neighbors for each landmark. NOTE: The PIPNet will reshape the detection headers if the number of landmarks in custom dataset is not equal with thenum_lms you initialized.
defset_custom_meanface(custom_meanface_file_or_string:str)->bool:""" :param custom_meanface_file_or_string: a long string or a file contains normalized or un-normalized meanface coords, the format is "x0,y0,x1,y1,x2,y2,...,xn-1,yn-1". :return: status, True if successful. """
Also, agenerate_meanface API is available in torchlm to help you get meanface in custom dataset.
# generate your custom meanface.custom_meanface,custom_meanface_string=torchlm.data.annotools.generate_meanface(annotation_path="../data/WFLW/converted/train.txt",coordinates_already_normalized=True)# check your generated meanface.rendered_meanface=torchlm.data.annotools.draw_meanface(meanface=custom_meanface,coordinates_already_normalized=True)cv2.imwrite("./logs/wflw_meanface.jpg",rendered_meanface)# setting up your custom meanfacemodel.set_custom_meanface(custom_meanface_file_or_string=custom_meanface_string)
Intorchlm, some pre-defined dataset converters for common use benchmark datasets are available, such as300W,COFW,WFLW andAFLW. These converters will help you to convert the common use dataset to the standard annotation format thattorchlm need. Here is an example ofWFLW.
fromtorchlm.dataimportLandmarksWFLWConverter# setup your path to the original downloaded dataset from officialconverter=LandmarksWFLWConverter(data_dir="../data/WFLW",save_dir="../data/WFLW/converted",extend=0.2,rebuild=True,target_size=256,keep_aspect=False,force_normalize=True,force_absolute_path=True)converter.convert()converter.show(count=30)# show you some converted images with landmarks for debugging
Then, the output's layout in../data/WFLW/converted would be look like:
├── image│ ├──test│ └── train├── show│ ├── 16--Award_Ceremony_16_Award_Ceremony_Awards_Ceremony_16_589x456y91.jpg│ ├── 20--Family_Group_20_Family_Group_Family_Group_20_118x458y58.jpg...├── test.txt└── train.txtThe ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference oftorchlm will be release inlite.ai.toolkit. Here is an example of1000 Facial Landmarks Detection usingFaceLandmarks1000. Download model from Model-Zoo2.
#include"lite/lite.h"staticvoidtest_default(){ std::string onnx_path ="../../../hub/onnx/cv/FaceLandmark1000.onnx"; std::string test_img_path ="../../../examples/lite/resources/test_lite_face_landmarks_0.png"; std::string save_img_path ="../../../logs/test_lite_face_landmarks_1000.jpg";auto *face_landmarks_1000 =newlite::cv::face::align::FaceLandmark1000(onnx_path); lite::types::Landmarks landmarks; cv::Mat img_bgr =cv::imread(test_img_path); face_landmarks_1000->detect(img_bgr, landmarks);lite::utils::draw_landmarks_inplace(img_bgr, landmarks);cv::imwrite(save_img_path, img_bgr);delete face_landmarks_1000;}
The output is:
More classes for face alignment (68 points, 98 points, 106 points, 1000 points)
auto *align =new lite::cv::face::align::PFLD(onnx_path);// 106 landmarks, 1.0Mb only!auto *align =new lite::cv::face::align::PFLD98(onnx_path);// 98 landmarks, 4.8Mb only!auto *align =new lite::cv::face::align::PFLD68(onnx_path);// 68 landmarks, 2.8Mb only!auto *align =new lite::cv::face::align::MobileNetV268(onnx_path);// 68 landmarks, 9.4Mb only!auto *align =new lite::cv::face::align::MobileNetV2SE68(onnx_path);// 68 landmarks, 11Mb only!auto *align =new lite::cv::face::align::FaceLandmark1000(onnx_path);// 1000 landmarks, 2.0Mb only!auto *align =new lite::cv::face::align::PIPNet98(onnx_path);// 98 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet68(onnx_path);// 68 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet29(onnx_path);// 29 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet19(onnx_path);// 19 landmarks, CVPR2021!
More details of C++ APIs, please checklite.ai.toolkit.
Intorchlm, we provide pipelines for deploying models withPyTorch andONNXRuntime. A high level API namedruntime.bind can bind face detection and landmarks models together, then you can run theruntime.forward API to get the output landmarks and bboxes. Here is an example ofPIPNet. Pretrained weights of PIPNet,Download.
importtorchlmfromtorchlm.toolsimportfaceboxesv2fromtorchlm.modelsimportpipnettorchlm.runtime.bind(faceboxesv2(device="cpu"))# set device="cuda" if you want to run with CUDA# set map_location="cuda" if you want to run with CUDAtorchlm.runtime.bind(pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",map_location="cpu",checkpoint=None) )# will auto download pretrained weights from latest release if pretrained=Truelandmarks,bboxes=torchlm.runtime.forward(image)image=torchlm.utils.draw_bboxes(image,bboxes=bboxes)image=torchlm.utils.draw_landmarks(image,landmarks=landmarks)
importtorchlmfromtorchlm.runtimeimportfaceboxesv2_ort,pipnet_orttorchlm.runtime.bind(faceboxesv2_ort())torchlm.runtime.bind(pipnet_ort(onnx_path="pipnet_resnet18.onnx",num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw"))landmarks,bboxes=torchlm.runtime.forward(image)image=torchlm.utils.draw_bboxes(image,bboxes=bboxes)image=torchlm.utils.draw_landmarks(image,landmarks=landmarks)
Intorchlm, each model have a high level and user-friendly API namedapply_evaluating for evaluation. This method will calculate the NME, FR and AUC for eval dataset. Here is an example ofPIPNet.
fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)NME,FR,AUC=model.apply_evaluating(annotation_path="../data/WFLW/convertd/test.txt",norm_indices=[60,72],# the indexes of two eyeballs.coordinates_already_normalized=True,eval_normalized_coordinates=False)print(f"NME:{NME}, FR:{FR}, AUC:{AUC}")
Then, you will get thePerformance(@NME@FR@AUC) results.
Built _PIPEvalDataset:eval count is 2500!Evaluating PIPNet: 100%|██████████| 2500/2500 [02:53<00:00, 14.45it/s]NME: 0.04453323229181989, FR: 0.04200000000000004, AUC: 0.5732673333333334
Intorchlm, each model have a high level and user-friendly API namedapply_exporting for ONNX export. Here is an example ofPIPNet.
fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)model.apply_exporting(onnx_path="./save/pipnet/pipnet_resnet18.onnx",opset=12,simplify=True,output_names=None# use default output names.)
Then, you will get a Static ONNX model file if the exporting process was done.
... %195 = Add(%259, %189) %196 = Relu(%195) %outputs_cls = Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %cls_layer.weight, %cls_layer.bias) %outputs_x= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %x_layer.weight, %x_layer.bias) %outputs_y= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %y_layer.weight, %y_layer.bias) %outputs_nb_x= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %nb_x_layer.weight, %nb_x_layer.bias) %outputs_nb_y= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %nb_y_layer.weight, %nb_y_layer.bias)return %outputs_cls, %outputs_x, %outputs_y, %outputs_nb_x, %outputs_nb_y}Checking 0/3...Checking 1/3...Checking 2/3...
The code oftorchlm is released under the MIT License.
Please consider ⭐ this repo if you like it, as it is the simplest way to support me.
- The implementation of torchlm's transforms borrow the code fromPaperspace.
- PIPNet:Towards Efficient Facial Landmark Detection in the Wild, CVPR2021
About
💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.















