Movatterモバイル変換

NotificationsYou must be signed in to change notification settings
Fork27
Star266

💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

github.com/xlite-dev/torchlm

License

MIT license

266 stars 27 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
data		data
docs		docs
test		test
torchlm		torchlm
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
hubconf.py		hubconf.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Repository files navigation

💎 TorchLM: An easy-to-use PyTorch library for face landmarks detection.

English |Data Augmentations API Docs |ZhiHu Page |Pypi Downloads

🤗 Introduction

torchlm is aims to build a high level pipeline for face landmarks detection, it supportstraining,evaluating,exporting,inference(Python/C++) and100+ data augmentations, can easily install viapip.

👋 Core Features

High level pipeline fortraining andinference.
Provides30+ native landmarks data augmentations.
Canbind 80+ transforms fromtorchvision andalbumentations withone-line-code.
SupportPIPNet, YOLOX, ResNet, MobileNet and ShuffleNet for face landmarks detection.

🆕 What's New

[2022/03/08]: AddPIPNet:Towards Efficient Facial Landmark Detection in the Wild, CVPR2021
[2022/02/13]: Add30+ transforms andbind80+ transforms from torchvision and albumentations.

🔥🔥Performance(@NME)

Model	Backbone	Head	300W	COFW	AFLW	WFLW	Download
PIPNet	MobileNetV2	Heatmap+Regression+NRM	3.40	3.43	1.52	4.79	link
PIPNet	ResNet18	Heatmap+Regression+NRM	3.36	3.31	1.48	4.47	link
PIPNet	ResNet50	Heatmap+Regression+NRM	3.34	3.18	1.44	4.48	link
PIPNet	ResNet101	Heatmap+Regression+NRM	3.19	3.08	1.42	4.31	link

🛠️Installation

you can installtorchlm directly frompypi.

pip install torchlm>=0.1.6.10# or install the latest pypi version `pip install torchlm`pip install torchlm>=0.1.6.10 -i https://pypi.org/simple/# or install from specific pypi mirrors use '-i'

or install from source if you want the latest torchlm and install it in editable mode with-e.

git clone --depth=1 https://github.com/xlite-dev/torchlm.gitcd torchlm&& pip install -e.

🌟🌟Data Augmentation

torchlm provides30+ native data augmentations for landmarks and canbind with80+ transforms from torchvision and albumentations. The layout format of landmarks isxy with shape(N, 2).

Use almost30+ native transforms fromtorchlm directly

importtorchlmtransform=torchlm.LandmarksCompose([torchlm.LandmarksRandomScale(prob=0.5),torchlm.LandmarksRandomMask(prob=0.5),torchlm.LandmarksRandomBlur(kernel_range=(5,25),prob=0.5),torchlm.LandmarksRandomBrightness(prob=0.),torchlm.LandmarksRandomRotate(40,prob=0.5,bins=8),torchlm.LandmarksRandomCenterCrop((0.5,1.0), (0.5,1.0),prob=0.5)])

Also, a user-friendly APIbuild_default_transform is available to build a default transform pipeline.

transform=torchlm.build_default_transform(input_size=(input_size,input_size),mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225],force_norm_before_mean_std=True,# img/=255. firstrotate=30,keep_aspect=False,to_tensor=True# array -> Tensor & HWC -> CHW)

Seetransforms.md for supported transforms sets and more example can be found attest/transforms.py.

💡 more details about transform in torchlm

torchlm provides30+ native data augmentations for landmarks and canbind with80+ transforms from torchvision and albumentations throughtorchlm.bind method. The layout format of landmarks isxy with shape(N, 2),N denotes the number of the input landmarks. Further,torchlm.bind provide aprob param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations intorchlm aresafe andsimplest. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. Yes, is ok if you pass a Tensor to a np.ndarray-like transform,torchlm will automatically be compatible with different data types and then wrap it back to the original type through aautodtype wrapper.

bind 80+ torchvision and albumentations's transforms

NOTE: Please install albumentations first if you want to bind albumentations's transforms. If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless,ablumentations need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall albumentations. Seealbumentations#1140 for more details.

# first uninstall conflict opencvspip uninstall opencv-pythonpip uninstall opencv-python-headlesspip uninstall albumentations# if you have installed albumentationspip install albumentations# then reinstall albumentations, will also install deps, e.g opencv

Then, check if albumentations is available.

torchlm.albumentations_is_available()# True or False

transform=torchlm.LandmarksCompose([torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5,25)),prob=0.5),torchlm.bind(albumentations.ColorJitter(p=0.5))])

bind custom callable array or Tensor transform functions

# First, defined your custom functionsdefcallable_array_noop(img:np.ndarray,landmarks:np.ndarray)->Tuple[np.ndarray,np.ndarray]:# do some transform here ...returnimg.astype(np.uint32),landmarks.astype(np.float32)defcallable_tensor_noop(img:Tensor,landmarks:Tensor)->Tuple[Tensor,Tensor]:# do some transform here ...returnimg,landmarks

# Then, bind your functions and put it into the transforms pipeline.transform=torchlm.LandmarksCompose([torchlm.bind(callable_array_noop,bind_type=torchlm.BindEnum.Callable_Array),torchlm.bind(callable_tensor_noop,bind_type=torchlm.BindEnum.Callable_Tensor,prob=0.5)])

some global debug setting for torchlm's transform

setup logging mode asTrue globally might help you figure out the runtime details

# some global settingtorchlm.set_transforms_debug(True)torchlm.set_transforms_logging(True)torchlm.set_autodtype_logging(True)

some detail information will show you at each runtime, the infos might look like

LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOutLandmarksRandomScale() Execution Flag: FalseBindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOutBindTorchVisionTransform(GaussianBlur())() Execution Flag: TrueBindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOutBindAlbumentationsTransform(ColorJitter())() Execution Flag: TrueBindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOutBindTensorCallable(callable_tensor_noop())() Execution Flag: FalseError atLandmarksRandomTranslate() Skip, Flag: False Error Info:LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!LandmarksRandomTranslate() Execution Flag: False

Execution Flag: True means current transform was executed successful, False means it was not executed because of the random probability or some Runtime Exceptions(torchlm will should the error infos if debug mode is True).
AutoDtype Info:
- Array_InOut means current transform need a np.ndnarray as input and then output a np.ndarray.
- Tensor_InOut means current transform need a torch Tensor as input and then output a torch Tensor.
- Array_In means current transform needs a np.ndarray input and then output a torch Tensor.
- Tensor_In means current transform needs a torch Tensor input and then output a np.ndarray.
Yes, is ok if you pass a Tensor to a np.ndarray-like transform,torchlm will automatically be compatible with different data types and then wrap it back to the original type through aautodtype wrapper.

🎉🎉Training

Intorchlm, each model have two high level and user-friendly APIs namedapply_training andapply_freezing for training.apply_training handle the training process andapply_freezing decide whether to freeze the backbone for fune-tuning.

Quick Start👇

Here is an example ofPIPNet. You can freeze backbone before fine-tuning throughapply_freezing.

fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)model.apply_freezing(backbone=True)model.apply_training(annotation_path="../data/WFLW/converted/train.txt",# or fine-tuning your custom datanum_epochs=10,learning_rate=0.0001,save_dir="./save/pipnet",save_prefix="pipnet-wflw-resnet18",save_interval=1,logging_interval=1,device="cuda",coordinates_already_normalized=True,batch_size=16,num_workers=4,shuffle=True)

Please jump to the entry point of the function for the detail documentations ofapply_training API for each defined models in torchlm, e.gpipnet/_impls.py#L166. You might see some logs if the training process is running:

Parametersfor DataLoader:  {'batch_size': 16,'num_workers': 4,'shuffle': True}Built _PIPTrainDataset: train count is 7500!Epoch 0/9----------[Epoch 0/9, Batch 1/468]<Total loss: 0.372885><cls loss: 0.063186><x loss: 0.078508><y loss: 0.071679><nbx loss: 0.086480><nby loss: 0.073031>[Epoch 0/9, Batch 2/468]<Total loss: 0.354169><cls loss: 0.051672><x loss: 0.075350><y loss: 0.071229><nbx loss: 0.083785><nby loss: 0.072132>[Epoch 0/9, Batch 3/468]<Total loss: 0.367538><cls loss: 0.056038><x loss: 0.078029><y loss: 0.076432><nbx loss: 0.083546><nby loss: 0.073492>[Epoch 0/9, Batch 4/468]<Total loss: 0.339656><cls loss: 0.053631><x loss: 0.073036><y loss: 0.066723><nbx loss: 0.080007><nby loss: 0.066258>[Epoch 0/9, Batch 5/468]<Total loss: 0.364556><cls loss: 0.051094><x loss: 0.077378><y loss: 0.071951><nbx loss: 0.086363><nby loss: 0.077770>[Epoch 0/9, Batch 6/468]<Total loss: 0.371356><cls loss: 0.049117><x loss: 0.079237><y loss: 0.075729><nbx loss: 0.086213><nby loss: 0.081060>...[Epoch 0/9, Batch 33/468]<Total loss: 0.298983><cls loss: 0.041368><x loss: 0.069912><y loss: 0.057667><nbx loss: 0.072996><nby loss: 0.057040>

Dataset Format👇

Theannotation_path parameter is denotes the path to a custom annotation file, the format must be:

"img0_path x0 y0 x1 y1 ... xn-1,yn-1""img1_path x0 y0 x1 y1 ... xn-1,yn-1""img2_path x0 y0 x1 y1 ... xn-1,yn-1""img3_path x0 y0 x1 y1 ... xn-1,yn-1"...

If the label in annotation_path is already normalized by image size, please setcoordinates_already_normalized asTrue inapply_training API.

"img0_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img1_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img2_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h""img3_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"...

Here is an example ofWFLW to show you how to prepare the dataset, also seetest/data.py.

Additional Custom Settings👋

Some models in torchlm support additional custom settings beyond thenum_lms of your custom dataset. For example,PIPNet also need to set a custom meanface generated by your custom dataset. Please jump the source code of each defined model intorchlm for the details about additional custom settings to get more flexibilities of training or fine-tuning processes. Here is an example of How to trainPIPNet in your own dataset with custom meanface setting?

Set up your custom meanface and nearest-neighbor landmarks throughpipnet.set_custom_meanface method, this method will calculate the Euclidean Distance between different landmarks in meanface and will auto set up the nearest-neighbors for each landmark. NOTE: The PIPNet will reshape the detection headers if the number of landmarks in custom dataset is not equal with thenum_lms you initialized.

defset_custom_meanface(custom_meanface_file_or_string:str)->bool:"""    :param custom_meanface_file_or_string: a long string or a file contains normalized    or un-normalized meanface coords, the format is "x0,y0,x1,y1,x2,y2,...,xn-1,yn-1".    :return: status, True if successful.    """

Also, agenerate_meanface API is available in torchlm to help you get meanface in custom dataset.

# generate your custom meanface.custom_meanface,custom_meanface_string=torchlm.data.annotools.generate_meanface(annotation_path="../data/WFLW/converted/train.txt",coordinates_already_normalized=True)# check your generated meanface.rendered_meanface=torchlm.data.annotools.draw_meanface(meanface=custom_meanface,coordinates_already_normalized=True)cv2.imwrite("./logs/wflw_meanface.jpg",rendered_meanface)# setting up your custom meanfacemodel.set_custom_meanface(custom_meanface_file_or_string=custom_meanface_string)

Benchmarks Dataset Converters👇

Intorchlm, some pre-defined dataset converters for common use benchmark datasets are available, such as300W,COFW,WFLW andAFLW. These converters will help you to convert the common use dataset to the standard annotation format thattorchlm need. Here is an example ofWFLW.

fromtorchlm.dataimportLandmarksWFLWConverter# setup your path to the original downloaded dataset from officialconverter=LandmarksWFLWConverter(data_dir="../data/WFLW",save_dir="../data/WFLW/converted",extend=0.2,rebuild=True,target_size=256,keep_aspect=False,force_normalize=True,force_absolute_path=True)converter.convert()converter.show(count=30)# show you some converted images with landmarks for debugging

Then, the output's layout in../data/WFLW/converted would be look like:

├── image│   ├──test│   └── train├── show│   ├── 16--Award_Ceremony_16_Award_Ceremony_Awards_Ceremony_16_589x456y91.jpg│   ├── 20--Family_Group_20_Family_Group_Family_Group_20_118x458y58.jpg...├── test.txt└── train.txt

🛸🚵‍️ Inference

C++ APIs👀

The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference oftorchlm will be release inlite.ai.toolkit. Here is an example of1000 Facial Landmarks Detection usingFaceLandmarks1000. Download model from Model-Zoo².

#include"lite/lite.h"staticvoidtest_default(){  std::string onnx_path ="../../../hub/onnx/cv/FaceLandmark1000.onnx";  std::string test_img_path ="../../../examples/lite/resources/test_lite_face_landmarks_0.png";  std::string save_img_path ="../../../logs/test_lite_face_landmarks_1000.jpg";auto *face_landmarks_1000 =newlite::cv::face::align::FaceLandmark1000(onnx_path);  lite::types::Landmarks landmarks;  cv::Mat img_bgr =cv::imread(test_img_path);  face_landmarks_1000->detect(img_bgr, landmarks);lite::utils::draw_landmarks_inplace(img_bgr, landmarks);cv::imwrite(save_img_path, img_bgr);delete face_landmarks_1000;}

The output is:

More classes for face alignment (68 points, 98 points, 106 points, 1000 points)

auto *align =new lite::cv::face::align::PFLD(onnx_path);// 106 landmarks, 1.0Mb only!auto *align =new lite::cv::face::align::PFLD98(onnx_path);// 98 landmarks, 4.8Mb only!auto *align =new lite::cv::face::align::PFLD68(onnx_path);// 68 landmarks, 2.8Mb only!auto *align =new lite::cv::face::align::MobileNetV268(onnx_path);// 68 landmarks, 9.4Mb only!auto *align =new lite::cv::face::align::MobileNetV2SE68(onnx_path);// 68 landmarks, 11Mb only!auto *align =new lite::cv::face::align::FaceLandmark1000(onnx_path);// 1000 landmarks, 2.0Mb only!auto *align =new lite::cv::face::align::PIPNet98(onnx_path);// 98 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet68(onnx_path);// 68 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet29(onnx_path);// 29 landmarks, CVPR2021!auto *align =new lite::cv::face::align::PIPNet19(onnx_path);// 19 landmarks, CVPR2021!

More details of C++ APIs, please checklite.ai.toolkit.

Python APIs👇

Intorchlm, we provide pipelines for deploying models withPyTorch andONNXRuntime. A high level API namedruntime.bind can bind face detection and landmarks models together, then you can run theruntime.forward API to get the output landmarks and bboxes. Here is an example ofPIPNet. Pretrained weights of PIPNet,Download.

Inference on PyTorch Backend

importtorchlmfromtorchlm.toolsimportfaceboxesv2fromtorchlm.modelsimportpipnettorchlm.runtime.bind(faceboxesv2(device="cpu"))# set device="cuda" if you want to run with CUDA# set map_location="cuda" if you want to run with CUDAtorchlm.runtime.bind(pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",map_location="cpu",checkpoint=None) )# will auto download pretrained weights from latest release if pretrained=Truelandmarks,bboxes=torchlm.runtime.forward(image)image=torchlm.utils.draw_bboxes(image,bboxes=bboxes)image=torchlm.utils.draw_landmarks(image,landmarks=landmarks)

Inference on ONNXRuntime Backend

importtorchlmfromtorchlm.runtimeimportfaceboxesv2_ort,pipnet_orttorchlm.runtime.bind(faceboxesv2_ort())torchlm.runtime.bind(pipnet_ort(onnx_path="pipnet_resnet18.onnx",num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw"))landmarks,bboxes=torchlm.runtime.forward(image)image=torchlm.utils.draw_bboxes(image,bboxes=bboxes)image=torchlm.utils.draw_landmarks(image,landmarks=landmarks)

🤠🎯 Evaluating

Intorchlm, each model have a high level and user-friendly API namedapply_evaluating for evaluation. This method will calculate the NME, FR and AUC for eval dataset. Here is an example ofPIPNet.

fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)NME,FR,AUC=model.apply_evaluating(annotation_path="../data/WFLW/convertd/test.txt",norm_indices=[60,72],# the indexes of two eyeballs.coordinates_already_normalized=True,eval_normalized_coordinates=False)print(f"NME:{NME}, FR:{FR}, AUC:{AUC}")

Then, you will get thePerformance(@NME@FR@AUC) results.

Built _PIPEvalDataset:eval count is 2500!Evaluating PIPNet: 100%|██████████| 2500/2500 [02:53<00:00, 14.45it/s]NME: 0.04453323229181989, FR: 0.04200000000000004, AUC: 0.5732673333333334

⚙️⚔️ Exporting

Intorchlm, each model have a high level and user-friendly API namedapply_exporting for ONNX export. Here is an example ofPIPNet.

fromtorchlm.modelsimportpipnet# will auto download pretrained weights from latest release if pretrained=Truemodel=pipnet(backbone="resnet18",pretrained=True,num_nb=10,num_lms=98,net_stride=32,input_size=256,meanface_type="wflw",backbone_pretrained=True)model.apply_exporting(onnx_path="./save/pipnet/pipnet_resnet18.onnx",opset=12,simplify=True,output_names=None# use default output names.)

Then, you will get a Static ONNX model file if the exporting process was done.

  ...  %195 = Add(%259, %189)  %196 = Relu(%195)  %outputs_cls = Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %cls_layer.weight, %cls_layer.bias)  %outputs_x= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %x_layer.weight, %x_layer.bias)  %outputs_y= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %y_layer.weight, %y_layer.bias)  %outputs_nb_x= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %nb_x_layer.weight, %nb_x_layer.bias)  %outputs_nb_y= Conv[dilations= [1, 1], group= 1, kernel_shape= [1, 1], pads= [0, 0, 0, 0], strides= [1, 1]](%196, %nb_y_layer.weight, %nb_y_layer.bias)return %outputs_cls, %outputs_x, %outputs_y, %outputs_nb_x, %outputs_nb_y}Checking 0/3...Checking 1/3...Checking 2/3...

📖 Documentations

Data Augmentation's API

🎓 License

The code oftorchlm is released under the MIT License.

❤️ Contribution

Please consider ⭐ this repo if you like it, as it is the simplest way to support me.

👋 Acknowledgement

The implementation of torchlm's transforms borrow the code fromPaperspace.
PIPNet:Towards Efficient Facial Landmark Detection in the Wild, CVPR2021

About

💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

github.com/xlite-dev/torchlm

Releases8

v0.1.7-rc0 Latest

Feb 7, 2025

+ 7 releases

Packages

No packages published

Contributors4

Languages

Python100.0%

Movatterモバイル変換

License

xlite-dev/torchlm

Folders and files

Latest commit

History

Repository files navigation

💎 TorchLM: An easy-to-use PyTorch library for face landmarks detection.

🤗 Introduction

👋 Core Features

🆕 What's New

🔥🔥Performance(@NME)

🛠️Installation

🌟🌟Data Augmentation

🎉🎉Training

Quick Start👇

Dataset Format👇

Additional Custom Settings👋

Benchmarks Dataset Converters👇

🛸🚵‍️ Inference

C++ APIs👀

Python APIs👇

Inference on PyTorch Backend

Inference on ONNXRuntime Backend

🤠🎯 Evaluating

⚙️⚔️ Exporting

📖 Documentations

🎓 License

❤️ Contribution

👋 Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases8

Packages0

Uh oh!

Contributors4

Uh oh!

Languages

Packages