NotificationsYou must be signed in to change notification settings
Fork506
Star5.4k

Add PaddleDetection-based Layout Model#54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

lolipopshock merged 15 commits intoLayout-Parser:masterfroman1018:add-paddle-structure

Aug 17, 2021

Merged

Add PaddleDetection-based Layout Model#54

lolipopshock merged 15 commits intoLayout-Parser:masterfroman1018:add-paddle-structure

Aug 17, 2021

Conversation

Copy link

Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

这个去掉吧，安装paddleocr的话，会自动安装这个模块的

src/layoutparser/models/paddleDetection/layoutmodel.py Outdated


		if not enforce_cpu:
		# initial GPU memory(M), device ID
		config.enable_use_gpu(200, 0)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

改为2000吧

add paddle model

a101930

lolipopshock changed the title~~reorganize paddle structure~~Add PaddleDetection-based Layout Model

Aug 6, 2021

Copy link

Member

lolipopshock commentedAug 6, 2021

Thanks for this PR. Here are some general questions/thoughts:

Please remove the OCR part from this PR.
Please use PEP8 for as the variable naming conventions -- you could useblack andpylint for checking.
For thepreprocess.py andlayoutmodel.py file
1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
2. And more fundamentally, why thepreprocess function should live withinlayoutparser rather thanpaddledetection? I think it'spaddledetection's obligation to do the preprocessing for the models.
Also could you add tests for the paddle modules? You could follow the examples intest_model.

Copy link

ContributorAuthor

an1018 commentedAug 7, 2021

Thanks for this PR. Here are some general questions/thoughts:

* Please remove the OCR part from this PR.* Please use PEP8 for as the variable naming conventions -- you could use `black` and `pylint` for checking.* For the `preprocess.py` and `layoutmodel.py` file    1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?  2. And more fundamentally, why the `preprocess` function should live within `layoutparser` rather than `paddledetection`? I think it's `paddledetection`'s obligation to do the preprocessing for the models.* Also could you add tests for the paddle modules? You could follow the examples in [`test_model`](https://github.com/Layout-Parser/layout-parser/blob/master/tests/test_model.py).

Thanks for your replay, I'll modify it later！

Copy link

littletomatodonkey commentedAug 7, 2021

Thanks for this PR. Here are some general questions/thoughts:
Please remove the OCR part from this PR.
Please use PEP8 for as the variable naming conventions -- you could useblack andpylint for checking.
For thepreprocess.py andlayoutmodel.py file
I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
And more fundamentally, why thepreprocess function should live withinlayoutparser rather thanpaddledetection? I think it'spaddledetection's obligation to do the preprocessing for the models.
Also could you add tests for the paddle modules? You could follow the examples intest_model.

Hi, For thepreprocess.py andlayoutmodel.py file,

we decouple the preprocess and inference engine because that will be more clear for the code.
we extract the preprocess code to reduce the dependence to paddledetection, or else the users might download the whole repo, which is too heavy.

an1018 added2 commits

August 8, 2021 13:42

Add paddle model

202f938

Add paddle model

109b20d

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

please don't change anything here - I'll modify it later myself.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/__init__.py Outdated

		@@ -11,11 +11,13 @@

		from .ocr import (
		GCVFeatureType, GCVAgent,
		TesseractFeatureType, TesseractAgent
		TesseractFeatureType, TesseractAgent,
		PaddleocrAgent

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please don't add OCR Agent in this PR -- right now we focus on LayoutModel.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/models/detectron2/catalog.py Outdated

		@@ -86,7 +86,6 @@
		0: "Table"
		},
		}
		# fmt: on

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

why removed?

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry, I've modified it back

src/layoutparser/models/paddleDetection/catalog.py Outdated

		}

		# fmt: off
		LABEL_MAP_CATALOG = {

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please remove the labeling mappings for unused datasets.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/models/paddleDetection/catalog.py Outdated

		# fmt: on


		class LayoutParserDetectron2ModelHandler(PathHandler):

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Change class name toLayoutParserPaddleModelHandler.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/models/paddleDetection/layoutmodel.py Outdated

		trt_min_shape=1,
		trt_max_shape=1280,
		trt_opt_shape=640,
		min_subgraph_size=3):

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And are there any documentations on how to set and adjust these parameters, in English? If not, I would suggest removing them..

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/models/paddleDetection/layoutmodel.py Outdated

		enable_mkldnn=True,
		thread_num=10,
		min_subgraph_size=3,
		use_dynamic_shape=False,

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And apparently some of the hyperparameters/configurations are not used. Please remove.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

src/layoutparser/models/paddleDetection/layoutmodel.py Outdated

		inputs = self.create_inputs(im, im_info)
		return inputs

		def postprocess(self, np_boxes, np_masks, inputs):

Copy link

Member

lolipopshockAug 6, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think we need merge thepostprocess withgather_output. It's not reasonable to break them into two functions.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've merged ,thks

src/layoutparser/models/paddleDetection/layoutmodel.py Outdated


		if not enforce_cpu:
		# initial GPU memory(M), device ID
		config.enable_use_gpu(2000, 0)

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why it is 2000 rather than other values? Please leave a note in the comment. Thanks.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

setup.py Outdated

		@@ -33,6 +33,12 @@
		'google-cloud-vision==1',
		'pytesseract'
		],
		"paddlepaddle": [

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for not being clear earlier. Please remove all the newextras_require, thanks.

Copy link

ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Got it.

an1018 added3 commits

August 9, 2021 20:38

Add paddle model

90f8fef

Add paddle model

577ff84

Add paddle model

a976ede

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/base_catalog.py Outdated

		@@ -10,7 +10,7 @@ class DropboxHandler(HTTPURLHandler):
		"""

		def _get_supported_prefixes(self):
		return ["https://www.dropbox.com"]
		return ["https://www.dropbox.com","https://paddle-model-ecology.bj.bcebos.com"]

Copy link

Member

lolipopshockAug 11, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this is a good idea -- perhaps it's better to create something likeclass PaddleModelURLHandler(HTTPURLHandler): in the paddle modelcatalog.py file.

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please see the comments below for more detailed instructions.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		layout = self.gather_output(np_boxes, np_masks)
		return layout

		def untar_files(self, model_tar, model_dir):

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It feels counter-intuitive to put this function inside the modeling folder, and I think it's better to include this part inside the PathManager, by rewriting this function_get_local_path. So basically it will be something like this:

classPaddleModelURLHandler(HTTPURLHandler):"""    Supports download and file check for dropbox links    """def_get_supported_prefixes(self):return ["https://paddle-model-ecology.bj.bcebos.com"]def_isfile(self,path):returnpathinself.cache_mapdef_get_local_path(self,path:str,force:bool=False,cache_dir:Optional[str]=None,**kwargs:Any,    )->str:"""        This implementation downloads the remote resource and caches it locally.        The resource will only be downloaded if not previously requested.        """self._check_kwargs(kwargs)if (forceorpathnotinself.cache_mapornotos.path.exists(self.cache_map[path])        ):logger=logging.getLogger(__name__)parsed_url=urlparse(path)dirname=os.path.join(get_cache_dir(cache_dir),os.path.dirname(parsed_url.path.lstrip("/"))            )filename=path.split("/")[-1]iflen(filename)>self.MAX_FILENAME_LEN:filename=filename[:100]+"_"+uuid.uuid4().hexcached=os.path.join(dirname,filename)withfile_lock(cached):ifnotos.path.isfile(cached):logger.info("Downloading {} ...".format(path))cached=download(path,dirname,filename=filename)#### ----> Add from hereifpath.endswith(".tar"):untar_function()logger.info("URL {} cached in {}".format(path,cached))self.cache_map[path]=cachedreturnself.cache_map[path]

You might also need to check whether the untar'ed version exists beforehand to avoid duplicated downloading.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		image, im_info = permute(image, im_info)

		inputs = {}
		inputs['image'] = np.array((image, )).astype('float32')

Copy link

Member

lolipopshockAug 11, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This feels weird -- probably it's better to havenp.array(image)[np.newaxis, :].astype('float32').

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		config of model, defined by `Config(model_dir)`
		model_path (str):
		The path to the saved weights of the model.
		threshold (float):

Copy link

Member

lolipopshockAug 11, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Don't forget to clean up the docstrings -- and you can specify what the extra configs are under theextra_configs block.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.pyShow resolvedHide resolved

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		config_path = self._reconstruct_path_with_detector_name(config_path)
		model_tar = PathManager.get_local_path(config_path)

		pre_dir = os.path.dirname(model_tar)

Copy link

Member

lolipopshockAug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

please remove this -- see the details below.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		thread_num=extra_config.get('thread_num',10))

		self.threshold = extra_config.get('threshold',0.5)
		self.input_shape = extra_config.get('input_shape',[3,640,640])

Copy link

Member

lolipopshockAug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Asinput_shape is inself.im_info, you might want to remove this variable.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		self.threshold = extra_config.get('threshold',0.5)
		self.input_shape = extra_config.get('input_shape',[3,640,640])
		self.label_map = label_map
		self.im_info = {

Copy link

Member

lolipopshockAug 11, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So basically this is something likedefault_image_info, right? It's better to use a more descriptive name rather thanim_info, which is also the outputs of the preprocessors.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/preprocess.py Outdated

		im_info (dict): info of processed image
		"""
		image = image.astype(np.float32, copy=False)
		mean = np.array(mean)[np.newaxis, np.newaxis, :]

Copy link

Member

lolipopshockAug 11, 2021•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why initializing the arrays this every time? We can just initialize themean andstd to be the appropriatendarrays.

lolipopshock reviewed

Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Outdated

		inputs (dict): input of model
		"""
		# read rgb image
		image, im_info = decode_image(image, self.im_info)

Copy link

Member

lolipopshockAug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't like the design here:

in the current setting, theself.im_info will be changed for all different image inputs, which doesn't make sense.
im_info is only changed in theresize function, and thedecode_image doesn't changeimage at all - can we further simplify the APIs?

For example, an alternative example could be:

image = image.copy()input_shape = np.array(image.shape[:2], dtype=np.float32)image, scale_factor = resize(image) # change the return value image = (image - self.extra_config['pixel_mean']) \/  image - self.extra_config['pixel_std'] # may need change how to load the values image =  image.transpose((2, 0, 1)).copy() # The model requires channel-first input model_input_image_shape =  np.array(image.shape[:2], dtype=np.float32)image_info = {    'scale_factor': scale_factor,    'im_shape': model_input_image_shape,    'input_shape': input_shape,}

Copy link

Member

lolipopshock commentedAug 11, 2021•
edited
Loading

Thanks for the updates! The current version is better than the previous one, but I think the code could be further simplified and more structured -- please check the comments for possible updates.

an1018and others added4 commits

August 12, 2021 14:43

Add paddle model

5932e6f

Add paddle model

80aded6

Better model downloading logic

76d4096

Use layout parser PathManager

2fabc52

simplify the layoutmodel in paddledetection

cf75a68

Copy link

Member

lolipopshock commentedAug 17, 2021

I've also carefully checked the code, and added some changes. Some key updates:

improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
further simplify thelayoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.

And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal.

Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

lolipopshock added2 commits

August 17, 2021 01:53

remove the empty preprocess.py file

0bad164

incldue paddle models in dev-requirements

1f83199

lolipopshock approved these changes

Aug 17, 2021

View reviewed changes

Copy link

littletomatodonkey commentedAug 17, 2021

I've also carefully checked the code, and added some changes. Some key updates:
improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
further simplify thelayoutmodel.py, removing unnecessary variables and definitions.
Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.
And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is~~2MB/s in US, taking around 2~~3 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

Hi, thanks for your new changes, which further simplifies the code.

For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
The changes are ok for me, thanks for your careful review again!

Copy link

ContributorAuthor

an1018 commentedAug 17, 2021•
edited
Loading

The changes are ok for me，too.

Through this repo and the whole modification process, I also learned a lot. Thank you very much！

Copy link

Member

lolipopshock commentedAug 17, 2021

I've also carefully checked the code, and added some changes. Some key updates:
improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
further simplify thelayoutmodel.py, removing unnecessary variables and definitions.
Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.
And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is~~2MB/s in US, taking around 2~~3 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!
Hi, thanks for your new changes, which further simplifies the code.
For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
The changes are ok for me, thanks for your careful review again!

I think ultimately we'll provide better downloading support, e.g., mirrored downloading sites and or something mentioned in#43 . At that time, ppl from different regions can choose the best downloading methods accordingly.

lolipopshock merged commit035f66a intoLayout-Parser:master

Aug 17, 2021

lolipopshock mentioned this pull request

Aug 30, 2021

Can we use Yolov5 or other model to do DLA?#40

Closed

Labels

None yet

3 participants

		@@ -27,12 +27,22 @@
		"torch",
		"torchvision",
		"iopath",
		"tqdm",

		@@ -33,6 +33,15 @@
		'google-cloud-vision==1',
		'pytesseract'
		],
		"paddleocr": [

Movatterモバイル変換

Add PaddleDetection-based Layout Model#54

Add PaddleDetection-based Layout Model#54

Uh oh!

Conversation

an1018 commentedAug 6, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshock commentedAug 6, 2021

Uh oh!

an1018 commentedAug 7, 2021

Uh oh!

littletomatodonkey commentedAug 7, 2021

Uh oh!

an1018 commentedAug 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshockAug 6, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshockAug 11, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshockAug 11, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshockAug 11, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolipopshockAug 6, 2021•
edited
Loading

lolipopshockAug 11, 2021•
edited
Loading

lolipopshockAug 11, 2021•
edited
Loading

lolipopshockAug 11, 2021•
edited
Loading

lolipopshockAug 11, 2021•
edited
Loading

lolipopshockAug 11, 2021•
edited
Loading

lolipopshock commentedAug 11, 2021•
edited
Loading

an1018 commentedAug 17, 2021•
edited
Loading