Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add PaddleDetection-based Layout Model#54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

an1018
Copy link
Contributor

Hi, I have reorganized the file structure of Paddle.
Please review.Thx

setup.py Outdated
@@ -27,12 +27,22 @@
"torch",
"torchvision",
"iopath",
"tqdm",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

这个去掉吧,安装paddleocr的话,会自动安装这个模块的


if not enforce_cpu:
# initial GPU memory(M), device ID
config.enable_use_gpu(200, 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

改为2000吧

@lolipopshocklolipopshock changed the titlereorganize paddle structureAdd PaddleDetection-based Layout ModelAug 6, 2021
@lolipopshock
Copy link
Member

Thanks for this PR. Here are some general questions/thoughts:

  • Please remove the OCR part from this PR.
  • Please use PEP8 for as the variable naming conventions -- you could useblack andpylint for checking.
  • For thepreprocess.py andlayoutmodel.py file
    1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
    2. And more fundamentally, why thepreprocess function should live withinlayoutparser rather thanpaddledetection? I think it'spaddledetection's obligation to do the preprocessing for the models.
  • Also could you add tests for the paddle modules? You could follow the examples intest_model.

@an1018
Copy link
ContributorAuthor

Thanks for this PR. Here are some general questions/thoughts:

* Please remove the OCR part from this PR.* Please use PEP8 for as the variable naming conventions -- you could use `black` and `pylint` for checking.* For the `preprocess.py` and `layoutmodel.py` file    1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?  2. And more fundamentally, why the `preprocess` function should live within `layoutparser` rather than `paddledetection`? I think it's `paddledetection`'s obligation to do the preprocessing for the models.* Also could you add tests for the paddle modules? You could follow the examples in [`test_model`](https://github.com/Layout-Parser/layout-parser/blob/master/tests/test_model.py).

Thanks for your replay, I'll modify it later!

@littletomatodonkey
Copy link

Thanks for this PR. Here are some general questions/thoughts:

  • Please remove the OCR part from this PR.

  • Please use PEP8 for as the variable naming conventions -- you could useblack andpylint for checking.

  • For thepreprocess.py andlayoutmodel.py file

    1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
    2. And more fundamentally, why thepreprocess function should live withinlayoutparser rather thanpaddledetection? I think it'spaddledetection's obligation to do the preprocessing for the models.
  • Also could you add tests for the paddle modules? You could follow the examples intest_model.

Hi, For thepreprocess.py andlayoutmodel.py file,

  1. we decouple the preprocess and inference engine because that will be more clear for the code.
  2. we extract the preprocess code to reduce the dependence to paddledetection, or else the users might download the whole repo, which is too heavy.

@an1018
Copy link
ContributorAuthor

Hi,I've modied the code, including:

  • remove the OCR part

  • usepylint for checking
    For theinit.py andcatalog.py file,these two files are less modified, so they are not checked

  • add tests for the paddle modulestest_model

lolipopshock reacted with thumbs up emoji

setup.py Outdated
@@ -33,6 +33,15 @@
'google-cloud-vision==1',
'pytesseract'
],
"paddleocr": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

please don't change anything here - I'll modify it later myself.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

@@ -11,11 +11,13 @@

from .ocr import (
GCVFeatureType, GCVAgent,
TesseractFeatureType, TesseractAgent
TesseractFeatureType, TesseractAgent,
PaddleocrAgent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please don't add OCR Agent in this PR -- right now we focus on LayoutModel.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

@@ -86,7 +86,6 @@
0: "Table"
},
}
# fmt: on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

why removed?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry, I've modified it back

}

# fmt: off
LABEL_MAP_CATALOG = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please remove the labeling mappings for unused datasets.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

# fmt: on


class LayoutParserDetectron2ModelHandler(PathHandler):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Change class name toLayoutParserPaddleModelHandler.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

trt_min_shape=1,
trt_max_shape=1280,
trt_opt_shape=640,
min_subgraph_size=3):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And are there any documentations on how to set and adjust these parameters, in English? If not, I would suggest removing them..

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

enable_mkldnn=True,
thread_num=10,
min_subgraph_size=3,
use_dynamic_shape=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And apparently some of the hyperparameters/configurations are not used. Please remove.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

inputs = self.create_inputs(im, im_info)
return inputs

def postprocess(self, np_boxes, np_masks, inputs):
Copy link
Member

@lolipopshocklolipopshockAug 6, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think we need merge thepostprocess withgather_output. It's not reasonable to break them into two functions.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've merged ,thks


if not enforce_cpu:
# initial GPU memory(M), device ID
config.enable_use_gpu(2000, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why it is 2000 rather than other values? Please leave a note in the comment. Thanks.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

setup.py Outdated
@@ -33,6 +33,12 @@
'google-cloud-vision==1',
'pytesseract'
],
"paddlepaddle": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for not being clear earlier. Please remove all the newextras_require, thanks.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Got it.

@@ -10,7 +10,7 @@ class DropboxHandler(HTTPURLHandler):
"""

def _get_supported_prefixes(self):
return ["https://www.dropbox.com"]
return ["https://www.dropbox.com","https://paddle-model-ecology.bj.bcebos.com"]
Copy link
Member

@lolipopshocklolipopshockAug 11, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this is a good idea -- perhaps it's better to create something likeclass PaddleModelURLHandler(HTTPURLHandler): in the paddle modelcatalog.py file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please see the comments below for more detailed instructions.

layout = self.gather_output(np_boxes, np_masks)
return layout

def untar_files(self, model_tar, model_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It feels counter-intuitive to put this function inside the modeling folder, and I think it's better to include this part inside the PathManager, by rewriting this function_get_local_path. So basically it will be something like this:

classPaddleModelURLHandler(HTTPURLHandler):"""    Supports download and file check for dropbox links    """def_get_supported_prefixes(self):return ["https://paddle-model-ecology.bj.bcebos.com"]def_isfile(self,path):returnpathinself.cache_mapdef_get_local_path(self,path:str,force:bool=False,cache_dir:Optional[str]=None,**kwargs:Any,    )->str:"""        This implementation downloads the remote resource and caches it locally.        The resource will only be downloaded if not previously requested.        """self._check_kwargs(kwargs)if (forceorpathnotinself.cache_mapornotos.path.exists(self.cache_map[path])        ):logger=logging.getLogger(__name__)parsed_url=urlparse(path)dirname=os.path.join(get_cache_dir(cache_dir),os.path.dirname(parsed_url.path.lstrip("/"))            )filename=path.split("/")[-1]iflen(filename)>self.MAX_FILENAME_LEN:filename=filename[:100]+"_"+uuid.uuid4().hexcached=os.path.join(dirname,filename)withfile_lock(cached):ifnotos.path.isfile(cached):logger.info("Downloading {} ...".format(path))cached=download(path,dirname,filename=filename)#### ----> Add from hereifpath.endswith(".tar"):untar_function()logger.info("URL {} cached in {}".format(path,cached))self.cache_map[path]=cachedreturnself.cache_map[path]

You might also need to check whether the untar'ed version exists beforehand to avoid duplicated downloading.

image, im_info = permute(image, im_info)

inputs = {}
inputs['image'] = np.array((image, )).astype('float32')
Copy link
Member

@lolipopshocklolipopshockAug 11, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This feels weird -- probably it's better to havenp.array(image)[np.newaxis, :].astype('float32').

config of model, defined by `Config(model_dir)`
model_path (str):
The path to the saved weights of the model.
threshold (float):
Copy link
Member

@lolipopshocklolipopshockAug 11, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Don't forget to clean up the docstrings -- and you can specify what the extra configs are under theextra_configs block.

config_path = self._reconstruct_path_with_detector_name(config_path)
model_tar = PathManager.get_local_path(config_path)

pre_dir = os.path.dirname(model_tar)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

please remove this -- see the details below.

thread_num=extra_config.get('thread_num',10))

self.threshold = extra_config.get('threshold',0.5)
self.input_shape = extra_config.get('input_shape',[3,640,640])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Asinput_shape is inself.im_info, you might want to remove this variable.

self.threshold = extra_config.get('threshold',0.5)
self.input_shape = extra_config.get('input_shape',[3,640,640])
self.label_map = label_map
self.im_info = {
Copy link
Member

@lolipopshocklolipopshockAug 11, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So basically this is something likedefault_image_info, right? It's better to use a more descriptive name rather thanim_info, which is also the outputs of the preprocessors.

im_info (dict): info of processed image
"""
image = image.astype(np.float32, copy=False)
mean = np.array(mean)[np.newaxis, np.newaxis, :]
Copy link
Member

@lolipopshocklolipopshockAug 11, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why initializing the arrays this every time? We can just initialize themean andstd to be the appropriatendarrays.

inputs (dict): input of model
"""
# read rgb image
image, im_info = decode_image(image, self.im_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't like the design here:

  1. in the current setting, theself.im_info will be changed for all different image inputs, which doesn't make sense.
  2. im_info is only changed in theresize function, and thedecode_image doesn't changeimage at all - can we further simplify the APIs?

For example, an alternative example could be:

image = image.copy()input_shape = np.array(image.shape[:2], dtype=np.float32)image, scale_factor = resize(image) # change the return value image = (image - self.extra_config['pixel_mean']) \/  image - self.extra_config['pixel_std'] # may need change how to load the values image =  image.transpose((2, 0, 1)).copy() # The model requires channel-first input model_input_image_shape =  np.array(image.shape[:2], dtype=np.float32)image_info = {    'scale_factor': scale_factor,    'im_shape': model_input_image_shape,    'input_shape': input_shape,}

@lolipopshock
Copy link
Member

lolipopshock commentedAug 11, 2021
edited
Loading

Thanks for the updates! The current version is better than the previous one, but I think the code could be further simplified and more structured -- please check the comments for possible updates.

@lolipopshock
Copy link
Member

I've also carefully checked the code, and added some changes. Some key updates:

  1. improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
  2. further simplify thelayoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.

And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal.

Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

@littletomatodonkey
Copy link

I've also carefully checked the code, and added some changes. Some key updates:

  1. improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
  2. further simplify thelayoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.

And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is2MB/s in US, taking around 23 minutes to download the whole model, which might be suboptimal.

Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

Hi, thanks for your new changes, which further simplifies the code.

  • For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
  • Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
  • The changes are ok for me, thanks for your careful review again!

@an1018
Copy link
ContributorAuthor

an1018 commentedAug 17, 2021
edited
Loading

The changes are ok for me,too.

Through this repo and the whole modification process, I also learned a lot. Thank you very much!

@lolipopshock
Copy link
Member

I've also carefully checked the code, and added some changes. Some key updates:

  1. improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
  2. further simplify thelayoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifyingbatch_size during inference, however that variable was never used.
And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is2MB/s in US, taking around 23 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

Hi, thanks for your new changes, which further simplifies the code.

  • For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
  • Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
  • The changes are ok for me, thanks for your careful review again!

I think ultimately we'll provide better downloading support, e.g., mirrored downloading sites and or something mentioned in#43 . At that time, ppl from different regions can choose the best downloading methods accordingly.

@lolipopshocklolipopshock merged commit035f66a intoLayout-Parser:masterAug 17, 2021
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@littletomatodonkeylittletomatodonkeylittletomatodonkey left review comments

@lolipopshocklolipopshocklolipopshock approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@an1018@lolipopshock@littletomatodonkey

[8]ページ先頭

©2009-2025 Movatter.jp