- Notifications
You must be signed in to change notification settings - Fork506
Add PaddleDetection-based Layout Model#54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
setup.py Outdated
@@ -27,12 +27,22 @@ | |||
"torch", | |||
"torchvision", | |||
"iopath", | |||
"tqdm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
这个去掉吧,安装paddleocr的话,会自动安装这个模块的
if not enforce_cpu: | ||
# initial GPU memory(M), device ID | ||
config.enable_use_gpu(200, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
改为2000吧
Thanks for this PR. Here are some general questions/thoughts:
|
Thanks for your replay, I'll modify it later! |
littletomatodonkey commentedAug 7, 2021
Hi, For the
|
Hi,I've modied the code, including:
|
setup.py Outdated
@@ -33,6 +33,15 @@ | |||
'google-cloud-vision==1', | |||
'pytesseract' | |||
], | |||
"paddleocr": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
please don't change anything here - I'll modify it later myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
src/layoutparser/__init__.py Outdated
@@ -11,11 +11,13 @@ | |||
from .ocr import ( | |||
GCVFeatureType, GCVAgent, | |||
TesseractFeatureType, TesseractAgent | |||
TesseractFeatureType, TesseractAgent, | |||
PaddleocrAgent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Please don't add OCR Agent in this PR -- right now we focus on LayoutModel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
@@ -86,7 +86,6 @@ | |||
0: "Table" | |||
}, | |||
} | |||
# fmt: on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
why removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sorry, I've modified it back
} | ||
# fmt: off | ||
LABEL_MAP_CATALOG = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Please remove the labeling mappings for unused datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
# fmt: on | ||
class LayoutParserDetectron2ModelHandler(PathHandler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Change class name toLayoutParserPaddleModelHandler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
trt_min_shape=1, | ||
trt_max_shape=1280, | ||
trt_opt_shape=640, | ||
min_subgraph_size=3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
And are there any documentations on how to set and adjust these parameters, in English? If not, I would suggest removing them..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
enable_mkldnn=True, | ||
thread_num=10, | ||
min_subgraph_size=3, | ||
use_dynamic_shape=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
And apparently some of the hyperparameters/configurations are not used. Please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
inputs = self.create_inputs(im, im_info) | ||
return inputs | ||
def postprocess(self, np_boxes, np_masks, inputs): |
lolipopshockAug 6, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think we need merge thepostprocess
withgather_output
. It's not reasonable to break them into two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I've merged ,thks
if not enforce_cpu: | ||
# initial GPU memory(M), device ID | ||
config.enable_use_gpu(2000, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why it is 2000 rather than other values? Please leave a note in the comment. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Done
setup.py Outdated
@@ -33,6 +33,12 @@ | |||
'google-cloud-vision==1', | |||
'pytesseract' | |||
], | |||
"paddlepaddle": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sorry for not being clear earlier. Please remove all the newextras_require
, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Got it.
@@ -10,7 +10,7 @@ class DropboxHandler(HTTPURLHandler): | |||
""" | |||
def _get_supported_prefixes(self): | |||
return ["https://www.dropbox.com"] | |||
return ["https://www.dropbox.com","https://paddle-model-ecology.bj.bcebos.com"] |
lolipopshockAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't think this is a good idea -- perhaps it's better to create something likeclass PaddleModelURLHandler(HTTPURLHandler):
in the paddle modelcatalog.py
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Please see the comments below for more detailed instructions.
layout = self.gather_output(np_boxes, np_masks) | ||
return layout | ||
def untar_files(self, model_tar, model_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It feels counter-intuitive to put this function inside the modeling folder, and I think it's better to include this part inside the PathManager, by rewriting this function_get_local_path
. So basically it will be something like this:
classPaddleModelURLHandler(HTTPURLHandler):""" Supports download and file check for dropbox links """def_get_supported_prefixes(self):return ["https://paddle-model-ecology.bj.bcebos.com"]def_isfile(self,path):returnpathinself.cache_mapdef_get_local_path(self,path:str,force:bool=False,cache_dir:Optional[str]=None,**kwargs:Any, )->str:""" This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested. """self._check_kwargs(kwargs)if (forceorpathnotinself.cache_mapornotos.path.exists(self.cache_map[path]) ):logger=logging.getLogger(__name__)parsed_url=urlparse(path)dirname=os.path.join(get_cache_dir(cache_dir),os.path.dirname(parsed_url.path.lstrip("/")) )filename=path.split("/")[-1]iflen(filename)>self.MAX_FILENAME_LEN:filename=filename[:100]+"_"+uuid.uuid4().hexcached=os.path.join(dirname,filename)withfile_lock(cached):ifnotos.path.isfile(cached):logger.info("Downloading {} ...".format(path))cached=download(path,dirname,filename=filename)#### ----> Add from hereifpath.endswith(".tar"):untar_function()logger.info("URL {} cached in {}".format(path,cached))self.cache_map[path]=cachedreturnself.cache_map[path]
You might also need to check whether the untar'ed version exists beforehand to avoid duplicated downloading.
image, im_info = permute(image, im_info) | ||
inputs = {} | ||
inputs['image'] = np.array((image, )).astype('float32') |
lolipopshockAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This feels weird -- probably it's better to havenp.array(image)[np.newaxis, :].astype('float32')
.
config of model, defined by `Config(model_dir)` | ||
model_path (str): | ||
The path to the saved weights of the model. | ||
threshold (float): |
lolipopshockAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Don't forget to clean up the docstrings -- and you can specify what the extra configs are under theextra_configs
block.
Uh oh!
There was an error while loading.Please reload this page.
config_path = self._reconstruct_path_with_detector_name(config_path) | ||
model_tar = PathManager.get_local_path(config_path) | ||
pre_dir = os.path.dirname(model_tar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
please remove this -- see the details below.
thread_num=extra_config.get('thread_num',10)) | ||
self.threshold = extra_config.get('threshold',0.5) | ||
self.input_shape = extra_config.get('input_shape',[3,640,640]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Asinput_shape
is inself.im_info
, you might want to remove this variable.
self.threshold = extra_config.get('threshold',0.5) | ||
self.input_shape = extra_config.get('input_shape',[3,640,640]) | ||
self.label_map = label_map | ||
self.im_info = { |
lolipopshockAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
So basically this is something likedefault_image_info
, right? It's better to use a more descriptive name rather thanim_info
, which is also the outputs of the preprocessors.
im_info (dict): info of processed image | ||
""" | ||
image = image.astype(np.float32, copy=False) | ||
mean = np.array(mean)[np.newaxis, np.newaxis, :] |
lolipopshockAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why initializing the arrays this every time? We can just initialize themean
andstd
to be the appropriatendarray
s.
inputs (dict): input of model | ||
""" | ||
# read rgb image | ||
image, im_info = decode_image(image, self.im_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't like the design here:
- in the current setting, the
self.im_info
will be changed for all different image inputs, which doesn't make sense. - im_info is only changed in the
resize
function, and thedecode_image
doesn't changeimage
at all - can we further simplify the APIs?
For example, an alternative example could be:
image = image.copy()input_shape = np.array(image.shape[:2], dtype=np.float32)image, scale_factor = resize(image) # change the return value image = (image - self.extra_config['pixel_mean']) \/ image - self.extra_config['pixel_std'] # may need change how to load the values image = image.transpose((2, 0, 1)).copy() # The model requires channel-first input model_input_image_shape = np.array(image.shape[:2], dtype=np.float32)image_info = { 'scale_factor': scale_factor, 'im_shape': model_input_image_shape, 'input_shape': input_shape,}
lolipopshock commentedAug 11, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks for the updates! The current version is better than the previous one, but I think the code could be further simplified and more structured -- please check the comments for possible updates. |
I've also carefully checked the code, and added some changes. Some key updates:
Please note, you mentioned in the previous version that paddle mode may support specifying And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal. Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work! |
littletomatodonkey commentedAug 17, 2021
Hi, thanks for your new changes, which further simplifies the code.
|
an1018 commentedAug 17, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
The changes are ok for me,too. Through this repo and the whole modification process, I also learned a lot. Thank you very much! |
I think ultimately we'll provide better downloading support, e.g., mirrored downloading sites and or something mentioned in#43 . At that time, ppl from different regions can choose the best downloading methods accordingly. |
Hi, I have reorganized the file structure of Paddle.
Please review.Thx