Movatterモバイル変換


[0]ホーム

URL:


ContentsMenuExpandLight modeDark modeAuto light/dark, in light modeAuto light/dark, in dark modeSkip to content
Hezar Documentation
Logo
Hezar Documentation
Back to top

hezar.data.datasets.ocr_dataset module

classhezar.data.datasets.ocr_dataset.OCRDataset(config:OCRDatasetConfig,split=None,preprocessor=None,**kwargs)[source]

Bases:Dataset

General OCR dataset class.

OCR dataset supports two types of image to text dataset. One is for tokenizer-based models in which the labels aretokens and the other is char-level models in which the labels are separated by character and the converted to ids.This behavior is specified by thetext_split_type in config which can be eithertokenize orchar_split.

required_backends:List[str|Backends]=[Backends.SCIKIT]
classhezar.data.datasets.ocr_dataset.OCRDatasetConfig(path:str|None=None,task:~hezar.constants.TaskType=TaskType.IMAGE2TEXT,max_size:int|float|None=None,hf_load_kwargs:dict|None=None,text_split_type:str|~hezar.data.datasets.ocr_dataset.TextSplitType=TextSplitType.CHAR_SPLIT,id2label:~typing.Dict[int,str]=<factory>,text_column:str='label',images_paths_column:str='image_path',max_length:int|None=None,invalid_characters:list|None=None,reverse_text:bool|None=None,reverse_digits:bool|None=None)[source]

Bases:DatasetConfig

Configuration class for OCR datasets.

Parameters:
  • path (str) – Path to the dataset.

  • text_split_type (TextSplitType) – Type of text splitting (CHAR_SPLIT or TOKENIZE).

  • id2label (Dict[int,str]) – Mapping of label IDs to characters.

  • text_column (str) – Column name for text in the dataset.

  • images_paths_column (str) – Column name for image paths in the dataset.

  • max_length (int) – Maximum length of text.

  • invalid_characters (list) – List of invalid characters.

  • reverse_digits (bool) – Whether to reverse the digits in text.

id2label:Dict[int,str]
images_paths_column:str='image_path'
invalid_characters:list=None
max_length:int=None
name:str='ocr'
path:str=None
reverse_digits:bool=None
reverse_text:bool=None
task:TaskType='image2text'
text_column:str='label'
text_split_type:str|TextSplitType='char_split'
classhezar.data.datasets.ocr_dataset.TextSplitType(value)[source]

Bases:str,Enum

An enumeration.

CHAR_SPLIT='char_split'
TOKENIZE='tokenize'
On this page

[8]ページ先頭

©2009-2025 Movatter.jp