Movatterモバイル変換


[0]ホーム

URL:


Hugging Face's logoHugging Face

Hub Python Library documentation

Strict Dataclasses

Hub Python Library

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces
Faster examples with accelerated inference
Switch between documentation themes

to get started

Strict Dataclasses

Thehuggingface_hub package provides a utility to createstrict dataclasses. These are enhanced versions of Python’s standarddataclass with additional validation features. Strict dataclasses ensure that fields are validated both during initialization and assignment, making them ideal for scenarios where data integrity is critical.

Overview

Strict dataclasses are created using the@strict decorator. They extend the functionality of regular dataclasses by:

  • Validating field types based on type hints
  • Supporting custom validators for additional checks
  • Optionally allowing arbitrary keyword arguments in the constructor
  • Validating fields both at initialization and during assignment

Benefits

  • Data Integrity: Ensures fields always contain valid data
  • Ease of Use: Integrates seamlessly with Python’sdataclass module
  • Flexibility: Supports custom validators for complex validation logic
  • Lightweight: Requires no additional dependencies such as Pydantic, attrs, or similar libraries

Usage

Basic Example

from dataclassesimport dataclassfrom huggingface_hub.dataclassesimport strict, as_validated_field# Custom validator to ensure a value is positive@as_validated_fielddefpositive_int(value:int):ifnot value >0:raise ValueError(f"Value must be positive, got{value}")@strict@dataclassclassConfig:    model_type:str    hidden_size:int = positive_int(default=16)    vocab_size:int =32# Default value# Methods named `validate_xxx` are treated as class-wise validatorsdefvalidate_big_enough_vocab(self):if self.vocab_size < self.hidden_size:raise ValueError(f"vocab_size ({self.vocab_size}) must be greater than hidden_size ({self.hidden_size})")

Fields are validated during initialization:

config = Config(model_type="bert", hidden_size=24)# Validconfig = Config(model_type="bert", hidden_size=-1)# Raises StrictDataclassFieldValidationError

Consistency between fields is also validated during initialization (class-wise validation):

# `vocab_size` too small compared to `hidden_size`config = Config(model_type="bert", hidden_size=32, vocab_size=16)# Raises StrictDataclassClassValidationError

Fields are also validated during assignment:

config.hidden_size =512# Validconfig.hidden_size = -1# Raises StrictDataclassFieldValidationError

To re-run class-wide validation after assignment, you must call.validate explicitly:

config.validate()# Runs all class validators

Custom Validators

You can attach multiple custom validators to fields usingvalidated_field. A validator is a callable that takes a single argument and raises an exception if the value is invalid.

from dataclassesimport dataclassfrom huggingface_hub.dataclassesimport strict, validated_fielddefmultiple_of_64(value:int):if value %64 !=0:raise ValueError(f"Value must be a multiple of 64, got{value}")@strict@dataclassclassConfig:    hidden_size:int = validated_field(validator=[positive_int, multiple_of_64])

In this example, both validators are applied to thehidden_size field.

Additional Keyword Arguments

By default, strict dataclasses only accept fields defined in the class. You can allow additional keyword arguments by settingaccept_kwargs=True in the@strict decorator.

from dataclassesimport dataclassfrom huggingface_hub.dataclassesimport strict@strict(accept_kwargs=True)@dataclassclassConfigWithKwargs:    model_type:str    vocab_size:int =16config = ConfigWithKwargs(model_type="bert", vocab_size=30000, extra_field="extra_value")print(config)# ConfigWithKwargs(model_type='bert', vocab_size=30000, *extra_field='extra_value')

Additional keyword arguments appear in the string representation of the dataclass but are prefixed with* to highlight that they are not validated.

Integration with Type Hints

Strict dataclasses respect type hints and validate them automatically. For example:

from typingimportListfrom dataclassesimport dataclassfrom huggingface_hub.dataclassesimport strict@strict@dataclassclassConfig:    layers:List[int]config = Config(layers=[64,128])# Validconfig = Config(layers="not_a_list")# Raises StrictDataclassFieldValidationError

Supported types include:

  • Any
  • Union
  • Optional
  • Literal
  • List
  • Dict
  • Tuple
  • Set

And any combination of these types. If your need more complex type validation, you can do it through a custom validator.

Class validators

Methods namedvalidate_xxx are treated as class validators. These methods must only takeself as an argument. Class validators are run once during initialization, right after__post_init__. You can define as many of them as needed—they’ll be executed sequentially in the order they appear.

Note that class validators are not automatically re-run when a field is updated after initialization. To manually re-validate the object, you need to callobj.validate().

from dataclassesimport dataclassfrom huggingface_hub.dataclassesimport strict@strict@dataclassclassConfig:    foo:str    foo_length:int    upper_case:bool =Falsedefvalidate_foo_length(self):iflen(self.foo) != self.foo_length:raise ValueError(f"foo must be{self.foo_length} characters long, got{len(self.foo)}")defvalidate_foo_casing(self):if self.upper_caseand self.foo.upper() != self.foo:raise ValueError(f"foo must be uppercase, got{self.foo}")config = Config(foo="bar", foo_length=3)# okconfig.upper_case =Trueconfig.validate()# Raises StrictDataclassClassValidationErrorConfig(foo="abcd", foo_length=3)# Raises StrictDataclassFieldValidationErrorConfig(foo="Bar", foo_length=3, upper_case=True)# Raises StrictDataclassFieldValidationError

Method.validate() is a reserved name on strict dataclasses.To prevent unexpected behaviors, aStrictDataclassDefinitionError error will be raised if your class already defines one.

API Reference

@strict

The@strict decorator enhances a dataclass with strict validation.

huggingface_hub.dataclasses.strict

<source>

(accept_kwargs: bool = False)

Parameters

  • cls —The class to convert to a strict dataclass.
  • accept_kwargs (bool,optional) —If True, allows arbitrary keyword arguments in__init__. Defaults to False.

Decorator to add strict validation to a dataclass.

This decorator must be used on top of@dataclass to ensure IDEs and static typing toolsrecognize the class as a dataclass.

Can be used with or without arguments:

  • @strict
  • @strict(accept_kwargs=True)

Example:

>>>from dataclassesimport dataclass>>>from huggingface_hub.dataclassesimport as_validated_field, strict, validated_field>>>@as_validated_field>>>defpositive_int(value:int):...ifnot value >=0:...raise ValueError(f"Value must be positive, got{value}")>>>@strict(accept_kwargs=True)...@dataclass...classUser:...    name:str...    age:int = positive_int(default=10)# Initialize>>>User(name="John")User(name='John', age=10)# Extra kwargs are accepted>>>User(name="John", age=30, lastname="Doe")User(name='John', age=30, *lastname='Doe')# Invalid type => raises>>>User(name="John", age="30")huggingface_hub.errors.StrictDataclassFieldValidationError: Validation errorfor field'age':    TypeError: Field'age' expectedint, gotstr (value:'30')# Invalid value => raises>>>User(name="John", age=-1)huggingface_hub.errors.StrictDataclassFieldValidationError: Validation errorfor field'age':    ValueError: Value must be positive, got -1

validate_typed_dict

Method to validate that a dictionary conforms to the types defined in aTypedDict class.

This is the equivalent to dataclass validation but forTypedDicts. Since typed dicts are never instantiated (only used by static type checkers), validation step must be manually called.

huggingface_hub.dataclasses.validate_typed_dict

<source>

(schema: typedata: dict)

Parameters

  • schema (type[TypedDictType]) —The TypedDict class defining the expected structure and types.
  • data (dict) —The dictionary to validate.

Raises

StrictDataclassFieldValidationError

  • StrictDataclassFieldValidationError —If any field in the dictionary does not conform to the expected type.

Validate that a dictionary conforms to the types defined in a TypedDict class.

Under the hood, the typed dict is converted to a strict dataclass and validated using the@strict decorator.

Example:

>>>from typingimport Annotated, TypedDict>>>from huggingface_hub.dataclassesimport validate_typed_dict>>>defpositive_int(value:int):...ifnot value >=0:...raise ValueError(f"Value must be positive, got{value}")>>>classUser(TypedDict):...    name:str...    age: Annotated[int, positive_int]>>># Valid data>>>validate_typed_dict(User, {"name":"John","age":30})>>># Invalid type for age>>>validate_typed_dict(User, {"name":"John","age":"30"})huggingface_hub.errors.StrictDataclassFieldValidationError: Validation errorfor field'age':    TypeError: Field'age' expectedint, gotstr (value:'30')>>># Invalid value for age>>>validate_typed_dict(User, {"name":"John","age": -1})huggingface_hub.errors.StrictDataclassFieldValidationError: Validation errorfor field'age':    ValueError: Value must be positive, got -1

as_validated_field

Decorator to create avalidated_field. Recommended for fields with a single validator to avoid boilerplate code.

huggingface_hub.dataclasses.as_validated_field

<source>

(validator: typing.Callable[[typing.Any], NoneType])

Parameters

  • validator (Callable) —A method that takes a value as input and raises ValueError/TypeError if the value is invalid.

Decorates a validator function as avalidated_field (i.e. a dataclass field with a custom validator).

validated_field

Creates a dataclass field with custom validation.

huggingface_hub.dataclasses.validated_field

<source>

(validator: typing.Union[list[typing.Callable[[typing.Any], NoneType]], typing.Callable[[typing.Any], NoneType]]default: typing.Union[typing.Any, dataclasses._MISSING_TYPE] = <dataclasses._MISSING_TYPE object at 0x7ff1eeb7d510>default_factory: typing.Union[typing.Callable[[], typing.Any], dataclasses._MISSING_TYPE] = <dataclasses._MISSING_TYPE object at 0x7ff1eeb7d510>init: bool = Truerepr: bool = Truehash: typing.Optional[bool] = Nonecompare: bool = Truemetadata: typing.Optional[dict] = None**kwargs: typing.Any)

Parameters

  • validator (Callable orlist[Callable]) —A method that takes a value as input and raises ValueError/TypeError if the value is invalid.Can be a list of validators to apply multiple checks.
  • **kwargs —Additional arguments to pass todataclasses.field().

Create a dataclass field with a custom validator.

Useful to apply several checks to a field. If only applying one rule, check out theas_validated_field decorator.

Errors

classhuggingface_hub.errors.StrictDataclassError

<source>

()

Base exception for strict dataclasses.

classhuggingface_hub.errors.StrictDataclassDefinitionError

<source>

()

Exception thrown when a strict dataclass is defined incorrectly.

classhuggingface_hub.errors.StrictDataclassFieldValidationError

<source>

(field: strcause: Exception)

Exception thrown when a strict dataclass fails validation for a given field.

Why Not Use pydantic ? (or attrs ? or marshmallow_dataclass ?)

  • See discussion inhttps://github.com/huggingface/transformers/issues/36329 regarding adding Pydantic as a dependency. It would be a heavy addition and require careful logic to support both v1 and v2.
  • We don’t need most of Pydantic’s features, especially those related to automatic casting, jsonschema, serialization, aliases, etc.
  • We don’t need the ability to instantiate a class from a dictionary.
  • We don’t want to mutate data. In@strict, “validation” means “checking if a value is valid.” In Pydantic, “validation” means “casting a value, possibly mutating it, and then checking if it’s valid.”
  • We don’t need blazing-fast validation.@strict isn’t designed for heavy loads where performance is critical. Common use cases involve validating a model configuration (performed once and negligible compared to running a model). This allows us to keep the code minimal.
Update on GitHub


[8]ページ先頭

©2009-2026 Movatter.jp