Pandas likes to throw cryptic errors when you feed its functions with emptyDataFrames saying nothing that would help you to identify the root cause. In order to avoid this I used to write conditions like this one all over the place:
def normalize_null(data: pd.DataFrame) -> pd.DataFrame: """Replaces pd.NaT garbage with None.""" if data.empty: return pd.DataFrame() return data.replace({pd.NaT: None})I don't even remember if this particular operation fails on an emptyDataFrame, but there are many that do so I just add it to every function just in case.
Then I thought why making it so ugly? Make it a decorator! So now I do this:
@skip_if_empty(default=pd.DataFrame())def normalize_null(data: pd.DataFrame) -> pd.DataFrame: """Replaces pd.NaT garbage with None.""" return data.replace({pd.NaT: None})with this decorator:
def skip_if_empty(default: Any | None): def decorate(decoratee): def decorator(*decoratee_args, **decoratee_kwargs): if len(decoratee_args) < 1: raise ValueError("The decorated function must have at least one argument.") if type(decoratee_args[0]) is not pd.DataFrame: raise ValueError("The first argument must be a DataFrame.") return default if decoratee_args[0].empty else decoratee(*decoratee_args, **decoratee_kwargs) decorator.__signature__ = inspect.signature(decoratee) return decorator return decorateNow when I use several such methods that modify aDataFrame in achain I don't have to worry about anything beingempty:
data = normalize_null(data)data = do_something_else(data)data = ...Would you say it's ok to use decorators this way or mabe there is something else about it that could be improved?
1 Answer1
functools.wraps
The triple-nested function confused me for a minute, looking at it as a stranger, particularly as the decorator is not documented. You may want to look intofunctools.wraps which is a convenience function which does what you want with preserving signature, docs, name, etc. while being a standardised method.
Names
To me,default doesn't immediately tell me what it's doing,default_return would be more obvious to me. Also,skip_if_empty implies that this function would work for any type, but is explicitly limited by:
if type(decoratee_args[0]) is not pd.DataFrametopd.DataFrames. Perhapsskip_if_empty_df might be more clear. Either that, or you could make this more generically useful by allowing a second argument of permitted type:
def skip_if_empty(default: Any | None, allowed_types: tuple = (pd.DataFrame,)): ... if type(decoratee_args[0]) is not in allowed_typesPythonic
if len(decoratee_args) < 1unlesslen(tuple) is liable to return a negative number what you're actually asking is:
if len(decoratee_args) == 0or more pythonically
if not decoratee_argsLikewise
if type(decoratee_args[0]) is not pd.DataFramemight want to be:
if not isinstance(decoratee_args[0], pd.DataFrame)which would allow subclasses ofpd.DataFrames to work as well.
Error messages
As a user, I would (and should) be unaware that any of the functions are in any way decorated, so seeing:
raise ValueError("The decorated function must have at least one argument.")Is unhelpful. You might instead want:
raise ValueError(f"{decoratee.__name__} must have at least one argument.")- \$\begingroup\$I like that and I have a question about this
allowed_types: tuple = (pd.DataFrame,). Would this also work with a list or are tuples preferable in cases like this? I guess probably because they are immutable, right?\$\endgroup\$t3chb0t– t3chb0t2023-03-29 10:55:25 +00:00CommentedMar 29, 2023 at 10:55 - 1\$\begingroup\$It would work with a
listand type annotations are just hints (by default), I would argue atuplemakes most sense here due to immutability and the fact that you will be defining it as a programmer rather than a user, but it could be any collection that supportsinsemantics.isinstancealso recommendstuples\$\endgroup\$DeathIncarnate– DeathIncarnate2023-03-29 10:59:45 +00:00CommentedMar 29, 2023 at 10:59
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
