KEMBAR78
Enum validator improvements by samuelcolvin · Pull Request #9045 · pydantic/pydantic · GitHub
Skip to content

Conversation

@samuelcolvin
Copy link
Member

@samuelcolvin samuelcolvin commented Mar 19, 2024

Change Summary

While looking through how we do enum validation in preparation for moving some of it to Rust, I saw a few things that could be improved. Hopefully this provides a "cleaner" schema, and should improve performance a bit.

Performance gain is small, but not insignificant:

before

In [1]: import enum
   ...: from pydantic import TypeAdapter
   ...: 
   ...: class Foo(enum.IntEnum):
   ...:     x = 1
   ...:     y = 2
   ...: 
   ...: 
   ...: ta = TypeAdapter(Foo)
   ...: 
   ...: %timeit ta.validate_python(2)
   ...: %timeit ta.validate_json('2')
   ...: 
550 ns ± 2.37 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
560 ns ± 2.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

After

In [1]: import enum
   ...: from pydantic import TypeAdapter
   ...: 
   ...: class Foo(enum.IntEnum):
   ...:     x = 1
   ...:     y = 2
   ...: 
   ...: 
   ...: ta = TypeAdapter(Foo)
   ...: 
   ...: %timeit ta.validate_python(2)
   ...: %timeit ta.validate_json('2')
   ...: 
508 ns ± 0.84 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
517 ns ± 1.95 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 19, 2024

Deploying pydantic-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8bd46c3
Status: ✅  Deploy successful!
Preview URL: https://11ebbd3f.pydantic-docs2.pages.dev
Branch Preview URL: https://enum-validator-improvements.pydantic-docs2.pages.dev

View logs

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 19, 2024

CodSpeed Performance Report

Merging #9045 will not alter performance

Comparing enum-validator-improvements (8bd46c3) with main (09f4456)

Summary

✅ 10 untouched benchmarks

except ValueError:
# The type: ignore on the next line is to ignore the requirement of LiteralString
raise PydanticCustomError('enum', f'Input should be {expected}', {'expected': expected}) # type: ignore
raise PydanticCustomError('enum', 'Input should be {expected}', {'expected': expected})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output is the same, but it's more correct to let pydantic-core take care of replacing the variables in the message template.

python_schema=strict_python_schema,
)
js_updates['type'] = 'integer'
lax_schema = core_schema.no_info_after_validator_function(to_enum, core_schema.int_schema())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced lots of use of core_schema.chain_schema with core_schema.no_info_after_validator_function which should be more efficient, and is more concise.

[enum_schema, core_schema.no_info_plain_validator_function(lambda x: x.value)]
)
if config.get('use_enum_values', False):
enum_schema = core_schema.no_info_after_validator_function(attrgetter('value'), enum_schema)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attrgetter is nearly twice as fast as a lambda:

In [1]: import enum

In [2]: class Foo(enum.Enum):
   ...:     x = 'a'
   ...:     y = 'b'
   ...: 

In [3]: with_lambda = lambda thing: thing.x

In [4]: %timeit with_lambda(Foo)
39.1 ns ± 0.161 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: from operator import attrgetter

In [6]: using_attrgetter = attrgetter('x')

In [7]: %timeit using_attrgetter(Foo)
24.5 ns ± 0.473 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using attrgetter more broadly?

@samuelcolvin samuelcolvin added the relnotes-performance Used for performance improvements. label Mar 19, 2024
Copy link
Contributor

@sydney-runkle sydney-runkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! This is much more simple than the existing behavior!

[enum_schema, core_schema.no_info_plain_validator_function(lambda x: x.value)]
)
if config.get('use_enum_values', False):
enum_schema = core_schema.no_info_after_validator_function(attrgetter('value'), enum_schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using attrgetter more broadly?

@sydney-runkle sydney-runkle merged commit b0dfc54 into main Mar 19, 2024
@sydney-runkle sydney-runkle deleted the enum-validator-improvements branch March 19, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

relnotes-performance Used for performance improvements.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants