TL;DR - When creating NamedTuples dynamically, there should be a single interface that’d allow to pass all 3 - field name, field annotation, and field default. Because collections.namedtuple()
accepts defaults, but NOT annotations, and typing.NamedTuple()
accepts annotations but NOT defaults.
E.g. by allowing to add annotations to collections.namedtuple()
.
Or by allowing to add defaults to typing.NamedTuple()
.
Context
I’m building a frontend library for Django - django-components (e.g. think Vue in Python). There, I want to make the life of our users as simple as possible.
To allow for input validation and typing, we’ve come up with this construct (simplified):
class MyTable(Component):
class Kwargs(NamedTuple):
title: str
title_height_px: int = 40
def get_template_data(self, kwargs: Kwargs):
return {
"title": kwargs.title.trim(),
"title_height_px": kwargs.title_height_px,
}
MyTable.render(
kwargs=MyTable.Kwargs(
title="MyPage",
title_height_px=50,
)
)
What’s going on in the example is that:
- User defines a UI component by subclassing
Component
. - They can defined nested class
Component.Kwargs
to define component inputs (similar to Vue’sprops
) - To get type hints, user can annotate the
kwargs
argument inget_template_data()
with theComponent.Kwargs
class they just defined. Internally, we transformkwargs
to an instance ofComponent.Kwargs
. This way they get type checking inside the component - To get type checking when calling the component from outside (
MyTable.render()
), users can reuse theComponent.Kwargs
class that they defined
In the example above, the Kwargs
class subclasses NamedTuple
. Our users can choose any other class - e.g. if they use Pydantic’s BaseModel
, they will get not only type checking, but also runtime input validation.
As I said, I want to make the API as simple as possible. Needing to remember that Component.Kwargs
must subclass NamedTuple
(or other) is not great. And so in a recent PR, I made it possible to skip the specifying of the parent class. So now our users can simply do:
class MyTable(Component):
class Kwargs: # <----- No longer explicitly subclasses NamedTuple
title: str
title_height_px: int = 40
To implement this simplification, I needed to take a plain class like the Component.Kwargs
above, and behind-the-scenes convert it into a NamedTuple
. That way, even if user simply defines class Kwargs
, it will still eventually be a NamedTuple, and the behaviour above would not break.
So I had a challenge - How can I take a plain class and convert it into a NamedTuple?
Why NamedTuple? It might be outdated info, but I read that they should be faster than dataclasses. On some of my larger web pages at work, this class may be instantiated ~3-4k times when rendering a page, so I try to be performance-conscious.
Problem
Turns out that there’s a lot of nuance and a minefield when it comes to dynamically creating a NamedTuple class for which one wants to specify BOTH annotations AND defaults.
Hence why I think this should be implemented in Python. So that we, Python users, don’t have to think about the details, but simply call collections.namedtuple()
or similar, and get a NamedTuple that has both annotations AND defaults.
See final implementation here
For normal classes, you could simply make a new class that subclasses from both.
class X(MyClass, NamedTuple):
pass
But NamedTuples don’t support that.
And you can’t further subclass the subclass of
NamedTuples
:
class Another(NamedTuple):
x: int = 1
class X(Another):
y: str
When using
typing.NamedTuple
as a function, you can’t pass in defaults:
my_class = typing.NamedTuple("MyClass", [("x", int), ("y", str)])
I tried setting the defaults (_field_defaults
) manually, but Python wasn’t picking that up.
One option was to define the NamedTuple with a class syntax as a string, and then evaluate that string. But that had 2 problems - 1) security risk, and 2) we’d need to import all the types used in annotations:
my_cls_str = """
from typing import NamedTuple
from path.to.custom import CustomClass
class MyClass(NamedTuple):
x: int
y: str
z: CustomClass
"""
my_cls = eval(my_cls_str, {}, {})
Lastly I managed to get it working using
collections.namedtuple
. This function doesn’t define the field annotations, but it is able to handle defaults.
So if I have NamedTuple with 3 fields - x
, y
, and z
, and I set defaults to ["hello", 123]
:
my_cls = namedtuple("MyClass", ["x", "y", "z"], defaults=["hello", 123])
then this is the same as writing:
class MyClass(NamedTuple):
x: int
y: str = "hello"
z: int = 123
To get the annotations back in, I had set MyClass.__annotations__
at the end.
NOTE: I can’t confirm if using __annotations__
is the right way to set these, since I don’t know what tools or functions I could use to test this out.
NOTE 2: One annoying thing about collections.namedtuple
was how it sets defaults the same way as Python functions do - it takes defaults as a list, and it assigns the defaults to the last N entries of the class signature. It’s annoying becuase it’s easy to make a mistake when contructing the defaults list.
NOTE 3: One thing I didn’t know - when defining NamedTuple
with the class syntax, the fields it recognizes are only those that are annotated. If a field is NOT annotated, it’s treated as a CLASS VARIABLE:
class MyClass(NamedTuple):
x: int # Required field
y: str = "hello" # Optional field (with default)
z = 123 # Class var!
So at the end I also assigned the class variables to the new namedtuple class.
But this already feels outside of the scope of namedtuple
constructor, because in other cases one may not want to copy the class vars.
NOTE 4: Surprisingly LLMs were really bad at solving this, so I’m hoping this comment gets into the next training set.
Proposal
When creating NamedTuples dynamically, there should be a single interface that’d allow to pass all 3 - field name, field annotation, and field default. Because collections.namedtuple()
accepts defaults, but NOT annotations, and typing.NamedTuple()
accepts annotations but NOT defaults.
E.g. by allowing to add annotations to collections.namedtuple()
.
Or by allowing to add defaults to typing.NamedTuple()
.
The API of collections.namedtuple
is (in Py 3.11):
def namedtuple(
typename: str,
field_names: str | Iterable[str],
*,
rename: bool = False,
module: str | None = None,
defaults: Iterable[Any] | None = None
) -> type[tuple[Any, ...]]
So it could receive one more kwarg, annotations
, that would be an iterable like defaults
def namedtuple(
typename: str,
field_names: str | Iterable[str],
*,
rename: bool = False,
module: str | None = None,
defaults: Iterable[Any] | None = None,
annotations: Iterable[Any] | None = None, # <---- NEW
) -> type[tuple[Any, ...]]
Don’t know about how Python handles the defaults
iterable, but I reckon there should be an error when defaults
and annotations
are both non-null and their lenghts don’t match.
The API of typing.NamedTuple
as function call is:
Employee = NamedTuple('Employee', [('name', str), ('id', int)])
Currently it’s 2-tuple of (name, type)
. So NamedTuple
could be made to optionally accept a 3-tuple, in which case it would be (name, type, default)
:
Employee = NamedTuple('Employee', [
('name', str),
('id', int),
('active', bool, True),
])
This should work the same way as when defining a NamedTuple with the class syntax - fields with defaults MUST be AFTER fields without defaults.