Python Type Hints#
Python is a dynamically typed language, meaning variable types don't require explicit declaration. However, as projects grow in complexity, type annotations become increasingly valuable for code maintainability and clarity.
Type hints (PEP 484) have been a major focus of recent Python releases, and I was particularly intrigued when I heard about Guido van Rossum's work on MyPy at Dropbox, where the team needed robust tooling to migrate their codebase from Python 2 to Python 3.
Today, type hints are essential for modern Python development. They significantly enhance IDE capabilities and AI-powered development tools by providing better code completion, static analysis, and error detection. This mirrors the evolution we've seen with TypeScript's adoption over traditional JavaScript—explicit typing leads to more reliable and maintainable code.
Typed Python vs data science projects
We know that type hints are not very popular among data science projects for some reasons, but we won't discuss them here.
typing module vs collections module#
Since Python 3.9, most of types in typing
module i deprecated, and collections
module is recommended.
Some types like: typing.Any
, typing.Generic
, typing.TypeVar
, etc. are still not deprecated.
Thanks to subscription support in many collections since Python3.9
The collections
module is now the preferred way to import many types (not all yet), as they support subscription at runtime. Subscription refers to using square brackets []
to indicate the type of elements in a collection. Subscription at runtime means we can use list[int]
, dict[str, int]
, etc. directly without importing from typing.List
, typing.Dict
, etc.
In [1]: list[int]
Out[1]: list[int]
In [2]: type(list[int])
Out[2]: types.GenericAlias
"""
https://docs.python.org/3/reference/datamodel.html#classgetitem-versus-getitem
# Usually, the subscription of an object using square brackets will call the __getitem__() instance method
defined on the object's class. However, if the object being subscribed is itself a class,
the class method __class_getitem__() may be called instead. __class_getitem__()
should return a GenericAlias object if it is properly defined.
"""
In [3]: list.__class_getitem__(int)
Out[3]: list[int]
In [4]: type(list.__class_getitem__(int))
Out[4]: types.GenericAlias
In [5]: list.__getitem__(int)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[5], line 1
----> 1 list.__getitem__(int)
TypeError: descriptor '__getitem__' for 'list' objects doesn't apply to a 'type' object
Aliases to Built-in Types#
Deprecated Alias | Replacement |
---|---|
typing.Dict | dict |
typing.List | list |
typing.Set | set |
typing.FrozenSet | frozenset |
typing.Tuple | tuple |
typing.Type | type |
Aliases to Types in collections#
Deprecated Alias | Replacement |
---|---|
typing.DefaultDict | collections.defaultdict |
typing.OrderedDict | collections.OrderedDict |
typing.ChainMap | collections.ChainMap |
typing.Counter | collections.Counter |
typing.Deque | collections.deque |
Aliases to Other Concrete Types#
Deprecated Alias | Replacement |
---|---|
typing.Pattern | re.Pattern |
typing.Match | re.Match |
typing.Text | str |
typing.ByteString | collections.abc.Buffer or union like bytes | bytearray | memoryview |
Aliases to Container ABCs in collections.abc#
Deprecated Alias | Replacement |
---|---|
typing.AbstractSet | collections.abc.Set |
typing.Collection | collections.abc.Collection |
typing.Container | collections.abc.Container |
typing.ItemsView | collections.abc.ItemsView |
typing.KeysView | collections.abc.KeysView |
typing.Mapping | collections.abc.Mapping |
typing.MappingView | collections.abc.MappingView |
typing.MutableMapping | collections.abc.MutableMapping |
typing.MutableSequence | collections.abc.MutableSequence |
typing.MutableSet | collections.abc.MutableSet |
typing.Sequence | collections.abc.Sequence |
typing.ValuesView | collections.abc.ValuesView |
Aliases to Asynchronous ABCs in collections.abc#
Deprecated Alias | Replacement |
---|---|
typing.Coroutine | collections.abc.Coroutine |
typing.AsyncGenerator | collections.abc.AsyncGenerator |
typing.AsyncIterable | collections.abc.AsyncIterable |
typing.AsyncIterator | collections.abc.AsyncIterator |
typing.Awaitable | collections.abc.Awaitable |
Aliases to Other ABCs in collections.abc#
Deprecated Alias | Replacement |
---|---|
typing.Iterable | collections.abc.Iterable |
typing.Iterator | collections.abc.Iterator |
typing.Callable | collections.abc.Callable |
typing.Generator | collections.abc.Generator |
typing.Hashable | collections.abc.Hashable |
typing.Reversible | collections.abc.Reversible |
typing.Sized | collections.abc.Sized |
Aliases to contextlib ABCs#
Deprecated Alias | Replacement |
---|---|
typing.ContextManager | contextlib.AbstractContextManager |
typing.AsyncContextManager | contextlib.AbstractAsyncContextManager |
Notes#
- Deprecated aliases are guaranteed to remain until at least Python 3.14.
- Type checkers may flag deprecated aliases for projects targeting Python 3.9+.
Sequence & Collection#
collections.abc.Collection
is a type of unordered collection. Collections supports only__len__
,__iter__
,__contains__
operators, and does not support indexing or slicing.collections.abc.Sequence
is subclass ofCollection
, is a type of ordered, indexable collection. Sequence supports__getitem__()
,__reversed__
in addition to the methods ofCollection
. Sequences can be sliced and indexed.
See Collections Abstract Base Classes to check all the methods available for each type.
Type | Sequence | Collection |
---|---|---|
str | ✅ | ✅ |
tuple | ✅ | ✅ |
list | ✅ | ✅ |
range | ✅ | ✅ |
set | ❌ | ✅ |
dict | ❌ | ✅ |
order | ✅ | ❌ |
indexing (having __getitem__() )(e.g., seq[0] ) | ✅ | ❌ |
Membership Checks (x in data ) | ✅ | ✅ |
Type aliases#
From Mypy: Python 3.12 introduced the type
statement for defining explicit type aliases. Explicit type aliases are unambiguous and can also improve readability by making the intent clear. The definition may contain forward references without having to use string literal escaping, since it is evaluated lazily, which improves also the loading performance.
type AliasType = list[dict[tuple[int, str], set[int]]] | tuple[str, list[str]]
# Now we can use AliasType in place of the full name:
def f() -> AliasType:
...
type alias can not be used with isinstance()
Type variable#
From MyPy: Python 3.12 introduced new syntax to use the type[C]
and a type variable with an upper bound (see Type variables with upper bounds).
In the below example, we define a type variable U
that is bound to the User
parent class. This allows us to create a function that can return an instance of any subclass of User
, while still providing type safety. See the fastapi-demo for concrete example.
Here is the example using the legacy syntax (Python 3.11 and earlier):
U = TypeVar('U', bound=User)
def new_user(user_class: type[U]) -> U:
# Same implementation as before
Now mypy will infer the correct type of the result when we call new_user() with a specific subclass of User:
Annotating __init__ methods#
From MyPy: It is allowed to omit the return type declaration on __init__ methods if at least one argument is annotated.
class C1:
# __init__ has no argument is annotated,
# so we should add return type declaration
def __init__(self) -> None:
self.var = 42
class C2:
# __init__ has at least one argument is annotated,
# so it's allowed to omit the return type declaration
# so in most cases, we don't need to add return type.
def __init__(self, arg: int):
self.var = arg
Postponed Evaluation of Annotations#
PEP 563 (Postponed Evaluation of Annotations) (also known as Future annotations import) allows you to use from __future__ import annotations
to defer evaluation of type annotations until they're actually needed. Generally speaking, it turns every annotation into a string. This helps with:
- Forward references
- Circular imports
- Performance improvements
from __future__ import annotations
must be the first executable line in the file. You can only have shebang and comment lines before it.
from __future__ import annotations
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
friends: list[User] = [] # Forward reference works
# This works in Pydantic v2
user = User(name="Alice", age=30, friends=[])
from __future__ import annotation is not fully compatible with Pydantic
See this warning, and see this github issue, and this issue for the compatibility issues with Pydantic and postponed evaluation of annotations. Future annotations import doesn't support Python3.10 new syntax for union type (e.g., int | str
), and it also doesn't support the new syntax for type variables with upper bounds (e.g., type[C]
), neither for some dynamic evaluation of annotations. So it's preferable NOT TO USE from __future__ import annotation
as much as possible, just use string literal annotations
for forward references and circular imports.
Import cycles#
From MyPy: If the cycle import is only needed for type annotations:
from typing import TYPE_CHECKING
if TYPE_CHECKING:
import bar
def listify(arg: 'bar.BarClass') -> 'list[bar.BarClass]':
return [arg]
from foo import listify
class BarClass:
def listifyme(self) -> 'list[BarClass]':
return listify(self)
SqlAlchemy also uses string literal for lazy evaluation and typing.TYPE_CHECKING for typing:
from __future__ import annotations
(PEP 563) turns every annotation into a string. Should be used with careful.The
TYPE_CHECKING
import enables static type checking tools (MyPy, IDEs) to analyze types without affecting runtime behavior. For more details, see the SQLModel documentation.While
from __future__ import annotations
(PEP 563) allows direct usage ofchildren: Mapped[List[Child]]
, the preferred approach ischildren: Mapped[List["Child"]]
. The latter avoids potential compatibility issues with libraries like Pydantic while maintaining clear type hints.By using
if TYPE_CHECKING:
, we ensure the type checker recognizeschildren
as a list ofChild
objects (even it's in string format"Child"
) while preventing circular imports at runtime.SQLAlchemy uses string literals (e.g.,
"Child"
) to reference models, allowing for lazy loading and avoiding circular dependencies.
Callable and Protocol#
From MyPy: We can use Protocols to define callable types with a special call member:
Callback protocols
and Callable
types can be used mostly interchangeably, but protocols are more flexible and can be used to define more complex callable types.
from collections.abc import Iterable
from typing import Optional, Protocol
class Combiner(Protocol):
def __call__(self, *vals: bytes, maxlen: int | None = None) -> list[bytes]: ...
def batch_proc(data: Iterable[bytes], cb_results: Combiner) -> bytes:
for item in data:
...
def good_cb(*vals: bytes, maxlen: int | None = None) -> list[bytes]:
...
def bad_cb(*vals: bytes, maxitems: int | None) -> list[bytes]:
...
batch_proc([], good_cb) # OK
batch_proc([], bad_cb) # Error! Argument 2 has incompatible type because of
# different name and kind in the callback
Protocol doesn't like isinstance()
Although the @runtime_checkable
decorator allows using isinstance()
to check if an object conforms to a Protocol, this approach has limitations and performance issues. Therefore, it's recommended to use Protocol
exclusively for static type checking and avoid runtime isinstance()
checks, at least until Python 3.13.
from typing import Protocol, runtime_checkable
@runtime_checkable
class Drawable(Protocol):
def draw(self) -> None: ...
class Circle:
def draw(self) -> None:
print("Drawing a circle")
# This works but is not recommended
circle = Circle()
if isinstance(circle, Drawable): # Avoid this pattern
circle.draw()
# Preferred approach: rely on duck typing
def render(obj: Drawable) -> None:
obj.draw() # Type checker ensures obj has draw() method
render(circle) # Type-safe without runtime checks
Type narrowing for parameters in multi-type#
We know how to define parameters with union types a: int | str
, but how can we help static type checkers understand which specific type a parameter has within if-else control flow?
Previously, we can simply use isinstance()
function, Python 3.13 introduced typing.TypeIs
(PEP 742)for this purpose (use typing_extensions.TypeIs for Python versions prior to 3.13).
# https://mypy.readthedocs.io/en/stable/type_narrowing.html#type-narrowing-expressions
from typing import reveal_type
def function(arg: object):
if isinstance(arg, int):
# Type is narrowed within the ``if`` branch only
reveal_type(arg) # Revealed type: "builtins.int"
elif isinstance(arg, str) or isinstance(arg, bool):
# Type is narrowed differently within this ``elif`` branch:
reveal_type(arg) # Revealed type: "builtins.str | builtins.bool"
# Subsequent narrowing operations will narrow the type further
if isinstance(arg, bool):
reveal_type(arg) # Revealed type: "builtins.bool"
# Back outside of the ``if`` statement, the type isn't narrowed:
reveal_type(arg) # Revealed type: "builtins.object"
# https://mypy.readthedocs.io/en/stable/type_narrowing.html#type-narrowing-expressions
from typing import TypeIs, reveal_type
def is_str(x: object) -> TypeIs[str]:
return isinstance(x, str)
def process(x: int | str) -> None:
if is_str(x):
reveal_type(x) # Revealed type is 'str'
print(x.upper()) # Valid: x is str
else:
reveal_type(x) # Revealed type is 'int'
print(x + 1) # Valid: x is int
In [6]: process(2)
Runtime type is 'int'
3
In [7]: process("2")
Runtime type is 'str'
Don't use TypeGuard, it works only in if branch, not else branch. TypeIs works for both if and else branch.
When to use TypeIs over isinstance()#
PEP 724 says: Python code often uses functions like isinstance()
to distinguish between different possible types of a value. Type checkers understand isinstance()
and various other checks and use them to narrow the type of a variable. However, sometimes you want to reuse a more complicated check in multiple places, or you use a check that the type checker doesn't understand. In these cases, you can define a TypeIs
function to perform the check and allow type checkers to use it to narrow the type of a variable.
A TypeIs function takes a single argument and is annotated as returning TypeIs[T]
, where T
is the type that you want to narrow to. The function must return True
if the argument is of type T
, and False
otherwise. The function can then be used in if checks, just like you would use isinstance()
. For example:
# https://peps.python.org/pep-0742/#when-to-use-typeis
rom typing import TypeIs, Literal
type Direction = Literal["N", "E", "S", "W"]
def is_direction(x: str) -> TypeIs[Direction]:
return x in {"N", "E", "S", "W"}
def maybe_direction(x: str) -> None:
if is_direction(x):
print(f"{x} is a cardinal direction")
else:
print(f"{x} is not a cardinal direction")
Stub files#
Python standard library ships its type hints in the typeshed repo with .pyi
extension.
For third party libraries, you can save stub files along with your code in the same directory, or you can put them in a for e.g. myproject/stubs
directory, and point it by the env var export MYPYPATH=~/work/myproject/stubs
.
If a directory contains both a .py
and a .pyi
file for the same module, the .pyi
file takes precedence. This way you can easily add annotations for a module even if you don’t want to modify the source code. This can help you to manually add type hints to third-party libraries that don't have them.
Generating stub files#
Mypy also ships with two tools for making it easier to create and maintain stubs: Automatic stub generation (stubgen) and Automatic stub testing (stubtest).
# default output dir is: out, use -o to change it
stubgen my_pkg_dir -o stubs
# default output dir is: typings
pyright --createstub my_pkg_dir
A common problem with stub files is that they tend to diverge from the actual implementation. Mypy includes the stubtest tool that can automatically check for discrepancies between the stubs and the implementation at runtime.
Generics#
list
, set
, dict
, etc, all the built-in collection classes are all Generics type, as they accept one or more type arguments within [...], which can be arbitrary types. For example, the type dict[int, str]
has the type arguments int
and str
, and list[int]
has the type argument int
.
Type variables with value restriction#
def concat[S: (str, bytes)](x: S, y: S) -> S:
return x + y
concat('a', 'b') # Okay
concat(b'a', b'b') # Okay
concat(1, 2) # Error!
from typing import TypeVar
S = TypeVar('S', str, bytes)
def concat(x: S, y: S) -> S:
return x + y
concat('a', 'b') # Okay
concat(b'a', b'b') # Okay
concat(1, 2) # Error!
Annotating decorators#
https://mypy.readthedocs.io/en/stable/generics.html#declaring-decorators
Overloading#
Use @overload to let type checkers know that a function can accept different types of arguments and return different types based on those arguments.
If there are multiple equally good matching variants (overloaded functions), mypy will select the variant that was defined first.
Put always the finest overloaded function at first: https://mypy.readthedocs.io/en/stable/more_types.html#type-checking-the-variants
Typing tools#
MyPy#
Ref. MyPy in this post.
While MyPy may not be the most performant type checker, particularly when integrated into pre-commit hooks, it remains an invaluable learning resource. The MyPy documentation provides comprehensive guidance on writing effective type hints. Understanding its development history and current maintainership adds valuable context to its role in the Python ecosystem. And this posts is mainly based on MyPy documentation.
Pyright && Pylance#
Ref. Pyright in this post.
Pylance is the Microsoft backed Pyright extension for VSCode.
RightTyper#
During an internal tech demo at my working, I heard about RightTyper, a Python tool that generates type annotations for function arguments and return values. It's important to note that RightTyper doesn't statically parse your Python files to add types; instead, it needs to run your code to detect types on the fly. So, one of the best ways to use RightTyper is with python -m pytest
, assuming you have good test coverage.
ty#
ty represents the next generation of Python type checking tools. Developed by the team behind the popular ruff linter, ty is implemented in Rust for exceptional performance. It functions both as a type checker and language server, offering seamless integration through its dedicated VSCode extension ty-vscode.
While Ruff excels at various aspects of Python linting, type checking remains outside its scope. ty aims to fill this gap, though it's currently in preview and still evolving toward production readiness. The combination of Ruff and ty promises to provide a comprehensive Python code quality toolkit.
pyrefly#
pyrefly emerges as another promising entrant in the Python type checking landscape. Developed by Meta and also written in Rust, pyrefly offers both type checking capabilities and language server functionality. While still in preview, it demonstrates the growing trend of high-performance Python tooling implemented in Rust.
The tool integrates smoothly with modern development environments through its VSCode extension refly-vscode, making it accessible to a wide range of developers. Its backing by Meta suggests potential for robust development and long-term support.
ty vs pyrefly#
pyrefly has very similar output format to ty, but after a quick test, it seems that it generates more alerts than ty for the same codebase. It doesn't mean pyrefly is more powerful or more strict. Sometimes it just generates more false positives as it's still in preview.
And test on a single Generics type, both ty and pyrefly generate the same type errors, but ty can point to the exact position of the error, while pyrefly only points to a limited position.
Test code:
mypy_demo.py | |
---|---|
Test results: