TL;DR
Meta’s Pyrefly 1.0 shipped May 12 with 10-50x speed over mypy on real codebases, 87.8% typing spec conformance, and a migration tool that reads your mypy.ini. mypy 2.0 followed with experimental parallel checking (up to 5x faster with 8 workers) and three breaking default changes. Astral’s ty remains the fastest raw engine but sits at 53.2% conformance — fine for editor feedback, risky for CI enforcement. If you’re starting fresh, Pyrefly gives you the best balance of speed and correctness today. If you’re mid-project on mypy with heavy plugin usage (Django ORM, Pydantic v1, SQLAlchemy stubs), stay put until Pyrefly’s plugin story matures.
Three Releases in Three Weeks
For most of Python’s lifetime, type checking meant mypy. You installed it, ran it in CI, grumbled about the speed, and moved on. Pyright eventually took the second slot (faster, more conformant, embedded in VS Code via Pylance) but it didn’t force anyone off mypy. When I compared ty, mypy, and Pyright back in April, that was already a crowded field. Now there’s a fourth serious contender.
May 2026 changed the math. Meta released Pyrefly 1.0 on May 12, graduating from the beta they’d run since November 2025. Six days earlier, mypy 2.0 shipped with its first real answer to the speed problem: --num-workers for parallel type checking. And Astral’s ty, built by the Ruff and uv team (set to join OpenAI following the acquisition announced in March), continued pushing sub-10ms incremental updates for editor workflows.
I spent the last two weeks running all three against the same projects: a FastAPI service (about 45K lines), a data pipeline using Polars (12K lines), and the NumPy and pandas public repos for the big-codebase numbers. Here’s what I found.
Speed: Where Pyrefly and ty Leave mypy Behind
The headline numbers come from Pyrefly’s own benchmarks across 53 open-source packages, cross-checked against independent results. These are cold-start full checks on a MacBook M4:
| Package | Pyrefly | ty | mypy | Pyright |
|---|---|---|---|---|
| pandas | 1.9s | 1.5s | 13.9s | 144.4s |
| scipy | 4.8s | 2.8s | 20.0s | 151.3s |
| numpy | 4.8s | 30.6s | 8.0s | 70.9s |
| tensorflow | 2.4s | 2.0s | 30.6s | 53.7s |
| homeassistant | 2.2s | 2.2s | crash | 51.3s |
| Django 5.2.1 | 0.9s | 0.6s | — | — |
| flask | 0.2s | 0.2s | 2.0s | 1.2s |
ty wins on pandas and several smaller packages. Pyrefly wins on numpy, where ty’s 30.6s is an outlier that likely hits a worst-case path in Salsa’s dependency graph. On the median package, they’re close. The consistent story: both Rust-based checkers run 10-60x faster than mypy on a cold check.
mypy 2.0’s --num-workers 8 narrows the gap to maybe 3-4x slower than Pyrefly on a well-parallelized codebase. But the speedup depends on import structure. Tightly coupled packages with lots of cross-module dependencies won’t parallelize well. On the FastAPI project I tested, mypy 2.0 with 8 workers took 4.1 seconds vs. Pyrefly’s 0.8 seconds and ty’s 0.6 seconds. A 5x improvement over mypy 1.x, but still 5x slower than the Rust tools.
Incremental Checking: ty’s Real Advantage
Full checks run in CI. But the checks you feel every day are incremental — how fast does the editor update after you change a function? ty’s Salsa-based architecture is built specifically for that.
When you edit one function in a large codebase, ty recomputes only the types that transitively depend on what changed. The result: sub-10ms diagnostic updates in PyTorch, a codebase with millions of lines. Pyrefly works at module granularity: change one function, and the whole file gets rechecked. Still fast (19ms for PyTorch after Pyrefly 1.0’s optimizations, down from 2.4 seconds in beta), but ty’s approach scales better as files grow.
For the daily experience of writing code in an editor, ty’s incremental speed is hard to beat. You type, you see errors, there’s no perceptible delay. Pyrefly at 19ms is also invisible to humans. mypy’s LSP is functional but noticeably slower. You feel the pause after every save.
Conformance: Pyrefly Checks More, ty Checks Less
A fast type checker that misses real bugs is worse than a slow one that catches them. The Python typing spec conformance suite has 139 tests. Results as of the latest published conformance runs (March 2026):
| Type Checker | Pass Rate | Tests Passed |
|---|---|---|
| Pyright | 97.8% | 136/139 |
| Pyrefly | 87.8% | 122/139 |
| mypy | 58.3% | 81/139 |
| ty | 53.2% | 74/139 |
Pyrefly’s 87.8% is a significant jump over mypy’s 58.3%. It correctly handles ParamSpec, most overload patterns, descriptors, and enum narrowing, all areas where mypy has long-standing open issues. Pyrefly still falls short of Pyright’s 97.8%, but the gap is narrowing fast (it was at around 70% in the November 2025 beta).
ty at 53.2% is a different story. Most of ty’s failures are unimplemented features rather than incorrect results. When ty does check something, it usually gets it right. But “I skipped checking this pattern entirely” and “I checked it and found no errors” look the same in CI output. If your codebase uses Protocol classes, complex generics, or recursive types, ty will silently pass code that Pyrefly or Pyright would flag.
Different Checking Philosophies
The conformance gap comes from different design choices.
Pyrefly takes an aggressive approach: it infers types even when annotations are absent and reports errors on completely untyped code. If you drop Pyrefly into an existing project with zero type hints, it will still find bugs: variable shadowing, unreachable branches, wrong argument counts. This is useful for adopting type checking gradually without annotating everthing first.
ty follows the “gradual guarantee”: removing type annotations from working code should never introduce new errors. This means ty is more permissive on untyped code. The benefit is predictability: you won’t get a flood of new errors when removing annotations. The cost is that ty misses bugs in untyped code that Pyrefly catches.
mypy 2.0 lands somewhere in between but closer to ty’s permissive end. The new --local-partial-types default (now on by default) tightens inference within a single scope, but mypy still requires explicit annotations to do its best work.
Migration: Getting Off mypy
Moving to Pyrefly
Pyrefly 1.0 ships a migration tool that reads your existing mypy.ini or pyproject.toml mypy config and generates a Pyrefly config:
pip install pyrefly
pyrefly init # reads mypy.ini, writes pyrefly.toml
pyrefly check src/
The init command translates mypy options to Pyrefly equivalents and emits a “legacy” preset that softens Pyrefly’s aggressive defaults to match mypy’s behavior. This means your first run won’t drown you in thousands of new errors from Pyrefly’s stricter inference.
Once you’re stable, you can tighten the config gradually: switch from the legacy preset to the default one, enable stricter checks section by section. Pyrefly’s baseline feature helps here: run pyrefly baseline to snapshot current errors into a JSON file, then only new errors get reported. You chip away at the existing debt without blocking CI.
pyrefly check --baseline=baseline.json --update-baseline src/ # snapshot current errors
pyrefly check --baseline=baseline.json src/ # only report new errors
The coverage report is another migration tool I didn’t expect to find useful:
pyrefly coverage report src/
It outputs JSON with annotation completeness and type completeness metrics per module. Useful for tracking migration progress across a large codebase and answering “how typed are we actually?”
Moving to ty
ty doesn’t have a mypy config migration tool. You install it and run it:
pip install ty
ty check src/
Because ty follows the gradual guarantee, the initial experience is smoother, with fewer errors on first run, especially on untyped code. But you also get less value from ty on untyped code, since it deliberately avoids flagging patterns it can’t fully resolve.
ty’s real selling point is as a Ruff companion. If you already use Ruff for linting and uv for package management, ty slots in as the type checker in the same Astral toolchain. Configuration lives in pyproject.toml under [tool.ty], consistent with the rest of the stack.
Staying on mypy 2.0
The upgrade from mypy 1.x to 2.0 has three breaking default changes you’ll hit immediately:
--local-partial-typesis now enabled by default, so variables inferred from branches may error where they didn’t before--strict-bytesis enabled per PEP 688, rejectingbytearrayormemoryviewwherebytesis expected--allow-redefinitionbehavior changed; use--allow-redefinition-oldto restore legacy behavior
Python 3.9 targeting is rejected entirely; the minimum is now 3.10. If your CI matrix still includes 3.9, you’ll need mypy<2.0 pinned for those jobs. (If you’re already on Python 3.14 with free-threading, the GIL removal may affect mypy’s own parallel mode too, though the team hasn’t documented that interaction yet.)
The parallel checking feature is experimental. The mypy team notes “minor semantic differences between parallel and non-parallel modes,” so for CI pipelines requiring strict consistency, single-process mode remains the safer default.
# Try parallel mode
uv run mypy -n 8 src/
# Or in pyproject.toml
[tool.mypy]
num_workers = 8
IDE and Language Server Support
All three ship language servers, but the experience varies.
Pyrefly ships a VS Code extension and a standalone LSP. Jupyter notebook support reached full parity with .py files in 1.0, so rename, find references, code actions, and document symbols all work inside cells. The playground at pyrefly.org/sandbox lets you test type checking in the browser before installing anything. Since 1.0, Pyrefly has become the default type checker for Instagram’s 20-million-line Python codebase, and it’s been adopted by PyTorch, NumPy, and JAX.
For incremental LSP performance, ty is in a different class. Sub-10ms updates on PyTorch-scale codebases mean zero-delay diagnostics as you type. The playground lives at play.ty.dev. ty also supports intersection and negation types (MyClass & ~MySubclass), a feature no other Python type checker implements, useful for precise narrowing in complex class hierarchies.
mypy works through its daemon (dmypy) for IDE integration. The experience is functional but slower than both Rust-based alternatives. mypy’s plugin system (Django, Pydantic, SQLAlchemy) remains the main reason to stay, since neither Pyrefly nor ty supports third-party plugins yet.
The Plugin Problem
The unresolved gap: mypy’s plugin system powers type checking for Django models (django-stubs), Pydantic validators, SQLAlchemy ORM, and a dozen other frameworks. These plugins teach the type checker how framework-specific patterns work. Without them, you get false positives on every Model.objects.filter() call.
Neither Pyrefly nor ty supports third-party plugins. Pyrefly handles dataclasses, enums, and descriptors natively, but Django ORM and Pydantic v1 patterns still need plugin support. ty is in the same position.
If your codebase leans heavily on Django or Pydantic v1 (the older plugin-dependent version; Pydantic v2+ uses standard annotations), switching to Pyrefly or ty today means losing type coverage on framework-specific code. For some projects, that’s a dealbreaker. For projects that primarily use standard library types, dataclasses, and Pydantic v2, the gap is much smaller.
When to Use Each
| Scenario | Recommendation |
|---|---|
| New project, no existing type checker | Pyrefly — best speed/conformance balance, migration tools for later |
| Large codebase on mypy with Django/Pydantic v1 plugins | mypy 2.0 — plugins still unmatched, parallel mode helps speed |
| Already using Ruff + uv, want integrated toolchain | ty — same Astral toolchain, fast editor feedback |
| Need highest possible correctness in CI | Pyright — 97.8% conformance, still the gold standard |
| Migrating from mypy, want gradual transition | Pyrefly — pyrefly init reads mypy config, baseline feature |
| Maximum editor responsiveness on million-line codebase | ty — sub-10ms incremental updates via Salsa |
FAQ
Is Pyrefly faster than mypy?
On cold full checks, Pyrefly runs 10-50x faster than mypy on most codebases. Even with mypy 2.0’s new parallel mode (--num-workers 8), Pyrefly still finishes roughly 5x faster on the projects I tested. Pyrefly is written in Rust and mypy in Python, and the performance gap reflects that.
Should I switch from mypy to Pyrefly?
If your project doesn’t rely heavily on mypy plugins (Django ORM stubs, Pydantic v1, SQLAlchemy), the switch is low-risk. Run pyrefly init to generate a config from your mypy setup, then pyrefly check to see the diff. Use the baseline feature to suppress existing errors and only enforce new ones. If you depend on mypy plugins, wait. Pyrefly doesn’t support them yet.
What is the difference between Pyrefly and ty?
Both are Rust-based Python type checkers, but they disagree on inference philosophy. Pyrefly checks aggressively — it infers types even on unannotated code and reports errors there. ty follows the gradual guarantee — it won’t add errors to working code that lacks annotations. ty has faster incremental updates (sub-10ms) thanks to fine-grained Salsa-based recomputation. Pyrefly has higher spec conformance (87.8% vs. 53.2%) and better support for complex typing patterns like ParamSpec and overloads.
Is Pyrefly production ready?
Yes. The 1.0 release on May 12, 2026 marks production readiness. Meta runs Pyrefly as the default type checker on Instagram’s 20-million-line Python codebase. It’s also been adopted by the PyTorch, NumPy, and JAX projects. Since the November 2025 beta, the team shipped over 60 updates and improved incremental editor speed by 125x on PyTorch-scale codebases.
Which Python type checker has the best spec conformance?
Pyright leads at 97.8% (136/139 tests). Pyrefly follows at 87.8% (122/139). mypy passes 58.3% (81/139) and ty passes 53.2% (74/139). These numbers are from the Python typing spec conformance suite as of early March 2026. Pyrefly’s 1.0 release notes claim “over 90%” conformance, suggesting the 87.8% figure has already improved. All four tools are under active development, so expect these to shift.
Sources
- Pyrefly v1.0 release announcement — official release blog with performance numbers and feature list
- Pyrefly speed and memory comparison — benchmarks across 53 open-source Python packages
- Pyrefly typing conformance comparison — spec pass rates for all major type checkers
- mypy 2.0 release announcement — official release blog with breaking changes and parallel mode
- Pyrefly vs ty comparison — independent benchmark and design philosophy comparison
- OpenAI acquires Astral — announcement of the Astral (Ruff, uv, ty) acquisition
- ty vs mypy vs Pyright comparison — our earlier comparison before Pyrefly 1.0 shipped
Bottom Line
May 2026 brought three major Python type checker releases in three weeks. Pyrefly 1.0 is the most complete new entrant: fast, reasonably conformant, and backed by Meta’s internal usage at massive scale. mypy 2.0’s parallel mode is a welcome patch but doesn’t close the speed gap with Rust-based tools. ty is the most architecturally interesting option with the best editor experience, but its 53.2% conformance score makes it a hard sell for CI enforcement today.
My recommendation for most teams: evaluate Pyrefly now. Run pyrefly init against your existing mypy config, check the error diff, and decide whether the plugin gap affects your specific stack. If it doesn’t, you’ll get a type checker that’s an order of magnitude faster without sacrificing much correctness. If it does, stay on mypy 2.0 and revisit in six months. Plugin support is Pyrefly’s last major gap.