- Notifications
You must be signed in to change notification settings - Fork10
The codemod Airtable used when migrating from Flow to TypeScript
License
Airtable/typescript-migration-codemod
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The codemod written to migrate the Airtable codebase from Flow to TypeScript.
This codemod was open sourced as a part of publishing the blog post “The continual evolution of Airtable’s codebase: Migrating a million lines of code to TypeScript.” If you’re interested in learning more about how Airtable migrated from Flow type TypeScript, we recommend reading that blog post.
⚠️ This codemod was designed for one-time use against Airtable’s codebase! If you want to run it against your codebase, you’ll need to clone the git repo and do some manual tuning.
There are two migration scripts:
src/modules/run.ts
: For converting CommonJS modules (require()
,module.exports
) to ES Modules (import
,export
).src/typescript/run.ts
: For converting Flow to TypeScript.
Both codemods use a custom setup where multiple Node.js workers process files in parallel. Practically speaking,jscodeshift is a great tool which abstracts away parallel codemod workers for you. We chose to write a thin wrapper for more control.
Whenever the codemod detects a case it can’t automatically transform, it reports the file, line, and column to the console. We were able to use this reporting to manually fix many odd patterns in our codebase to help the codemod.
The TypeScript migration (src/typescript/run.ts
) usesflow type-at-pos
(docs) to discover what type Flow thinks a variable is when a developer did not add an annotation.
What follows is an edited version of our internal documentation which shares, at a high level, the different classes of changes made to the codebase.
This document does not cover the CommonJS to ES Module migration.
(Note: You’ll see references to anh
andu
namespace in a couple places throughout this document. These names are placeholders for actual namespaces at Airtable.)
As a part of the TypeScript migration, a lot of code will change. This document outlines the large classes of changes both automated and manual. The idea is we as an engineering team can review this document instead of reviewing the entire migration diff which would be impossible. There are a couple common goals with all the changes:
- Don’t break the product. Always choose to preserve code semantics over anything else.
- Don’t reduce type safety. On net, the migration shouldincrease type safety. Any large class of automated or manual change will always preserve or increase type safety.
- Keep it simple. Migrating requires a lot of changes so keep the changes simple. You should be able to reason about a change in the context of a single file. We can always follow up with smaller migrations for more complex transforms.
Even if some changes eventually become Airtable anti-patterns, they were motivated by these goals so we could safely and timely ship a TypeScript migration.
There are two categories of changes.Automated changes which were done by a codemod andManual changes which were done by hand when a computer couldn’t automatically fix them.
You can find the codemod which makes the automated changes in this repo. The function which does all the transformation issrc/typescript/migrate_to_typescript.ts
. The codemod spins up a bunch of processes which transform files in parallel using Babel to parse,Recast to print, and Babel utilities to traverse the AST. This is a similar architecture tojscodeshift but hand-rolled to provide custom file discovery logic and to be more thorough with AST transformations.
All files with an// @flow
annotation will be converted to TypeScript. The file extension will be changed to.ts
or.tsx
if the file has JSX in it. It’s worth considering if, as part of the Airtable style guide, whether we should name all files with a.tsx
extension for consistency.
mixed
→unknown
empty
→never
Object
→FlowAnyObject
TheObject
type is just an alias forany
in Flow. There’s a little bit of unimportant history to this type, what’s important is that it’sany
now. The codemod will transform it toFlowAnyObject
instead ofany
directly since a type alias preserves the author’s intent.Function
→FlowAnyFunction
Same story asObject
, it is an alias forany
in Flow.$ReadOnlyArray<T>
→ReadonlyArray<T>
$Keys<T>
→keyof T
$Values<T>
→h.ObjectValues<T>
We use a utility type defined in theh
namespace which is implemented asT[keyof T]
. We don’t transform to that directly because it would require writingT
twice which is annoying ifT
is a really long name.$Shape<T>
→Partial<T>
$PropertyType<T, K>
→T[K]
$ElementType<T, K>
→T[K]
The difference between$PropertyType
and$ElementType
in Flow is that$PropertyType
requires a single static string key, so$PropertyType<O, 'foo' | 'bar'>
is not allowed. They use slightly different code paths in Flow’s codebase. The difference is mostly for historical reasons and kinda annoying.*
→FlowAnyExistential
Flowexistential types are deprecated. They basically behave asany
. There’s more context and history here for those interested.$Subtype<T>
→any
Also adeprecated type. Should probably add an alias likeFlowAnyObject
.React.Node
→React.ReactNode
React.Element<P>
→React.ReactElement<React.ComponentProps<P>>
TypeScript’s React types accept a component type inReact.ReactElement
instead of the props type like Flow.
This change is pretty important (since it happens a lot) and a bit unfortunate. Flow supports any type as the indexer of an object. For example{[key: UserId]: string}
. However, TypeScript only supportsstring
ornumber
index keys and does not allow types that are literally aliases forstring
ornumber
. So even though in our codebase ID types are written astype UserId = string
we can’t use them as an indexer type in TypeScript.
So the codemod adds a utility typeh.ObjectMap<K, V>
which allows you to writeh.ObjectMap<UserId, string>
. That will preserve the developer’s intent with the Flow type{[UserId]: string}
.h.ObjectMap
has a simple implementation using a TypeScript feature calledmapped types:type ObjectMap<K, V> = {[_K in K]: V}
. You can also useh.ObjectMap
if you have a string union (like an enum type) soh.ObjectMap<AccessPolicy, T>
. This is a bit different than Flow since TypeScript will make all of the properties required. If you’d like the properties to be optional you can usePartial<h.ObjectMap<AccessPolicy, T>>
which desugars to{[P in AccessPolicy]?: T}
.
TypeScript actually includes a utility type with the same definition ash.ObjectMap
calledRecord
. Given our domain model at Airtable, I figured using the nameRecord
would be too confusing for developers.
A lot of places usingh.ObjectMap
in Hyperbase could be using an ES6map. An ES6 map is better for type checking and better for performance. If we end up making our ID types nominal (soRowId
can’t be passed somewhere expecting aColumnId
) thenh.ObjectMap
might not work but an ES6 map would work.
We as an engineering team have to make a couple decisions:
- Should we discourage use of
h.ObjectMap
and recommend directly writing a mapped type ({[_K in K]: V
) or directly writing an indexer type ({[key: string]: V}
)? - Should we encourage replacing usages of
h.ObjectMap
with an ES6 map? - Should we name
h.ObjectMap
something longer to discourage people from using it?
Object types with a spread like{a: A, ...T, b: B}
will be turned into intersection types like{a: A} & T & {b: B}
since TypeScript doesn’t have object type spread. The reason why intersection types weren’t enough for Flow is that there were small very edge-case type rules Flow didn’t want to compromise on. Intersections are perfectly acceptable for this in TypeScript.
Most of the cases where we were using object spread types, though, we should probably useinterface
andextends
going forward. The type checking ofinterface
s is easier to understand andinterface
s have better IDE tooling support.
Object properties which have a+
like{+p: T}
are readonly so we convert them to readonly TypeScript properties like{readonly p: T}
. Technically, you can also have “writeonly” properties in Flow with-
like{-p: T}
but that never shows up in our codebase.
Flow permits arguments in function types to be unnamed. For example,(T, U) => V
. TypeScript requires function types to have argument names, though. So the codemod will transform unnamed function type arguments to anargN
form like(arg1: T, arg2: U) => V
. (Starts at one instead of zero because humans start counting from one.)
Unnamed rest arguments like(...Array<T>) => U
will be namedrest
as in(...rest: Array<T>) => U
.
Unnamed indexer properties like{[string]: T}
will be namedkey
as in{[key: string]: T}
.
The Flow maybe type?T
will be transformed intoT | null | undefined
. If the maybe type is part of a function parameter likefunction f(x: ?T)
then it will be transformed intofunction f(x?: T | null)
.
If a function parameter included a union with undefined likefunction f(x: T | void)
it will be transformed into an optional parameterfunction f(x?: T)
. Flow treats types which can acceptundefined
as optional, but TypeScript requires a parameter to be marked as optional.
If an optional parameter comes before a required parameter then we make the optional parameter required. For examplefunction f(x?: T, y: U)
is allowed in Flow but not in TypeScript so we will change it tofunction f(x: T | undefined, y: U)
.
If an optional parameter also has a default value then we remove the optional mark since TypeScript complains. For examplefunction f(x?: T = y)
will be transformed intofunction f(x: T = y)
.
Flow has syntax for specifically importing types:import type {UserId} from '...'
. This will be converted to a normal ES import:import {UserId} from '...'
. The only difference is removing thetype
keyword.
The Babel plugin for TypeScript will completely remove imports that only import types, so we don’t have to worry about regressing bundle size or introducing cycles.
In TypeScript, function parameters whose type is not immediately known require a type annotation. For example:function f(x)
requires a type annotation forx
buta.map(b => { ... })
does not require a type annotation forb
since the type is immediately known (a
is an array type).
For unannotated function parameters that require a type annotation wetry to get Flow’s inferred type by runningflow type-at-pos
. We do not use the type if it is longer than 100 characters (since a human probably would not have written that type).
If we getany
from Flowwe rename it toFlowAnyInferred
so that the programmer knows “we asked Flow what type it thought this was and Flow saidany
.” Flow ends up inferringany
a lot for unannotated function parameters. That’s because in order to infer a type for unannotated function parameters, Flow needs to see a call for that functionin the same file that the function was defined. If Flow does not see a call in the same file it treats the parameter asany
.
Specifically, we useflow type-at-pos
to infer function parameter types for:
- Function declarations (
function f(x)
). Notably not function expressions, we will manually annotate those. - Class methods (
class C {m(x) {...}
). - Object methods butonly in
createReactClass()
(createReactClass({m(x) {...}})
). We will manually annotate unannotated methods in crud managers.
Ifflow type-at-pos
doesn’t give us a type then we will manually annotate.
Flow has an expression type cast syntax which looks like(x: T)
. TypeScript has a similar type cast syntax which looks like(x as T)
. They don’t have the same behavior! TypeScript as-expressions allow fordowncasting whereas Flow type cast expressions do not. We introduce au.cast<T>(x)
utility which doesn’t allow downcasting as part of the codemod, but we will transform many of the common cases to as-expressions. Here are the rules:
((x: any): T)
is pretty frequently used in Hyperbase to downcast a type. We transform this into(x as T)
. Except if we are in acreateReactClass()
object property, then we will transform it into((x as any) as T)
since React code uses this syntax to declare the type of variables which will be initialized incomponentWillMount
.(x: any)
is transformed into(x as any)
since that’s no different than usingu.cast()
.('foo': 'foo')
is transformed into('foo' as const)
. This is pretty common with ourObject.freeze()
enum syntax. In TypeScript,as const
will tell the type checker: “infer this type knowing I’m never going to mutate it.” Notably, this means you don’t have to type string literals like'foo'
twice anymore!(x: T)
whenx
is a literal likenull
or[]
we transform it to(x as T)
. That’s because(null: T | null)
or([]: Array<T>)
is used a lot to widen the type for an empty value in something likegetInitialState()
.
If none of these cases apply then we transform(x: T)
tou.cast<T>(x)
which preserves type safety.
There are a couple common patterns in the codebase where TypeScript requires a type annotation that Flow doesn’t, notably for this transform:let x
,let x = {}
, andlet x = []
. Flow is supposedly able to infer a type for these variables although it often gets it wrong and ends up usingany
. In all of our non-test files the correct type was manually added. In test files, however, the codemod will add an explicitany
. For examplelet x: any
.
Also, if we can’t get Flow’s inferred type for an unannotated function parameter (using the process described inUnannotated Function Parameters) we will annotate the parameter asany
in test files only.
The principle here is that it’s ok for test files to be less type safe. Test file type safety is practically guaranteed given that they run in CI all the time. It would be very tedious to manually pick the right type for every case of this for little type safety value. Introducing this transform in the codemod fixed ~10k errors. For context, ~15k errors will be manually fixed.
As a part of this change, the eslint rule that warns when it sees anany
is disabled in test files.
// flow-disable-next-line
will be transformed into// @ts-ignore
.
// eslint-disable-line flowtype/no-weak-types
will be transformed into// eslint-disable-line @typescript-eslint/no-explicit-any
.
Opaque type aliases will be converted into normal type aliases. It’s a bit unfortunate that TypeScript doesn’t support opaque type aliases (yet), but you can get the same behavior through a variety of means. We only have five opaque type aliases and they are all written by the same person.
All the manual changes were done by one engineer so in this part of the document I’ll be using first person. When deciding how to manually fix ~15k errors I leaned pretty heavily on the migration goals listed at the beginning of the document:don’t break the product,don’t reduce type safety,keep it simple.
In this section I’ll list major categories of manual fixes. I may miss or skip some smaller categories. You can find and review all my manual fixes in a single commit (~1.6k files out of ~3.3k TypeScript files have manual changes). It’s worth noting that I’m a bit scared of bugs introduced by manual fixes. All the bugs I found when fixing the Airbuild test suite were from manual fixes. A diff of the compiled output should give us visibility into where manual fixes changed code semantics.
Perhaps the larges class of manual fixes was adding extra type annotations. TypeScript errors when a type is under annotated. Most of the time this would occur with an empty object, empty array, or empty variable (let x = null
) that is later mutated. Common patterns I’d see:
const x = {};const x = [];let x = null;new Set(); // infers Set<unknown>new Promise(); // infers Promise<unknown>shardingHelper.mapReduceAsync(); // infers unknown for map/reduce result types
If I couldn’t find a type or I believed Flow inferredany
I’d use theFlowAnyInferred
type alias which was introduced forUnannotated Function Parameters.
For objects, it’s worth noting how I picked between{[key: string]: T}
,h.ObjectMap<K, T>
, and{[_K in K]: T}
(seeObject Index Types Other Than String Or Number for some more context on the differences). Maybe we should encode this in the style guide? The decision tree is unforuntately a bit complicated.
- If the key is exactly
string
and not a type alias use{[key: string]: T}
. - If the key should really be an object ID like
UserId
orRecordId
useh.ObjectMap<UserId, T>
. - If the key is a string union or a generic string type (like
IdT
) and the properties need to be optional use{[_K in K]?: T}
. Although, now I regret that decision and wish I usedPartial<h.ObjectMap<K, T>>
which is more consistent.
Suppression comments (// @ts-ignore
in TypeScript) are scary because the silence all errors on a line of code. However, sometimes it made sense to use a suppression comment anyway. This section documents my policy on when to use a suppression comment vs.any
or some other tactic.
First off, all suppression comments manually added in the TypeScript migration begin with#typescript-migration
. This way you can search through the files you own and cleanup suppression comments. I manually added ~2.5k suppressions, for context: I manually edited ~1.6k files and there are ~3.3k TypeScript files. Some suppression comments address a specific issue, they are:
#typescript-migration-implicit-any
:Implicit Anys Where The Type Could Be Immediately Known#typescript-migration-react-null-prop
:React Null Props
I would add a generic// @ts-ignore #typescript-migration
suppression comment when:
- Fixing the error would require semantic changes which couldn’t be implemented or verified in a space of about five lines. I’d make really small, simple, and locally verifiable changes that didn’t require running the app but that’s it.
- Fixing the error with an
any
would make the code less type safe. - It’s an unfortunate incompatibly between TypeScript and Flow without a clear fix.
I would first try to change the types (type changes are verifiable by running TypeScript), if I couldn’t do that then I’d use a suppression comment.
InUnannotated Function Parameters I mentioned that function expressions (like arrow functions) with unannotated function parameters I left alone since usually the type is immediately knowable by TypeScript. For example:
x.map(y => y + 1)
Ifx
isArray<number>
then TypeScript immediately knowsy
must benumber
. Roughly how this rule works: if TypeScript knows the type of an expression at the point where the expression is defined it can use that type to infer some types in that expression. This is a pretty common trick in programming language design, especially programming languages with lambda expressions (Rust example,TypeScript example).
However, in our example above ifx
isany
then we don’t immediately have a type fory
, so TypeScript complains thaty
needs a type annotation. This happens pretty commonly withh.async
since it’s typed asany
. There are two possible fixes:
- Annotate
y
withany
. - Add a
@ts-ignore
comment to suppress the error.
The problem with option 1 is that it locks in the type ofy
forever. Ifx
is ever changed fromany
toArray<number>
theny
will still forever beany
instead ofnumber
. So instead I went with option 2 in most cases and added a// @ts-ignore #typescript-migration-implicit-any
comment for unannotated arrow expression parameters thatcould be immediately known if we had more type safety. That way ifx
is one day typedy
will benumber
and the@ts-ignore
comment can be safely removed.
This is especially important for uses ofh.async
. Code usingh.async
ends up looking like:
h.async.series([ // @ts-ignore #typescript-migration-implicit-any next => {...}, // @ts-ignore #typescript-migration-implicit-any (x, next) => {...}, // @ts-ignore #typescript-migration-implicit-any (y, next) => {...}, // @ts-ignore #typescript-migration-implicit-any (z, next) => {...},]);
That’s only becauseh.async
isany
, though. One day if we add proper types toh.async
we can remove all these suppression comments. They’re also really easy to search for since they use a specific name.
I wrote custom types forcreateReactClass()
that balance our needs to expressEventListenerMixin
. It increases type coverage, compared to Flow, sincethis.props
andthis.state
are properly typed. Prop types are inferred frompropTypes
and the state type is inferred fromgetInitialState()
.
In cases where prop types are too general I used something likePropTypes.object.isRequired as PropTypes.Validator<CustomObject>
which gives the prop type theCustomObject
type instead of a generic object.
A common pattern in our codebase is:
function doThing(opts?: {someOption?: number}) { opts = u.setAndEnforceDefaultOpts({ someOption: 0, }, opts);}
u.setAndEnforceDefaultOpts()
returnsany
so Flow givesopts
a type ofany
. But this control flow analysis is difficult to do in general, so TypeScript keeps the type thatopts
was annotated with which is{someOption?: number} | undefined
. That means when you try to accessopts.someOption
TypeScript will error complaining thatopts
isundefined
. The fix I went with was to change the code to:
function doThing(_opts?: {someOption?: number}) { const opts = u.setAndEnforceDefaultOpts({ someOption: 0, }, _opts);}
Rename theopts
parameter to_opts
and set the result ofu.setAndEnforceDefaultOpts()
to a new variable. Technically, I’d classify this as a semantics change, but it’s small and locally verifiable so I’m ok with it.
I’m calling outu.setAndEnforceDefaultOpts()
because this issue happened a lot with that utility, but I saw this class of issue elsewhere too. Particularly in column type providers where a value would start asunknown
and would be reassigned to more a specific type likenumber
.
TypeScript requires all class fields to be initialized in the constructor. In Hyperbase we commonly see a pattern like this:
class MyClass { someField: number; constructor() { this.reset(); } reset() { this.someField = 42; }}
This is an error in TypeScript sincesomeField
is not initialized in the constructor. When I saw this I’d either:
- Change the property to an optional property like
someField?: number
if it was truly optional. - Suppress the field with an
// @ts-ignore #typescript-migration
comment since it’s an unfortunate incompatibility between Flow and TypeScript.
Some functions likeu.head()
andu.last()
returned justT
in Flow. That’s incorrect since it doesn’t account for the empty array case. In TypeScript they returnT | undefined
. Almost everywhere we useu.head()
andu.last()
we first check that the list is non-empty so adding a non-null assertion (u.last()!
) is fine.
This also happens for a couple other utilities likeu.maxBy()
, but most notably for every usage ofu.head()
andu.last()
.
An error I saw a lot was when we use astring
toindex a type with known properties.
type O = { foo: number, bar: number, qux: number,};declare const o: O;declare const p: string;o[p]; // Error!
The solution here really depends on the context. The basic strategy was to get the type ofp
to bekeyof O
. Sometimes you can do by changing annotations. Sometimes I’d usep as keyof O
because you could locally verify the property exists since there’s au.has()
check.
if (u.has(o, p)) { o[p]; // Still an error, but actually safe.}
Flow wouldimplicitly returnany
in these cases.
Unfortunately, the TypeScript types for React (@types/react
) don’t supportnull
everywhere they could. For example, the following are all errors:
<div role={enabled ? 'button' : null} onClick={enabled ? handleClick : null} style={{ backgroundColor: enabled ? 'tomato' : null, }}> ...</div>
The fix here is to useundefined
instead, but since that would be a semantics change that is not locally verifiable (what if some else uses this prop type?) and changing the types would require forking@types/react
I chose to suppress these cases instead with// @ts-ignore #typescript-migration-react-null-prop
.
We can follow up after the migration by picking a path forward (either forking@types/react
or usingundefined
) and remove all the instances of#typescript-migration-react-null-prop
. At the moment I’m not ready to add the complexity of doing either (forking a popular library vs. semantics change) to the migration.
Sometimes, with Flow we use imports from untyped modules in a type position. For example when thequill-delta
module is untyped:
import QuillDelta from 'quill-delta';function doThing(delta: QuillDelta) {...}
This is an error in TypeScript, so I change this pattern to:
import QuillDelta from 'quill-delta';type QuillDelta = FlowAnyUntypedModuleImport;function doThing(delta: QuillDelta) {...}
Which is an alias forany
and easy to search for. When you add types forquill-delta
you can search fortype QuillDelta = FlowAnyUntypedModuleImport
and simply delete it.
I developed this strategy a bit later in the manual migration and tried to retroactively move to this style but might have missed some cases. (Which likely use a suppression comment instead.)
For example when you have the following component whereDateCellTimeInput
is created bycreateReactClass()
:
const DateCellEditor = createReactClass({ _timeInput: (null as DateCellTimeInput | null), // ...});
You can’t actually useDateCellTimeInput
as a type here since it’s a value (created bycreateReactClass()
). Flow had special support for this. The fix here is to introduce a utility typeh.ReactRefType
which gets the type of acreateReactClass()
instance.
const DateCellEditor = createReactClass({ _timeInput: (null as h.ReactRefType<typeof DateCellTimeInput> | null), // ...});
A common pattern for exhaustive pattern match checking in Hyperbase is (in Flow syntax):
switch (field.type) { // handle all the cases... default: throw u.spawnUnknownSwitchCaseError('type', (field.type: empty));}
In TypeScript syntax after applying the rules inType Cast Expressions:
switch (field.type) { // handle all the cases... default: throw u.spawnUnknownSwitchCaseError('type', u.cast<never>(field.type));}
The logic here is that TypeScript or Flow will typefield
as the bottom type (never
orempty
respectively) if all other cases have been handled. However, TypeScript does not allow properties to be accessed on anever
type sofield.type
errors. The fix here is to writeu.cast<never>(field)['type']
since TypeScript does allow index accesses onnever
.
Recommendation: to make exhaustive switches really easy to write we can add a utility like:
function spawnExhaustiveSwitchError(value: never) { if (typeof value === 'object' && value !== null) { // Infer sentinel field by looking for common property names: // type, kind, method, etc. } else { // Return error... }}
That means all you need in an exhaustive switch like the one above is:
default: throw u.spawnExhaustiveSwitchError(field);
In a couple places we’d have a file which (after the ES Module migration) looks like this…
class MyClass {...}export default { MyClass, someConstant: 42,};
…and is then used like this…
import _myClass from 'client_server_shared/my_class';const {MyClass, someConstant} = _myClass;function f(x: MyClass) {...}
This breaks TypeScript’s ability to useMyClass
as a type since instead of being introduced into scope in import-land it is introduced into scope in value-land. The fix is to use a named export instead of a default export like this:
export class MyClass {...}export const someConstant = 42;
So when TypeScript required this change, I manually made it.
Destructed default imports are a consequence of migrating to ES Modules without changing the semantics of our code. We’ll probably want to recommend a different style of imports/exports eventually. In the meantime, however, I needed to manually migrate some modules to named exports.
I’d convert code likeglobal[sdkVariableName]
to(global as any)[sdkVariableName]
since in TypeScript,global
is typed.
In a classthis.constructor
is typed asFunction
. The TypeScript team isinterested in changing this but for now some of our abstractions are affected.
I ended up adding a suppression comment in this case since there it’s a TypeScript incompatibility without a clear fix.
About
The codemod Airtable used when migrating from Flow to TypeScript
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.