PR that converts the TypeScript repo from namespaces to modules

After this change, the TypeScript compiler will now be compiled with esbuild. I feel like thats probably the best endorsement esbuild could get, hah.

Surprising they call out the 2 space indent level that esbuild is hardcoded[1] to use as a benefit. Why not save even more bytes and re-format the output to single tab indentation? I wrote a simple script to replace the indentation with tabs. 2 indent size: 29.2MB, tabbed size: 27.3MB. 2MB more of indentation saved! Not significant after compression, but parsing time over billions of starts? Definitely worth it.

[1] https://github.com/evanw/esbuild/issues/1126

robbick · 3 years ago

I can see this is probably a calm point that will definitely not escalate, programmers don't really care about tabs and spaces that much, right???

Accacin · 3 years ago

Heh, I always wondered what the big deal was. I'll just use whatever the company I'm working for uses and not even think about it.

To be honest, that's why I love auto formatting cause I never need to think about that stuff, I can just write code.

spartanatreyu · 3 years ago

"Tabs vs spaces" is often misunderstood (and falsely reported) as a problem of preference.

The real problem is that using spaces for indentation is an accessibility issue.

The solution is to use tabs for indentation, and spaces for alignment.

eyegor · 3 years ago

In soviet python, runtime cares

(if you accidentally use a tab in a file that otherwise uses spaces, you get a runtime exception, or vise versa)

bnt · 3 years ago

Tabs

CodeWriter23 · 3 years ago

> Why not save even more bytes and re-format the output to single tab indentation?

For your answer search "programmers who use spaces make more money"

jdrek1 · 3 years ago

Correlation != causation

Spaces are simply inferior to tabs since the latter conveys the meaning of "one level of indentation" while the former does not. It's also better for accessibility and file size. There is not one single logical reason to ever use spaces for indentation, not one.

For some very fucking stupid historical reason someone in the 80s made the idiotic decision of spaces being the default in editors and people just went with it. The people earning more are doing so because those are the seniors who have given up on common sense and just go with the flow of the masses who are unable to grasp "tabs for indentation, spaces for alignment" yet insist on keeping alignment so the (terrible) compromise is just using spaces. And I strongly question whether "alignment" is worth anything, in almost all cases it's just useless and in the rest you're drawing ASCII diagrams in the comments which doesn't affect your code at all.

Also see the top answer at https://www.reddit.com/r/programming/comments/8tyg4l/why_do_...

throwup · 3 years ago

I wish they had broken down that survey question further to find out _how many_ spaces the highest paid developers use. Then I could finally have a data-driven answer to put in my prettier config!

z3c0 · 3 years ago

TLDR "Programmers who use spaces are more likely to respond to StackOverflow surveys soliciting information about their income."

TOGoS · 3 years ago

This was actually a significant issue in a large PHP codebase I used to work on. Client hired a new guy who insisted that we convert everything to spaces, and suddenly it took about twice as long to check the thing out from Subversion.

noduerme · 3 years ago

Someone who comes onto a project and actually wants to charge money to sit there and convert tabs to spaces or vice versa. Incredible.

ElijahLynn · 3 years ago

That doesn't make sense that spaces vs tabs would result in a 2x longer checkout. Something else is at play if that is the case.

Krizzu · 3 years ago

Why not suggest this in the PR? Seems like it could be used in pipeline to decrease size further down

breck · 3 years ago

Why a tab when 1 space will do?

marssaxman · 3 years ago

Tab is ASCII 9 while space is 32. Tabs, having the lower number, are therefore obviously cheaper.

quickthrower2 · 3 years ago

Reminds me of Silicon Valley (HBO) where Richard uses “we are a compression company” to justify using tabs over spaces. Ironically once gzip compressed I doubt it would make any difference.

andy_ppp · 3 years ago

Why have and white space at all in that case?

Even on an absolutely gigantic codebase using tabs or spaces will make almost no difference to build or type-checking times. Building an AST is much more overhead than white space considerations and once it’s an AST tabs or spaces are not included in the running of the code.

idontwantthis · 3 years ago

Does that mean they are not using type checking? That’s the really really slow part of writing TS and es build doesn’t include it, which is why I’ve never seen the point of using esbuild as a compiler.

jakebailey · 3 years ago

We are still type checking, it's just not needed as a dependency for our JS outputs. Type checking still happens in tests, and I have CI tasks and VS Code watch tasks which will make sure we are still type checking.

> Finally, as a result of both of the previous performance improvements (faster code and less of it), tsc.js is 30% faster to start and typescript.js (our public API) is 10% faster to import. As we improve performance and code size, these numbers are likely to improve. We now include these metrics in our benchmarks to track over time and in relevant PRs.

> [...]

> The TypeScript package now targets ES2018. Prior to 5.0, our package targeted ES5 syntax and the ES2015 library, however, esbuild has a hard minimum syntax target of ES2015 (aka ES6). ES2018 was chosen as a balance between compatibility with older environments and access to more modern syntax and library features

I'd be curious as to what percentage of the improvement comes from modules vs comes from a different target.

berkes · 3 years ago

I'm curious about that too.

From my superficial knowledge of compilers, "modularization" itself should not make code faster, if anything slower. There'll always be some overhead of loading modules and communicating between them, not?

I presume, from my own experience when building software (not compilers), that modules allow for a much easier to reason about, much better isolated (cohesion, loose coupling). And therefore for much easier improvements inside the module. I would presume that, here too, modules allowed them to improve the inner workings much better, allowing for the performance increase. Or am I completely misunderstanding this feature?

jakebailey · 3 years ago

There are some key things here that maybe weren't clearly stated in my writeup.

Firstly, the old codebase is TS namespaces, which compile down to IIFEs that push properties onto objects. Each file that declares that namespace is its own IIFE, and so every access to other files incurs the overhead of a property access.

With modules, tooling like esbuild, rollup, can now actually see those dependencies (now they are standard ES module imports) and optimize access to them. In this PR's case, the main boost comes from scope hoisting.

For example, in one file, we may declare the helper `isIdentifier`. In namespaces, we would write `isIdentifier` in another file, but this would at emit time turn into `ts.isIdentifier`, which is slower. Now, we import that helper, and then esbuild (or rollup) can see that exact symbol. All of the helpers get pulled to the top of the output bundle, and calls to those helpers are direct.

That's why modules gives us a boost. There's also more (modules means we can use tooling to tree shake the output, and smaller bundles are faster to load), but the hoisting is the big thing.

constexpr · 3 years ago

There are two performance implications of "modularization": initialization-time and run-time.

You are correct that initializing many modules is usually slower than initializing one module [1]. However, bundling puts all modules into one file, so this PR doesn't actually change anything here. Both before and after this PR, the TypeScript compiler will be published as a single file.

At run-time, switching to ES modules from another JavaScript module system can be a significant performance improvement because it removes the overhead of communicating between them. Other module systems (e.g. TypeScript namespaces, CommonJS modules) use dynamic property accesses to reference identifiers in other modules while ES modules use static binding to reference the identifiers in other modules directly. Dynamic property access can be a big performance penalty in a large code base. Here's an example of the performance improvement that switching to ES modules alone can bring: https://github.com/microsoft/TypeScript/issues/39247.

[1] This is almost always true. A random exception to this is that some buggy compilers have O(n^2) behavior with respect to the number of certain kinds of symbols in a scope, so having too many of those symbols in a single scope can get really slow (and thus splitting your code into separate modules may actually improve initialization time). This issue is most severe in old versions of JavaScriptCore: https://github.com/evanw/esbuild/issues/478. When bundling, esbuild deliberately modifies the code to avoid the JavaScript features that cause this behavior.

klodolph · 3 years ago

> From my superficial knowledge of compilers, "modularization" itself should not make code faster, if anything slower. There'll always be some overhead of loading modules and communicating between them, not?

I think this is a misunderstanding of what actually happened.

TypeScript has a thing called “namespaces” and a thing called “modules”. Both provide modularization. The TS repo is not being modularized, instead, the namespaces are getting converted to modules.

Namespaces are an old-school approach to writing a module in JavaScript. You pack all of your exports into a JS object, and then access the object from somewhere else. This works, but JS is dynamic, and the runtime has no way to guarantee that you won’t mess with this object (replace functions or whatnot).

Modules don’t have this object. You just call the function, instantiate the class, or do whatever else with the names you imported. They are resolved statically, so certain optimizations become more “obvious”, like inlining.

akiselev · 3 years ago

For ES6 modules, the exports object is frozen (made read-only) so the JIT can make some extra assumptions and optimizations. With bundles, unless the bundler inserts `Object.freeze` around `module.exports`, they have to be treated as dynamic objects.

[1] https://nodejs.org/api/esm.html#differences-between-es-modules-and-commonjs [2] https://www.typescriptlang.org/docs/handbook/release-notes/typescript-4-7.html [3] https://github.com/facebook/jest/issues/9430

zebracanevra · 3 years ago

swsieber · 3 years ago

iainmerrick · 3 years ago

A little while ago I asked (https://news.ycombinator.com/item?id=33051021):

I’m curious, how many people are using TSC only for type-checking, and a different system (eg esbuild or ts-node) to actually compile/bundle/execute their code?

Looks like my suspicion was correct; not even tsc uses tsc!

presentation · 3 years ago

Probably the majority since popular frameworks like NextJS do it with SWC now.

WorldMaker · 3 years ago

The default configurations for Create-React-App and others use babel for type stripping today.

This seems to me like a great "win" for Typescript that so many tools just natively handle TS type stripping and that so much Typescript today only needs type stripping and doesn't need other parts of TS emit processes (or tslib).

MitchellCash · 3 years ago

> a change in the indentation used in our bundle files (4 spaces -> 2 spaces)

I find it interesting that one of the reasons given for the reduction in package size is due to such a simple indentation change from 4 spaces to 2 spaces.

Not interesting that 2 bytes are less than 4 bytes, rather, TypeScript is a large project and it would be interesting to know how much size was saved from this one specific change? Seems like a trivial change, so why not do it sooner? And assuming readability isn't required in the bundle output why not bundle with no indentation at all and put everything on a single line, would this not be even smaller again?

RyanCavanaugh · 3 years ago

> Seems like a trivial change, so why not do it sooner?

Re: indentation: Literally, no one thought of it, as far as anyone can tell. Linus's law appears to have its limits.

robertlagrant · 3 years ago

Most minifiers already put things on one line, though.

IshKebab · 3 years ago

Kind of funny because one of the benefits of tabs vs spaces that people laugh off is that it saves space.

I think it's probably correct to laugh this off though. Why would you care about the non-minified/gzipped size this much?

yrgulation · 3 years ago

TS devs still debating this in 2022? And i thought php devs are ridiculous for still debating setters and getters.

Bravo. This must have been painful. Super excited to use a faster tsc. That will make a huge difference in our products. Thank you.

redox99 · 3 years ago

I absolutely hate how with Typescript and ES Modules, if you have a file

utils/foo.ts

you have to import it as

import Foo from "utils/foo.js"

Even though there is no .js file on disk, and you might be running ts-node or whatever that doesn't build a .js file.

Importing a file that "doesn't exist" is so counterintuitive.

In addition all code breaks because you have to change all your imports, and /index.ts or /index.js won't work either.

bigyikes · 3 years ago

Every TypeScript project I have worked on either:

1) enforces no extension, e.g. “utils/foo”, or

2) allows TS extensions, e.g. “utils/foo.ts”

I have never imported a TS file using a JS extension. Maybe your woes could be fixed with a configuration change?

Chyzwar · 3 years ago

No, you were using non-standard ESM modules (compiled to CommonJS defined by babel) Typescript recently added support for ESM compatible with node.js see "module": "node16"[1][2]

The Whole ESM saga is clusterfuck, not much better than python 2 -> 3 migration. Large node.js codebases have no viable path to migrate, and most tools still cannot support ESM properly[3]. Stuff is already breaking because prolific library authors are switching to ESM.

As someone that maintain large part of TS/JS tooling in my day job, I absolutely despise decisions made by node.js module team. My side projects are now in Elixir and zig because these communities care about DX.

cypress66 · 3 years ago

And that's how it should work imo. But if you enable esm (which you might need in the future because of packages being esm only) you can't use those, only .js.

That's because typescript developers are dead set that they don't want to transpile the imports, they just want to copy paste them into the resulting file when running tsc.

sebdufbeau · 3 years ago

From my understanding, this is a fairly new predicament, for projects that target ESM (type module in package.json) instead of the default CJS

dickfickling · 3 years ago

have you worked on Typescript projects using ES modules? What you’ve described is the status quo for CommonJS modules, but doesn’t work when you switch to ESM (afaik, at least)

tobyhinloopen · 3 years ago

No, it’s new, to comply with new stuff from nodejs.

You can likely change it with a config, but do note that importing using .js will be the new standard way of doing things and by changing it through configuration you’re chosing to not follow the new standard.

It’s a complete mess IMO. Every project uses a different way to handle modules and there’s a lot of rough edges.

lewisl9029 · 3 years ago

Agree completely. It also makes interop between Node and Deno more painful. But there is hope on the horizon. :)

See https://github.com/microsoft/TypeScript/issues/37582 which is referenced in the 4.9 Iteration Plan as "Support .ts as a Module Specifier for Bundler/Loader Scenarios": https://github.com/microsoft/TypeScript/issues/50457

But that doesn't seem like it would fix the typical flow, there still would be no transpiling of imports. So yes you could use .ts in ts-node, but you would have to use .js in tsc. Which is pretty awful (you want your code to work with both).

codefined · 3 years ago

Actually looks to be tabled for TypeScript 5.0, release date of 'March 14th' next year.

kuramitropolis · 3 years ago

A thousand times this. It's not only the dumbest thing I've seen a programming language do, it's also dumbest thing I've seen in the JS ecosystem. Ended up having to implement an AST-based post-processor to fix packages before publishing them.

cush · 3 years ago

Where are you seeing this? The PR that this thread is about doesn't have this quirk. Maybe your setup has some issue...?

matt_kantor · 3 years ago

Their complaints are unrelated to this specific PR.

See https://www.typescriptlang.org/docs/handbook/esm-node.html for details about how import paths work in CommonJS vs ESM. In both cases the import path you write in your source code is the same import path that is used in the emitted JavaScript. What's different is that NodeJS's ESM implementation doesn't allow extensionless import paths (but its CommonJS implementation does).

Deleted Comment

Does anyone have any insight about how to coordinate this kind of change to a large project? This kind of change touches literally every file, so every branch will have merge conflicts. The best idea I can think of is to announce the date ahead of time and make every contributor rebase their branches on the day of the merge. But there has to be a better way.

> | edit [–] | on: PR that converts the TypeScript repo from namespac...

Surprisingly, at least for this PR, solving merge conflicts turns out to not be too hard. By not squash merging it, we can have a single commit that unindents the codebase all in one go (and the commit is in the tree), which means that every line has a clear path back to the current state of the main branch. (And crucially, we can make git blame not point every line to me...)

Potentially, an approach like this might be applicable to other changes; I have a commit in my stack which moves the old build system config to the new build system config's path (even though it's wrong), as git does a much better job understanding where the code is going if you help it.

siftrics · 3 years ago

Jake Bailey was one of the best TAs I ever had in college. It's great to see his name behind this.

I miss the Piazza days...

Thank you for the kind words!