example: `ranger` is written in python and it's freaking slow. in comparison, `yazi` (Rust) has been a breeze.
Edit: Sorry, I meant GIL, not single thread.
You probably mean GIL, as python has supported multi threading for like 20 years.
Idk if ranger is slow because it is written in python. Probably it is the specific implementation.
What AI have you been using for 5 years of coding?
Models like Stable Diffusion sort of do a similar thing using Clip embeddings. It works, and it's an easy way to benefit from the pre-training Clip has. But for a language model it would seemingly make more sense to just add the extra layer.
I'm just focusing on different parts
It can only be fully resolved with either infinite context length, or doing it similar to how humans do it - add some LSP "color" to the code tokens.
You can get a feel of what LLMs deal with when you try opening 3000 lines of code in a simple text editor and try to do something. May work for simple fixes, but not whole codebase refactors. Only ultra skilled humans can be productive in it (using my subjective definition of "productive")
Experiments I want to build on top of it:
1. Adding lsp context to the embeddings - that way the model could _see_ the syntax better, closer to how we use IDEs and would not need to read/grep 25k of lines just to find where something is used. 2. Experiments with different "compression" ratios. Each embedding could encode a different amount of bytes and we would not rely on a huge static token dictionary.
I'm aware that papers exist that explore these ideas, but so far no popular/good open source models employ this. Unless someone can prove me wrong.
I don't like it too, but it is what it is.
If I gave free water refils if you used my brand XYZ water bottle, you should not cry that you don't get free refills to your ABC branded bottle.
It may be scummy, but it does make sense.