Readit News logoReadit News
ashirviskas commented on Fedora Asahi Remix is now working on Apple M3   bsky.app/profile/did:plc:... · Posted by u/todsacerdoti
jaredcwhite · 16 days ago
I've been running Asahi Fedora GNOME on a Mac mini M1 for some while now (using it right now in fact) with almost zero complaints. A really solid and usable setup. I could see myself buying a used MacBook Air M3 down the road once this work is all finished up, which is very exciting. The prices are already pretty reasonable, even for a 16GB RAM model!
ashirviskas · 16 days ago
Apple made lower than 16GB M3 models? Man, can't wait till the cheapest model is at least 128GB.
ashirviskas commented on I was banned from Claude for scaffolding a Claude.md file?   hugodaniel.com/posts/clau... · Posted by u/hugodan
layer8 · 20 days ago
It was perfectly understandable to me. Maybe cultural differences? You seem to be American, OP Portuguese, and myself European as well.
ashirviskas · 20 days ago
Another European chiming in, I enjoyed OPs article.
ashirviskas commented on Claude Chill: Fix Claude Code's flickering in terminal   github.com/davidbeesley/c... · Posted by u/behnamoh
Der_Einzige · 22 days ago
Did this get written mostly by human hands, or did AI also write this? I would hope something like this was primarily made by humans...
ashirviskas · 22 days ago
Do you also write your bytecode by human hands? At which abstraction layer do we draw the line?
ashirviskas commented on Claude Chill: Fix Claude Code's flickering in terminal   github.com/davidbeesley/c... · Posted by u/behnamoh
behnamoh · 22 days ago
it has, but python being single threaded (until recently) didn't make it an attractive choice for CLI tools.

example: `ranger` is written in python and it's freaking slow. in comparison, `yazi` (Rust) has been a breeze.

Edit: Sorry, I meant GIL, not single thread.

ashirviskas · 22 days ago
> it has, but python being single threaded (until recently) didn't make it an attractive choice for CLI tools.

You probably mean GIL, as python has supported multi threading for like 20 years.

Idk if ranger is slow because it is written in python. Probably it is the specific implementation.

ashirviskas commented on I was a top 0.01% Cursor user, then switched to Claude Code 2.0   blog.silennai.com/claude-... · Posted by u/SilenN
ronsor · 23 days ago
Probably the original GitHub Copilot
ashirviskas · 23 days ago
It is only 4 years old
ashirviskas commented on I was a top 0.01% Cursor user, then switched to Claude Code 2.0   blog.silennai.com/claude-... · Posted by u/SilenN
nebezb · 23 days ago
> my experience from 5 years of coding with AI

What AI have you been using for 5 years of coding?

ashirviskas · 23 days ago
Keyboard autocomplete?
ashirviskas commented on Starting from scratch: Training a 30M Topological Transformer   tuned.org.uk/posts/013_th... · Posted by u/tuned
nl · 24 days ago
Isn't this just an awkward way of adding an extra layer to the NN, except without end-to-end training?

Models like Stable Diffusion sort of do a similar thing using Clip embeddings. It works, and it's an easy way to benefit from the pre-training Clip has. But for a language model it would seemingly make more sense to just add the extra layer.

ashirviskas · 24 days ago
I mean this is exactly what it is. Just a wrapper to replace the tokenizer. That is exactly how LLMs can read images.

I'm just focusing on different parts

ashirviskas commented on Starting from scratch: Training a 30M Topological Transformer   tuned.org.uk/posts/013_th... · Posted by u/tuned
appplication · 24 days ago
Not an expert in the space, but I’m not sure you need to modify tokens to get the model to see syntax, you basically get that exact association from attention.
ashirviskas · 24 days ago
You get that association that is relevant to your project only if you can cram the whole codebase. Otherwise it is making rough estimates and some of the time that seems to be where the models fail.

It can only be fully resolved with either infinite context length, or doing it similar to how humans do it - add some LSP "color" to the code tokens.

You can get a feel of what LLMs deal with when you try opening 3000 lines of code in a simple text editor and try to do something. May work for simple fixes, but not whole codebase refactors. Only ultra skilled humans can be productive in it (using my subjective definition of "productive")

ashirviskas commented on Starting from scratch: Training a 30M Topological Transformer   tuned.org.uk/posts/013_th... · Posted by u/tuned
ashirviskas · 25 days ago
I wonder what if we just crammed more into the "tokens"? I am running an experiment of replacing discrete tokens with embeddings + small byte encoder/decoder. That way you can use embedding space much more efficiently and have it contain much more nuance.

Experiments I want to build on top of it:

1. Adding lsp context to the embeddings - that way the model could _see_ the syntax better, closer to how we use IDEs and would not need to read/grep 25k of lines just to find where something is used. 2. Experiments with different "compression" ratios. Each embedding could encode a different amount of bytes and we would not rely on a huge static token dictionary.

I'm aware that papers exist that explore these ideas, but so far no popular/good open source models employ this. Unless someone can prove me wrong.

ashirviskas commented on Anthropic Explicitly Blocking OpenCode   gist.github.com/R44VC0RP/... · Posted by u/ryanvogel
ashirviskas · a month ago
Well, using Claude Pro/Max Calude Code api without Claude Code, instead of their actual API they monetize goes against their ToS.

I don't like it too, but it is what it is.

If I gave free water refils if you used my brand XYZ water bottle, you should not cry that you don't get free refills to your ABC branded bottle.

It may be scummy, but it does make sense.

u/ashirviskas

KarmaCake day395November 9, 2020View Original