Personally, I think that this is a price worth paying!
Personally, I think that this is a price worth paying!
Would you be willing to share details about the fine-tuning procedure, such as the initialization, learning rate schedule, batch size, etc.? I'd love to learn more.
Background: I've been playing around with generating image sequences from sliding windows of audio. The idea roughly works, but the model training gets stuck due to the difficulty of the task.
(Full disclosure: I have only read the abstract so far.)
This is pure gold though:
> How does one make a web app using a standard framework? I've never used it, but it sounds like someone has been able to put together something like a Web app with only one app.
Edit: This is even better.
> Rewriting a Linux kernel in Rust, by hand, is definitely the right thing to do as a beginner/intermediate programmer.
[1] https://archive.org/details/14566367HackerNewsCommentsAndSto...
Sometimes watching the news, it seems like 90% of what they say when they are 'vamping' is just self-attention.
Has anyone posted any GPT / Hacker News generated text yet? Wisdom of the crowds, indeed. It'd be interesting to post using it with light editing, especially something that uses upvotes for training.
One of the things I was thinking about was training on your favorite novel, so you could have a sort of conversation with it / ask it questions. A kind of interactive cliff notes. However, as looked into it I realized it was still too much of a markov chain like thing to be functionally useful. Fun idea though.
The real win, in all of this, of course is auto completion in different mediums. Code completion demos are pretty wild - https://tabnine.com/blog/deep/ Come to think about it, you could probably use it for writing academic papers as well assuming you know the content well.
Self-Attention and Human/Computer interaction is a very brave new world. I don't think people really yet know the potential for seismic shift here.
As far as HNSW implementations go, this one appears to be almost entirely unfinished. Node insertion logic is missing (https://github.com/swapneel/hnsw-rust/blob/b8ef946bd76112250...) and so is the base layer beam search.