binocarlos (u/binocarlos)

binocarlos commented on Supabase raises $200M Series D at $2B valuation finance.yahoo.com/news/ex... · Posted by u/baristaGeek

michelpp · 4 months ago

I turn 50 tomorrow and I love vibe coding. In the hands of an expert with decades of experience in all the internal corners of C, Python and Postgres I find AI tools to be miracles of technology. I know how to ask them exactly what I want and I know how to separate the goodness from the bullshit. If Supabase is bringing AI closer to the developer at the database level then that is a great thing.

binocarlos · 4 months ago

I agree with this - I hear a lot of hate towards vibe coding but my experience with voice dictation and using 20 years experience in the trenches and so being very specific telling the model what to do has been, well, refreshing to say the least.

I used to pride myself of knowing all the little ins and outs of the tech stack, especially when it comes to ops type stuff. This is still required, the difference is you don't need to spend 4 hours writing code - you can use the experience to get to the same result in 4 minutes.

I can see how "ask it for what you want and hope for the best" might not end well but personally - I am very much enjoying the process of distilling what I know we need to do next into a voice dictated prompt and then watching the AI just solve it based on what I said.

binocarlos commented on TSMC execs allegedly dismissed OpenAI CEO Sam Altman as 'podcasting bro' tomshardware.com/tech-ind... · Posted by u/WithinReason

jsheard · a year ago

There's no accounting for taste, but keep in mind that all of these services are currently losing money, so how much would you actually be willing to pay for the service you're currently getting in order to let it break even? There was a report that Microsoft is losing $20 for every $10 spent on Copilot subscriptions, with heavy users costing them as much as $80 per month. Assuming you're one of those heavy users, would you pay >$80 a month for it?

Then there's chain-of-thought being positioned as the next big step forwards, which works by throwing more inferencing at the problem, so that cost can't be amortized over time like training can...

binocarlos · a year ago

I would pay hundreds of dollars per month for the combination of cursor and claude - I could not get my head around it when my beginner lever colleague said "I just coded this whole thing using cursor".

It was an entire web app, with search filters, tree based drag and drop GUIs, the backend api server, database migrations, auth and everything else.

Not once did he need to ask me a question. When I asked him "how long did this take" and expected him to say "a few weeks" (it would have taken me - a far more experienced engineer - 2 months minimum).

His answer was "a few days".

What I'm not saying is "AGI is close" but I've seen tangible evidence (only in the last 2 months), that my 20 year software engineering career is about to change and massively for the upside. Everyone is going to be so much more productive using these tools is how I see this.

binocarlos commented on Show HN: Smart website search powered by open models tryhelix.ai/searchbot... · Posted by u/lewq

binocarlos · a year ago

I helped work on the RAG part of this :-)

We used https://github.com/pgvector/pgvector under the hood and found it extremely easy to integrate with our database schema - being able to just specify the structure of a table and have metadata fields alongside the embeddings made the code very easy to reason about.

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

bugglebeetle · 2 years ago

Unsloth’s colab notebooks for fine-tuning Mistral-7B are super easy to use and run fine in just about any colab instance:

https://github.com/unslothai/unsloth

It’s my default now for experimenting and basic training. If I want to get into the weeds, I use axolotl, but 9/10, it’s not really necessary.

binocarlos · 2 years ago

Excellent link thank you - I will make sure we check this out because as the README says:

> Finetune Mistral, Llama 2-5x faster with 70% less memory!

Could be very useful for us!

Disclaimer: I work on Helix

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

nicolezhu · 2 years ago

What are some os / hardware specific challenges you guys faced?

binocarlos · 2 years ago

Great question! scheduling workloads onto GPUs in a way where VRAM is being utilised efficiently was quite the challenge.

What we found was the IO latency for loading model weights into VRAM will kill responsiveness if you don't "re-use" sessions (i.e. where the model weights remain loaded and you run multiple inference sessions over the same loaded weights).

Obviously projects like https://github.com/vllm-project/vllm exist but we needed to build out a scheduler that can run a fleet of GPUs for a matrix of text/image vs inference/finetune sessions.

disclaimer: I work on Helix

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

_pdp_ · 2 years ago

Interesting article but, IMHO, completely impractical. Teaching the model about specific content is totally what you should not do. What you should do is to teach the model how to effectively retrieve the information even if it is unsuccessful on the first try.

binocarlos · 2 years ago

We are finding that fine tuning is very good at setting the style and tone of responses. A potential use case we are thinking about is what if your star sales person leaves the company? Could you fine tune an LLM on their conversations with customers and then do inference where it would write text in the style of your star sales person.

We are also adding function calling so the model would know to reach out to an external API to fetch some data before generating a response.

disclaimer: I work on Helix

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

gdiamos · 2 years ago

Glad to see that more people outside the big ai labs are figuring out how to do fine tuning. Some open source LLM authors also seem to have figured it out.

I think many users get put off it because just pushing a button doesn’t work and the whole thing seems like a black box that you don’t know how to fix when it breaks.

It turns out that finetuning can be debugged, but the methods aren’t well documented (yet), eg by generating q/a, oversampling them, etc

When you get it to work it’s powerful - new abilities emerge beyond memorization.

Just like how llama2/claude2/gpt4 learned reasoning by memorizing sentences from Reddit posts :P

Also, I don’t get the comparison of rag vs finetuning in articles like this - why not do both. RAG is easy to setup - it’s push button. Just do it on all models (including finetuned models).

binocarlos · 2 years ago

Thanks for the feedback - I agree that fine tuning a) has potential and b) is not easy :-)

> Also, I don’t get the comparison of rag vs finetuning in articles like this - why not do both

It's interesting you say this because we are very close to adding RAG support to Helix sessions and it will be "both at the same time" not an "either or" setup. You can choose to do either or but we are interested in seeing if doing both at the same time yields better results than either or - watch this space!

disclaimer: I work on Helix

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

joshka · 2 years ago

For helix, I notice that GitHub is listed as a data source, but there's nothing in the docs about this. I'd really love to see what a model trained on my commonly used git repos (which generally are newer than The Stack etc), and in particular their commit history. Ideally these would make it easier for code completion to have the historical context as well as the current code to play with in determining what to write next.

I often wonder how you'd go about organizing training data for a full historic github repo in a way that makes sense for training (or RAG)? The vast majority of the data is previous changes to the repo. I think this would generally mean that it would outweigh the current information and cause problems (i.e. old method names before refactoring etc.)

Also, perhaps being able to expand that out to doing the same thing for a bunch of consumers of the library that I'm maintaining would be neat.

Sprinkle in the PR and Issue history, docs website, API docs, and discord history and I think you'd have a helluva model.

binocarlos · 2 years ago

This is spot on, the thing we've not yet done is make it easy to import a repo(s) code and the associated metadata into a fine tuning session easily.

> I often wonder how you'd go about organizing training data for a full historic github repo in a way that makes sense for training (or RAG)?

This is the hard part :-) But you are right - it would be intriguing to see what the output of a fune-tuned & RAG model would look like for this use-case. We are currently experimenting with adding RAG alongside the fine tuned model (so it's both, not either or) to see if it produces better results.

I will make sure we take a look at the gihub repo use case because it feels like that would be an interesting experiment to do!

disclaimer: I work on Helix

binocarlos commented on How we got fine-tuning Mistral-7B to not suck helixml.substack.com/p/ho... · Posted by u/lewq

AznHisoka · 2 years ago

Does fine tuning it on a set of docs in your “knowledge base” help for generalizing it so it can answer questions pertaining to new documents that come in (with a “similar” style/structure but with different content/fscts)?

binocarlos · 2 years ago

Fine tuning on your documents will really help to answer questions in the style and tone of those documents, so in that way, yes it helps.

It would be possible to include some parts of the new documents in the prompt so you can answer questions about new facts in the style and tone of your old documents, which we feel is useful. We are also experimenting with adding Retrieval Augmented Generation alongside fine tuning to see if the results are better than either or.

disclaimer: I work on Helix

binocarlos commented on HN website is/was down. I'm curious why? hn.hund.io/... · Posted by u/karmakaze

sgammon · 2 years ago

It’s been up so long I thought it was my internet.

binocarlos · 2 years ago

I use hacker news as my "is my Internet working" test - it's fast and always up (kudos)

So yes, I also thought my Internet was broken :-)