Readit News logoReadit News
afiodorov · 2 years ago
I don't trust the code quality evalution. The other day at work I wanted to split my string by ; but only if it's not within single quotes (think about splitting many SQL statements). I explicitly asked for stdlib python solution and preferrably avoid counting quotes since that's a bit verbose.

GPT4 gave me a regex found on https://stackoverflow.com/a/2787979 (without "), explained it to me and then it successfully added all the necessary unit tests and they passed - I commited all of that to the repo and moved on.

I couldn't get 70B to answer this question even with multiple nudges.

Every time I try something non GPT-4 I always go back - it's feels like a waste of time otherwise. A bit sad that LLMs follow the typical winner-takes-it-all tech curve. However if you could ask the smartest guy in the room your question every time, why wouldn't you?

---

Edit: USE CODE MODE and it'll actually solve it.

rushingcreek · 2 years ago
Thanks for the feedback, could you please post the cached Phind link so we can take a look?

It might also be helpful to try Phind Chat mode in cases like this.

EDIT: It seems like Phind-70B is capable of getting the right regex nearly every time when Chat mode is used or search results are disabled. It seems that the search results are polluting the answer for this example, we'll look into how to fix it.

Perseids · 2 years ago
I've tried it with a question which requires deeper expertise – "What is a good technique for device authentication in the context of IoT?" – and the Search mode is also worse than the Chat mode:

- Search: https://www.phind.com/search?cache=s4e576jlnp1mpw73n9iy4sqc

- Chat: https://www.phind.com/agent?cache=clsyev95o0006le08b5pjrs14

The search was heavily diluted by authentication methods that don't make any sense for machine-to-machine authentication, like multi-factor or biometric authentication, as well as the advice to combine several methods. It also falls into the, admittedly common, trap of assuming that certificate based authentication is more difficult to implement than symmetric key (i.e. pre-shared key) authentication.

The chat answer is not perfect, but the signal-to-noise ratio is much better. The multi-factor authentication advice is again present, but it's the only major error, and it also adds relevant side-topics that point in the right direction (secure credential storage, secure boot, logging of auth attempts). The Python example is cute, but completely useless, though (Python for embedded devices is rare and in any case you wouldn't want a raw TLS socket, but use it in a MQTTS / HTTPS / CoAP+DTLS stack, and last but not least, it provides a server instead of client, even though IoT devices mostly communicate outbound).

Dead Comment

planb · 2 years ago
I didn't take a look at the code, but to me it sounds quite dangerous to take an implementation AND the unit tests straight from an LLM, commit and move on.

Is this the new normal now?

fileyfood500 · 2 years ago
It's very powerful, I can enter implementations for any algorithm by typing 5 words and clicking tab. If I want the AI to use a hashmap to solve my problem in O(n), I just say that. If I need to rewrite a bunch of poorly written code to get rid of dead code, add constants, etc I do that. If I need to convert files between languages or formats, I do that. I have to do a lot more code review than before, and a lot less writing. It saves a huge amount of time, it's pretty easy to measure. Personally, the order of consultation is Github Copilot -> GPT4 -> Grimoire -> Me. If it's going to me, there is a high probability that I'm trying to do too many things at once in an over-complicated function. That or I'm using a relatively niche library and the AI doesn't know the methods.
swman · 2 years ago
It’s the new boot camp dev. It is still the same as copy pasting SO solutions lol
Xenoamorphous · 2 years ago
I guess most people would review the code as if it had been written by a colleague?
RamblingCTO · 2 years ago
Hopefully not, I feel it's a waste of time. The time spent on stupid minor mistakes by github copilot I didn't catch probably doesn't really compare to the time I would've spent typing on my own. (I only use that stuff for fancy code completion, nothing more. Every LLM is absolutely moronic. Yesterday I asked chatgpt to convert gohtml to templ, to no avail ...)
ugh123 · 2 years ago
Presumably people look at things before committing the code. And code reviews and pull requests are still normal.

Blindly copying code from any source and running it or committing it to your main branch without even the slightest critical glance is foolish.

ogrisel · 2 years ago
Arguably the tests should be easier to review than the implementation.

But if there non-trivial logic in the code of the tests, I agree this is probably a risky approach.

romeros · 2 years ago
it really feels like GPT-4 is Google and Everybody else is Yahoo/Bing. i.e cute but not really
unshavedyak · 2 years ago
Agreed, though i'm _really_ interested in trying 1M token Gemini. The idea of uploading my full codebase for code assist stuff sounds really interesting. If i can ever get access to the damn thing...
devjab · 2 years ago
Gemini is much better than the free version of GPT 3.5 though. At least in my experience.

Microsoft’s enterprise co-pilot is also fairly decent. It’s really good at providing help to Microsoft related issues or helping you find the right parts of their ridiculously massive documentation site. Which probably isn’t too weird considering.

HKH2 · 2 years ago
In my experience, Bing's image search is way better than Google's. Also, I'm not going to use a search engine that I have to log in or do a captcha for.
meindnoch · 2 years ago
Doesn't handle escaped quotes, and the time complexity of that regex is very bad.
eru · 2 years ago
The time complexity for all matching a string against any fixed regular expression is O(length of string).

If you want to talk about constant factors, we need to leave our comfortable armchairs and actually benchmark.

[Just to be clear, I am talking about real regular expressions, not Franken-xpressions with back-references etc here. But what the original commenter described is well within the realm of what you can do with regular expressions.]

You are right about escaped quotes etc. That's part of why parsing with regular expressions is hard.

Deleted Comment

sebstefan · 2 years ago
Can you try this?

"Can you give me an approach for a pathfinding algorithm on a 2D grid that will try to get me from point A to point B while staying under a maximum COST argument, and avoid going into tiles that are on fire, except if no other path is available under the maximum cost?"

I've never found an AI that could solve this, because there's a lot of literature online about A* and tiles with cost, and solving this requires a different approach

jeffbee · 2 years ago
> I wanted to split my ... SQL statements ... avoid counting quotes ... GPT4 gave me a regex ... I commited all of that to the repo

I see that the future is brighter than ever for the information security industry.

xyzzy_plugh · 2 years ago
Sure is! We've got a bright and oh so plentiful road ahead, pending we can avoid blowing up the planet.
ldjkfkdsjnv · 2 years ago
Yup, LLMs broke well known benchmarks
kunalgupta · 2 years ago
same exp
tarruda · 2 years ago
I don't care much for benchmarks, many models seems to be contaminated just to approach proprietary models in coding benchmarks.

I had never tried Phind before, but gave Phind-70B a spin today and so far found it to be really good for coding writing and understanding, maybe even GPT-4 level. Hard to tell for sure since I only tested it on a single problem: Writing some web3 code in typescript. This is what I did:

- Gave it some specifications of a react hook that subscribes to a smart contract event and fetches historical events starting from a block number. It completed successfully.

- Took this code and gave it to GPT-4 to explain what it did, as well as finding potential issues. GPT gave a list of potential issues and how to address.

- Then I went back to the Phind and asked it to find potential issues in the code it had just written, and it found more or less the same issues GPT-4 had found.

- Went back to GPT-4 and asked to write a different version of the hook.

- Took the GPT-4 written code and asked it to explain the code, which it did successfully (though I think it lacked more details than the GPT-4 explanation of the code written by Phind).

I will be testing this more over the next days. If this proves to be in the GPT-4 ballpark and the 70b weights are released, I will definitely replace my ChatGPT plus subscription with Phind Pro.

WuxiFingerHold · 2 years ago
Not an expert at all. But just wanted to let the creators know: I've been using Phind almost daily for some months now and it's been awesome. Whenever I accidentally do a web search I recognize what a game changer this is. (ChatGPT probably as well, but never used it.) Last week I was under pressure at work and I used it for stuff like: "How can i capture output from a command and print it line by line to the console with Rust", and must say that kind of time and energy savings are very significant.
sekai · 2 years ago
Don't even remember when I opened Stack Overflow, won't miss that condescending place.
the_duke · 2 years ago
Just wait for people to stop using SO, at which point the LLMs won't have a high quality training set for new questions, so you won't get good answers from the LLMs anymore...
dcow · 2 years ago
SO: the community that optimized for moderator satisfaction over enduser utility.
rushingcreek · 2 years ago
Thank you :)
dalmo3 · 2 years ago
My work banned any AI tool, and... After using Phind for months, going back to Google/SO is just crippling.
throwup238 · 2 years ago
Get kagi and use the !code bang

Then you're not using AI, you're using your search engine. wink wink

rushingcreek · 2 years ago
Phind founder here. You can try the model for free, without a login, by selecting Phind-70B from the homepage: https://phind.com.
Fervicus · 2 years ago
I don't use LLMs a lot, maybe once a week or so. But I always pick Phind as my first choice because it's not behind a login and I can use it without giving my phone number. Hopefully you'll keep it that way!
HKH2 · 2 years ago
https://labs.perplexity.ai is the same and it loads much faster than Phind.
worldsayshi · 2 years ago
I don't see how they could. They need to finance it at some point?
bee_rider · 2 years ago
Important and hard-hitting question from me: have you ever considered calling yourself the Phinder or the Phiounder?
bbor · 2 years ago
Phindational models, phintech, Phinterest, phinder… it might be the best startup name of all time. Hell, startup a password manager and call it Phinders’ Keeper.
fragmede · 2 years ago
Find Phounder
ComputerGuru · 2 years ago
And here I was wondering why this service was called pee-hind!
Zacharias030 · 2 years ago
or the PhiTO / PhiEO
carbocation · 2 years ago
It seems unexpected that other people can edit a link to a Phind chat just by getting the URL. It means that if you share a URL with someone, they can change your results: https://www.phind.com/search?cache=k56i132ekpg43zdc7j5z1h1x
goldemerald · 2 years ago
Very nice. I've been working with GPT4 since it released, and I tried some of my coding tasks from today with Phind-70B. The speed, conciseness, and accuracy are very impressive. Subjectively, the answers it gives just feel better than GPT4, I'm definitely gonna give pro a try this month.
visarga · 2 years ago
I prefer Phind's web search with LLM to both Google search and GPT-4. I have switched my default search engine, only using Google for finding sites, not for finding information anymore.

GPT-4 might be a better LLM but its search capability is worse, sometimes sends really stupid search keywords that are clearly not good enough.

declaredapple · 2 years ago
Any chances of an API?

And are there plans to release any more weights? Perhaps one or two revisions behind your latest ones?

parineum · 2 years ago
Ask phind to make you one that screen scrapes
zestyping · 2 years ago
I tried asking "What is the size of Phind-70B's context window?" and it couldn't answer the question. Strangely, it immediately found the page with the answer (https://www.phind.com/blog/introducing-phind-70b) but refused to acknowledge that the answer was there. I tried asking several ways. It even quoted the exact answer in the displayed snippet, but still said there was no answer!

Here are a couple screenshots:

https://imgur.com/a/u7iKOywhttps://imgur.com/a/aHAto5H

And here's the link to the whole conversation:

https://www.phind.com/search?cache=zlaksmzkm0h5cpx8l95n62tl

Why is this happening? Does it generally have difficulty with reading web pages, or is there something strange about this particular question?

airgapstopgap · 2 years ago
Since you're here: have you considered moving to other, better generalist base models in the future? Particularly Deepseek or Mixtrals. Natural language foundation is important for reasoning. Codellama is very much a compromise, it has lost some NLP abilities from continued pretraining on code.
shrubble · 2 years ago
I tried a question about Snobol4 and was impressed with what it said (it couldn't provide an exact example due to paucity of examples). When testing more mainstream languages I have found it very helpful.
bobbyi · 2 years ago
I'm selecting 70B and it is coming back with "Answer | Phind-34B Model".

I'm not sure if it's really using the 34B model or if the UI is wrong about which one it used

anter · 2 years ago
You have to click on the "Chat" option at the top left corner, then it'll use the 70B model. I got stuck on that too til I figured that out.
rushingcreek · 2 years ago
Please try logging in in that case, you will still get your 10 free uses.
brainless · 2 years ago
Hello Michael, lovely to see this, congrats. Do you already have an API? I could not see it on the site. If not, then do you know around when we can expect it? I am building a desktop BI app with hosted and local LLMs (need schema inference and text to SQL). Would be nice to have Phind as an option for users. Thanks
petesergeant · 2 years ago
This is good stuff, congrats. Took a little detour, but GPT-4 does too (https://www.phind.com/agent?cache=clsxw1mru0033l908mojpvb3b)
robbomacrae · 2 years ago
Why do none of the graphs show the speed difference? That seems to be your biggest advantage and the subject line...
browningstreet · 2 years ago
Hmm, when I try I see this in the dropdown:

0 Phind-70B uses left

And I've never made any selection there.

rushingcreek · 2 years ago
I'd suggest logging in in that case -- you will still get your free uses. The Phind-70B counter for non-logged in users has carried over from when we offered GPT-4 uses without a login. If you've already consumed those uses, you'll need to log in to use Phind-70B.
justaj · 2 years ago
Are you considering adding more non-US payment methods for Phind Pro?
forevernoob · 2 years ago
For sure this. I've recently found out that you can only pay using credit card, US bank account or Cash App.
coder1001 · 2 years ago
API on the horizon?
acdanger · 2 years ago
Hi, when I try to use the 70B model from the homepage, the response indicates that it's using the 34B model.
rushingcreek · 2 years ago
Please try logging in in that case. You will get 10 free daily 70B uses.

Dead Comment

shafiemukhre · 2 years ago
Awesome update!

I have been using Phind almost daily for the past 3-4 weeks and the code it produces is pretty good and it is runnable on the first try more often compared to ChatGPT. Most of the time the answer is somewhat accurate and points me in the right direction.

ChatGPT (with GPT 4) has been slow af for me for the past 2+ months but I like studying a topic using ChatGPT, it is more verbose and explanatory when explaining things to you.

Maybe a purpose-built dedicated AI model is the right path. A model that does well in fixing bugs, writing feature code, and producing accurate code will not be a good tool for or conversational studying. And vice versa.

Also, I don't like that Phind is not handling the follow-up question that well when there are multiple kinds of questions within the same thread. ChatGPT is good at this.

rushingcreek · 2 years ago
Thanks for the feedback! Have you tried setting a custom answer profile at https://phind.com/profile?

You can tell it to be more explanatory for certain topics.

shafiemukhre · 2 years ago
I haven't actually because Phind is working for me so far whenever I have code-related questions or when I need to refactor my code. TIL that I can customize the answer style preference, will give it a try!
jamesponddotco · 2 years ago
I'm impressed with the speed, really impressed, but not so much with the quality of the responses. This is a prompt I usually try with new LLMs:

> Acting as an expert Go developer, write a RoundTripper that retries failed HTTP requests, both GET and POST ones.

GPT-4 takes a few tries but usually takes the POST part into account, saving the body for new retries and whatnot. Phind in the other hand, in the two or three times I tried, ignores the POST part and focus on GET only.

Maybe that problem is just too hard for LLMs? Or the prompt sucks? I'll see how it handle other things since I still have a few tries left.

shapenamer · 2 years ago
I'm a human and I don't have the slightest idea what you're asking for.
Powdering7082 · 2 years ago
Do you use Go? It makes sense to me
rushingcreek · 2 years ago
Thanks, can you send the cached link please? I'd also suggest trying Chat mode for questions like this, where there are unlikely to benefit from an internet search.

Just tried your query now and it seemed to work well -- what are your thoughts?

https://www.phind.com/search?cache=tvyrul1spovzcpwtd8phgegj

jamesponddotco · 2 years ago
Here you go:

https://www.phind.com/search?cache=k56i132ekpg43zdc7j5z1h1x

I'll give chat mode a try. Didn't see that it existed until now.

EDIT

Chat mode didn't do much better:

https://www.phind.com/agent?cache=clsxpl4t80002l008v3vjqw5j

For the record, this is the interface I asked it to implement:

https://pkg.go.dev/net/http#RoundTripper

coder543 · 2 years ago
“RoadTripper”? Or “RoundTripper”?
jamesponddotco · 2 years ago
Ops, haha. Interesting that GPT-4 still got it right though.

Phind still forgot about POST, but at least now it got the interface right.

https://www.phind.com/search?cache=ipu8z1tb3bnn7nfgfibcix38

zettabomb · 2 years ago
A fun little challenge I like to give LLMs is to ask some basic logic puzzles, i.e. how can I measure 2 liters using a 3 liter and a 5 liter container? Usually if it's possible, they seem to do ok. When it's not possible, they produce a variety of wacky results. Phind-34B is rather amusing, and seems to get stuck in a loop: https://www.phind.com/agent?cache=clsxpravk0001la081cc9dl45
hobabaObama · 2 years ago
I tested this prompt in various LLMs

1. phind was by far the best - gave me solution in just 2 steps

2. Grok was second best - it did arrive at the solution but with additional non-sense step. But the solution was correct.

3. To my surprise GPT-4 could not solve the prompt and in fact gave a wrong answer in 4 steps - "Now you should have exactly 4 liters in the 5-liter container." which is not what I asked

4. As expected Gemini pro was the worst. It asks me to pour completely filled up 3L container into 5L and then you will be left with 2L in 3L container.. LOL that does not even make sense.

thelittleone · 2 years ago
These are interesting tests. I wonder how far we are away from AIs solving these (the ones that have no solution) without any special programming to teach them how.

Deleted Comment