Claude 3.5 Sonnet - Readit News

Opus remained better than GPT for me, even after the release of GPT-4o. VERY happy to see an even further improvement beyond that, Claude is a terrific product and given the news that GPT-5 only began its training several weeks ago I don't see any situation where Anthropic is dethroned in the near term. There are only two parts of Anthropic's offering I'm not a fan of:

- Lack of conversation sharing: I had a conversation with Claude where I asked it to reverse engineer some assembly code and it did it perfectly on the first try. I was stunned, GPT had failed for days. I wanted to share the conversation with others but there's no way provided like GPT, and no way to even print the conversation because it cuts off on the browser (tested on Firefox).

- No Android app. They're working on this but for now, there's only an iOS app. No expected ETA shared, I've been on the waitlist.

I feel like both of these are relatively basic feature requests for a company of Anthropic's size, yet it has been months with no solution in sight. I love the models, please give me a better way of accessing them.

sk11001 · 2 years ago

Both GPT-4 and 4o have been completely useless for coding in the past couple of weeks for me - constant errors, and not just your typical LLM inaccuracies but incapable of producing a few lines of self-consistent code e.g. defines variables foo on one line and refers to it as bar on the next, or it misspells it as foox.

labrador · 2 years ago

Waht language? Because I'm guessing they work well for languages with a large amount of training data like Python (in my experience), less well for less used languages like Zig or Clojure (haven't tried them but that's my theory)

esafak · 2 years ago

For me it has been very repetitious despite my instruction to the contrary.

Zetaphor · 2 years ago

I've been experiencing bizarre typos and misspellings that I've come to describe as the model being drunk. Things like it writing peremeter instead of parameter

kake25 · 2 years ago

The level of misspelling is insane at the moment. It does it almost 50%+ of the times. I just started using claude 3.5 and the difference is night and day.

ipsum2 · 2 years ago

It's the same model though. Maybe your perception has changed.

Alifatisk · 2 years ago

> I had a conversation with Claude where I asked it to reverse engineer some assembly code and it did it perfectly on the first try. I was stunned

I share the same experience with you but with Claude 3 Sonnet. I can’t count how many times I’ve shared some code with Claude with barely any hope because other GPTs failed aswell, yet, Claude surprised me and performed the task with success.

I’ve actually reached to the point that I expressed my gratitude to Claude because of how well it performs on coding tasks and other tasks in general. I don’t know what Anthropic did, but something did they right.

Being able to handle large amounts of tokens, “understand” and perform tasks on it & spit out large amounts of data back with barely any cut-offs (unlike Gemini) has made me feel like Claude is at the moment the best option.

SubiculumCode · 2 years ago

I do wonder if GPT quality fluctuates seasonally, or with electricity costs, in an engineering effort to balance costs with performance.

I agree on all your points, but would like to emphasize that I really do enjoy the voice input voice output thing that chatgpt's app has. Its not how I use it when working, but when commuting, a lot of times, I'll turn on the the chatgpt app and have a conversation with it exploring ideas related to work or side projects. Its better than NPR, and I can't listen to the '3d6 Down the Line' podcast everyday, just once a week.

I've been subscribed to PHind, which is a decent service allowing access to their models, chatgpt 4 turbo and o, and claudes. Its been incredibly useful, especially with their search integration. Unfortunately, while chatgpt can be used 500 times a day, Claude is only 10, although I guess it goes into an API like payment mode after that on top of subscription.

I sure wish I'd buckle down and calculate my usage to really get an idea of whether subscription is cheaper or more expensive for me compared to API.

lxgr · 2 years ago

Short of switching between models (which at least OpenAI definitely does for free customers, but I believe they always indicate it), how would that work? Different quantizations?

henry_viii · 2 years ago

> Lack of conversation sharing... [there is] no way to even print the conversation because it cuts off on the browser (tested on Firefox).

Until they make conversations shareable, in the meantime you can print the whole page in Chrome by:

- going to Developer Tools (Ctrl + Shift + I)

- opening the Command Palette (Ctrl + Shift + P)

- searching for 'screenshot'

- selecting Capture full size screenshot

coreylane · 2 years ago

I recently released Slackrock [https://github.com/coreylane/slackrock] that you may find helpful, it's a Slack chat app that can access several FMs (including Claude 3.5) via AWS Bedrock. Responses can be easily shared with others by inviting them to your channels, and Slack has an Android app. It doesn't support attachments (yet) but I'm working on it!

natsucks · 2 years ago

cool!

wonderfuly · 2 years ago

> Lack of conversation sharing

You can use my product https://ChatHub.gg which supports dozens of chatbots including Claude and can share conversations from any of them.

trungdq88 · 2 years ago

If you have an API key, using Opus with a 3rd party UI like typingmind.com solves all of the problems you mentioned (disclaimer: I'm the app developer)

lannisterstark · 2 years ago

I use LibreChat for this as self hosted UI. Works awesome.

mac-attack · 2 years ago

I'm sticking w/ Claude for the foreseeable future as they seem less slimy than OpenAI/Microsoft/Google so far and care about safety.

I'm in the same boat waiting for an Android app btw. One other feature that I'm hoping they catch up to others on is a permanent context window so that I can get Claude to stop speaking so formally all the time

joshstrange · 2 years ago

To each their own, but I still prefer ChatGPT. The UI for Claude is terrible in my opinion.

I had subscriptions for both and I would fire off questions to both of them and see which one I liked more and I consistently liked the ChatGPT ones more. I canceled my subscription last week for Claude. I am super happy that Anthropic continues to push the envelope on this and I hope to re-subscribe to them in the future.

spidersouris · 2 years ago

If it's really only the UI that's bothering you, why not use a web UI such as Open WebUI?

Powdering7082 · 2 years ago

> GPT-5 only began its training several weeks ago

Source?

netsec_burn · 2 years ago

https://openai.com/index/openai-board-forms-safety-and-secur... (May 28th)

> OpenAI has recently begun training its next frontier model and we anticipate the resulting systems to bring us to the next level of capabilities on our path to AGI.

stuckinhell · 2 years ago

I've had way better success with GPT-4o than claude. I wonder why

netsec_burn · 2 years ago

Have you tried 3 Opus or 3.5 Sonnet? Are you using it for programming, or something else?

simonw · 2 years ago

Personal prompting style, I imagine,

viraptor · 2 years ago

On the plus side, at least ChatBoost supports both openai and claude API. But for this specific model it seems to be broken... I hope that gets noticed and fixed soon.

gotrythis · 2 years ago

What I understand is that it's GPT 6 that just went into training, and that GPT 5 is complete and being delayed until after the U.S. election.

PaulWaldman · 2 years ago

And after GPT-5's release, what would be the plan for subsequent elections? This seems to be a temporary play in delaying AI regulation if public sentiment further becomes that AI can have a strong influence in the elections.

viraptor · 2 years ago

It there any online confirmation of this, that's more than speculation?

r2_pilot · 2 years ago

(assuming you are correct) It says something about how a company feels about the safety of their products when they feel like they should time the releases based on political events.

Deleted Comment

modeless · 2 years ago

This is pure speculation, right?

sva_ · 2 years ago

Source: trust me bro

ilaksh · 2 years ago

I also believe that gpt-4o was originally called gpt-5. If you look at the image generation on their website from gpt-4o which has not been released, I believe that along with the voice caused Ilya to declare mission accomplished (AGI) and that is why there was a coup. The coup failed because no one wanted to wrap up the company or change the way it operated because they would lose a lot of money.

The reason the name was changed was because there was a big public scare about gpt-5 taking over and so Altman had to promise not to release gpt-5 soon. So they changed the name to gpt-4o (omni). Which is A) obviously dramatically a different architecture, B) a huge step up in capabilities (most still unreleased) C) very general purpose. Because of A) and B), this should obviously be a new major version (5).

Yes, this is speculation, but it's very obvious speculation to me. It's weird for me that most people not only don't share this view but seem to absolutely hate when I say it.

This is amazing - I far prefer the personality of Claude to GPT-4 series models. Also, with coding tasks, Claude-3-Opus and been far better for me vs gpt-4-turbo and gpt-4o both. Looking forward to giving it a spin.

Seems like it's doing better than GPT-4o in most benchmarks though I'd like to see if its speed is comparable or not. Also, eagerly awaiting the LMSYS blind comparison results!

3l3c7r1c · 2 years ago

For coding Claude Opus-3 provides far more mature code and good at finding bugs (when present with the error code) compared to GPT-4-Turbo and GPT-4o. Last few days I've been using both for some python+pyspark project. Not sure how come in their comparison GPT-4o is showing that good!

prasoonds · 2 years ago

100% agree here. Claude is especially good at larger context sizes and retains coherence way longer than GPT-4 series of models

orbital-decay · 2 years ago

>I far prefer the personality of Claude to GPT-4 series models.

This new Sonnet seems way less human-like than even old Sonnet, let alone Opus. It's practically devoid of character. It's smart, though.

eigenvalue · 2 years ago

I find that it varies between language and task whether GPT-4o or Claude3 Opus will be better. I usually try both now.

icelancer · 2 years ago

I agree. There are some corner cases that GPT-4o reliably fails that Claude does well in, and vice versa. GPT-4 and GPT-4o consistently generates very poor cv2 Python code for human face/boundary box work - it's a strange reproducible failure in my experience.

snthpy · 2 years ago

I'm surprised there isn't a single mention of Gemini 1.5 Pro. I've been using it for about a month because it came for free with my Google setup and I've been pretty happy. Not for coding but mostly for business tasks like writing minutes from transcripts, summarizing long legal documents,... and the long context length has been awesome. It also conveniently integrates with the rest of my google setup like Drive.

IIRC it also ranked only behind gpt4o on benchmarks.

tkgally · 2 years ago

I've also had good results with Gemini 1.5 Pro for some tasks. Just yesterday, it produced very good analysis and comments based on a 200-page document. ChatGPT 4o was much weaker, and the document was too large for Claude 3 Opus. (This was a few hours before 3.5 was released.)

sunaookami · 2 years ago

Gemini in general is terrible. Way too many mistakes. If you use it via the API it repeats itself constantly. At least it's the model that is the easiest to jailbreak and will happiliy give you a tutorial on how to make a bomb if you ask politely ;) Very ironic considering how Google emphasizes "safety".

nsingh2 · 2 years ago

GPT4(o) is quite good at advanced math, it's been helpful when I was learning differential geometry. Not sure how Claude compares though, this 3.5 release has tempted me to try it out. Also, it's finally available in Canada!

lanstin · 2 years ago

Claude 3 was much better than GPT4 for functional analysis and abstract algebra (first year classes).

stuckinhell · 2 years ago

I'm honestly shocked people are saying this. I use both and GPT-4 is usually better.

What kind of coding tasks is Claude 3 opus doing for people ?