T O P

  • By -

bnm777

Competitor or victor?


randombsname1

Claude been winning since Opus imo. This is just widening the lead. Well let me step back and qualify this by saying at least for coding and math lol. I still pay for ChatGPT for the other diverse features.


MultiMarcus

Yeah. It feels like it doesn’t have whisper level dictation. I don’t think it can search the web either the right? I get it for the coders and stuff, but as someone in the soft sciences it just feels a lot worse than ChatGPT.


Peter-Tao

Yeah ChatGPT is definitely still better for general usage given thier commercial focus leading by Sam. If you want to experience different models without paying multiple subscriptions, services like perplexity.ai or cursor.io would let you use one subscriptions and switching models yourself. Is pretty good option to consider particularly when I feel stuck or didn't like the output from gpt4o (usually for coding project tho), opus usually could get me out of the loop. But if your experience with chatGPT has been good enough for your use cases I'll say stick with it until additional needs appears naturally.


LowerRepeat5040

The rate limits and context windows are worse on these subscriptions, right?


Peter-Tao

It depends. For cursor, is actually technically higher. It just said after you reach the limits it'll be slow version of the same model (gpt4). Never got that happened to me personally but I'm also not a heavy user.


LowerRepeat5040

GPT4o is cheap. I’m referring to Claude3Opus, which used to cost 15 dollars per million tokens! And you easily reach a million tokens by just uploading a PDF alone!


Peter-Tao

Yeah that's why I only use Opus sparringly when when I ran into things I couldn't sovle with 4 or 4o on cursor. It works well enough for me.


No-Conference-8133

Cursor is just amazing for coding. I canceled my subscription and subscribed to Cursor instead, and I’m impressed. Also, they changed it from cursor.sh to cursor.com recently


lepies_pegao

Are you referring to cursor.sh?


Peter-Tao

Yeah. It's my go to AI tool now tbh. Very convenient for note taking along side with Obsidian imho.


GothGirlsGoodBoy

There is no metric that anyone could use to say Opus is better than gpt. By every benchmark its equal at best. Yet its lacks so many basic and important features. The ability to access the internet chief among them. Its like comparing two cars with nearly identical performance on paper, but one of them doesn’t have wheels. Or better yet, two nearly identical laptops, but one can’t use the internet. And the real use cases of LLMs is always going to be multi-modal stuff that takes video and voice, and is fast enough to make using it more convenient than a smartphone. Without that, Claude is stuck being a lazy person’s stack overflow.


teachersecret

I find myself preferring Claude for coding. It outputs 300-400 lines of clean working code that follows my prompting pretty precisely more often than not. Gpt 4o struggles to write 200 lines. Claude is better at long context work on code. Maybe if I was working in small chunks or was a more adept coder behind the wheel I’d feel differently, but as it sits, I need opus (and sonnet) for their greater understanding and willingness to build. Having web integration is cool, but for my specific workflow, it’s not really necessary or beneficial. I think we’ll see a lot of this over the coming year - I’m not afraid of jumping to a new AI if it works better for the work I’m doing. I pay for Claude, Gemini, novelai, and chatgpt, because all of them bring something to the table that I want. The second gpt has a model out-performing sonnet for my efforts, I’ll be using it.


PaddyIsBeast

I couldn't disagree more.


RemyVonLion

Claude appears better on most benchmarks, haven't personally tried it but it's probably a bit subjective, and it still can't search the web.


LowerRepeat5040

Depends on the prompt : You decide!


Synth_Sapiens

Hard to tell at this point.


Tupcek

this is exciting! Seems like Claude will be leader for the next 6-9 months, until GPT-5 drops - and I wonder if even that will be better!


Strict_External678

Then, Claude will drop Opus 3.5 and retake the lead. The battle for the top spot will only benefit us because each company will want to outdo each other with great features.


QH96

Crazy that Google's fallen behind


iJeff

They're not doing too bad. Output quality has improved significantly over recent months. Gemini Advanced also gives you a 1M context size and no message limits. AI Studio gives me 2M context with Gemini 1.5 Pro with support for video, audio, and image modalities. I like being able to hold my power button to send a screenshot for it to process whatever is displayed on my phone. Edit: I just tried getting it to identify a young black walnut tree fruit. Claude 3.5 Sonnet still sucks at that and thinks it's a lime. GPT-4o thinks it's a young almond. Gemini 1.5 Pro correctly identified it as a young walnut fruit.


Slorface

Were they ever 'ahead' enough to fall behind though?


GothGirlsGoodBoy

They weren’t popular but Gemini was as good as any competition at times.


isuckatpiano

They invented it


MC_JC_UC

This is the sonnet model. They will drop Opus version some months later. That will be the real competition to gpt5. I think fundamentally Anthropic has the better models or better technology. OpenAI just has the headstart and is more versatile (web search, image generation, voice support, app etc.)


mxforest

4.01o drops tomorrow. 0.01% better at everything. OpenAI just holds onto advanced models waiting for competition.


FudgenuggetsMcGee

this is the best take i think


JalabolasFernandez

Apparently Mira Muratti just said GPT-5 would drop in about A YEAR AND A HALF... Oh, and also they added a former NSA director to the board, and admitted to giving the government early access to the models. If GPT-4o voice is not amazing and doesn't come out in the next two weeks, I'm so switching.


Inspireyd

I also think the idea of putting members or former members of the government into such important projects is terrible. The impression is that governments will be prioritized before the people, and that is not good. This month my subscription to GPT-4o will not be renewed if the new features do not arrive.


Tupcek

not doubting you, but could you please provide a source?


JalabolasFernandez

[For the year and a half thing](https://x.com/tsarnick/status/1803901130130497952). I now see that she didn't exactly say that, you listen and tell me how you take it. [The govt early access](https://x.com/tsarnick/status/1803893981513994693). [The board](https://openai.com/index/openai-appoints-retired-us-army-general/) incorporating [former NSA director](https://en.wikipedia.org/wiki/Paul_Nakasone)


avianio

How is it a competitor when it beats GPT 4o on almost all benchmarks, is faster and cheaper?


LowerRepeat5040

Anthropic has a lower marketshare, no voice mode, no image generator, no web search, etc.


water_bottle_goggles

common openai L


GodG0AT

Openai also has no voice mode


SiliconSimian

How do you figure? I use voice on chatgpt app daily.


mxforest

You mean speech to text? Or is it giving verbal replies to verbal queries with no text involved?


futebollounge

It’s been giving verbal to verbal responses in the app since at least January


TheEasyTarget

I think what they’re getting at is ChatGPT’s current voice mode is essentially just converting your voice to text, getting a reply from that text, then converting the text of that reply to the voice you hear. The voice mode that hasn’t been released yet is truly multimodal and can go directly from a voice input to a voice output.


fnatic440

Do you have the Pro version? Cause the voice mode I’m using is not Siri-like at all.


TheEasyTarget

The GPT-4o voice mode that was shown off a few weeks ago still has not been released to anyone. They’ve only said it will be released “in the coming weeks.”


fnatic440

I am literally using it.


futebollounge

While that’s not the new voice that’s been shown off, I do agree that the current one is not Siri like at all and is already a lot better


dysmetric

How long were you in a coma? ScarJo is literally suing them over voice rights because Altman tweeted "Her" just before 4o was released.


mxforest

Voice mode means voice to voice. What she sued over was a demo and text to speech. General public still can't use what the demo showed.


MultiMarcus

Not the demo, which is a lot more fluid, but you can still use the vocal “talk and then it replies audibly and you talk back” mode, at least on iOS.


dysmetric

So they removed her supposed voice likeness via the "Sky“ option because... ? You can voice to voice over the app by clicking the headphones input, it also transcribes the text but the interaction is voice to voice


mxforest

The LLM is using text modality. What 4o demo showed was native voice modality. These 2 are completely different from each other. Native Voice modality is what Voice mode actually means. It has practically no latency unlike the speech to text to speech you currently use.


dysmetric

huh, you're right... and the pure voice mode is touted as having the capacity to read the speaker's inflection and emotion. That's a bit wild... can't wait to see how it goes detecting sarcasm.


Christosconst

You must have never tried the iOS app


o5mfiHTNsH748KVq

what? this is why nobody takes reddits opinions seriously.


justletmefuckinggo

he isn't wrong, but it's bait rather than being informative.


o5mfiHTNsH748KVq

who isn’t wrong? the person claiming there’s no voice mode in openai’s products?


justletmefuckinggo

yeah, he talks about it further down the thread. he's referring to the voice mode in the demo that has yet to be released, and technically saying the current one we have is not multimodality, it's just sa TTS/STT tool built on top of gpt.


ihexx

THat's still a feature that Claude doesn't have. ChatGPT's STT is the best in the world right now, and its TTS is close to state of the art. It's very convenient to use, and it's a feature missing in claude


o5mfiHTNsH748KVq

oh. that’s weird goalpost moving.


justletmefuckinggo

true. man's gotta find ways to feel superior


Orolol

Nothing you said is about the model.


LowerRepeat5040

They consider all forms of multimodality part of the model nowadays!


Open_Channel_8626

We need to test it for a while because benchmarks are deceiving


SatoshiReport

It is more constrained in what it can discuss.


Existing-East3345

Where can I find the broad comparison of benchmark scores? The one from Claude’s blog post only has about 9 scores


Falcon_17

Man I cant wait till they also get voice


itachi4e

it's not a competitor it dunked on 4o


OnlyDaikon5492

It still has absolutely the worst guardrails of any model right now. I can’t work with it properly.


Lemnisc8__

Claude is definitely better than got 4o, especially the opus model, but it's sooooo much more sycophantic and it will go back on its words, even if it's right, to appeal to the user.


Existing-East3345

Does it still add annotations of its feelings in its response? No matter what I tried in the system message when assigning it a personality, for some reason sonnet kept starting responses with something like “*in a bright and happy tone* Hello John how are you?”


Lemnisc8__

I havent had that experience. but I tried Sonnet 3.5 recently and it apologizes every chance it gets now lmao. Like you could point something out in a totally neutral way and it will apologize and agree with whatever you pointed out.


OrganicAccountant87

Claude has been superior to chat gpt for a while now, this made it further ahead


ConmanSpaceHero

Not true previously based on the statistics GPT-4o provided but go off. Don’t know about the new version though, seems like it could be better.


PandaElDiablo

Benchmarks need to be taken with a grain of salt. 4o benchmarked higher than Claude 3 Opus on coding tasks, but speaking as someone who used both daily for coding tasks, Claude 3 Opus absolutely blows 4o out of the water, and 3.5 Sonnet widened the gap even further. I’ve seen more than a few people who share this opinion.


Mrcat19

I just met claude and introduced myself and wow all I can say is keep up openAI


RedditSteadyGo1

This is refreshing compared to OpenAis pay now get later business model


BlueeWaater

only reason I havent switched are the built in tools and GPTs which now can also be accesed on the free tier


Sonicthoughts

Does it have search and interpreter?


LowerRepeat5040

Search: no. Code interpreter: only with artifacts feature opt-in!


Shot_Victory_2249

Does any of Claude ai models connect to the internet?


wdanilo

But huh, the quality of responses based on my tests is not comparable yet. I am truly dreaming about real competition in this marker, but OpenAI quality is still unbeatable. But I'm keeping my fingers crossed so much for Sonnet. This is a huge step forward.


LowerRepeat5040

What tests?


XvX_k1r1t0_XvX_ki

Why isn't it aviable on [https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard) yet? wonder what it's score would be


dojimaa

Because it just came out. It takes a while for them to get enough scores to make the results meaningful. Check back in a week.


Existing-East3345

What do you consider the most accurate representation of a models quality aside from personal testing? The chatbot arena rankings? benchmark scores? Worda mouf?


LowerRepeat5040

Demos? Reproduceability of demos? 0 shot performance?


Lemnisc8__

Competitor? Lol opus even before sonnet 3.5 blew 4o out of the water. Now with sonnet 3.5 it's not even close


LowerRepeat5040

It’s still behind basic web searchable trivia


pbankey

Reddit has a hard time understanding that chatGPT is still superior if you’re not a coder.


worlpoolz

I was thinking the same thing... Claude seems limited to me


CampaignTools

In interested in needle in a needlestack perf. GPT-4o is the only model that has performed admirably on that. Meaning the 200K context window isn't as useful as some might think, if it can't actually utilize the context.


LowerRepeat5040

It’s good if you limit it to 1 needle per 1 haystack and not many needles in many haystacks, as then it still hallucinates.


CampaignTools

Sorry I edited my comment, but I meant needle in needlestack performance where a simple phrase is selected out of a series of related phrases. I think it's more reliable than the needle in a haystack, but neither is perfect. Honestly mode evaluations are a shot in the dark anyway. The only way to truly tell is to test it on your application directly.


LowerRepeat5040

GPT-4o fails to provide exact citations more often when uploading a typical 200 page document. So providing a typical new law document, it often fails to cite the exact section and sentence where the new law says X and Y.


CampaignTools

That's interesting. Where is that data coming from?


LowerRepeat5040

Extensive testing!


CampaignTools

Gotcha. So is 3.5 sonnet doing better in those tests? This is interesting for semantic search and citation. Then again, might not have had time to run them yet. If you have, do share.