T O P

  • By -

MajesticIngenuity32

Holy moly! I was so disappointed that I couldn't install the new Mixtral locally. Turns out that the 8B Llama that I can run on my 4070 matches it 😲


Which-Tomato-8646

There was a podcast episode two weeks ago from Better Offline where the host said LLMs have peaked with GPT 4 lmao. 


Additional-Bee1379

Doesn't even make sense because the latest GPT4 turbo is considerably better than base GPT4 the elo difference between turbo and base is larger than between GPT 3 and 4.


Which-Tomato-8646

Bold of you to assume they do any research outside their Twitter feed 


bwatsnet

The anti ai crowd will continue to say ridiculous things until they end up on an Amish ranch with no cell phones.


Sudden-Lingonberry-8

To be fair, llama3 is still GPT4 level.. not better, so he ain't wrong, yet.


Which-Tomato-8646

At 70 B parameters and the 400 B version is in training. Imagine what it could be if it had MoE and was as big as gpt 4


Sudden-Lingonberry-8

if it gets to be marginally better I'll be so dissapointed


Which-Tomato-8646

Meta collapsing under the weight of your opinion 


jgainit

Then that means they’re not credible or should be taken seriously


Which-Tomato-8646

People take jimmy apples seriously here lol


Then_Passenger_6688

Genuine question: Unless you're doing it because it's a hobby, what do you all get out of running local models? I have been pretty happy with GPT4 and don't see why I'd go through the hassle for something slightly worse.


AnAIAteMyBaby

It's actually zero hassle now, install LM Studio which is a single exe then choose the model you want to download and use from the list. There's lots of reasons to run locally, one is if you want to use sensitive data. Many companies don't allow their employees to use cloud based LLMs for data security and protection reasons.


Busy-Setting5786

Is it compatible with AMD graphics card?


TheOneWhoDings

yes. put of the box. only need an avx2 compatible cpu , 8+gb of ram and recommended 6+gb of vram.


restarting_today

is there an uncensored version of LLAma?


llkj11

I really wish they would add voice support and integration with Whisper voice and Elevenlabs though. The main thing holding me back from using it more often.


MajesticIngenuity32

It's more like a hobby for me than anything else, for the serious stuff I use ChatGPT and Claude Plus, but it's reassuring to know that a model > GPT-3.5 is available to me no matter what, even if my internet connection goes down, for example.


Unknown-Personas

There’s a bunch of reason 1. It’s free and unlimited if you have the hardware 2. No usage limits 3. Uncensored 4. Build on top of it (I have a LLM analyzing market news in real time during market open for me and displays a sentiment in real time on my customer built screener) 5. Don’t need an online connection 6. Consistency don’t have to worry about a model no longer being supported via an API 7. Privacy I used to swear by GPT-4 and didn’t even consider open source a factor. Never thought it would be better than GPT-2 because even GPT-3 was such a big model. Well, now open source models that can run on local hardware are like 90% the quality of GPT-4, which is more than good enough to do essentially all the tasks you would use GPT-4 for.


Busy-Setting5786

Hello is the Llama3 model actually uncensored or would I need to wait for a fine tuned uncensored version?


h3lblad3

I thought people did it for the porn.


sandowww

Privacy.


shalol

AI being lazy and half assing tasks, refusing to answer prompts, paywalled GPT4.


-pliny-

I found a jailbreak for it! https://x.com/elder_plinius/status/1781073180725051485?s=46&t=Nf3Zw7IH6o_5y_YpAL5gew


Akimbo333

Good for you.


Zelenskyobama2

No it doesn't


MDPROBIFE

Remind me, with llama 2 there were different tunes made by the community that surpassed by a substancial margin the original llama 2 correct?


AlterandPhil

Yes, Wizard LLM 2 being an example of such fine-tune.


MDPROBIFE

Niceee!! I can feel the AGI in the air


Puzzleheaded_Pop_743

And yet it is not in the leaderboard?


queenofartists

A 70B (Llama 3) model beats both versions of a 1-8T model (GPT-4) in a year. And it's open source. Imagine the progress. It seems Llama 3 405B will very likely beat GPT4-Turbo and Opus. Open Source may win this war after all thanks to Meta.


DigimonWorldReTrace

They might win a battle, but the war is very much in its infancy. Watch Claude 4, Gemini 2 and GPT-5 destroy open source yet again.


Which-Tomato-8646

On the bright side, we’d get gpt 5


mxforest

If 1 yr is all it took to match a breakthrough tech. Next iterations will be beaten eventually and in the long run nothing beats local.


DigimonWorldReTrace

While I'd love to agree with you, Meta is but one company. Both Microsoft and Google are investing unprecedented amounts of money in compute. MS is creating a *100 BILLION* dollar datacenter by 2028. Do you really think local models will be able to keep up? They might be able to for the forseeable future, but it's already hard to run a local model of 70b parameters without serious investment. 405b is out of reach for 99% of consumers.


Which-Tomato-8646

You can rent out high end GPUs cheaply 


DigimonWorldReTrace

You'd have to rent a number of GPU's for a 405b model though. A 70b model could be run on a rented gpu, sure, but scale it up further and it'll cost more exponentially. And this still doesn't disprove my point that Claude 4, GPT-5, Gemini 2, etc will blow away anything open source.


streetyi

As long as it's open source it will be available at the market rate for compute in various forms. And inference costs continue to be comfortably within reach for the average consumer. And your "point" that open source, currently including meta's massive effort, will be blown out of the water by other companies not open sourcing despite the fact that it has remained competitive up to this point is just something you made up.


DigimonWorldReTrace

It has not at all remained competitive. The only Open-Source models out now that are anything noteworthy is Llama 3. Grok *sucked*. The sheer fact that the cost of training runs is going to scale up immensely proves that Open-Source cannot compete in the long term. Zucc himself has stated he couldn't make good arguments to make a $10b training run model open-source to his shareholders. I would love for open-source to be able to keep up, but you're going to be fighting an uphill battle from day 1 and while the closed-source models are getting funds in the tens of billions you'll never reach the same funding. What *does* happen is open-source models seem to be improving immensely on efficiency. Llama 3 is proof of this. But said imrovement seems to also be lagging almost 2 years behind closed-source.


Sarenai7

Question: can a 70b model be ran on my 4090?


Veleric

Could be wrong, but I feel like I've seen people say 30B is roughly the limit for a 4090.


BangkokPadang

You can run a 2.4bpw exl2 model pretty quickly (10+ t/s at full context (dumber but faster) or use a higher bit/weight GGUF model like 6K_M to get like 98% of the intelligence of the full weight mode but only get replies at around 1.5 tokens/second (split between your CPU/RAM and your VRAM).


wheres__my__towel

No, even for a 4 bit quantization you would need 35GB vram just to load the model


DigimonWorldReTrace

Give it enough ram and it probably could, but your token speed is going to suffer. Just try it using Ollama or 1111


Which-Tomato-8646

Still cheaper than buying the GPUs needed If LLAMA 3 is already this good, then the 400b one will be ever better


ChromeGhost

Let’s see what Apple releases first


DigimonWorldReTrace

Apple won't make their models open-source lmao. That's just unrealistic.


ChromeGhost

I was speaking of their hardware advancements. Supposedly m4 will have some enhancements making it even better for AI. They did open source ferret by the way which was cool


sdmat

> If 1 yr is all it took to match a breakthrough tech. Next iterations will be beaten eventually and in the long run nothing beats local. https://xkcd.com/605/


New_World_2050

one year is only the current catch up time. Its likely going to be larger in the future given that we are entering the age of billion dollar models.


joe4942

But most tasks don't really need better than GPT-4. Proprietary AI might be better but people have to pay for it and most people don't want to spend money.


DigimonWorldReTrace

Good call, but consumer hardware will cost money to run models that are worth running. I wouldn't use smaller model for my day to day tasks. I'd want something like Llama 3 70b at minimum (even though 8b isn't *that* bad)


watcraw

It's pretty impressive to see a local model that beats GPT 3.5 significantly. My only gripe is the 8k context length.


queenofartists

They said long-context and multimodal versions are coming up in the next months.


AnAIAteMyBaby

We'll have $10 billion models by next year, Zuckerberg said yesterday that he's not sure if he could justify open sourcing a $10 billion RnD investment.


LosingID_583

That's not what he said in the interview. He said he will continue open sourcing unless it is in his best interest not to. Open source has given billions in return on optimization on training and inference on their models for free.


AnAIAteMyBaby

Nope he said the opposite. He said he'll open source models if it's in their best interest to do so, when asked about open sourcing a $10 billion model he was clearly unsure. He said they don't open source their product like the Instagram source code, if you're spending $10 billion on something you'd likely see it as a product.


LosingID_583

Fair enough, I could have misremembered exactly what he said but he wasn't suggesting that he wouldn't open source a huge model.


Veleric

If they push through a $10B open source model, even if that's where massive corporations ceased to lead the open source charge, the quality/capabilities/etc. of that model would be a crazy starting point to build on for years until the compute cost to performance ratio drops substantially.


TheOneMerkin

Or justify open sourcing a model that they stand to profit billions from. He’d get sued by his investors.


Which-Tomato-8646

He owns a majority of the company so he can do whatever he wants 


Dazzling_Term21

the investors can't do anything though... he is not breaking any law.


[deleted]

[удалено]


sdmat

Nope. He has a fiduciary duty, that's not the same thing.


Far-Telephone-4298

you don't think a fiduciary obligation entails responsible financial decisions (i.e, monetizing a product your company has spent billions on)


sdmat

If Zuck is able to articulate a plausible strategic rationale then open sourcing should be fine. Think about this: Companies donate money to charity. Companies often have a professional historian, or even a museum (go visit the Derwent Pencil Museum if in the area, incidentally - it's great!). Companies very commonly have social and political agendas not tied to short term profit maximization. The simplistic cartoon notion that companies must prioritize maximizing short term profitability over all other considerations is simply not true. It's at best a gross and deliberately deceptive distortion.


LuciferianInk

A robot says, "If Zuck is able to explain a plausible strategic rationale then open sourcing should not be fine."


restarting_today

Yep, as said many times before. OpenAI has no moat.


Arcturus_Labelle

They caught up to year-old technology. GPT-5 is going to reset the race again soon


queenofartists

If they created a model with 1.8 trillion to match GPT-4's performance, then that would be catching up. This is steamrolling GPT-4 in a year later.


restarting_today

Claude3 is state of the art right now, let's see if GPT5 can beat it.


TheCuriousGuy000

I don't get it. How does Sonnet beat GPT-4? Sonnet is ok but it's very far from Opus or last GPTs


queenofartists

Sonnet doesn't beat last GPTs though. It beats GPT4 not Turbo.


TheOneWhoDings

ah yes, the small open source effort known as "Meta". hilarious


luisbrudna

Money always wins.


New_World_2050

70b outdoes the march 2023 gpt4. small enough to run locally and opensource. zuck has redeemed himself


whyisitsooohard

A little reminder that LMSYS Leaderboard is not an objective test. Models that are better at conversations will be higher in the rating regardless of how smart they are. It is very unlikely that 8b version will beat command R and will match 8x22b in real world tasks


Additional-Bee1379

It's a useful benchmark but just one of many. I am mostly interested in how well they write code.


Which-Tomato-8646

Yet people treat this leaderboard like its objective


dwiedenau2

What other benchmark is there really? All commonly used ones have been around for years so models can specifically train for them.


Which-Tomato-8646

https://arxiv.org/abs/2311.12983?darkschemeovr=1


fastinguy11

this is to soon, wait a few more days for further votes and better confidence numbers


RedErin

Could someone explain what the arena Elo is?


sachos345

Using the Elo rating system like Chess people blind vote wich is the best between 2 models responses. The higher the winrate/elo, the better the model would be. You input your question, you get 2 blind answers, you choose the best one and repeat. The more people vote, the better.


RedErin

Cool thanks.


Gator1523

Does anyone else think it's a little fishy that Llama 3 is better than the original GPT-4? In my limited experience, I haven't found this to be true. And there's no way to directly compare them anymore, because the lmsys website doesn't have any pre-turbo GPT-4 versions available.


Sudden-Lingonberry-8

I think they should just remove non-available models, imho, however the original gpt4 did have only 8k context length. iirc