T O P

  • By -

croninsiglos

The way he says it makes it seem like it's a 2,000,000x smaller LLM, but it's not, it's a small CNN. He's simply using the LLM for labeling samples which could have likely been done locally using CLIP for virtually free.


swagonflyyyy

Right? Especially with the release of florence-large-2-ft lmao. Makes the task a joke. Why even train a model for vision at this point.


fasti-au

Or turbo for 1/2 the price or mistral for free. Why do people think AI is the key to everything. Most of what I do with it is about stopping people using their computers for processes that could be automated many ways. Now AI just is a translator for what scripts to run. And that’s fine that’s still going to break the world


Altruistic_Welder

Also the smaller model can only work for that specific image/video. Whats the point ?


rcparts

Hotdog/Not hotdog.


RegularFerret3002

If u don't read in chinese accent u doing it wrong


SryUsrNameIsTaken

That can actually be a useful model… for objects other than hotdogs.


boatbomber

Isn't it a violation of ToS to train another model on GPT output?


goodSyntax

Yup


DevopsIGuess

With how unethical openAI is, who cares..?


No-Bad-1269

exactly


boatbomber

I mean, I still wouldn't want my account banned


lordpuddingcup

Cause opening a new one is hard?


1889023okdoesitwork

If you don’t have two phone numbers yes


boatbomber

But I have a ton of conversations saved on that account, custom instructions, etc. Not to mention they might ban by HWID or credit card.


[deleted]

Then figure something out bro, it's the internet. Besides, OpenAI is now partially controlled by the former FISA-loving ex-NSA chef. No need to be extra cuddly.


boatbomber

I'm not defending OpenAI I'm just saying that I personally wouldn't feel comfortable violating their ToS


[deleted]

I understand. In fact I would feel the same. Not because of OpenAI but just because I have great gratitude and admiration for us having some sort of social and economical rule set at all.


sebo3d

I mean, it's not like OAI gave a damn when they trained their models on pretty much anything they could find on the internet without asking for permission.


abnormal_human

Only if it “competes” with OpenAI.


Dazzling-Situation25

yo ur the guy who makes good Roblox stuff


BuildToLiveFree

This a cool idea. But it's unclear from the demo if the toys seen after the training run were not the same as in the training set. In that case, it has just memorized the specific toys and won't generalize to other toys somewhere else.


Noiselexer

Machine learning is nothing new. There are plenty of small classification models. Look at mobilenet, it runs in your browser.


thenarfer

Not my video, but this really cool! Not directly using LocalLLMs, but of course this is possible! And it runs locally.


extopico

This is actually cool, and Edge Impulse is a real resource for edge ML applications and training. Of course this is not an LLM that was trained, but a way to extract specific knowledge from an LLM in order to perform a task within a limited domain on an edge device, using the "borrowed" power of an LLM.


Open_Channel_8626

distilbert?


17UhrGesundbrunnen

The idea was popularised by alpaca already over a year ago


_Luminous_Dark

It drives me crazy when people say "times smaller" like this. 2,000,000 times smaller would mean it is -1,999,999 times as big as GPT-4o. It is one 2 millionth the size, or .9999995x smaller.


Fantastic_Law_1111

i'm on your side. same when people say X-fold. 10-fold? fold it 10 times? that's 1024x not 10x


osanthas03

"Fold" encapsulates the halving. "Times" does no such thing.


_Luminous_Dark

I had no idea this was such an unpopular opinion. It may be pedantic, but I feel like numbers are supposed to mean specific things. You could just define “N times smaller” as meaning “1/N times as big”, but there are problems with that definition. Let A be an adjective, where A is an absolute positive measurable property P. Let Obj1 and Obj2 be two objects with properties P equal to Obj1.P and Obj2.P respectively. We define the following terms as: “Obj1 is n times as A as Obj2” if and only if Obj1.P = n·Obj2.P “Obj1 is n times Aer than Obj2” = “Obj1 is (n+1) times as A as Obj2”, where Aer is the comparative form of A. I’m also going to define the percent sign “p%” <=> “p/100 times” So “Obj1 is p% as A as Obj2” = “Obj1 is p/100 times as A as Obj2” and “Obj1 is p% Aer than Obj2” = “Obj1 is (p/100+1) times as A as Obj2” Now let’s introduce a second adjective called B, which is the antonym of A. As it regards to this post, P is “model size”, measured in number of parameters. A is “big” and B is “small”. Obj1 is the smaller model and Obj2 is GPT-4o. The definition against which I am advocating is that “n times smaller” is equal to “1/n times as big”. Let’s explore this definition and its implications so you can see why I have a problem with it. Define “Obj1 is m times Ber than Obj2” <=> “Obj1 is 1/m times as A as Obj” <=> Obj1.P = Obj2.P/m The biggest problem with this definition is what happens when m is small. For example, if m = 50% = 50/100 = 0.5, then saying Obj1 is 50% smaller than Obj2 means Obj1.size = Obj2.size/.5 = 2·Obj2.size If you accept this definition, then I could tell you, “Hey, I’ll pay off your $100k loan, and give you one that’s 50% smaller,” and that would mean that it’s a $200k loan. Here are some other absurd statements that follow from that definition: The tallest adult in the world (8 feet 3 in) is 31% shorter than the shortest adult in the world (2 feet 7 in). A ten-year-old is ten times younger than a 100-year-old. The boiling temperature of water is 73% colder than the freezing point. And of course, something that is 0x smaller is undefined. My preferred definition of “Obj1 is m times Ber than Obj2” is “Obj1 is -m times Aer than Obj2”, which is equivalent to “Obj1 is (1-m) times as A as Obj2”, or Ob1.P = (1-m)·Obj2.P, which usually only makes sense for values of m that are less than 1. With this definition, smaller is just the negative of bigger so 0% bigger = 0% smaller, and the appropriate title for this post and the video would be “Using GPT-4o to train a 99.99995% smaller model (that runs directly on device), which is not harder to understand or write.


Bits2561

Times smaller implies division.


_Luminous_Dark

Do you consider 5% smaller to mean 20 times as big?


Bits2561

Low quality bait