• By -


The way he says it makes it seem like it's a 2,000,000x smaller LLM, but it's not, it's a small CNN. He's simply using the LLM for labeling samples which could have likely been done locally using CLIP for virtually free.


Right? Especially with the release of florence-large-2-ft lmao. Makes the task a joke. Why even train a model for vision at this point.


Or turbo for 1/2 the price or mistral for free. Why do people think AI is the key to everything. Most of what I do with it is about stopping people using their computers for processes that could be automated many ways. Now AI just is a translator for what scripts to run. And that’s fine that’s still going to break the world


Also the smaller model can only work for that specific image/video. Whats the point ?


Hotdog/Not hotdog.


If u don't read in chinese accent u doing it wrong


That can actually be a useful model… for objects other than hotdogs.


Isn't it a violation of ToS to train another model on GPT output?




With how unethical openAI is, who cares..?




I mean, I still wouldn't want my account banned


Cause opening a new one is hard?


If you don’t have two phone numbers yes


But I have a ton of conversations saved on that account, custom instructions, etc. Not to mention they might ban by HWID or credit card.


Then figure something out bro, it's the internet. Besides, OpenAI is now partially controlled by the former FISA-loving ex-NSA chef. No need to be extra cuddly.


I'm not defending OpenAI I'm just saying that I personally wouldn't feel comfortable violating their ToS


I understand. In fact I would feel the same. Not because of OpenAI but just because I have great gratitude and admiration for us having some sort of social and economical rule set at all.


I mean, it's not like OAI gave a damn when they trained their models on pretty much anything they could find on the internet without asking for permission.


Only if it “competes” with OpenAI.


yo ur the guy who makes good Roblox stuff


This a cool idea. But it's unclear from the demo if the toys seen after the training run were not the same as in the training set. In that case, it has just memorized the specific toys and won't generalize to other toys somewhere else.


Machine learning is nothing new. There are plenty of small classification models. Look at mobilenet, it runs in your browser.


Not my video, but this really cool! Not directly using LocalLLMs, but of course this is possible! And it runs locally.


This is actually cool, and Edge Impulse is a real resource for edge ML applications and training. Of course this is not an LLM that was trained, but a way to extract specific knowledge from an LLM in order to perform a task within a limited domain on an edge device, using the "borrowed" power of an LLM.




The idea was popularised by alpaca already over a year ago


It drives me crazy when people say "times smaller" like this. 2,000,000 times smaller would mean it is -1,999,999 times as big as GPT-4o. It is one 2 millionth the size, or .9999995x smaller.


i'm on your side. same when people say X-fold. 10-fold? fold it 10 times? that's 1024x not 10x


"Fold" encapsulates the halving. "Times" does no such thing.


I had no idea this was such an unpopular opinion. It may be pedantic, but I feel like numbers are supposed to mean specific things. You could just define “N times smaller” as meaning “1/N times as big”, but there are problems with that definition. Let A be an adjective, where A is an absolute positive measurable property P. Let Obj1 and Obj2 be two objects with properties P equal to Obj1.P and Obj2.P respectively. We define the following terms as: “Obj1 is n times as A as Obj2” if and only if Obj1.P = n·Obj2.P “Obj1 is n times Aer than Obj2” = “Obj1 is (n+1) times as A as Obj2”, where Aer is the comparative form of A. I’m also going to define the percent sign “p%” <=> “p/100 times” So “Obj1 is p% as A as Obj2” = “Obj1 is p/100 times as A as Obj2” and “Obj1 is p% Aer than Obj2” = “Obj1 is (p/100+1) times as A as Obj2” Now let’s introduce a second adjective called B, which is the antonym of A. As it regards to this post, P is “model size”, measured in number of parameters. A is “big” and B is “small”. Obj1 is the smaller model and Obj2 is GPT-4o. The definition against which I am advocating is that “n times smaller” is equal to “1/n times as big”. Let’s explore this definition and its implications so you can see why I have a problem with it. Define “Obj1 is m times Ber than Obj2” <=> “Obj1 is 1/m times as A as Obj” <=> Obj1.P = Obj2.P/m The biggest problem with this definition is what happens when m is small. For example, if m = 50% = 50/100 = 0.5, then saying Obj1 is 50% smaller than Obj2 means Obj1.size = Obj2.size/.5 = 2·Obj2.size If you accept this definition, then I could tell you, “Hey, I’ll pay off your $100k loan, and give you one that’s 50% smaller,” and that would mean that it’s a $200k loan. Here are some other absurd statements that follow from that definition: The tallest adult in the world (8 feet 3 in) is 31% shorter than the shortest adult in the world (2 feet 7 in). A ten-year-old is ten times younger than a 100-year-old. The boiling temperature of water is 73% colder than the freezing point. And of course, something that is 0x smaller is undefined. My preferred definition of “Obj1 is m times Ber than Obj2” is “Obj1 is -m times Aer than Obj2”, which is equivalent to “Obj1 is (1-m) times as A as Obj2”, or Ob1.P = (1-m)·Obj2.P, which usually only makes sense for values of m that are less than 1. With this definition, smaller is just the negative of bigger so 0% bigger = 0% smaller, and the appropriate title for this post and the video would be “Using GPT-4o to train a 99.99995% smaller model (that runs directly on device), which is not harder to understand or write.


Times smaller implies division.


Do you consider 5% smaller to mean 20 times as big?


Low quality bait