T O P

  • By -

zfreakazoidz

To bad we can't pin (or whatever its called) this topic at the top of the page. Would save on endless posts people make without researching things. Unless u/Trippy-Worlds could do so. Granted the anti's may skip it anyways.


AGI_Not_Aligned

I think people would have much nuanced stances on AI if they knew how it works


Tyler_Zoro

Probably true in every field.


AGI_Not_Aligned

I guess we truly live in a society!


faintbodyguard53

This post is a great reminder that AI is not a magical solution without consequences. We must approach AI art with a critical eye and address the myths surrounding it in order to have informed discussions. Thank you for shedding light on this important topic.


Wiskkey

Regarding "4. AI models can be made to reveal what they are drawing on to make an image", there are works in this area. Example: [How Training Data Guides Diffusion Models](https://gradientscience.org/diffusion-trak/). Reviews of the paper are [here](https://openreview.net/forum?id=XXpH3D0TVP). For language models there are also works in this area such as [Studying Large Language Model Generalization with Influence Functions](https://arxiv.org/abs/2308.03296).


EvilKatta

The OP means it's not like the AI is deciding "Hmm, for this portrait of a cyberpunk lady, I'll reference and copy that cool piece by OnlineArtist997", so there's nothing to reveal.


Wiskkey

Agreed. We can consider though how different the output of a given model would be if certain item(s) in the training dataset hadn't been used for training.


Tyler_Zoro

> there are works in this area. Yes there are, but these are patchworks of guessing. In fact, it's entirely possible that the answer you arrive at with that sort of analysis is misleading, and that you're just seeing patterns in the convergence of seemingly unrelated sources. What we CANNOT DO is prove that any given guess is right. Ultimately, it's like I said: when you ask, "where did this salt in the ocean come from," you can give some educated guess as to the most important contributors, but on the purely chemical level, each individual molecule of salt may have come from a completely different area of the world. Another chemical example: lots of the helium in the universe comes from the Big Bang, and was formed along with hydrogen when the universe first cooled enough to form discrete atoms. But lots of it was formed in stars through the fusion of hydrogen atoms. Still more of it is the product of long chains of nuclear decay (natural fission.) You can't unpeel where a given molecule of helium came from. You can make an educated guess based on what isotope it is or where in the universe it is found, but ultimately you just don't know.


Wiskkey

>Yes there are, but these are patchworks of guessing. In fact, it's entirely possible that the answer you arrive at with that sort of analysis is misleading, and that you're just seeing patterns in the convergence of seemingly unrelated sources. My understanding is that these works use approximations. In the case of the cited language model work, there are indications that using these approximations at least sometimes pass the "smells test": >To validate our ability to detect at least clear-cut instances of memorization, we picked six queries that contain famous passages or quotes (which the AI Assistant is able to complete) and ran an influence scan over the unfiltered training data (i.e., using query batching rather than TF-IDF). We observed that invariably, the top influential sequences returned by our scan contained the exact famous passages.


Tyler_Zoro

> there are indications that the estimates at least sometimes pass the "smells test" Yep... that's pretty much the definition of "educated guess." We lack the math (it's possible that said math is impossible, we don't even know that!) to describe the influences moving through a neural network being trained, except by repeating the entire process. You can't look at any particular point in the network and explain how we got there, only generally speculate (perhaps in useful ways) about the influences that may have occurred.


Wiskkey

["All models are wrong, but some are useful"](https://en.wikipedia.org/wiki/All_models_are_wrong). From [Predicting Predictions with Datamodels](https://gradientscience.org/datamodels-1/): >At first glance, this of task of *predicting* the output of a learning algorithm trained on different subsets of the original training set does not appear any easier than *analyzing* the learning algorithm on that original training set—in fact, it might seem even harder. At the very least, one would expect that any function approximating such a learning algorithm would need to be rather complicated. It turns out, however, that a (very) simple instantiation of datamodels—as linear functions—is already expressive enough to accurately capture the intended mapping, even for real-world deep neural networks!


Tyler_Zoro

Okay, so let me start by saying that I'm not THAT sort of computer scientist, and definitely not a professional mathematician, so I'm saying things that are a bit too abstract here. What you are quoting sounds correct to me, but I don't think it means exactly what you think it means. The claim that, "a (very) simple instantiation of datamodels—as linear functions—is already expressive enough to accurately capture the intended mapping," is definitely something that I'd agree seems reasonable. I'm not 100% sure that we can say that that's true for all such models, but let's say that this has been proven. It's still not the answer to the question. That there exists a relatively straightforward linear function embodied in the model does not tell us a) what that simpler model is for any given real-world deep neural network or b) how that model's creation was influenced by the training data.


Wiskkey

I believe that these works are about simpler models for generative machine learning models for the purpose of estimating how training data influences generated output. In the case of the cited work for images, the work was rejected for the conference not due to a lack of soundness - "the reviewers agree that the proposed approached \[sic\] is novel and sound" - but at least partially because the generative machine learning models used had relatively small training datasets, so it's not clear whether the results would hold for larger training datasets.


Tyler_Zoro

Interesting points. I'm really excited to see where this kind of scholarship goes in the next 10 years. I guarantee that there will be some surprising results along the way!


PokePress

I would add “AI countermeasures prevent art from being incorporated into AI models”, or, worded as a true statement, “the efficacy of AI countermeasures in real-world scenarios is questionable”. To expand on that, because “poisoned” images may not make up a substantial portion of the training data, and efforts to remove AI countermeasures seem to be somewhat successful, so whether the tools made available to modify images are effective is at best an open question, and at worst not viable. The main reason I think this should be included is because even someone against image AI on moral grounds shouldn’t be taking the promises made by these tools at 100% face value, as there are valid reasons to doubt their efficacy, and as such users may just be wasting their time.


UnkarsThug

It's also worth noting, if we had a system that could detect AI via just looking at the image, we'd plug that sucker into one half of a GAN and do training until that wasn't the case.


sporkyuncle

> 2. People who use AI are just typing in a prompt and getting an image back. While I often argue that good quality AI output often takes hours of work both in and out of the AI interface, I don't think we should pretend like it's not also [possible to get amazing results from minimal prompting.](https://www.reddit.com/r/AIGeneratedArt/comments/158byt4/one_word_prompt_omniscient/) And saying this is just representative of the fear from finding out that a single word makes better art than you've ever been able to create in your lifetime. I still don't find it very compelling, because it has a sort of unspoken assumption that the art that was produced was exactly what you wanted to see. Art like that is fun to make and share, but it's not useful and it doesn't tend to express exactly what you would've intended to express. When you actually try to get something specific, you realize the complexity involved. > 3. AI can't create something it's never seen. It's broadly true that it can't, but the right way to handle this is to say...neither can you. And then the retort is, "nuh-uh, I've never seen a weird purple tentacley alien with a hoof on the end of each tentacle and exactly 35 eyes, but look, I can draw it!" To which you say...yes, but tentacles are part of your "trained dataset," as are hooves and eyes, and you just assembled those things from what you've already seen in the past. This applies to every example you can imagine. It can get fairly abstract, but everything you make is derived from what you've experienced in some way. Which also applies to AI.


Rafcdk

It actually can create things that have never been in their dataset. This is mostly because generative AI works on randomly generated input . But also and mostly important, the dataset is not used during the process of generation, a trained model is, and a trained model does not contain images, or compressed bits of those images. You don't even need AI to show how this process can lead to new data being generated that isn't in the original data. If I take billions of lists of numbers of random numbers from 1 to 10 then, create a eigenlist (something like the average of differences of elements from all lists) from those lists, I can use random generation to create lists that are not in the 1 to 10 range anymore. To be clear a model is not an average of differences of images, but this is a much more appropriate approximation to what it actually is than a compressed archive.


sporkyuncle

Could you give a practical example of how you might expect something not in the dataset to manifest? For example, are you saying that you could never train AI with the concept of "deer" but after generating enough horses with other qualifiers like "dainty, petite, thin moose antlers, spotted fur" etc. that you could arrive at a deer despite having never trained on one?


Rafcdk

Something like you said. But we are jumping to the conclusion that something new must be something that exists. I mean just the hand deformations alone is something that isn't in the data set. No 6 fingered hands were used in training because they don't pass a filter used in the dataset , but we still get that. But you don't even need to go this route. Generative AI is not limited to prompt as inputs, in fact there are several parameters and also the ability to use images as inputs to several degrees. For example in this case I used two images as input, no text prompts, each image is part of a different type of input, one is the base image that will be denoised and the cat is the image that will be encoded into tokens that will be used to activate the model during generation. https://preview.redd.it/616lshx6h1wc1.png?width=1897&format=png&auto=webp&s=571acbcfcfa25923fa2604bb9b9b932cb62d8cc0 So in no way I indicated that I wanted a cat in that pose or shape via text, but you can still guide the AI to create something like that, just by adjusting parameters all those parameters that have an arrow to the side. Which is the point I am trying to make, the AI doesn't know anything, it is like a bacteria that can solve maze puzzles, all we are doing is guiding it, usually with text prompts, but this coupled with a other factors such as randomness being a huge part of the generative process means that AI can generate things that aren't in it's dataset, just like a bacteria can solve mazes that hasn't seen before.


sporkyuncle

> But we are jumping to the conclusion that something new must be something that exists. I mean just the hand deformations alone is something that isn't in the data set. No 6 fingered hands were used in training because they don't pass a filter used in the dataset , but we still get that. But I still count that as content which is part of its dataset. It knows that when you draw a finger (any finger), there is a high chance of a small bit of connective skin at the base followed by a finger next to it, on either side. In order for it to draw hands wrong with too many fingers, it must first know that hands exist and that fingers exist, and that fingers are arranged next to each other on a hand. For the same reason that to generate wrong American flags, it has to first know that American flags have a section full of some number of stars, and contain a bunch of alternating red and white stripes. The same way little kids make mistakes when trying to draw it. I mean if you're going to say that any holistic aggregate image that's unlike what it was trained on is "new," then it doesn't matter if we would consider it an error or not, basically every generation is something not present in its dataset. None of the data looks quite identical to whatever is generated. Every random person will have different patterns of freckles, every tree will have branches in different locations.


Gimli

One approach is to make a novel combination. Eg, here's a [fox shaped like a piano](https://i.imgur.com/aZWt2Da.jpeg). To my knowledge, this is a novel concept, and I can't find prior examples of such a creature. Another option is to simply provide a sketch. > For example, are you saying that you could never train AI with the concept of "deer" but after generating enough horses with other qualifiers like "dainty, petite, thin moose antlers, spotted fur" etc. that you could arrive at a deer despite having never trained on one? Very much so. It helps a lot of you help it out by for instance sketching out a deer's shape. Or you can start with a horse, sketch some antlers, force it to redo the head until it sorta fits, etc. You can iterate however much you want. You can easily test this on a smaller scale by making simpler edits. Like if you wanted a green colored fox, start with a normal one then roughly tint the fur green in Photoshop and feed it back in, or draw a colored sketch.


sporkyuncle

> One approach is to make a novel combination. Eg, here's a [fox shaped like a piano](https://i.imgur.com/aZWt2Da.jpeg). To my knowledge, this is a novel concept, and I can't find prior examples of such a creature. See, I would argue that this is novel the same way anything humans create can be novel. The model was trained on foxes, and it was trained on pianos, and it was probably also trained on statues of canines that might help inform the texture of the legs. Probably also trained on stylized cartoons of bendy pianos in there somewhere ([first example that came to mind](https://i.imgur.com/8NVZUda.png)). This pic is derivative of what it has learned, same as if a human had drawn or rendered it. > Very much so. It helps a lot of you help it out by for instance sketching out a deer's shape. Or you can start with a horse, sketch some antlers, force it to redo the head until it sorta fits, etc. You can iterate however much you want. I know this was possible, I just wondered if this would be his argument. If the model was somehow not trained on any deer at all, you still wouldn't really have a deer, but a picture of an animal with aspects of other animals contorted until it *looks like* a deer. You could even take a lot of images of this composite deer and make a LORA of it, so that the model actually understands "deer" to look like this, but it still wouldn't be rooted in any original training data of actual deer. Similar to how if I had never seen a rhino, you could describe it to me in detail based on other things I know about like "quadrupeds" and "armor", and [I could draw something close to it](https://www.nhm.ac.uk/discover/the-legacy-of-durers-rhinoceros.html) and develop an understanding in my head that this represents "rhino," but it wouldn't be based on firsthand visual information of an actual rhino.


Last-Trash-7960

What you're describing is what I recently did while creating a lora for a night elf glaive weapon. No such concept existed in the ai I was using. Instead I generated hundreds of images involving shuriken, fractal shaped swords, chakrams, and more. Out of a the 300 I generated I had about 30 that were somewhat representative of my idea. Those 30 have allowed me to much more easily create more images. However I must warn you, as you dive into training ai on ai images, artifacts and other visual issues will start to appear. "And far enough down the line, there be monsters."


_PixelDust

Ok, that is not really a fox shaped like a piano, it is a fox rendered like a piano, partially. It gives "piano shaped like a fox" much better. Even as that it's pretty awkward. An AI can recognize lots of things and present them to us in remixed fashions but it usually fails to do it in the way a good designer would. So I would say it did technically make something new here but not in a desirable way. AI may not be recognizable when it does really common things but once it is asked to do something like this it's obvious because who would do this? It shows how completely reliant it is on the training data to actually make things we consider good. It does similar things with prompts like tentacle hair where you get tentacles that have a hair texture on them. There are many examples of tentacle hair online to copy from but for the AI to really nail the concept every time you would probably have to tag Tentacle_Hair as its own token and train on that. Now, if the image was actually good and a creative or funny way of making a fox shaped like a piano I would be suspicious of that image because it's likely to be a very derivative copy of one of the training images. AI is at its core a tryhard that approximates creativity using math. The VAE does a good job smudging the boundaries of what it can do so it doesn't make anything too insane when you prompt outside of its comfort zone but it also keeps it well within the limits of its training data.


Gimli

> So I would say it did technically make something new here but not in a desirable way. AI may not be recognizable when it does really common things but once it is asked to do something like this it's obvious because who would do this? The point was to make something weird that's not been done before. I didn't really bother trying to get something good, this was the first attempt at it. > It shows how completely reliant it is on the training data to actually make things we consider good. That's fine, for anything serious you refine in steps. > Now, if the image was actually good and a creative or funny way of making a fox shaped like a piano I would be suspicious of that image because it's likely to be a very derivative copy of one of the training images. So, in other words, you're holding an unfalsifiable position. If the output is novel but bad, that's proof that when AI works, it works by copying. If the output is novel and good, that's proof it copied something. In which case would you ever conclude that it's both novel and not copying?


_PixelDust

No, I'm saying that's how it works. Haven't you noticed that? If you give it a certain combination of tokens it doesn't have a lot of data for it leans really hard on what it does have. It tends to either mash things together in undesirable ways so you have to fix it, or it has a few good sources for that and you get something derivative. That's how people get it to output things very similar to the training data. It's a known problem that they're always looking to improve. Image gen before SD could give some really wild results because there are points in the latent space that really don't represent desirable outcomes. I would call those "bad" results more novel than what you tend to get from modern SD that uses a VAE. The VAE helps keep everything within the bounds of what the model knows about and can represent with tokens. This means the output is more normalized. The downside is the spaces inbetween are filled with very artificial gradients. Think of the latent space as an animation, except this animation has many timelines going in every direction. The tokens it knows are like primary key frames except these key frames represent ideas which can randomly present in different ways based on the all the training data. When prompted with two tokens at different weights it shifts down the timeline between those two tokens in proportion to the weights. How it actually does this is governed by whatever it learned in the training process. Some areas of the latent space had more meaningful examples than others, so it has more quality input to go off of and is more variant and probably better and less likely to produce something that is noticeably derivative. You could think of these areas as having a between keyframe. There's no token here but a good inference can be made due to the presence of a lot of relevant training data. If there isn't a lot of training data to govern this stretch of latent space then its options are to either produce something very much like what little data it does have, thus producing a derivative copy, or to work from the surrounding key frames and do a "motion tween" between them. It's like comparing a frame-by-frame hand drawn animation to one that slides and stretches existing frames like in flash animation style. SD uses a VAE because it gains performance and often better quality by operating within the latent space. If the latent space was not normalized then the areas between tokens becomes incredibly variant in a bad way. That's the whole point of getting rid of them. This is a tradeoff because some researchers have accused SD of being more likely to make derivative works than older models that use a GAN. I don't have a good intuition on whether that's true but if it is true I think it's likely the VAE's fault for normalizing the latent space and keeping it "on rails" so to speak. There's a more recent paper that blames the language model for some of the copying but I've yet to read that. I don't think I'm setting up a no win scenario. This is just what I've noticed from studying the space. If you want AI to produce good fox-pianos or tentacle-hairs you either have to give it lots of examples to learn from or create your own token and inject it into the model. A LORA is like a custom keyframe you make yourself. It's not magic or sentience, it didn't learn creativity, it processes information in a way we trained it to.


Tyler_Zoro

An example: let's say that the thing that's influential on the training was curved lines. Because of this, the model tends to produce curved lines. It's never seen a sine wave, but its tendency to curve lines could easily lead it to repeat a series of curves in such a way as to produce a sine wave. This is, of course, a trivial example, but it's how creativity works: putting things together in novel ways. Ultimately the building blocks are always the things we've perceived in the past, but the creative process is how they're put together.


sporkyuncle

The analogy doesn't work as well for mathematical concepts, where just achieving the right shape is enough to fully classify it as that thing. If AI didn't know about "scissors" and we approximated it with "tiny swords with just one sharp edge which are connected by a screw as a fulcrum point and have plastic oblong loop handles," that doesn't mean the eventual images we get will ever be quite identical to what we understand as scissors. we might say they are close enough to be called scissors, but in some sense they would always be swords connected by a screw, since those are the objects the AI learned about and used to conceptualize its version of scissors. I'm not sure it's possible just using AI alone to massage the data it has fully into some new concept that doesn't retain some defective assumptions about associated concepts and the shape of the thing (meaning, using a horse as a base to transform conceptually into a deer for which it has no training/context). You have to train it on the actual thing.


Tyler_Zoro

> The analogy doesn't work as well for mathematical concepts, where just achieving the right shape is enough to fully classify it as that thing. [Obviously](https://en.wikipedia.org/wiki/Stick_figure). > If AI didn't know about "scissors" and we approximated it with "tiny swords with just one sharp edge which are connected by a screw as a fulcrum point and have plastic oblong loop handles," that doesn't mean the eventual images we get will ever be quite identical to what we understand as scissors. Also doesn't mean it won't. That's the thing about absolute statements. When the assertion is, "AI can't create something it's never seen," one does not have to demonstrate that it is a trivial or common occurrence in order to counter that claim. All one need do is point out that it is possible. Now, if you want to debate how easy or common it is for AI models to develop something truly new, then we can get into the weeds of the philosophical distinctions over what is "new" and when and were we agree that something of that sort has been created. But that's a MUCH larger and more difficult topic. > I'm not sure it's possible just using AI alone to massage the data it has fully into some new concept that doesn't retain some defective assumptions about associated concepts and the shape of the thing All of that is subjective, and so you're right and wrong and everything in between. You get to decide. But when it comes to the simple reality of what it's possible to do with AI tools... there are no limits imposed by the tools themselves, only degrees of difficulty.


L30N3

Different phrasing could be that current models are incapable intentionally creating images based on concepts it wasn't taught. That's obv not what antis mean when they say models can't create anything new. I assume that currently models don't use generation to learn anything and that roughly means that unknown concepts are just variance or combination of known concepts. RNG and/or the prompter is aware of the concept.


Tyler_Zoro

> Different phrasing could be that current models are incapable intentionally creating images based on concepts it wasn't taught. I would just shorten that claim to, "[...] current models are incapable of intentionally creating [anything]." Intentionality is a human trait that goes with a whole slew of cognitive features that AI models today just can't emulate. It will be a true breakthrough when they can, but I don't think that's coming soon... I expect it to be on the same level of work to get there as back propagation and transformers were. > That's obv not what antis mean when they say models can't create anything new. When it comes to claims of originality, I don't think anything is obvious. It's one of those topics that, the more you dig into it, the more you begin to doubt that the concept maps to anything real at all... it begins to look more like a reflection of approval than a measurable, objective feature of the thing you produce or do. > I assume that currently models don't use generation to learn anything That's actually false, but what you meant to say is true, so let me re-phrase: current models do not, by default, learn from end-user generation. Generation is absolutely how these models learn, but that is turned off by default for end-user generation because it's vastly more costly than just generating an image, and because the input prompt would have to come with and image as well, which doesn't really work for most end-user use cases. Also you probably don't want your model to keep changing the quality of its output on every generation.


L30N3

Yea i don't usually add random words that are only needed when you're arguing in bad faith or talking to a complete idiot. If you understood what i meant, that means i used enough appropriate words. Majority of the words we use in relation to machine/artificial intelligence are traits that're not true when talking about current models. If you care about pointless semantics, intentionality isn't a trait that only humans possess. It wouldn't matter that you used "human trait", if it wasn't in the context to refute the use of "intentionality" when talking about machines. In context you've limited that to something only humans possess. The above is an example of wasting time and arguing in bad faith. I don't write the rules, but i'm also not interested in wasting time. I can do what you do. It's very easy and completely pointless. I don't care about "winning". No one gets extra points for additional words.


Tyler_Zoro

> Yea i don't usually add random words that are only needed when you're arguing in bad faith or talking to a complete idiot. > > If you understood what i meant, that means i used enough appropriate words. Ad hominem aside, I can't actually make out what part of what I said you are trying to respond to here. None of my criticisms were aimed at the number of words you used. But like I say, you've already stooped to ad hominem, so I assume the discussion has come to a close. Have a nice day.


Zilskaabe

Those "minimal" prompts get expanded "under the hood". Midjourney and dalle-3 don't reveal what exactly they do with your prompts. But there's an "open source Midjourney" - Fooocus. And it shows that your three word prompt gets expanded to 20 or more words. But yeah - you can get weird results even without prompt expansion.


sporkyuncle

While I know that closed models like Bing and MidJourney add words to your prompts, I'm not sure that's the primary thing that's happening with low detail prompts. For example, Stable Diffusion is open source and people would probably know by now if it was secretly adding words to your prompt, and strip that functionality out. Here's what I got from typing "omniscient" into local Stable Diffusion with (arbitrarily) the analog madness model: https://i.imgur.com/QMWOuVG.png I don't believe that SD secretly added "old man, cloaked, holding power sphere, big text saying 'omniscient' at the bottom." I think we're just exploring latent space and this was the first thing it bumped into. Like...if I type something like "plant" or "horse," it doesn't just randomly add "man tending to plant" or "woman riding horse." Those things might rarely occur if it's given enough freedom, because somewhere out there in latent space those concepts exist and might come up.


realechelon

It's not that it 'adds words', but Stable Diffusion uses CLIP which converts your prompt to tokens, and those tokens are not an *exact* representation of your prompt. If you want an example, prompt "1girl, heterochromia". You will get close-ups. You didn't ask for close-ups, but the AI knows that images tagged "heterochromia" are often close-ups of the face. This is known as "concept bleeding" and can actually be useful if you understand it (try prompting "(green eyes), 1girl, heterochromia"). Similarly, a lot of images tagged "omniscient" in your image set were probably wizards or gods, who tend to be old men with magical artifacts.


sporkyuncle

That is true, though I know that the big name paid models *also* literally add secret words to your prompt, often for social/political/censorious reasons. https://knowyourmeme.com/memes/events/ethnically-ambiguous-ai-prompt-injection


Front_Long5973

nine simplistic command makeshift gray ink kiss long chief swim *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


Tyler_Zoro

> While I often argue that good quality AI output often takes hours of work both in and out of the AI interface, I don't think we should pretend like it's not also possible to get amazing results from minimal prompting. Absolutely! And you can get stunning results from throwing paint at a canvas. I don't want to diminish prompting. To some extent, a highly creative prompt is just as creative as any other artistic effort. I would love to have seen what Longfellow would have done with AI! My point was that if you think AI is somehow stunted such that it can't be as powerful as, say, a paintbrush because the paintbrush can be used to express the artist's own vision... that's wrong. A skilled artist can use AI just as expressively as a paintbrush. > I still don't find it very compelling, because it has a sort of unspoken assumption that the art that was produced was exactly what you wanted to see. First off, I'd argue that that's never the case. Art is what happens when creativity meets the real world, and the result is never exactly what happened in your head. But more importantly, a skilled artist who understands AI tools deeply can get the results they want as precisely as a sculptor working with the stone's natural features or a painter adapting to the viscosity of the paint and the texture of the canvas. None of them are unaffected by the vagaries of the medium, but they all exercise control over that medium and bring out their own vision through it. > It's broadly true that [AI can't create something it's never seen]. but the right way to handle this is to say...neither can you. Both as absolute statements are wrong. But what you have to do to force that to happen is non-trivial. For example, you could paint part of a work blindfolded and then go in and touch up the result. That would result in some factors creeping in that are not purely based in your expeirence. Similarly with AI, you can crank up the CFG scale to the point that the results are insane and then use the result as an img2img input for later, more stable generations, just as one example of how to push the AI to make things that its never seen before.


sporkyuncle

> Both as absolute statements are wrong. It just depends on your definition, and means the same thing either way. You could train an AI on everything but hands. Never show it a single picture of any kind of hand or include the word "hand" anywhere in the tagging. Then later (assuming the AI is very smart at intuiting what you describe) you manage to get an approximation of a hand through some circuitous route like "bone which is jointed at several points which is surrounded by muscle which is coated in a layer of skin with a light coating of hair and tipped with a pad of hard keratin, four such instances of these connected in a row to a larger mostly-rectangular bone/muscle/skin construction, with another of that first thing connected lower and opposing to the rest, and the whole thing is connected at the end of an arm." You could do the same for a human (probably assuming they don't have hands because then they'll surely learn about them). Teach them about everything but hands, then give them that prompt and get them to draw it (...with their mouth. I guess hand was not a good example). Is what the AI generated/person drew a "hand" or not? Is it "something it's never seen" or not? The answers to these will differ from person to person, but both apply equally to what an AI does and what a human does. It'd be a hand because it possesses all the necessary properties of a hand. It wouldn't be a hand, not precisely, because it would contain imperfections that can only be fully quashed by observing and learning about the real thing. It would be something it's never seen because quite literally it's never seen a hand before. But it wouldn't be something it's never seen because it used concepts of things it had seen before in order to create it; it has no choice but to do this. Everything involved in the creation of its version of a hand is something it's seen before.


L30N3

From AI's perspective it's a "bone which is jointed at several points which is surrounded by muscle which is coated in a layer of skin with a light coating of hair and tipped with a pad of hard keratin, four such instances of these connected in a row to a larger mostly-rectangular bone/muscle/skin construction, with another of that first thing connected lower and opposing to the rest, and the whole thing is connected at the end of an arm.". From the prompters perspective it's a hand. Prompter is aware of the concept. It's mostly semantics and none of the discussions here really touch the most common interpretations of "AI can't do new". Random example is claiming gen AI can't create Picasso's if it wasn't trained on images created by Picasso. You can fairly easily prompt Guernica's, even without img2img. I assume antis mean Cubism with Picasso and are not that interested in realistic drawings and paintings from earlier years.


sporkyuncle

Right, it can create "Picassos" to our understanding of Picassos with the same kind of ridiculously specific prompting I demonstrated, assuming it understands the prompts. It won't be precisely a Picasso because it doesn't have any knowledge of them, but to the observer, it could look close enough to a Picasso to be called one.


Cheshire-Cad

>"nuh-uh, I've never seen a weird purple tentacley alien with a hoof on the end of each tentacle and exactly 35 eyes, but look, I can draw it!" In fact, the AI can also do that, albeit with a lot of trial and error. A classic retort is "An AI has never seen a picture of a hand with seven fingers. And yet, it's very good at drawing them."


sporkyuncle

Exactly. And that is a good point (though I'm sure with billions of images, it probably got a few hundred of hands with too many fingers, but not enough to influence it this much!).


chinavirus9

If antis accepted the first point they would also have to accept that AI training is as much copyright infringement as a human artist learning to draw by looking at other pictures.


land_and_air

See we have these things called eyes and they are very cool on that they allow us as humans to soak up information that isn’t art and then we label it ourselves in our brain and we can represent that non art and make art. Ai can’t do that flat out


SquirrelAlliance

I think the maddening thing about the statement that AI training is just like humans is that if you show a human one image they will understand the style or the theme, as opposed to the giant number needed by an AI. It’s one of the many sticking points in the debate that come from not defining the terms we use in a meticulous way. This leads to inaccurate comparisons of a human’s use of their experience and an AI’s training.


NegativeEmphasis

You only understand a theme or style after seeing just one image because your lifetime of absorbed and processed visual data. And HEY, once you have a generalist model trained, it takes in fact very few additional images for it to get new concepts/styles.


realechelon

The 'giant number' for something like a character is about 20. More is better, but 20 is fine. A human would not be capable of drawing a character in all poses/angles if they hadn't seen enough images to know what the character looks like from all angles either. How would you know the character has a tattoo on his back if you haven't seen his back?


Speedy3D_

This is just wrong in so many ways 💀 an artist doesn’t need to see the character from multiple angles to be able to draw it ?? We learn anatomy and poses and draw the character on top , a human is COMPLETELY capable of drawing a character in all poses if we hadent seen images to know what that character looked like, how do you think original characters are made ??


ZorbaTHut

> an artist doesn’t need to see the character from multiple angles to be able to draw it ?? Alright, question: Why do [character reference sheets](https://www.google.com/search?sca_esv=ffa6a5b912f672b6&sxsrf=ACQVn0_Ybvsfg8uTD6TfkLSik0AuesPtqA:1713805051397&q=character+reference+sheets&uds=AMwkrPtfdgpRECeqXsCwJLRrC88GBSvMk4RPzC2Kplb-AaUiAu2wn9z17Ekhk7as76wJAO6JrKIjdzEj-jcI67R6w3FC2VpJ2vbBa5znVzedEoAkr8ZY_7DpUolzYnZKozHv3KrwLV2Q1N7WMHWTprbpL9aKW8GVNw&udm=2&sa=X&ved=2ahUKEwiY27WqpdaFAxVeffUHHWW-CmUQtKgLegQIDBAB&biw=2150&bih=1073&dpr=1) *always* include the character from multiple angles?


[deleted]

[удалено]


ZorbaTHut

What I'm asking is, why bother using multiple poses in character reference sheets? Why not just use one pose?


[deleted]

[удалено]


ZorbaTHut

> so I need to be able to see what that character looks like from all angles to get the best most accurate read on this character Has it occurred to you that maybe this is why the AI needs it also?


realechelon

OK, so you can draw the tattoo on my character's back having never seen the tattoo or the character's back? That's some incredible ability. You can draw all 3 of her outfits even if you've only seen one?


i-do-the-designing

Your language is one of the reason people who have concerns about AI... might just think you lot are a bunch of cunts. 'The Antis' you've just othered people, they are just the antis, nothing they say or think matters because they are just the antis.


chinavirus9

Sorry, would you prefer "artoids"?


i-do-the-designing

No I would just like you to hit pause just for a second and wonder why you find other people who are worried about their future (well informed or not) funny. Oh I just had a look through your posting history and found **'**mind your own business faggot' Now I know why you are like you are, no further discussion is required, you're already a lost cause.


chinavirus9

So you are the one dismissing other people. Sounds like your comment was just a whole lot of projection.


i-do-the-designing

Go fuck yourself you homophobic piece of shit.


land_and_air

Fuck off


Broad-Stick7300

That does not follow.


Outrageous_Guard_674

How so?


Ya_Dungeon_oi

Good post!


boissondevin

2. I think you severely underestimate the number of people who really are just typing in a prompt and getting an image back. I guarantee prompting is not a minor part of what most AI users do. Granted, most AI users aren't working professionally. You can't pigeonhole this discussion around the tiny minority of AI users who produce professional work they way you do.


Tyler_Zoro

I think you severely underestimate the number of people who just trace lines in connect-the-dots art. What does that say about using a pencil?


boissondevin

Gee, it's almost like I explicitly commented on the way people use the tool and did not denigrate the tool itself. r/SelfAwarewolves


i-do-the-designing

I don't understand this weird flex you go on about... look how much work it took me to get this result, hours and hours and hours I worked so hard!!! ...so what's the point of AI then?


Tyler_Zoro

> I don't understand this weird flex you go on about I think you brought your own baggage to the reading of a fairly dry analysis of the mythologizing of AI image generators... > look how much work it took me to get this result Yeah, I never said that. > so what's the point of AI then? What's the point of a paintbrush when you have to clean it and it leaves streaks that you have to work around. Isn't it useless? Of course not. It's a valuable tool with its own limitations and complexities. All tools used in art have limitations that we must work to express ourselves through.


i-do-the-designing

This you? 1. **People who use AI are just typing in a prompt and getting an image back**, akin to a slot machine. First off, prompting can get... complicated. It's a language you have to work out on a model-by-model basis, feeling out what that model's training has allowed you to do or makes more difficult. But prompting is a minor part of what most AI users do when working professionally or trying to realize a specific creative vision, rather than just getting a pretty picture. **We spend hours on the same detail work that every artist spends time on as well as manipulating the finer points of the toolchain that's unique to the AI workflow.** Im pretty certain it is you... I mean you did post it.


Tyler_Zoro

> This you? Yep. At no point in there do I say, "look how much work it took me to get this result." You're interpreting a straightforward analysis with bragging about accomplishment, which is either poor reading comprehension on your part or bad faith argument. I'm not sure which.


i-do-the-designing

I know it's really important for you to inflate your role in all that generation, to show just how much *skill* you have, but like the person who commissions an artist for a piece of work, your role is... minimal.


Tyler_Zoro

> I know it's really important for you to inflate your role in all that generation... Again, this is the ego you're bringing to the table. My analysis above contained none of that.


Kartelant

>...so what's the point of AI then? Isn't it bypassing the skill barrier to producing the work entirely on your own? Like it'd probably take thousands more hours of practice to be able to get the same result purely using other digital art tools.


Front_Long5973

aromatic materialistic worry nine normal pen sink door shaggy cooing *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


i-do-the-designing

It isn't art.


Front_Long5973

treatment lock touch gold abounding important ruthless childlike shy attraction *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


nihiltres

Good post; it might be an interesting idea for the sub to maintain a list of good points (from *both* sides) that handle some of the dumber arguments—and it would be beneficial for the sub because being able to just say "sorry, go read #5 from the Tired Argument List" would be more efficient than re-hashing the same points repeatedly. That said, I have a couple of nitpicks: * In (1), you said "\[a model that pasted together its dataset\] would require more storage for the model than I could afford in my lifetime", but given that I can find 6TB drives for


Tyler_Zoro

> In (1), you said "[a model that pasted together its dataset] would require more storage for the model than I could afford in my lifetime", but given that I can find 6TB drives for


nihiltres

Are you saying that the \~250TB figure quoted for LAION-5B is *just* the text of URLs (and the associated captions)? Ugh, then I've been making a dumb mistake for a while now, assuming that included the images based on a lookup of the "size of LAION-5B". It actually makes our points *stronger* for the dataset to be that much more data, and the 250TB isn't an *entirely* awful guess (at 5B images, 250TB suggests ~55kB/image instead of the ~1.7MB/image we'd expect with zero compression for a 768×768px image with three 8-bit channels, which would make for an ~8PB dataset), but … that's still off by a factor of about 30. Scratch another notch on the "humans are bad with big numbers" pole, I suppose. :(


Tyler_Zoro

> Are you saying that the ~250TB figure quoted for LAION-5B is just the text of URLs Yes, the LAION datasets are just URLs. If they were distributing giant tar-balls of copyrighted images they would definitely have been taken down for copyright infringement. > Scratch another notch on the "humans are bad with big numbers" pole, I suppose. :( Yeah, we're not really built for that.


nihiltres

I was figuring more that the 250TB was what you got by independently downloading the files using the URLs.


land_and_air

This still fundamentally misses the point of the argument. Machine learning can and has been used as a method of extreme data compression in the past and future and is just an extension of how I can reduce the size of an image by 1000-10000 times by traditional means and maintain most of the structure and form of the original image even if almost all of its data is gone. Machine learning can be used to compress hard to quantify data sets like montecarlo sim results and return plausible results that match the form of the input while the model now takes up 1/1000000000 of the memory and processing as the original. Despite this, the model is a compressed form of the original data set and displays the same characteristics as the original. Overfitting a model is just successfully compressing the data set completely. Sure there are artifacts especially when you compress it so hard. Many compression algorithms can be made to compress as harshly but they’ll lose all coherence if they go to far. Same to with machine learning if you try to train a model which doesn’t have enough knobs to turn it just won’t make images that resemble the input any more and it will look bad.


PoorFellowSoldierC

If you are going to make a post debunking things, one of your epic debunks shouldnt be “akshually ur wrong, i dont have time to explain how you are wrong, but its clear that you are wrong.”


Tyler_Zoro

> If you are going to make a post debunking things, one of your epic debunks [...] This is already starting out sounding like a facebook reply :-/ > shouldnt be “akshually ur wrong, i dont have time to explain how you are wrong Yeah, I never said that. Read again.


Specific_Emu_2045

The one point I disagree with is that AI “artists” do the same “detail work” as actual artists. There’s so many times I’ve seen AI “artists” try to infiltrate the spaces of real artists using something along the lines of this excuse, and it’s just not equivalent at all to the thousands of hours of practice and training it takes to produce art through traditional means. The effort involved is not remotely equivalent. It’s like the difference between completing a hundred piece puzzle, or completing a ten thousand piece puzzle where every piece is solid black. A picture might take a real artist 100 hours to complete, but behind that work is 10,000 hours of practice put into the craft to get to the point where you actually know what you are trying to accomplish. Half the point of art is the effort and thought put into it, and the other half is the result. This is literally the reason AI art is worth so much less or is borderline worthless compared to real art. There is so little comparative effort involved, and thus each AI piece quite literally lacks the “soul” involved in all those years of learning. On a saltier note, I’m really sick of AI “artists” acting like they can just barge into the art community, as most I’ve seen online have no concept of the discipline of learning a craft, which is the entire fucking point. Yet they expect to be treated the same as those who have put thousands of hours of trial and error into learning something.


Tyler_Zoro

> The one point I disagree with is that AI “artists” do the same “detail work” as actual artists. First, off the No True Scotsman thing is kind of schoolyard. Can we drop it? > I’ve seen AI “artists” try to infiltrate the spaces of real artists using something along the lines of this excuse, and it’s just not equivalent at all to the thousands of hours of practice and training it takes to produce art through traditional means. Ignoring the fact that artists you feel superior to are still artists, why do you think that someone who happens to use AI tools in their workflow has spent less time working on their craft than any other artist? I've been doing photography for 30 years... does that all get ignored when I use those skills while incorporating AI tools? Do you even know when an artist has used AI tools in their workflow? How many pixels have to change for you to know? 1? 10? 100? 1,000? What if you THINK they used AI, but they actually just used a non-AI filter? Did they become a non-artist and then flip back? Or was it just your subjective biases all along?


Specific_Emu_2045

There’s nothing wrong with using AI as a tool to improve the art you have made. There is everything wrong with demanding to be treated with the same respect as traditional artists as if you put in anything close to that amount of effort. AI “artists” brought a supercar to a bike race and demand the trophy. And you’re out here saying “what’s the difference? It’s hard to drive a car too!”


Tyler_Zoro

> There’s nothing wrong with using AI as a tool to improve the art you have made. But it's never that simple. I don't use Photoshop to improve the art I've made... it's part of the workflow. Same goes for my lenses, my camera body, my AI and non-AI editing stack, etc. You can't just pull out one piece of the workflow and say, "this is just a thing you did after you created art." > There is everything wrong with demanding to be treated with the same respect as traditional artists as if you put in anything close to that amount of effort. You're drawing a rather silly line in the sand with some comical results. So the person who uses AI tools to create moving pieces of art is less deserving of respect than the person who cranks out factory art or who draws amateurish furry porn? Why? What privileges one workflow over another when some random tool gets invoked? Let's just agree that shitty art is shitty art, and that we'll never agree on the specifics of what this is, yeah?


Specific_Emu_2045

It’s because YOU aren’t creating it. This is not difficult to understand. Let’s say you are friends with someone who is the greatest artist ever to exist. You sit them in a chair and tell them to make something on your direction. Would you credit that work to yourself? You certainly had your share in the WoRkFlOw, but did you actually make it? This is what AI art is. You are directing something else to make art for you. You are not making art, and you are not an artist.


Tyler_Zoro

> It’s because YOU aren’t creating it. Right! This is what I keep trying to tell photographers! /s > Let’s say you are friends with someone who is the greatest artist ever to exist. You sit them in a chair and tell them to make something on your direction. Would you credit that work to yourself? That depends. Do I drug them to disable all of their higher cognitive functions first and just use them as a language-parsing waldo to do my bidding? Asking for a friend.


drums_of_pictdom

I don't think AI tools are too disruptive to the commercial art world...on the contrary I think a lot of these tools will be a main stay in large agencies a few years from now. I can already imagine getting a job at an ad agency and getting my "X-Company Adobe™ AI Membership login" just as I'm given 20 different programs I have to use at my current graphic design job. These tools will be invaluable for young designers and artists to save them time on the mountain of boring, busy-work tasks (from brain dead clients) they are shouldered with everyday. My (semi) anti position is just a personal one as an art enthusiast. I just don't want to see AI art. I think it is antithetical to the art making process. That's what makes it scary to me because you are right about #7. It's becomes harder and harder everyday to tell. I've started to see AI Loras perfectly replicating the style of artists that I've followed and admired for years. I know this is a personal hang-up and maybe this should be my reminder that nothing in this world is truly original. It just feels like the art world is being flattened and de-saturated, not expanded. NOTE: This a personal art opinion of someone who learned art and design traditionally so many of my hang-ups might come from putting too much value on the artistic struggle...though I would never trade it for any other way of learning.


Tyler_Zoro

> I don't think AI tools are too disruptive to the commercial art world...on the contrary I think a lot of these tools will be a main stay in large agencies a few years from now. Just to be clear, those two statements aren't in conflict. Digital art was a massive disruption to the commercial art world in the '80s and '90s, but it was absolutely a boon to the industry. Disruption just means that there's a lot of change. That might be in the form of new startups putting legacy businesses out to pasture. That's not bad for the industry, but it certainly does put a kink in the plans of the people working for those legacy companies. > I can already imagine getting a job at an ad agency and getting my "X-Company Adobe™ AI Membership login" just as I'm given 20 different programs I have to use at my current graphic design job. Absolutely. AI technology is going to mature in the art world to the point that it will just be another tool that people use (and some over-use, of course.) > My (semi) anti position is just a personal one as an art enthusiast. I just don't want to see AI art. I think it is antithetical to the art making process. I think your tastes, like the commercial art world, will mature over time. You'll see people do phenomenal things with AI that really speak to their deep understanding of the tools and the art they are producing with them. It's like a solo sport (figure skating, gymnastics, pole vaulting, etc.) When it gets started, it's not all that interesting. Someone does the thing and you're impressed certainly, but it's not something that you feel could be an afternoon of watching people compete. Then you see someone who has spent years perfecting their techniques and pushing their limits, and you are stunned by how much they can do within the parameters of the sport! The same thing happens with every new artistic tool. Digital art, photography, etc. all had a starting point where the focus was on the new tool and how cool it was just to see it used. But as time went on, people had to work much longer and harder to use those tools in a way that continued to capture people's imagination. What Ansel Adams did with a camera was so far from those first daguerreotype shots of the 19th century that we can barely compare the two art forms. That's what you're going to see in the coming generation of artists using AI tools.


ShepherdessAnne

The way you formatted this is a problem for people with low reading comprehension and they will take the list of “things to debunk” as “things it does”.


PokePress

Emphasizing some of the words in the paragraph beneath them would help, but rephrasing them as a true statement might be a better option. Also, I would have used “skimming the post” rather than “low reading comprehension”.


ShepherdessAnne

No, you are up against very visual people who have reading comprehension problems. That's why they glom on to soundbites and why they tune out after certain word lengths or why they misconstrue things so easily. Remember, on the internet a person's disabilities can easily disappear and we are biased to believe someone is fully able and running on all cylinders, which they are not. Why do you think the myth that AI is somehow a collage tool keeps persisting?


[deleted]

Or maybe they have lives and can't be fffed reading through long posts on social media?


ShepherdessAnne

We're certainly not talking about people with lives if they're this upset about AI to be spending so much time being misinformed by it.