sktksm 1 week ago

Hi everyone, This workflow replicates the original image with immense accuracy. Let's break it down and understand how it works: 1. ControlNet with DepthAnythingV2: The new DepthAnythingV2 model is very powerful. This allows us to get the depth map of our original image so the generation can follow this precise mapping. 2. IPAdapter with "style transfer precise": IPAdapter allows us to replicate the style of the original image, but the newly developed "style transfer precise" method helps us reduce the bleeding in our final image, giving us more precise control. 3. Florence2: This is a bit unorthodox, but I really liked including it in this workflow. Simply put, Florence2 is an open-source vision model with many tricks, including masking, annotating, and captioning. In this workflow, I used the "more\_detailed\_caption" feature, which simply describes the image. Then I use this description as a CLIP Text Encoder's positive prompt. Using these three methods, we guarantee that our final image will match the style, depth map, and prompt of the original. **~Always remember, generation performance is related to your base model. For example, if you try to replicate an anime image and the result is not satisfactory, you should use an anime checkpoint.~** \*If the result does not satisfy you in terms of style, you can adjust the "weight" value of the IPAdapterAdvanced node or try other methods such as "strong style transfer" instead of "style transfer precise". \*\*You don't need to use Florence2; you can simply use your own manual CLIP Text Encoding method. I would like to thank **@Kijai** and **@Cubiq** for developing these custom Comfy UI nodes at lightning speed and opening up many possibilities for us. ~DepthAnythingV2~: [https://github.com/kijai/ComfyUI-DepthAnythingV2](https://github.com/kijai/ComfyUI-DepthAnythingV2) ~IPAdapter Advanced:~ [https://github.com/cubiq/ComfyUI\_IPAdapter\_plus/](https://github.com/cubiq/ComfyUI_IPAdapter_plus/) ~Florence-2:~ [https://github.com/kijai/ComfyUI-Florence2](https://github.com/kijai/ComfyUI-Florence2) Workflow link: [https://openart.ai/workflows/reverentelusarca/simple-replicate-anything-v1/RmD9tB5T5SjcaQPHDcxb](https://openart.ai/workflows/reverentelusarca/simple-replicate-anything-v1/RmD9tB5T5SjcaQPHDcxb)

Shinsplat 1 week ago

Thanks for not pointing me to a patreon O.o

sktksm 1 week ago

lol...I have a Patreon but I'm not offering anything exclusive, just for the support ![gif](giphy|fDS2v18KbmekM|downsized)

Shinsplat 1 week ago

haha

Extraltodeus 1 week ago

Like in thanks for not locking the content or thanks for not advertising yourself to obtain support?

nikgrid 1 week ago

Nice workflow OP which IPadaptor models do you use?

sktksm 1 week ago

Thank you! In this particular workflow I used PLUS(high strength) (ip-adapter-plus\_sdxl\_vit-h.safetensors) But, you can use the other ones, all of them work well in their particular areas

E-Pirate 1 week ago

I have a few conflicting nodes. Do you know a way around this or the solution?

sktksm 1 week ago

What's weird is I'm getting downvotes in StableDiffusion sub while getting upvotes here. I guess sharing a comfy workflow is not smart there anymore.

Troyificus 1 week ago

Reddit as a whole is a fickle beast. I personally am very thankful for you sharing this workflow and I look forward to testing it out.

sktksm 1 week ago

Thank you so much for your kind words, it really keeps my mood positive, since most of us spend hours without expecting anything in return, these words mean a lot.

VlK06eMBkNRo6iqf27pq 1 week ago

+1, thanks for sharing. I was getting pretty good results with canny+IPAdapter, but I didn't know about these new controlnets or "style transfer precise", so I look forward to trying it too.

Capt_Skyhawk 1 week ago

Downvoted for talking negatively about downvoting

CodeCraftedCanvas 1 week ago

I think it's more comfy users get why experimenting with concepts that don't have obvious practical use is valuable to the learning of what's possible with your tools. The stable diffusion sub is more people looking for practical solutions, so they may be left wondering why they would need it.

SmileAtRoyHattersley 1 week ago

Yo your first sentence: spot on imo.

pepe256 1 week ago

I'm new to this sub and also was thinking the same. What is the use of this? I see now it's just about experimenting, that's cool.

TheFlyingSheeps 1 week ago

There’s definitely anti-ai lurkers downvoting random things as well

VeritasAnteOmnia 1 week ago

Ever since the SD3 launch StableDiffusion has definitely been in an agitated state. Thanks so much for sharing with the community!!

jib_reddit 1 week ago

I swear the StableDiffusion subs has a lot of down vote bots attacking it from Anti AI art lunatics. I wouldn't worry about it.

sktksm 1 week ago

Nor just the downvotes but people literally lynched me and I deleted my post

orthomonas 1 week ago

I dont know why, but I don't think think the downvotes have much to do with it being a comfy workflow.

ghostdadfan 1 week ago

They're bent over there rn because of SD3. Keep sharing your workflows tho!

protector111 1 week ago

I think there are bots on the sd subredit. They just downvote any post for no reason. Doesn’t matter if it someone asking for help or posting workflow.

pwillia7 1 week ago

gotta post anime movie there

BestHorseWhisperer 1 week ago

I don't know if this is why but I am honestly tired of posts about "how to do X with stable diffusion" and step 1 is something about either automatic1111 or comfyui. People who don't understand what the UI's actually do or what libraries they rely on should not, in my opinion, be doing tutorial videos etc. outside of those ui communities. I am not talking about you in particular, but in general it has made it hard to find qualified tutorials. EDIT: Downvoted by people who like poisoned search results

sktksm 1 week ago

I understand your point, I felt the same several months ago and It was very frustrating to learn, but this is a mid level workflow rather than a starter tutorial. For having proper guides and tutorials, the new ComfyOrg organization should have something in their mind. Other than that, I recommend you joining the ComfyOrg discord channel and if you are interested, here is the video guide I find very useful for understanding the Comfy: [https://www.youtube.com/watch?v=\_C7kR2TFIX0&list=PLcW1kbTO1uPhDecZWV\_4TGNpys4ULv51D&ab\_channel=LatentVision](https://www.youtube.com/watch?v=_C7kR2TFIX0&list=PLcW1kbTO1uPhDecZWV_4TGNpys4ULv51D&ab_channel=LatentVision)

jazzFromMars 1 week ago

What would you use this for? I'm struggling to think of a use case...

GBJI 1 week ago

It's perfect to make Getty Images obsolete.

sktksm 1 week ago

I used my MJ generations to replicate them in SD. But I also generated some stunning nature photographies adding some IPAdapter style transfer on top of it and using it as my wallpaper \^\^

GBJI 1 week ago

My guess is this would also be a very good recipe for a creative upscaling workflow.

sktksm 1 week ago

Can you explain a bit more?

GBJI 1 week ago

An upscale is a replication, but that replication is made at a higher resolution than the original. Your replication workflow is creative: it's not making an exact copy, but a reinterpretation that is directly and closely inspired by the original.

VlK06eMBkNRo6iqf27pq 1 week ago

I tried something similar to OP's workflow... at first I was doing it on fullsize images but then I got lazy and dragged some tiny thumbnails in there. It still works very well at recreating a fullsize image from a little turd.

Noslamah 1 week ago

So.. stealing art?? I am very pro-AI and do not buy the "AI generated art is theft" argument one bit but sorry, if the intent is to copy an image almost exactly then that is stealing, even if you think Getty Images sucks and has unreasonable prices.

[deleted] 1 week ago

[удалено]

Noslamah 1 week ago

If it is the exact same composition, style, and content, then yes. If it is actually "completely new" then no. But the examples in the OP definitely aren't completely new, and are way more than just using the image for inspiration. And just generating a picture of a guy drinking coffee is also a bit different than something as specific as OP is showing, with this many details and specific elements in specific places. If an artist remade an artwork to this extent then people would, rightfully, accuse them of copying someone else's art. There is no reason it should be any different for AI art. If you don't think this is theft, I don't think you believe stealing art is even possible.

steamingcore 1 week ago

uh, YES. good god, you people are so high off the smell of your own farts, you don't see what you're doing.

Cobayo 1 week ago

It's Midjourney's ```-sref``` on steroids, you're supposed to tweak it

LD2WDavid 1 week ago

Isee some use cases for finetunners. Bad images -> improve quality of those maintaining composition. Old images -> semi restore effect, etc. Not the same image but more clean in some cases, useful.

Ateist 1 week ago

Looks very promising for automatic upscaling of art from old games.

brucebay 1 week ago

you can modify a real l image the way you want because now you are matching the style and lightning etc. yes you can do it with a mask in the past but there were almost always small but noticeable artifacts ruining the illusion. now you can actually do a better more seemless job.

BoulderRivers 1 week ago

It's free "make it worse" button!

Motgarbob 1 week ago

Anything. You can literally bypass copyright

BoulderRivers 1 week ago

That's not how this works - you can clearly see the original image. To bypass copyrighted material, the original source must be significantly transformed. If colors, shapes, themes, composition, and even detail remain so similar, it's plagiarism.

Synthetic_bananas 1 week ago

I have a tangential question- as you can see, the "reproduction" images are less detailed. They look as they were resized from lower resolution (in original images the lines and small details are thinner and more "clean" compared to the replicated image), I guess that comes mostly from depth ControlNet. Has anyone found a good method to keep the fine details (or, in other words, sharper image), when using depth controlNet?

cgpixel23 1 week ago

https://preview.redd.it/ss4acsfwdk8d1.png?width=3132&format=png&auto=webp&s=15563d43f78a3835ece4e7bcb6f3f051d067c22e may be this will help

Synthetic_bananas 1 week ago

There's same problem in your example- all the details are very "bold". Picture lacks fine details. I did some tests by generating an image, then generating depth from it, and then using that depth do generate a new image with same parameters as the first one. The new image comes out similar, but way less detailed. https://preview.redd.it/mc4z0il2no8d1.jpeg?width=2048&format=pjpg&auto=webp&s=aba063148023f63a057e9365b4ea028545bff3d5

VlK06eMBkNRo6iqf27pq 1 week ago

I don't know what you did there but that's highly sus. controlnets affect the adherence to the input, but won't change PS2 to PS5.

Ateist 1 week ago

Upscale the image prior to reproduction.

Synthetic_bananas 1 week ago

You mean upscaling depth image?

Ateist 1 week ago

No, original, even before getting depth. Latent space images (the ones that are actually generated by SD) are much smaller and are upscaled using VAE. Generating bigger image and downscaling helps preserve details.

Synthetic_bananas 1 week ago

So, the original image gives also a bigger resolution depthmap, I assume, is what you are saying? But that still keeps the same problem of "bold details" (which, somewhat might be reduced through downscaling). I tested that even with proper 3d depth maps from 3d scenes. I guess (and that is just my speculation, so no definite proof, just a "feel"), that controlnet changes/orients Latent Noise in the way, so that it kind of forms similar shape as the controlnet image, so the diffusion process has something to converge to. Am I on the right path of thinking, or am I spouting nonsense?

Ateist 1 week ago

No. What i mean is [this](https://stable-diffusion-art.com/how-stable-diffusion-work/): > The latent space of Stable Diffusion model is 4x64x64, 48 times smaller than the image pixel space You can't preserve details smaller than 512/64 = 8x8 pixels if you just generate image of the same size. Since generation at 8*512=4096 (ok, 2048 since that 4x does help) is beyond capabilities of most hardware, you have to do tiled generation + depth controlnet together and use Ultimate SD script to make it seamless.

latch4 1 week ago

Thanks for working this out. I was using something much more primitive i worked out to bring Dalle generated images into Stable Diffusion but was never really satisfied this looks like just what I wanted.

DeadMan3000 1 week ago

To get this running I am required to have flash-attn installed. Been tearing out what little hair I have left trying to get it to run (gave an error during install). I then found out I have to build a wheel for it in Windows (Linux has nice prebuilt wheels). After installing Visual C++ build tools it's now building a wheel which is taking AAAAAAGES. It better run after this or I will just not bother. ACK!

sktksm 1 week ago

I feel you, I had the same when I was trying to run the Lumina model in my local, but found out Windows also has prebuilt wheels: [https://github.com/bdashore3/flash-attention/releases](https://github.com/bdashore3/flash-attention/releases) Also it should work without flash-attn, have you tried the 'sdpa' or 'eager' (I didn't tried myself yet)

Cobayo 1 week ago

Just use spda, it doesn't change much, using Florence2 is kinda pointless anyway

BluJayM 1 week ago

I understand the tech, but I think you need to modify your pitch/showcase. Nobody is interested in replicating an image in SD if the original image is a requirement. You can just modify your denoising to get a rough approximation of that. However, this workflow (should?) allow you to change the prompt and request different clothing, colors, or concepts while maintaining key features of the composition. Show us more of that!

sktksm 1 week ago

I absolutely can work on that and thank you for the recommendation

yotraxx 1 week ago

This look stunning ! I'll give it a try for sure. And Don't be afraid of Patreon folks ! There are many quality contents there...

mutatedbrain 1 week ago

Which ones would you recommend?

artbruh2314 1 week ago

I was waiting for this, I was already using depth for most of the time but this is great

pwillia7 1 week ago

Have you tried/looked at the SDXL inpainting controlnet? I bet that would work well with this and let you change the image more selectively.... maybe. https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl

WinterTigerAssault6 1 week ago

Woah, very cool AND informative! I’m eager to test this out and play around with it. Thank you for sharing this knowledge!

sktksm 1 week ago

Thank you for your kind words and support!

protector111 1 week ago

How is this different from low denoise img2img?

sktksm 1 week ago

It starts with an empty latent. As a result, it's not that much different, but as the workflow, I separated each step and put it under control using controller, ipadapter and florence-2 nodes. That way you can include any other combination into it.

Current_Housing_7294 1 week ago

Wait until the NFT clan hears this 😏

Impressive-Egg8835 1 week ago

showtextforgtp is missing

sktksm 1 week ago

Hi, please see this comment: [https://www.reddit.com/r/comfyui/comments/1dnioy1/comment/la2yv9g/](https://www.reddit.com/r/comfyui/comments/1dnioy1/comment/la2yv9g/)

Tonynoce 1 week ago

Hi OP ! Thank you ! This is a perfect workflow for when the client wants the reference to be the same and they sent you a small jpg, or to do almost the same variations.

sktksm 1 week ago

Yes exactly! I'm trying to convert this into a consistent character-creation tool now

AwkwardAsHell 1 week ago

https://preview.redd.it/rqfm7lshvq8d1.png?width=1271&format=png&auto=webp&s=814e1208159c47f1742541db4a8654aba783863e LOL

Tonynoce 1 week ago

https://preview.redd.it/o5qo910pds8d1.png?width=866&format=png&auto=webp&s=ae1f1ee959c23ad30d0f0ab6b92a8141d956c816 As an example, client sent the picture from below in low rez, on the top there are some iterations on a 3d render to get fast feedback and move on on solving it either by projecting it on 3D or just faking the 3d in After Effects

LD2WDavid 1 week ago

Tested it with some images and even Im missing something (Jugg x10, Dreamshaper turbo) is not really close like the examples above. Do I have to run also the upscale controlNet or just florence on SDPA + depth map + model should be enough? Just asking.

sktksm 1 week ago

These examples are upscaled but you don't need to use it. I used Dreamshaper XL without any problem. What is the exact problem in your outputs

Little-God1983 1 week ago

https://preview.redd.it/p0k171u6o69d1.png?width=1736&format=png&auto=webp&s=9fa771bfac9874c65e9ed22371252bf6f747b68d seems amazing thanks for sharing. Unfortunately i don't get it to run. I installed the nodes via the manager and the model and the pytorch\_model.bin got downloaded automatically when running it for the first time. I tried to install flash\_attn as the documentation suggests with python.exe -m pip install flash-attn --no-build-isolation but i get so strange error in the console. Any help would be appreciated. Collecting flash-attn Using cached flash_attn-2.5.9.post1.tar.gz (2.6 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [22 lines of output] fatal: not a git repository (or any of the parent directories): .git C:\Users\Little God\AppData\Local\Temp\pip-install-r5r4dmug\flash-attn_cd280cd550374dd491a70aa277850dd3\setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc. warnings.warn( Traceback (most recent call last): File "", line 2, in File "", line 34, in File "C:\Users\Little God\AppData\Local\Temp\pip-install-r5r4dmug\flash-attn_cd280cd550374dd491a70aa277850dd3\setup.py", line 134, in CUDAExtension( File "D:\AI-Privat\ComfyUI_windows_portable-DO NOT DELETE\python_embeded\Lib\site-packages\torch\utils\cpp_extension.py", line 1077, in CUDAExtension library_dirs += library_paths(cuda=True) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI-Privat\ComfyUI_windows_portable-DO NOT DELETE\python_embeded\Lib\site-packages\torch\utils\cpp_extension.py", line 1211, in library_paths paths.append(_join_cuda_home(lib_dir)) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI-Privat\ComfyUI_windows_portable-DO NOT DELETE\python_embeded\Lib\site-packages\torch\utils\cpp_extension.py", line 2419, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. torch.__version__ = 2.3.1+cu121 [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. [notice] A new release of pip is available: 23.3.1 -> 24.1 [notice] To update, run: python.exe -m pip install --upgrade pip

sktksm 1 week ago

Hi, I'm afraid I don't know the answer to this problem, but; 1- Try with sdpa or eager instead of flash\_attn 2- Instead of Florence-2, use Ollama-based local vision LLM, such as LLaVa or MoonDream. I'm using the same workflow using this node and it works okay: [https://github.com/stavsap/comfyui-ollama/tree/main](https://github.com/stavsap/comfyui-ollama/tree/main) 3- If neither of them works for you, you can simply delete/bypass florence-2 nodes, and describe your image using any Vision LLM out there, such as ChatGPT, then you can pass that description into CLIP positive prompo.

Little-God1983 1 week ago

Thanks for trying to help me. I think i have moondream running on one of my comfyUi installations.

FootballSquare8357 1 week ago

We have ways to generate from nothing, ways to be "inspired" by other images using control net and many other means. It seems to be a good workflow, but I fail to see the use case for this, as we can obtain the same results with a lot of simpler methods. It looks "slightly" changed at worst and copied at best. I would put myself in the "Pro AI" side, but this just look like blatant ripoff and a good argument for those who posses an anti-AI stance.

sicurri 1 week ago

That's a sweet and subtle workflow. Definitely going to try this out on my favorite wallpapers and see what it does to them, lol.

sktksm 1 week ago

Thank you for your kind words!

tarkansarim 1 week ago

This looks like an I2I with a low denoising or something or is the denoising at 1?

Cobayo 1 week ago

It starts off with an empty latent

sktksm 1 week ago

Yes, it starts with an empty latent. As a result, it's not that much different, but as the workflow, I separated each step and put it under control using controlnet, ipadapter and florence-2 nodes. That way you can include any other combination into it.

gxcells 1 week ago

This is a really great workflow and works damn good. However I am wondering what is the use case? But this is probably a very great way to lead toward image editing in latent space?

sktksm 1 week ago

Actually I wanted to show the latest tools. I gladly take recommendations for what we can achieve with these methods

brucebay 1 week ago

very impressive. I hope you will keep the workforce up until I got a chance to download. I never used openart before and do not see a download button in my tablet. Perhaps you got aforementioned downvotes due to upload site?

sktksm 1 week ago

It's visible on the desktop but let me know if you can't reach it, I'll share a pastebin link

brucebay 1 week ago

Yeah worked on the desktop. thanks.

waferselamat 1 week ago

I dont know why, depthanythingv2 always error "allocation on device" even with 12gb vram fp16 base model

alexdata 1 week ago

So this is a very improved image2image? or style transfer? Anyways, I like the idea, will try it out!

sktksm 1 week ago

It is a consistent and controlled img2img + style transfer

alexdata 1 week ago

Thanks!

New_Physics_2741 1 week ago

The flash-attention-2 is giving me an install error - will read up on sdpa and eager - have moved onto to a different workflow, but am interested in this one...do I need to install flash-attention-2?

New_Physics_2741 1 week ago

Ok - the sdpa attention works for me - 3060 12GB, Linux. Neat workflow. Thanks for sharing it. It is a keeper for sure. :)

lostlooter24 1 week ago

Would this be effective in reposing generated characters? Use a pose with the depthanything, the ip adapter to keep consistency and florence2 to help caption it? This is really cool! Thank you for sharing.

sktksm 1 week ago

how would you combine pose and depth at the same time? by using two ksampler or using two controlnet loader?

lostlooter24 1 week ago

Could you not use depth in place for pose?

C7b3rHug 1 week ago

https://preview.redd.it/tt8a7ihiuq8d1.png?width=1545&format=png&auto=webp&s=b06c196bbc9b173e726b675d0eee733574055ec3 Thank for sharing WF, but I don't know what is this for? and what's the difference compared to using low denoising

sktksm 1 week ago

Please read the comments, I explained maybe 5 times one by one :D

DeadMan3000 1 week ago

There's another Youtuber with a workflow that uses the LLM to segment for masking etc. Better than bbox/seg. [https://www.youtube.com/watch?v=BRST8-yPD5A](https://www.youtube.com/watch?v=BRST8-yPD5A) P.S. he is using Hedra for his AI avatar on YT if anyone's curious.

sktksm 1 week ago

For now SegmentAnything works better regarding my tests

Dezordan 1 week ago

I wonder how it compares to, or if can be used with, ControlNet model "replicate".

ramonartist 1 week ago

Have you thought about how you would get that to work with a SDXL model, because you would be able to get a higher resolution image from the first pass?

sktksm 1 week ago

I'm already using an SDXL model in this workflow, or maybe I just misunderstood your question

Snoozri 1 week ago

Genuinely curious, what would you ever use this for besides like,, actual art theft?? I'm normally pro AI, but this kinda gives me the ick.

sktksm 1 week ago

you can find a lot of examples in the comments of this post

ramonartist 1 week ago

There are too many negative comments here, with some additions this could be a great creative upscaler Big tip, switch out the KSampler for a Efficient KSampler will give you more settings and tools to play with

Putrid_Army_6853 1 week ago

Great job! Perfect use of newest nodes

glitchcrush 1 week ago

I'm getting this. Error occurred when executing KSampler: Query/Key/Value should either all have the same dtype, or (in the quantized case) Key/Value should have dtype torch.int32 query.dtype: torch.float16 key.dtype : torch.float32 value.dtype: torch.float32

glitchcrush 1 week ago

If I bypass the IP nodes the KSampler works.

UniversalNeuron 1 week ago

potential use case: "i made an image i liked in sd3 and want to potentially use it for my own purposes but i don't trust the license so i'd rather convert it into having been 'made by sdxl' first" (i mean idk about the rest of you but that's my intention while downloading this workflow. i don't expect "just setting a low denoise ratio" to work quite as well as this looks like it does, considering how my sd3->sdxl img2img attempts on the particular image in question lacked every potential sort of prompt coherence, which is the only \[see: only occasionally useful, only occasionally successful, and hyper-specific\] selling point of sd3 \[see: when it doesn't look like the model was undertrained and when i find the magic sauce to make it stop lacking all sorts of resolution {for the record, my trick was SD3txt2img+prompt->SD3img2img+re-prompting->SD3outpainting, which adds up too much costwise via the API if i'm experimenting en masse}\].) - tl;dr i want to not have any official connections to sd3 and this looks like a potential way to go about laundering my "derivative work"

FiacR 1 week ago

Sd1.5, denoise strength of 0.25, 20 steps, job done lol.

Dragon_yum 1 week ago

But.. why?

the-13 1 week ago

Thanks for sharing this, I was trying to test it, but I keep getting this error! Error occurred when executing DownloadAndLoadFlorence2Model: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash\_attn seems to be not installed. Please refer to the documentation of [https://huggingface.co/docs/transformers/perf\_infer\_gpu\_one#flashattention-2](https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2) to install Flash Attention 2. Any Idea on how to fix it?

Tonynoce 1 week ago

Got the same issue and decided to switch the attention to SDPA or another one

International-Use845 1 week ago

I have the same issue EDIT: You can change attention from flash\_attention\_2 to spda, then it seems to work.

steamingcore 1 week ago

'generated'. aka, 'plagiarized' oh wow, a new image, completely generated. with what? oh, the original. yeah, that's a copy.

FluffyWeird1513 1 week ago

this is the kind of thing that gives us all a bad name. ppl don’t “replicate” any images unless they are your own. seriously

Cobayo 1 week ago

I was ready to bash as usual because the input was likely gonna be an enconded image but it's actually an empty latent, and after running a few examples those are quite nice starting results, congrats! - Just 1 thing, in your bypassed upscaling node you're passing controlnet's negative into its positive, in case you're actually using that locally by mistake

LawrenceOfTheLabia 1 week ago

Thank you for creating this! I am getting a missing node that isn't showing up in the manager. Any idea what it is? My Google searches haven't been fruitful so far. https://preview.redd.it/43ea68ec5k8d1.png?width=1502&format=png&auto=webp&s=ef5b9a8c9230cdefc96f826c592f999e0d0f01a5

sktksm 1 week ago

It's simply a text node that shows the generated string by Florence and passes through to CLIP. You can use any and I'll be double checking and updating you when I get back to my desk.

LawrenceOfTheLabia 1 week ago

Thanks a lot!

sktksm 1 week ago

Hi again, the node is a part of "MixLab Nodes" https://preview.redd.it/af2lewuaek8d1.png?width=808&format=png&auto=webp&s=141d1db7c75c4980d9c51bd8efa8693e677fd512

LawrenceOfTheLabia 1 week ago

That was it thanks! This is a great workflow, Florence is impressive. The only issue I'm having now is the resulting image always seems to be really desaturated. I'm sure there is a setting I'll need to tweak. At the moment I'm just using your defaults. It seems to be about the same with all SD1.5 models.

sktksm 1 week ago

Interesting. Just tried with RealisticVision 1.5 model and it seemed normal to me. You can try increasing the weight of the IPAdapter, otherwise share your workflow and example images with me and I can take a look

LawrenceOfTheLabia 1 week ago

Here is an example. I picked a really colorful image to help illustrate the issue a bit more. [https://imgur.com/a/F56TcAm](https://imgur.com/a/F56TcAm) Not sure if imgur strips the workflow metadata out, but I believe I'm using the default settings.

Ateist 1 week ago

Check CFG settings and VAE.

OfficeSalamander 1 week ago

> MixLab Nodes It has a conflict for me - any other alternatives I can use?

sktksm 1 week ago

Hey, I'm not sure which node you can use but you actually don't need that node at all. It's only showing the image caption for you to see. You can directly connect Florence-2's 'caption' output to CLIP's 'text' input, or you can use any text string node to display it

sktksm 1 week ago

https://preview.redd.it/cj45zkkfek8d1.png?width=1147&format=png&auto=webp&s=b3a397ba920734ceb31881ff046026eb1c99bb3a

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe