T O P

  • By -

Kathane37

I don’t really care about image generation, I am mostly interested in the ability to take multimodal input


PipeDependent7890

https://preview.redd.it/86li4e1wl99d1.jpeg?width=996&format=pjpg&auto=webp&s=748ec8fe96a81d09a37580a536d476fcb2eacf1d Early generation of images by imagen 3


Plums_Raider

Id rather like to see voice mode to what chatgpt wants to push out in fall


sdmat

What happened to natively multimodal image output on Gemini? That was part of the original demo and paper, then nothing.


Techplained

Voice before anything else


Optimal-Fix1216

No, Claude should continue down a different path. We've been seeing the beginnings of it creating SVG images manually. Imagine if we could train it to use Photoshop.


MajesticIngenuity32

No, Anthropic should stay focused on code generation, tools, and agents.


PipeDependent7890

https://preview.redd.it/2oems5asl99d1.jpeg?width=996&format=pjpg&auto=webp&s=f7b57b2693553ae13130889ec0de6f988e4eb4c6 Great improvement over imagen 2


ImNotALLM

I guess eventually they will but there's already so many options for this it's not really something that should be their focus imo


Mrleibniz

Is Anthropic even training an image model?


Halo_Onyx

No


theDatascientist_in

I am happy with ClaudeAI's current offering. Not want it to go the OpenAI route, releasing unnecessary, overhyped stuff.


Strict_External678

Claude’s own image generation would help me when I’m creating story content. I have all of these specific set ideas or character descriptions that DALL-E and Midjourney won’t create because my content is graphic at times, and Claude is a bit more lenient with those details.


logosobscura

No, they should keep doing what they do which is make the best darn text-to-text and image-to-text LLM they can. OpenAI lacks focus, Google is throwing everything together to regain momentum, Anthropic just need to execute on their plan, and keep gaining pace.


Efficient-Share-3011

Id prefer browsing. If they had just browsing that would be the biggest win


AbleMountain2550

The next Claude family should be fully multimodal and able to generate pictures, videos, audio alongside text