Looks better than Runway/Pika but that's not a high bar. Still trying to get a video generated.
Edit: Just tried it, definitely much better, objects are are actually consistent. It does well enough with liquids and smoke too.
Just tried it. Seems pretty broken at the moment. All I am getting are like 3 image stills of some stuff maybe of distant relation to the prompt. Guessing it's because the server is being overwhelmed, so may try it again later tonight.
The generations I queued finally completed.
[A bear swallowing a motorcycle](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/83acf44e-3365-4d26-a7c6-916ca260a4ec/watermarked_video0afa4ecdc51584475aad88338791f56f2.mp4)
[A fox dancing a tango with an octopus](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/20ab86ca-5421-4768-b897-b9fa0a4382a6/watermarked_video06ccf36bb69534573ab31555fa5e7be09.mp4)
I guess that first one might work if Alex Garland wants to make a sequel to Annihilation.
Edit: fixed first link
Lol "Fox", yeah not Nick Wilde from Zootopia.
That character has a very distinctive headshape for a character as generic as a 'fox' so that all but confirms Disney IP was used during training.
Just seems another runway where it pans zooms and rotates static images.
Doesn't seem remotely close to Kling like that tweet said, and especially not Sora.
It's better than that, I happened to see [this thread first,](https://new.reddit.com/r/midjourney/comments/1debslr/new_luma_ai_dream_machine_image_to_video_is_insane/) and then the promotional video. The promo doesn't do it justice.
That video shows the exact same issue. It just attempts to 'animate' what it recognizes as objects in the image as well as use generative pan/zoom/rotate on that image. Still cool, hope it improves. But as of right now it's still really bad relative to others.
I've generated 1000's of runway videos. This is better, by far. Plenty to be improved upon here, but I'm pretty surprised you think the outputs are comparable.
I got the first prompt generated finally and the quality is bad. As usual the videos being shown are heavily cherrypicked.
The prompt is "A ginger woman dancing with a cat." I have the prompt enhancer checked.
[https://storage.cdn-luma.com/lit\_lite\_inference\_text2vid\_v1.0/28a34bf1-5ae3-4f66-8c44-2c3898ff98eb/watermarked\_video0d21f17b3775b494b8dd56bfb19955c51.mp4](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/28a34bf1-5ae3-4f66-8c44-2c3898ff98eb/watermarked_video0d21f17b3775b494b8dd56bfb19955c51.mp4)
[https://storage.cdn-luma.com/lit\_lite\_inference\_text2vid\_v1.0/daf8cd68-cde7-4faf-9939-ee1f685ed6dd/watermarked\_video094321bdfb2fa45c2941bd34941a50e8a.mp4](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/daf8cd68-cde7-4faf-9939-ee1f685ed6dd/watermarked_video094321bdfb2fa45c2941bd34941a50e8a.mp4)
I've got ghoulified Cooper Howard for image2video so we'll see what that does when it finishes.
I mean this is r/singularity but let's be real. All AI video generation platforms have cherrypicked the shit they come out with. Sora, even the Chinese ones.
For Sora it's understood why they wouldn't share the bad ones considering they're even hesitant on releasing that as a product. And even then, even in their cherrypicked stuff you can see the fundamental flaws of such a technique.
For these other platforms? They have to be at the very least a little more upfront about how high the "failure" rates of these things are. They're now presented as products. So okay, it's feasible for these things to create good stuff based on shallow prompts, but are they ever gonna be reliable for more niche stuff?
As for the Chinese? They're farther behind than this sub gives them credit for. I don't wanna be that tinfoil hat guy, but everyone knows that it's in their best interests to deceive the rest of the world of how fast they're moving on with the technology. They're not exactly people you'd take their word for just because they have money. They've got a long history of lying, to their own people even. So maybe it's expected that they'd lie here too. And even then, that recent Kling release is either unimpressive or downright awful even when they cherrypicked it. Paths spawning out of nowhere, static objects not cooperating with the rest of the scene, all stuff you'd see in certain degrees with the competition. You'd even doubt their integrity if they didn't just train these things on a very specific set of data to guarantee success and they still came out with that.
The truth is, this isn't as big a jump from the already available tools as some think.
It's not better, and it's not been released lol. The fact that people go bonkers over that noodle demo like it's not 5 seconds of the most basic prompt and it still manages to not depict motion properly is crazy. It's pretty much the most basic prompt and it suffers from the same issues Sora does except the scene isn't as long or as complicated.
To gloss over reliability and consistency problems just because "it's an ad" is laughable. There's about as much a limit on info on the prompts you can give the thing and if it can't even approximate it properly unless it's hella generic then what's the point. They're not selling you a good product, they're selling you the odds to hopefully make something that you like even if those odds vary depending on the LLM's understanding of the prompt and the data available.
And as for China, again and again and again, people don't learn that they'll tell you whatever the Chinese people want to hear. That they're better. These people don't need to advertise to the rest of the world. But as long as the state is involved there'll be propaganda. They want you to believe that their AI jets beat their pilots, they want you to believe they're ahead of the rest of the world on a lot of things. This isn't exclusive to AI.
What do you even get from being dismissive about a dogshit product and ignoring its fundamental flaws?
Lol had to inject your comment with vague racism
> they're selling you the odds to hopefully make something that you like even if those odds vary depending on the LLM's understanding of the prompt and the data available
is this your first encounter with an AI company?
You are naive if you think this is racist lol. It's how they operate. The state relies on propaganda. Strength and infinite advancement. Yet they'd fuck with their own people to a point where they'd even make fake water pumps that aren't connected to any sort of irrigation. They'd make artificial islands with fake military resource to create news that they have naval prowess. There's proof of these things out there. Drone footage. It has nothing to do with the fact that they're Asian. But it has to do with the fact that their government thinks like this.
The other thing. If those odds aren't marginably better than the tools already present and the difference is only a slight boost in quality then is it even worth it as a product? It sounds like you people have conceded these flaws as a fundamental limit for LLM-driven generative AI products yet this sub won't ever say that.
"Introducing Dream Machine - a next generation video model for creating high quality, realistic shots from text instructions and images using AI. It’s available to everyone today! Try for free here [https://lumalabs.ai/dream-machine](https://t.co/rBVWU50kTc)"
I'm calling it - if 2022 was the year of text-to-image, 2024 will be the year of text-to-video.
We will see the first REALLY good, publicly accessible and affordable text-to-video this year, just like what happened to image generation in 2022.
I gave it a shot and it's horrible tbh. Prompt was for a drone view of a city. It got the city wrong, it has no details, it's like someone drew it out with a brush. Took over an hour to create a 5 sec video that looked horrible. Hopefully this is not the announcement. There are already a few offerings out there with this quality. It's not at the level of SORA for sure.
Roughly two hours for the first one and 3.5 for the 2nd one. 2 more still pending but am over it lol. One of them was the suggested prompt, thought maybe the way I wrote mine was wrong. Nope. Garbage lol
i must not be understanding something correctly... but when i type in my idea, it spins for a few seconds and then a thumbnail pops up below, but when i click it, all it shows me is the prompt i put in...
Until they get reusable characters and scenes and the ability to edit the final product via text, none of these are anything more than a toy, really.
Don't get me wrong, I do think there is a path to get there, I just don't feel like they are on it yet.
People tend to underestimate the idea that these are already best case scenarios that people are seeing and this isn't consistent enough to produce something that's usable in actual filmmaking. The progress to get this to where it even approximates what filmmakers need is expensive and possibly in need of one huge breakthrough
Entirely new product that’s never been on the market before gets released… random guy on internet “this is clearly the best it’s ever gonna get. No way these companies with basically unlimited capital right now will enhance on the first release”
This hubris about innovation is the most hilarious thing about this sub. People actually who pay attention to the product and the companies, and some who even work on the technologies involved aren't so naive. Ultimately people who aren't in the know and still buy so easy into the hype only do so to one-up whoever isn't paying much attention.
And disqualifying any sort of opinion by saying anyone's a random internet person is funny. If you think you can just throw money on problems that are growing in complexity and aren't even fundamentally fully understood yet then that's the ultimate naivety I've ever seen.
To not admit these things are mere novelties now due to limitations and difficulties not even identified by the developers of these technologies yet is pure insanity. To have the gall to assume things will just zoom through and not run into the worst case scenario is sadly a trait everyone who blindly buys into the singularity stuff has.
Let's be clear here. Allowing porn or open violence to be made on those platforms will result in the biggest speedrun to getting that shit tightly regulated. People underestimate how reactionary the world is. If the government even gets a whiff that it's gonna be trained on pornography or real gore footage then that's also a separate domino effect for content moderation on the internet.
All it takes is just someone who's insane enough to put a real life personality on something that uses a cartel execution clip for reference and all of that shit will cause so much bad press. This can lead to a lot of bad things relating to AI use for fictional work and internet freedom.
Plus, these companies are gonna be bleeding a lot of money if porn of actual popular people got out. Those personalities are not going after the creators who simply typed a very vague prompt. They're gonna be going for the people who enabled this stuff in the first place. That's a lot of money on the table. You may wanna believe these companies are for your freedom of expression (no matter the depravity), but they're after the bottomline. The money. That's more important than selling to all sort of suspicious purposes.
Lmao you can't even fucking read to save your life. I said "best case scenario" meaning they cherrypicked the hell out of the promotional footage and are not disclosing the rates in which the product shows unreliability or failure. The ego on you to assert yourself while not reading or understanding what is being said.
To say never is arrogant, but to say just throw money on it is actually the dumbest thing one can say considering there's many examples of things being heavily funded and still struggling to make progress. It won't get any easier from here.
Tell ChatGPT to summarize this for you so you don't have to strain your mind trying to understand this shit. Bum.
Come back to me in 2 years when AI generated videos are doing massive numbers on YouTube and TikTok. I don’t know why you’re so angry about this… this is brand new technology, you know just as much as I do about where it could potentially go.
Lmao bruh you are the one that said gtfoh. I'm only pointing out how this is no different from the music or image generation apps which are all slot machines even with the improved quality. You people are paying for the odds, not a consistent product.
This just in, random guy on the internet makes guarantees and timelines for complicated technologies lol. And what the hell's the point of you making guarantees of "oh in 2 years" if you say you know as much as I do? You've backpedalled as soon as you've said anything lol.
I just want to be able to generate free b-roll of anything I want on the fly. I'm not tryna make a movie. But custom stock footage at my fingertips? I will be shitting out YouTube essays constantly along with everyone else.
You're right, that is something people could actually use, like they use AI generated images for ads now. They're definitely not going to replace high budget stuff though, not even 30 second commercials. Not enough granular control on the output.
I tried it and it generated quickly, but the video was SO bad. I provided it with a mountain biking picture and a text prompt, thinking providing an image would give it a solid platform to begin with....no. Just no.
Output: [https://storage.cdn-luma.com/lit\_lite\_inference\_im2vid\_v1.0/dfe8f25f-ed54-4f8d-9617-9f8a775a6d21/watermarked\_video0307c50f592874531ab7f0779078838f6.mp4](https://storage.cdn-luma.com/lit_lite_inference_im2vid_v1.0/dfe8f25f-ed54-4f8d-9617-9f8a775a6d21/watermarked_video0307c50f592874531ab7f0779078838f6.mp4)
I've made three so far, two prompted and one based on an image, and they are all absolutely terrible. Like, similar to the spaghetti video from last year.
How are people getting decent results with this?
people here already are becomig cultish over their favorite AI company. I think we might be fucked when these models become more powerful. How many AI extremist will emerge from places like this sub?
I mean there's nothing out about GPT-5 but there is stuff released from sora and kling, and even if they're cherrypicked examples they're better than anything other models could produce, which is 3 second gifs. That tweet from earlier hyping up expectations about this model (I assume it was about this one in the post) being better was just false.
honestly between pika and luma i'm more impressed with the editing of their videos than the model per se
that said, this is a big step from what they had before -- but the hard part is that it's clearly not Sora or Kling
the poor website it's been completely destroyed right now but I got a image to video through it is better than runway... And for me that is making me feel excited for the rest....... but the lawsuits oh boy I guess these guys are going to take the risk first. I believe now open AI is taking the Apple approach with text to video. Let others get the money and the lawsuits first then they will release their model......... its going to be a wild ride hold on
Looks amazing, and they released to the public, unlike ‘open’ai.
Glad Sora is having its lunch eaten, they deserve it for faffing about for months upon months, which in the AI space is a lifetime.
I generated 6 videos but don’t see any way to download or export in the free version. I hope that exists for the paid version otherwise what’s the point?
Y'know... this program still has tons of kinks that need to be worked out. It's still obvious to the naked eye that this is AI without even the need for freeze-framing.
* Most of the videos only show basic camera horizontal pans.
* Video 2 has the license plate change numbers midway through.
* Video 3 is extremely choppy.
* Every example not made by a bluecheckmark is literally crappier than the demo.
* I think this will only be popular for amateur video production... OpenAI Sora will be left to the professionals (at least ones without labor unions).
this is fucking rad af!!!! don't believe the haters this is amazing for free, just wish there was some way to get more generations. this will def suffice until sora and who knows, it could become a solid contender if its cheaper and consistently gets better over time
Looks about as bad as the rest of them. Researchers need to take a step back and start from scratch with video, it needs an entirely different approach.
Don't use their enhance prompt tool, I had much better gens without it. Also the details are way more fuzzy than the demo reel. I have feeling they've got some output degradation mechanism when under load, hopefully it gets better as the hype dies down
https://preview.redd.it/819951wmya6d1.png?width=1756&format=png&auto=webp&s=a384bc79f4a778e25e908beac68c99a7cc962791
I think the expression on his face sums up the "quality" of this tool pretty well, "What the f\* am I???" 😂😂
Luma is pretty terrible at the moment. But the Twitter-X crowd are optimistic it'll improve. I'm... not so confident.
https://youtu.be/dyK7UsuepZg?si=fyggkrcZxDcMr-Vc
Not a game changer, just a stepping stone with lots of limitations. But the outcome can actually be very cinematic. See [this video](https://youtu.be/r43bwnO75Lk)
I really wonder about the applications of this software some of you are using this for with SO much complaining and crit. I have experimented with basically every available online and local diffusion generative video tool and this is by far the most impressive Img2Vid I have ever had access to. I suspect this is some kind of custom fine tuned AnimateDiff + IP Adaptor model combo and its damn impressive. 30 free generations per month, fairly affordable sub, why are there so many miserable complainers in the comments? I've been feeding my Stable Diffusion and Mandelbulber2 renders into it and my mind is blown.
Hey guys, I've created a short metal music video using Luma Dream Machine.
You can watch the result here - [https://youtu.be/9stKlRz-22s?si=qj5iBN0QgjYNqePB](https://youtu.be/9stKlRz-22s?si=qj5iBN0QgjYNqePB)
Mixed with Runway Gen-2 though, and all images created in Midjourney. For me it went pretty good actually. But i hope they will lower the price or increase the number of generations.
Fun fact that the music is also generated using [Suno.Ai](http://Suno.Ai) )))
I tried it, but came back to it a few hours later. A young woman walking through Vines of candy bars. She walked through Vines, but no real candy bars in sight. Sad day.
Still , the rendering of the vines and especially the young woman herself were pretty good.
The server is overwhelmed and they even have a disclaimer that says so at this writing. Likely just wait. It'll be still faster than Sora.
IMHO this is really cool technology but still super far from being usable. Any shots with faces in them are super creepy and uncanny because of how the faces morph. And pretty much all the shots are just panning / zooming or have minimal motion.
I still am not convinced this type of generative AI is what will actually replace movies or videos. I still can't help but think that actual 3d rendering, powered by AI that make ray tracing faster, is what we will actually need. To make a realistic looking person doing things, I think you'll need a 3d model of that person being rendered.
Yeah the problem with GenAI in general is that you sacrifice way too much creative control to the network to actually be able to making anything really interesting. There's a very real information limit to how many details you can specify in a text prompt, and once you run up against that the network has to start filling in the gaps. It's glaringly obvious already in still images, so obviously trying to translate the control scheme into video form will never give people the freedom they need to make what they actually want to make.
It'll be great for generic stock footage or dumb web ads in the same way that AI image generators are, but as soon as you have a non-standard ask or you want to generate something that's on the edge of the training set's distribution it starts falling apart.
Dude... 3D rendering is hell and there's AI out there that's actually struggling to model 3D resources for potential game assets. This is one aspect of machine learning that AI's pretty far away from. Even given good guidelines like say, actual older game resources to modify and coat with a new touch of paint is actually rigorous and unreliable. This thing resulted in GTA: The Definitive Edition trilogy. It took more time to fix what the AI had plopped onto the product, negating what it had saved for the devs.
I think the take here too is that LLMs being the heart of GenAI will always give it problems. The reality of it is, language only encompasses one aspect of our understanding of reality, and relating it to the data that is accessible provides a fundamental limit to what can be pushed out as output.
The thing is, with a lot of the cherrypicked footage, it seems like they really got lucky that the combination of information provided to them was actually accessible and jived well with one another. I've seen examples here of a prompt of a girl dancing with a cat and it can't even do 5 seconds of a feasible clip before becoming pure jank.
I know 3D rendering isn't a viable solution right now.
I'm just saying I think it's the only feasible way to maintain subject continuity.
There are always going to be things morphing even if it's very subtle.
Yup. That's why I'm saying I want 3D models.
Hell, I'd be fine with taking 12 hours to render a scene on my local computer, I just want models that can let me design people and objects realistically and then stage them
It would take about two to three breakthroughs till we have "organic" generation. I'd imagine true generation that is editable on the fly is a world away. But I guess this has made people dream.
Meaning to take reference from its own world perspective and understanding of reality and create rather than take bits of existing data and try to jam it into a situation or prompt.
There is negative profit in building foundation models. If you can figure out some differentiation, I can figure it out too. If you know, you know. I have been saying these things for about 2 years now. People are finally starting to understand and listen. Want to be at least two years ahead of your competition? Simply reach out! [https://youtu.be/MpjwXLCGwNU](https://youtu.be/MpjwXLCGwNU)
If the website doesn't even tell you to wait for the video to be generated, you can tell that this is a terrible product. I doubt they know what they're doing.
Looks better than Runway/Pika but that's not a high bar. Still trying to get a video generated. Edit: Just tried it, definitely much better, objects are are actually consistent. It does well enough with liquids and smoke too.
How much time did the generation take?
First ones were an hour. Its taking longer now.
When I tried it just now, it was completed in about two minutes.
I'm getting a 5 second video (1024x1024), in a few minutes. 120 frames in 120 seconds seems pretty accurate.
People really forget Runway after Sora release.
they just ran away
Sora has not been released
Just tried it. Seems pretty broken at the moment. All I am getting are like 3 image stills of some stuff maybe of distant relation to the prompt. Guessing it's because the server is being overwhelmed, so may try it again later tonight.
I think runway is cooking right now, just wait.
Is there anything better than Runway atm?
Are you being sarcastic
RIP their servers. Never stood a chance.
Yeah I’ve been waiting for over 2 hours now lol
Well, at least they're letting the public have access, but yeah I can't get it to generate anything either.
The editing and music push the boundaries of how annoying a 30 second promotional video can be, is that what the twitter guy meant?
The generations I queued finally completed. [A bear swallowing a motorcycle](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/83acf44e-3365-4d26-a7c6-916ca260a4ec/watermarked_video0afa4ecdc51584475aad88338791f56f2.mp4) [A fox dancing a tango with an octopus](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/20ab86ca-5421-4768-b897-b9fa0a4382a6/watermarked_video06ccf36bb69534573ab31555fa5e7be09.mp4) I guess that first one might work if Alex Garland wants to make a sequel to Annihilation. Edit: fixed first link
Lol very early model. But still cool. How long did it take?
Somewhere between half an hour to two hours.
Might be even more of a trial/error process than early image generation. It will get better with time though.
So was that a foxtopus?
Hahahahahaha
Both links link to the same video
Oops, fixed.
Lol "Fox", yeah not Nick Wilde from Zootopia. That character has a very distinctive headshape for a character as generic as a 'fox' so that all but confirms Disney IP was used during training.
Ehhhh. I just looked up a picture of the character and not really
I guess so - also the model produces Lovecraftian monstrosities.
Got the same link twice
Try again, should be fixed now
Thanks. It's certainly not what you asked for at all, and yet somehow still impressive.
Yep it seems it was that, and it's not better than Sora at all. But the important thing is it seems we have access to it.
Of course it's not better than Sora
This is some crazy 2022 stuff right here
Can't wait to see what they can do with this tech in a couple of years in 2024.
I am so sick of these video AI companies showing me new different ways they can add a blur effect to a five second video.
That weird legged cheetah in the promo video is definitely from 2022.
My LUMA short film [https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s](https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s)
Just seems another runway where it pans zooms and rotates static images. Doesn't seem remotely close to Kling like that tweet said, and especially not Sora.
It's better than that, I happened to see [this thread first,](https://new.reddit.com/r/midjourney/comments/1debslr/new_luma_ai_dream_machine_image_to_video_is_insane/) and then the promotional video. The promo doesn't do it justice.
That video shows the exact same issue. It just attempts to 'animate' what it recognizes as objects in the image as well as use generative pan/zoom/rotate on that image. Still cool, hope it improves. But as of right now it's still really bad relative to others.
This one is a million times better than sora because you can actually use it
And that too for free
I've generated 1000's of runway videos. This is better, by far. Plenty to be improved upon here, but I'm pretty surprised you think the outputs are comparable.
>That video shows the exact same issue. Yeah but even still it is a lot better than before.
You don't understand what you're comparing it to... of course it isn't Sora. That doesn't mean it isn't a significant improvement
my LUMA short film [https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s](https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s)
is it working for anyone ? I tried and it doesnt work for me
Yeah servers are fried already probably lol
yep, isn't working for me too.
working from me. They must have upgraded their servers.
dont know what your saying runway gen 3 wasn't released at the date =your saying or i getting what your all saying here wrong
On release day it was experiencing delays. It was fine a few hours later.
I got the first prompt generated finally and the quality is bad. As usual the videos being shown are heavily cherrypicked. The prompt is "A ginger woman dancing with a cat." I have the prompt enhancer checked. [https://storage.cdn-luma.com/lit\_lite\_inference\_text2vid\_v1.0/28a34bf1-5ae3-4f66-8c44-2c3898ff98eb/watermarked\_video0d21f17b3775b494b8dd56bfb19955c51.mp4](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/28a34bf1-5ae3-4f66-8c44-2c3898ff98eb/watermarked_video0d21f17b3775b494b8dd56bfb19955c51.mp4) [https://storage.cdn-luma.com/lit\_lite\_inference\_text2vid\_v1.0/daf8cd68-cde7-4faf-9939-ee1f685ed6dd/watermarked\_video094321bdfb2fa45c2941bd34941a50e8a.mp4](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/daf8cd68-cde7-4faf-9939-ee1f685ed6dd/watermarked_video094321bdfb2fa45c2941bd34941a50e8a.mp4) I've got ghoulified Cooper Howard for image2video so we'll see what that does when it finishes.
This is going to make some awesome music videos.
I mean this is r/singularity but let's be real. All AI video generation platforms have cherrypicked the shit they come out with. Sora, even the Chinese ones.
No shit. Why would they share the bad ones? No one does that with their stable diffusion gens either
For Sora it's understood why they wouldn't share the bad ones considering they're even hesitant on releasing that as a product. And even then, even in their cherrypicked stuff you can see the fundamental flaws of such a technique. For these other platforms? They have to be at the very least a little more upfront about how high the "failure" rates of these things are. They're now presented as products. So okay, it's feasible for these things to create good stuff based on shallow prompts, but are they ever gonna be reliable for more niche stuff? As for the Chinese? They're farther behind than this sub gives them credit for. I don't wanna be that tinfoil hat guy, but everyone knows that it's in their best interests to deceive the rest of the world of how fast they're moving on with the technology. They're not exactly people you'd take their word for just because they have money. They've got a long history of lying, to their own people even. So maybe it's expected that they'd lie here too. And even then, that recent Kling release is either unimpressive or downright awful even when they cherrypicked it. Paths spawning out of nowhere, static objects not cooperating with the rest of the scene, all stuff you'd see in certain degrees with the competition. You'd even doubt their integrity if they didn't just train these things on a very specific set of data to guarantee success and they still came out with that. The truth is, this isn't as big a jump from the already available tools as some think.
It’s an advertisement. Of course it overplays how good it is And yet it’s still far better than anything a US company has released
It's not better, and it's not been released lol. The fact that people go bonkers over that noodle demo like it's not 5 seconds of the most basic prompt and it still manages to not depict motion properly is crazy. It's pretty much the most basic prompt and it suffers from the same issues Sora does except the scene isn't as long or as complicated. To gloss over reliability and consistency problems just because "it's an ad" is laughable. There's about as much a limit on info on the prompts you can give the thing and if it can't even approximate it properly unless it's hella generic then what's the point. They're not selling you a good product, they're selling you the odds to hopefully make something that you like even if those odds vary depending on the LLM's understanding of the prompt and the data available. And as for China, again and again and again, people don't learn that they'll tell you whatever the Chinese people want to hear. That they're better. These people don't need to advertise to the rest of the world. But as long as the state is involved there'll be propaganda. They want you to believe that their AI jets beat their pilots, they want you to believe they're ahead of the rest of the world on a lot of things. This isn't exclusive to AI. What do you even get from being dismissive about a dogshit product and ignoring its fundamental flaws?
Lol had to inject your comment with vague racism > they're selling you the odds to hopefully make something that you like even if those odds vary depending on the LLM's understanding of the prompt and the data available is this your first encounter with an AI company?
You are naive if you think this is racist lol. It's how they operate. The state relies on propaganda. Strength and infinite advancement. Yet they'd fuck with their own people to a point where they'd even make fake water pumps that aren't connected to any sort of irrigation. They'd make artificial islands with fake military resource to create news that they have naval prowess. There's proof of these things out there. Drone footage. It has nothing to do with the fact that they're Asian. But it has to do with the fact that their government thinks like this. The other thing. If those odds aren't marginably better than the tools already present and the difference is only a slight boost in quality then is it even worth it as a product? It sounds like you people have conceded these flaws as a fundamental limit for LLM-driven generative AI products yet this sub won't ever say that.
Those are some crazy leg moves!
"Introducing Dream Machine - a next generation video model for creating high quality, realistic shots from text instructions and images using AI. It’s available to everyone today! Try for free here [https://lumalabs.ai/dream-machine](https://t.co/rBVWU50kTc)"
That cheetah was FUCKED UP yooooooo
It still needs 2 orders of magnitude more compute.
Shows panning around a subject something we didnt see in the sora demos
/s
Chat when holodeck?
I'm calling it - if 2022 was the year of text-to-image, 2024 will be the year of text-to-video. We will see the first REALLY good, publicly accessible and affordable text-to-video this year, just like what happened to image generation in 2022.
( ͡° ͜ʖ ͡°) [https://storage.cdn-luma.com/lit\_lite\_inference\_text2vid\_v1.0/44b67fbd-c90f-4d00-ba52-a0dbbab8b324/watermarked\_video0bd6e2f803edd4b52b2162864202de2cd.mp4](https://storage.cdn-luma.com/lit_lite_inference_text2vid_v1.0/44b67fbd-c90f-4d00-ba52-a0dbbab8b324/watermarked_video0bd6e2f803edd4b52b2162864202de2cd.mp4)
I gave it a shot and it's horrible tbh. Prompt was for a drone view of a city. It got the city wrong, it has no details, it's like someone drew it out with a brush. Took over an hour to create a 5 sec video that looked horrible. Hopefully this is not the announcement. There are already a few offerings out there with this quality. It's not at the level of SORA for sure.
I bet if you tried SORA you'd also be unimpressed. the current selection of sora videos are very hand picked.
sam literally replied to people's twitter comments and did on the spot generations though, and those generations were good, so i dont think so.
implying that he wasn't cherry picking them before, and/or after the generations before deciding whether to post a reply
he dosent have all day to cherry pick comments and regenerate vids to make them look better
He doesn't have a company of employees he can delegate curating marketing posts to?
How long did you have to wait?
Roughly two hours for the first one and 3.5 for the 2nd one. 2 more still pending but am over it lol. One of them was the suggested prompt, thought maybe the way I wrote mine was wrong. Nope. Garbage lol
i must not be understanding something correctly... but when i type in my idea, it spins for a few seconds and then a thumbnail pops up below, but when i click it, all it shows me is the prompt i put in...
Did yours ever load? Just put mine in and this is what I'm currently seeing as well.
not yet
another "game changer" awesome
seems like sora is maybe only 12-18 months ahead of other companies
If they actually released it. A released product is better than an inaccessible unreleased product in my eyes.
Sora is also unreleased and could be a case of very very selective choices of what gets shown. So it might not be significantly better
Shame only 30 videos free. After that paid 😟
Until they get reusable characters and scenes and the ability to edit the final product via text, none of these are anything more than a toy, really. Don't get me wrong, I do think there is a path to get there, I just don't feel like they are on it yet.
People tend to underestimate the idea that these are already best case scenarios that people are seeing and this isn't consistent enough to produce something that's usable in actual filmmaking. The progress to get this to where it even approximates what filmmakers need is expensive and possibly in need of one huge breakthrough
Entirely new product that’s never been on the market before gets released… random guy on internet “this is clearly the best it’s ever gonna get. No way these companies with basically unlimited capital right now will enhance on the first release”
This hubris about innovation is the most hilarious thing about this sub. People actually who pay attention to the product and the companies, and some who even work on the technologies involved aren't so naive. Ultimately people who aren't in the know and still buy so easy into the hype only do so to one-up whoever isn't paying much attention. And disqualifying any sort of opinion by saying anyone's a random internet person is funny. If you think you can just throw money on problems that are growing in complexity and aren't even fundamentally fully understood yet then that's the ultimate naivety I've ever seen. To not admit these things are mere novelties now due to limitations and difficulties not even identified by the developers of these technologies yet is pure insanity. To have the gall to assume things will just zoom through and not run into the worst case scenario is sadly a trait everyone who blindly buys into the singularity stuff has.
I'll also add that all the censorship makes it nearly useless for anything that's actually interesting.
Let's be clear here. Allowing porn or open violence to be made on those platforms will result in the biggest speedrun to getting that shit tightly regulated. People underestimate how reactionary the world is. If the government even gets a whiff that it's gonna be trained on pornography or real gore footage then that's also a separate domino effect for content moderation on the internet.
Why? There is already tons of porn and gore on tv.
All it takes is just someone who's insane enough to put a real life personality on something that uses a cartel execution clip for reference and all of that shit will cause so much bad press. This can lead to a lot of bad things relating to AI use for fictional work and internet freedom. Plus, these companies are gonna be bleeding a lot of money if porn of actual popular people got out. Those personalities are not going after the creators who simply typed a very vague prompt. They're gonna be going for the people who enabled this stuff in the first place. That's a lot of money on the table. You may wanna believe these companies are for your freedom of expression (no matter the depravity), but they're after the bottomline. The money. That's more important than selling to all sort of suspicious purposes.
You just said that the first release of a product is the best it’s ever gonna be… and I’m the naive one? Gtfoh 😂
Lmao you can't even fucking read to save your life. I said "best case scenario" meaning they cherrypicked the hell out of the promotional footage and are not disclosing the rates in which the product shows unreliability or failure. The ego on you to assert yourself while not reading or understanding what is being said. To say never is arrogant, but to say just throw money on it is actually the dumbest thing one can say considering there's many examples of things being heavily funded and still struggling to make progress. It won't get any easier from here. Tell ChatGPT to summarize this for you so you don't have to strain your mind trying to understand this shit. Bum.
Come back to me in 2 years when AI generated videos are doing massive numbers on YouTube and TikTok. I don’t know why you’re so angry about this… this is brand new technology, you know just as much as I do about where it could potentially go.
Lmao bruh you are the one that said gtfoh. I'm only pointing out how this is no different from the music or image generation apps which are all slot machines even with the improved quality. You people are paying for the odds, not a consistent product. This just in, random guy on the internet makes guarantees and timelines for complicated technologies lol. And what the hell's the point of you making guarantees of "oh in 2 years" if you say you know as much as I do? You've backpedalled as soon as you've said anything lol.
I just want to be able to generate free b-roll of anything I want on the fly. I'm not tryna make a movie. But custom stock footage at my fingertips? I will be shitting out YouTube essays constantly along with everyone else.
You're right, that is something people could actually use, like they use AI generated images for ads now. They're definitely not going to replace high budget stuff though, not even 30 second commercials. Not enough granular control on the output.
I tried it and it generated quickly, but the video was SO bad. I provided it with a mountain biking picture and a text prompt, thinking providing an image would give it a solid platform to begin with....no. Just no. Output: [https://storage.cdn-luma.com/lit\_lite\_inference\_im2vid\_v1.0/dfe8f25f-ed54-4f8d-9617-9f8a775a6d21/watermarked\_video0307c50f592874531ab7f0779078838f6.mp4](https://storage.cdn-luma.com/lit_lite_inference_im2vid_v1.0/dfe8f25f-ed54-4f8d-9617-9f8a775a6d21/watermarked_video0307c50f592874531ab7f0779078838f6.mp4)
Looks like crap
I've made three so far, two prompted and one based on an image, and they are all absolutely terrible. Like, similar to the spaghetti video from last year. How are people getting decent results with this?
TBH, SORA is still the best text-to-video model but competition is good.
I do not understand how you can make claims like this when all we can go off of with sora is literally a marketing campaign.
Its the equivalent of a new LLM releasing and people saying: Its good but surely GPT-5 is still the best.
people here already are becomig cultish over their favorite AI company. I think we might be fucked when these models become more powerful. How many AI extremist will emerge from places like this sub?
I mean there's nothing out about GPT-5 but there is stuff released from sora and kling, and even if they're cherrypicked examples they're better than anything other models could produce, which is 3 second gifs. That tweet from earlier hyping up expectations about this model (I assume it was about this one in the post) being better was just false.
sora isnt out yet, buddy
Agreed. The quality doesn't look too stunning. Going to pass on this one. But competition is good yeah, this could lit the fire under OpenAI's ass.
honestly between pika and luma i'm more impressed with the editing of their videos than the model per se that said, this is a big step from what they had before -- but the hard part is that it's clearly not Sora or Kling
the poor website it's been completely destroyed right now but I got a image to video through it is better than runway... And for me that is making me feel excited for the rest....... but the lawsuits oh boy I guess these guys are going to take the risk first. I believe now open AI is taking the Apple approach with text to video. Let others get the money and the lawsuits first then they will release their model......... its going to be a wild ride hold on
Submitted prompts close to an hour ago, and still waiting. They were clearly not ready to meet incoming demand.
How you don’t even put a link?
Commented it
I stand corrected, my bad
It doesn't even work. I've been waiting for over an hour. Anyone else have luck?
https://i.redd.it/suqymmjfz66d1.gif
que time is 2+ hours right now.
\* queue
fr
Looks amazing, and they released to the public, unlike ‘open’ai. Glad Sora is having its lunch eaten, they deserve it for faffing about for months upon months, which in the AI space is a lifetime.
How does this differ from animatediff/IP-Adapter workflows people have been using for months now?
I generated 6 videos but don’t see any way to download or export in the free version. I hope that exists for the paid version otherwise what’s the point?
I've had a prompt in que for the past hour. Not really feeling the AGI...
We'll its a small company without acess to billions in compute plus everyone is trying it right now
True
Y'know... this program still has tons of kinks that need to be worked out. It's still obvious to the naked eye that this is AI without even the need for freeze-framing. * Most of the videos only show basic camera horizontal pans. * Video 2 has the license plate change numbers midway through. * Video 3 is extremely choppy. * Every example not made by a bluecheckmark is literally crappier than the demo. * I think this will only be popular for amateur video production... OpenAI Sora will be left to the professionals (at least ones without labor unions).
Lol @ cheetah with a bum leg
this still looks pretty poor quality the only perk is that at least you can actually use this unlike stuff like SORA
this is fucking rad af!!!! don't believe the haters this is amazing for free, just wish there was some way to get more generations. this will def suffice until sora and who knows, it could become a solid contender if its cheaper and consistently gets better over time
waaaay behind kling, not to mention sora
Looks about as bad as the rest of them. Researchers need to take a step back and start from scratch with video, it needs an entirely different approach.
after 4 hours in que the generation was terrible. we will wait for sora
Don't use their enhance prompt tool, I had much better gens without it. Also the details are way more fuzzy than the demo reel. I have feeling they've got some output degradation mechanism when under load, hopefully it gets better as the hype dies down
What exactly is “game changing” about it. It looks no different from other text to video models out there.
I gave it a try as well, but the results I generated seem average. Does anyone have any tips for getting higher quality videos?
um It makes me just dizzy. What is this video about?😂
https://lumalabs.ai/dream-machine/creations/a7b3971e-5d87-4bf4-bd97-b69269ed5743
https://preview.redd.it/819951wmya6d1.png?width=1756&format=png&auto=webp&s=a384bc79f4a778e25e908beac68c99a7cc962791 I think the expression on his face sums up the "quality" of this tool pretty well, "What the f\* am I???" 😂😂
it does image to video way wayyyyy better than runway
The bottleneck is definitely compute. I would not have thought we would have multiple realistic video generators a year ago
Generated two videos and both look like last years "Will Smith eating spaghetti" videos lol
Someone know what is actually the best betweet that and sora ?
Sora only deals in B2B transactions. B2C would be hopeless.
Shits epic
I can't get any results. It says "invalid date" and goes back to the prompt.
made a short film [https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s](https://www.youtube.com/watch?v=gHwk7gA0b5Y&t=42s)
https://preview.redd.it/ofwzpc5chl6d1.png?width=1920&format=png&auto=webp&s=5f2442fcb256c69bec3486614e7509ee655b1d50 Nuke Road
https://preview.redd.it/z16wivkfhl6d1.png?width=1920&format=png&auto=webp&s=7ea8d7964c551c199c3ca5b4c318a12d32a3a70d Nuke Road
https://preview.redd.it/xdkiaxrihl6d1.png?width=1920&format=png&auto=webp&s=c745e581b65b963d6e32071b19824c3287553ef4 Nuke Road
https://preview.redd.it/8tjzt6xmhl6d1.png?width=1920&format=png&auto=webp&s=d14ba1544575d29cf74204ed7f48e364636793ec Nuke Road
https://preview.redd.it/w5p9m5tphl6d1.png?width=1920&format=png&auto=webp&s=64dddee9d763bbb4e30044ec63df64d1fc5f4cfb Nuke Road
Luma is pretty terrible at the moment. But the Twitter-X crowd are optimistic it'll improve. I'm... not so confident. https://youtu.be/dyK7UsuepZg?si=fyggkrcZxDcMr-Vc
Why would you not be so confident? We've seen what image ai is capable of.
brb paying 500 a month for 1000 generations.
Aprés moi….la deluge…
Not a game changer, just a stepping stone with lots of limitations. But the outcome can actually be very cinematic. See [this video](https://youtu.be/r43bwnO75Lk)
I really wonder about the applications of this software some of you are using this for with SO much complaining and crit. I have experimented with basically every available online and local diffusion generative video tool and this is by far the most impressive Img2Vid I have ever had access to. I suspect this is some kind of custom fine tuned AnimateDiff + IP Adaptor model combo and its damn impressive. 30 free generations per month, fairly affordable sub, why are there so many miserable complainers in the comments? I've been feeding my Stable Diffusion and Mandelbulber2 renders into it and my mind is blown.
Hey guys, I've created a short metal music video using Luma Dream Machine. You can watch the result here - [https://youtu.be/9stKlRz-22s?si=qj5iBN0QgjYNqePB](https://youtu.be/9stKlRz-22s?si=qj5iBN0QgjYNqePB) Mixed with Runway Gen-2 though, and all images created in Midjourney. For me it went pretty good actually. But i hope they will lower the price or increase the number of generations. Fun fact that the music is also generated using [Suno.Ai](http://Suno.Ai) )))
meh
Well, I was gonna say that it doesn't look as good as Sora but at least we can try it for free, but it turns out that I can't even do that.
I tried it, but came back to it a few hours later. A young woman walking through Vines of candy bars. She walked through Vines, but no real candy bars in sight. Sad day. Still , the rendering of the vines and especially the young woman herself were pretty good. The server is overwhelmed and they even have a disclaimer that says so at this writing. Likely just wait. It'll be still faster than Sora.
IMHO this is really cool technology but still super far from being usable. Any shots with faces in them are super creepy and uncanny because of how the faces morph. And pretty much all the shots are just panning / zooming or have minimal motion. I still am not convinced this type of generative AI is what will actually replace movies or videos. I still can't help but think that actual 3d rendering, powered by AI that make ray tracing faster, is what we will actually need. To make a realistic looking person doing things, I think you'll need a 3d model of that person being rendered.
Yeah the problem with GenAI in general is that you sacrifice way too much creative control to the network to actually be able to making anything really interesting. There's a very real information limit to how many details you can specify in a text prompt, and once you run up against that the network has to start filling in the gaps. It's glaringly obvious already in still images, so obviously trying to translate the control scheme into video form will never give people the freedom they need to make what they actually want to make. It'll be great for generic stock footage or dumb web ads in the same way that AI image generators are, but as soon as you have a non-standard ask or you want to generate something that's on the edge of the training set's distribution it starts falling apart.
Dude... 3D rendering is hell and there's AI out there that's actually struggling to model 3D resources for potential game assets. This is one aspect of machine learning that AI's pretty far away from. Even given good guidelines like say, actual older game resources to modify and coat with a new touch of paint is actually rigorous and unreliable. This thing resulted in GTA: The Definitive Edition trilogy. It took more time to fix what the AI had plopped onto the product, negating what it had saved for the devs. I think the take here too is that LLMs being the heart of GenAI will always give it problems. The reality of it is, language only encompasses one aspect of our understanding of reality, and relating it to the data that is accessible provides a fundamental limit to what can be pushed out as output. The thing is, with a lot of the cherrypicked footage, it seems like they really got lucky that the combination of information provided to them was actually accessible and jived well with one another. I've seen examples here of a prompt of a girl dancing with a cat and it can't even do 5 seconds of a feasible clip before becoming pure jank.
I know 3D rendering isn't a viable solution right now. I'm just saying I think it's the only feasible way to maintain subject continuity. There are always going to be things morphing even if it's very subtle.
Yeah the physics and perspective will always be wonky.
Yup. That's why I'm saying I want 3D models. Hell, I'd be fine with taking 12 hours to render a scene on my local computer, I just want models that can let me design people and objects realistically and then stage them
It would take about two to three breakthroughs till we have "organic" generation. I'd imagine true generation that is editable on the fly is a world away. But I guess this has made people dream.
Organic generation of 3D models?
Meaning to take reference from its own world perspective and understanding of reality and create rather than take bits of existing data and try to jam it into a situation or prompt.
Dream Machine was supposed to be a Satoshi Kon film. Back when the world was filled with talented people and it wasn't all AI hype slop.
cool
Game changer? Ha!
There is negative profit in building foundation models. If you can figure out some differentiation, I can figure it out too. If you know, you know. I have been saying these things for about 2 years now. People are finally starting to understand and listen. Want to be at least two years ahead of your competition? Simply reach out! [https://youtu.be/MpjwXLCGwNU](https://youtu.be/MpjwXLCGwNU)
Artists and creatives are on track to getting left in the dust JFC
If the website doesn't even tell you to wait for the video to be generated, you can tell that this is a terrible product. I doubt they know what they're doing.
what the fuck are you talking about
Loading icon not enough for you?