Did you know you can configure Google to filter junk? Take these steps to improve search results, including adding my work to Lifehacker as a preferred source.
New Sora app from OpenAI Over the past few weeks, the focus has been on the hyper-realistic lameness of the AI. Sora allows users to easily create short videos that seem fairly real to most people. including videos showing images of real people.
But before Sora abandoned the project, it was Google who raised concerns about these realistic AI videos. With Veo 3Google launched an artificial intelligence model that not only created realistic videos, but also generated realistic audio that synchronized with the action. Sound effects, ambience, and even dialogue can be created along with the video itself, completely selling the effect with one simple cue.
I see 3.1
Now Google is back with an update to Veo, appropriately named Veo 3.1, which the company announced. on the blog on Wednesday. It's not necessarily a complete overhaul or a revolutionary new video model. Instead, Veo 3.1 builds on Veo 3, adding “richer sound” and “increased realism” that Google says creates “lifelike” textures. The new model also reportedly supports new storytelling management tools that go with new updates to Flow, Google's AI-powered video editor. Flow users now have more granular editing controls and can add audio to existing features such as Ingredients in Video, Frames in Video, and Expand.
What does this mean in practice? According to Google, Ingredients to Video with Veo 3.1 allows users to add reference images, such as a specific person, items of clothing or environment, to their scenes. The new Flow editor can then insert these elements into the finished product, as you can see in the demo video below:
Building on this new feature, Flow now allows you to add new elements to an existing scene. With Insert, you can tell Veo 3.1 to add new characters, details, lighting effects, and more to your clip. Google says it's also working on the opposite, to allow users to remove any elements they don't like from a generation.
Google also now has a new way for users to specify how they want a scene to be framed, called First and Last Frame. Users can select key frames to start and end a scene. Flow with Veo 3.1 will then fill in the gap and generate a scene that starts and ends based on those images.
What are your thoughts so far?
There's also now a way to create longer videos than previous versions of Flow were able to create. The new Extend feature lets you either continue the action of the current clip or jump to a new scene following it, although Google says the feature is most useful for creating a longer establishing shot. According to the company, Extend can create videos longer than a minute.
Veo 3.1 is available to users of the Gemini app, as well as Vertex AI if you have Google AI Pro subscription. Developers can access it through the Gemini API. Google says that “Video Ingredients”, “First and Last Frame” and “Extension” are moving to the Gemini API, but “Add Object” and “Remove Object” are not available. “Extension” is also not yet available in the Vertex AI API.
Is this really good?
Google sees all of these advances as a boon for creative people and creativity, but I'm quite skeptical about them. I could see Veo 3.1 and Flow as a good tool for thinking through shots before shooting or animating them (i.e. a storyboarding tool), or even as a way for new and up-and-coming filmmakers to learn editing by seeing their ideas in a more realized form. Overall, though, I don't think AI-generated content is the future—or at least not the future that most of us want. Sure, there's humor or novelty in some of these AI-generated videos, but I'm willing to bet that most people who enjoy them do so ironically or solely on social media.
The idea of replacing filmmakers and actors with generations of AI seems absurd, especially when it exposes us all to the risk of misinformation. Is it really that important for companies like Google and OpenAI to make it easy to create hyper-realistic, fully rendered scenes when those videos can easily be used to deceive the masses? It may be the ramblings of someone who resists change, but I don't think most of us would want to see our favorite shows and movies, made with passion and emotion, replaced by realistic-looking people giving subdued and robotic performances.