For most photos approximately 200 years Historically, convincingly altering a photograph required either a dark room, some experience with Photoshop, or, at a minimum, a steady hand with scissors and glue. OpenAI Tuesday released the tool this reduces the process to typing a sentence.
This isn't the first company to do this. Although OpenAI has since had a conversational image editing model in development. GPT-4o Google will overtake OpenAI in the market in 2024 in March with a publicly available prototype and then refined it into a popular model called Nano Banana image model (and Nano Banana Pro). Enthusiastic response to Google's image editing model in the artificial intelligence community caught the attention of OpenAI.
New OpenAI GPT 1.5 image is an AI-powered image synthesis model that reportedly generates images four times faster than its predecessor and costs about 20 percent less thanks to its API. The model was made available to all ChatGPT users on Tuesday and represents one more step make manipulating photorealistic images a simple process that does not require special visual skills.
“Galactic Queen of the Universe” added a photo of a room with a sofa using GPT Image 1.5 in ChatGPT.
GPT Image 1.5 is notable because it is a “native multimodal” image model, meaning image generation occurs within the same neural network that processes language cues. (In contrast, OT-E 3OpenAI's earlier image generator, previously built into ChatGPT, used a different technique called diffusion to create images.)
This new type of model we coated in more detail in March, treats images and text as the same thing: pieces of data called “tokens” that need to be predicted, patterns that need to be completed. If you upload a photo of your father and type “wear him in a tuxedo to your wedding,” the model will process your words and the image pixels in a single space, and then output new pixels just like the next word in the sentence.
Using this technique, GPT Image 1.5 can more easily alter visual reality than earlier AI image models, changing someone's pose or position, or rendering a scene from a slightly different angle, with varying degrees of success. It can also remove objects, change visual styles, adjust clothing, and refine certain areas while maintaining facial similarities across subsequent changes. You can chat with the AI model about a photo, refine it, and edit it just like you can draft an email in ChatGPT.






