Visual ChatGPT: Images from Text Requests – All Info

by Pramith

You’ve probably wondered if it’s possible to generate images from texts or even edit them. With Visual ChatGPT from Microsoft, this could soon become a reality

Visual ChatGPT – the most important info

Visual ChatGPT is a new implementation of Microsoft’s ChatGPT model that allows images to be generated and modified from text requests. To do this, the research team has integrated several image generators such as Stable Diffusion, Blip and Pix2Pix and uses dialogues between users and the AI to make dynamic changes to images.

  • Visual ChatGPT is a planned image tool from Microsoft that generates and changes images from text requests. It is based on the well-known chatbot ChatGPT as well as various image AIs.
  • The special feature compared to conventional image-generating AIs such as Midjourney and Dall-E is that they can only ever generate new images and use previous works at most as a reference.
  • Visual ChatGPT, on the other hand, is supposed to be able to dynamically change existing images through dialogue between the users and the AI. For example, it should be possible to replace or adapt individual image elements in a targeted manner.
  • Visual ChatGPT can also output parameters of images as a text response and thus, for example, interpret the colour of the wall in an image or give more complex image descriptions.
  • For now, Visual ChatGPT is available as a publicly available preview on Github, but it requires a lot of video memory and a dedicated GPU. It is still unclear what Microsoft will use the tool for, but it could, for example, be integrated into Bing image search or other programmes.

