Down The Rabbit Hole

#The Rabbit Hole called AI based image generation

Prior to 2022, several websites offered AI-powered image creation, editing, or manipulation tools. Some of the most popular sites include DeepArt, Prisma, Adboe Photoshop Camera, and several sites that enabled background removal, or image enhancement.

Many of these sites used tools based on neural networks and machine learning algorithms to process and transform images in various ways. Some of the underlying technology was propitiatory, while some tools were built using open source software. The image enhancements include features such as applying filters and effects, generating art, changing the dimensions of the images. But once Dall-E-2, Stable Diffusion and Midjourney came on the scene, the market, options, and opportunities around AI image generation exploded.

#AI Imaging is for everyone?

One of the greatest advantages of the above tools, and the hundred plus AI imaging tools is that one does not have to make huge investments in high spec'd computers including graphic cards, latest processors, etc. in order to generate, manipulate and enhance images. But the technology does have its limitations.

Online AI based imaging tools have made it possible for users who do not have a powerful computer with large amount of resources, to generate images. In other words, investment in say a multi core CPU, tends of gigabytes of RAM, expensive graphics card and disk drives has been nearly eliminated. Using the SaaS (Software as a Service) imaging apps and sites, one does not have to learn how to install, configure and maintain these imaging tools.

  • Language could become a big barrier for those who do not know English well, because most of the imaging tools understand English language prompts at this time. Thankfully, there are some great options that take inputs in languages other than English. In times to come, this will become more of a mainstream feature (multi-lingual inputs rather than English only).
  • Secondly, sites like Midjourney do requite the user to sign up for discord, understand its interface, and the commands, before they can start generating images.Not everyone might be able to maneuver the interface with aplomb.
  • Third and most important factor is "How do I begin"? That is, which commands or prompts should one enter in the text box so that the AI images can get generated? Thankfully, many sites have sample images that one can explore and learn from, while others offer a list of random image generation commands or prompts.
  • In order to make the most effective use of AI Imaging technology, one needs to have a basic understanding of different keyword (e.g. matte finish, octane render, etc.) and be familiar with some art styles, colours, etc.
  • There is quite a bit of trial and error involved before on can get comfortable using the different AI Imaging tools. In particular for images involving human characters, there have been several instances where the limbs or body parts were not rendered correctly. Or, the intended results were far from what the user anticipated.

In other words, AI imaging tools are not quite a plug and play solution for the un-initiated. There is a bit of a learning curve involved. And in some cases, one can literally go down the Rabbit Hole.

#Various Representations of AI Imaging, Imaging by AI Imaging tools

Representational image for GPT1
Representational image for GPT2
Representational image for GPT3
Machine for content creation, generated using AI

© by Amar Vyas, 2022 - 2023. All Rights Reserved. Built with Typemill.