The idea of writing this book began as two random streams of thought that converged around the end of October. The first was a question from a graphics designer who wanted to know which site(s) he could use to test AI based imaging. I provided him with a small list of sites, and he came back with several questions such as "which prompts will help me generate better quality images?"
The second trigger was a blog post that I had planned to write on this very topic. Between August and September 2022, I had generated some images using Dall-e-2, art.elbo.ai, and Stable Diffusion, and created short videos around a central theme.
Some of these themes resulted from fun filled collages of posters for Hollywood films, famous movie quotes, covers for L P disks and bestselling books, and some retro Bollywood film posters
Nearly a month and a half later, convergence of these two streams has evolved into something more comprehensive, namely, this book.
#The Ganesha Connection
The legend goes that the greatest epic ever written, namely the Mahabharat, was narrated by Rishi Vyas and scribed by Ganesh ji.
This book was recorded in audio format and transcribed using Speech to Text. I share the last name with the author of the Mahabharat. therefore, it is fitting that I affectionately call the Speech to Text tools used for this book as "Ganesh".
Note: all images in this book were generated using AI technology. The images on this site have been compressed and this may have resulted in loss in image quality compared to the original.
#Who Should Read This Book?
If you ask any novice author "Who is the intended audience for your book?" they are likely to reply "This book is for everyone." However, life does not work that way, and it is important to have a clear picture of who are the likely readers for any book. And in our context.
This book and its content is intended for the novice to the intermediate user of graphic arts or images. Maybe you are a graphic designer and are exploring the AI image creation. Or, maybe you are an absolute beginner who wants to learn a new technique from scratch.
I belong the latter category. The learning curve for me was very steep. In four months, I explored different styles of visual art, schools of painting, materials, the technique or technology behind generating images, and most importantly, was able to determine which images are most suitable for my intended use. The primary reason why I set on this path of exploring the world of AI Imaging was to explore if some of these images could be used for our podcasts at gaathastory. Alternately, maybe there was a potential to create some images for my blog. Little did I know that I would go down a rabbit hole. More on that in the following section.
Below image represents LP Disc cover for a 1950s style Bollywood film
What you can expect form reading this book
The focus of this book primarily remains on image creation or generation using AI. But that would be doing gross injustice to the immense possibilities created by AI Imaging. Therefore, I have devoted a section to some other awesome application of AI based imaging. These include image editing, background removal, restoration of old photos, and creating profile pictures or caricatures, among others. Creation of short animated videos is not covered in this book. Depending on the response to the current work, I might take that up as a project for a future edition of this book.
Caveat Emptor: The examples and the included images follow my own learning journey as well as the developments in this rapidly emerging space of AI imaging. I am not a trained graphic designer or an artist, so the images may vary from 'meh' to 'wow', with the median hovering somewhere around 'looks good'. Below is one such image, I leave it to your fine judgement to rate.
Planet earth showing the continents beginning with the letter "A". (Art Elbo)
Early Dabbling with AI Imaging
A couple of years ago, I was consulting for a European organization that works in the Sustainability space. They wanted to showcase some of their work in India, and the reports included images of the projects and testimonials of people. The challenge was, that many of the testimonials and images were from 2014 or 15. Or even before the Data Privacy laws came in place. This raised a possibility that many of the images that featured real people could not be used anymore, or so felt the communications team.
I disagreed with them. My logic was, that the Law of the Land in India does not require the written consent of the person whose photograph is to be used for digital media.(This might change in the coming years).
Image source: thispersondoesntexist.com
However, instead of presenting my above argument, I thought of a more creative way. Thanks to sites like thispersondoesnotexist, I was able to create many images of different people, some of whom quite closely resembled the people in the case studies and testimonials. In other words, some of the images looked quite usable for the report in question. However, soon came COVID-19, and the conversation around that report was lost in the chaos.
I had previously generated images using sites like thispersondoesnotexist, or created caricatures using apps like toonme.
I did realize during this experiment that the user had very little control on the type of image that would be generated. Namely, the ethnicity, gender, age, profession, hairstyle and clothing could not be defined in advance. Maybe it was possible, and I missed it at that time. Nonetheless, researching on this idea led me to sites like thisartworkdoesnotexist, thiscatdoesnotexist, which presented a good opportunity to spend some time tinkering around with creativity during lockdown in 2020.
screenshot from thischemicaldoesnotexist
Fun Fact: The first AI art program, called AARON, was developed by Harold Cohen in 1968 at the University of California at San Diego (source: wikipedia)
#Caricatures - a fun way to explore AI Imaging
In addition to finding this "X" does not exist, some sites and apps that turned profile pictures of persons into caricatures also showed up during this period. On these apps or sites, one can upload an image and the site or app creates a caricature based on that image. Below are a couple of examples.
Myself and my AI self toonme
ProductHunt used to feature such sites quite often in 2020-21 era. Back then, I thought this space was great for creating avatars for community sites or profile images for social media. I tried many similar services as above, without realizing that these sites were using some form of AI image technology at the core.
The Real Journey began in August 2022
I really got my feet wet in this exciting space when I got access to Dall-E-2. One of the first set of images I created using Dall-E-2 were in August 2022, under the "Har Ghar Tiranga" (National flag in every house) theme. This was a very popular campaign launched by the government of India to celebrate 75 years of India's Independence. Below video shows a collage of images generated using Dall-e-2 under that campaign. Note that these images have been used "As IS', that is, the output from Dall-e-2 was used without any editing or modification.
One of my earliest creations using Dall-E-2.
Video: Celebrating #HarGharTiranga Campaign https://vimeo.com/752101269
One of my earliest images using AI imaging in general. Note the obvious oops'es
There is so much more to explore, so much to learn. As a starting step, this book should hopefully pave the way for the novice or the AI imaging enthusiast.
Happy reading, and Happy learning!
-Amar Vyas, Bengaluru,
14 27 December 2022
You will find a blue (or blu-ish) cat origami at the end of each chapter. The cat's name is Cookie.[^1] She will be the designated spirit animal for this book.
[^1] I have a dog at home, his name is Buddy. But many years ago, we used to have a cat in our household. I had taken care Cookie since she was one week old and was abandoned by her mother. The origami avatars of cookie will follow us in our journey throughout this book. Of course these images were also generated using AI Imaging, during my first weeks of exploring AI Image technology. The source sites for these images include Thumbsnap, Stable Diffusion on Huggingface, and others.
Almost all of the images used in this book carry the Creative Commons CC0 license, or the open source license specified by the tool used to create a particular image. Where a particular image was created has been mentioned wherever practically possible. You are encouraged to check the terms of license for the respective AI image generation sites.