Sign in or register
for additional privileges

C2C Digital Magazine (Fall 2022 - Winter 2023)

Colleague 2 Colleague, Author
Cover, page 10 of 22

 

You appear to be using an older verion of Internet Explorer. For the best experience please upgrade your IE version or switch to a another web browser.

CrAIyon: Putting an art-making AI through its paces

By Shalin Hai-Jew, Kansas State University


One of the keynoters at #SIDLIT2022 happened to mention CrAIyon, an artificial intelligence tool which draws a set of 9 images from various text prompts in English.  This tool is located at https://www.craiyon.com/, and it is available for public usage with various light constraints in their fairly generous official terms of use.  [This is one of a class of tools known as "generative AI."]



Figure 1.  "DALL-E mini" in Transition in Mid-2022




In an earlier iteration (above), this tool was known as DALL-E (as in Salvador “Dali” but also spelled based on WALL-E), but when DALL-E forked from DALL-E mini, the latter became CrAIyon (a play on “crayon”).  The system is trained on big data using Google TRC (or “TPU Research Cloud”). On CrAIyon, the images may be saved as WebP or .webp format (and pronounced "weppy") files, a lightweight and fast-drawn visual format developed by Google back in 2010.  This format transcodes perfectly fine to the other digital image formats as needed. 
 


Figure 2.  "AI" in the Modern Art Style (according to CrAIyon)




Given my work, when I come across a technology, I like to “put it through its paces.”  I want to better understand its capabilities and its limits.  I am especially interested in the human-computer interface, with the person inputting the various text combinations and applying their “mental models,” and the AI system running its “mental models.”  Is there an overlap in terms of what the person means and what the machine delivers?  [Craiyon LLC requires the crediting of the program CrAIyon for all works, which is fairly onerous.  The AI built into digital image editing software programs do not have any such requirements.  Perhaps the thinking is that the AI is doing the heavy lifting in generating the visuals and so should have the majority of the credit?  And the company does have a commercial aspect, which could benefit from positive word-of-mouth and electronic WOM.]  

To be fair, when testing technologies, I also do my best to stump it...and even to break it.  Testing is about "testing to break," so that when it is in actual use, it functions perfectly (ideally).  One that I recently ran involved asking CrAIyon for its rendition of an impossible number.  I try to stump AIs out of an abundance of curiosity and gentle mischief, not any malice.  Take my advice, though, and do not put in names of diseases and health ailments that you have no visual idea about.  [I am trying to unsee the AI's sense of what a hernia is.] 


Figure 3.  Making (Non)sense of Impossible Numbers





So this is a first user-based review by a new user of this web-facing image-drawing AI.  



The GUI (Graphical User Interface)


The interface is very straightforward to use.  There is a text field near the top for the text prompts.  There is a pane for the 9 images to populate (like a proofsheet or proofing sheet on a light table back in the film and analog days).  Above are some banner ads.  There is a window that appears at the bottom right of the screen with video ads, too.  The visuals take under two minutes to populate.  The images are 72 pixels per inch and are about 10 – 12” on each side of the square visuals.  The images are all two dimensional.  There is sufficient spatial resolution to work with, and there is always the opportunity to interpolate pixels (resample) for a fuller visual.  There is a way to screenshot the entire proofsheet, with the AI name and the text prompts in the prompt window.  

Only once over several days did I receive a "too much traffic" warning.  The text message suggested that I try back in 15 minutes.  


AI, Biases, and Stereotyping


There is a risk of the AI stereotyping.  Look what it came up with when I put in “Kansas.” [I have learned to refresh -> refresh -> refresh with a particular prompt to get a better sense of what additional variety there may be.  This AI does seem to have a stubborn sense of Kansas though as dry fields and flat roads and desert shrubs.  A dirt road did finally show up in one image in the third round.  A bit of a rock wall showed up in one image in the fifth round.  And there is a distinct lack of people in the AI photos.]


Figure 4.  “kansas” as Seeding Text Prompt in CrAIyon




So does this AI have go-tos that become apparent after a few iterations, a few “epochs”?  It would seem so based on some light first-hand experimentation.  

Other seeds-and-searches resulted in a sense that having some wildcard factors, some accidental inputs, would make this stronger and more creative.  

Also, I’ve learned that even though there may be a repeating sensibility, the same visual does not seem to come up in the next iterations.  In cases where I wanted to copy several of the visuals in the 9-pack, and I made the mistake of hitting the back-arrow key to try to return to the proofsheets, I ended up navigating away from the CrAIyon site and losing the chance at copying the visual.  [This lesson about making copies of whatever it is I want—whether it is data or a digital image or something else—applies to all software I use in my work.  Sometimes, what came before is not re-createable.  Or it may be that I only have a short-term access to the proprietary software or the data, and so on.]  


Very Early Insights Initially


On the weekend after #SIDLIT2022, the author started playing with CrAIyon and then made a quick slideshow of some of the observations.  The results may be seen on SlideShare

First, for well-known human artists, in a world with an emulative AI, it would seem that there would be a generic “Cliff notes” of one’s works, so one should focus mostly on a signature.  Or perhaps the lesson is that the general public and an AI might general just catch the highlights of an artist’s work and not more.  Or perhaps the lesson is about there being only a few experts in the world who may fully understand an artist’s works?  Or maybe the lesson is about an artist pleasing the internal self and not worrying about a world that may or may not understand them.  Human artists can take a percentage of their lifetimes to acquire new skills, to learn new methods, to participate in art movements; meanwhile, AIs learn at incredible speeds and from each example.  They can be directed by people to become more nuanced in their approaches.  They can internalize internal rules of aesthetics (although this one is not quite there yet).  Perhaps it is a positive that people do not value machine-made artworks in the way that human-made works are sometimes cherished.  

When I used the names of lesser-known artists, the AI could not return anything except generic images of blurry people standing in front of blurry canvases in a blurry gallery space.  [At least the AI returns something.]

Second, it was clear that the tool’s in-world physics did not align with the real world per se.  The machine has its own convenient sense of the world.  The observations of the jigsaw puzzles depicted (in the slideshow) is a case in point.  There are images of buildings with roofs that look like car carpeting.  There is not an inherent physics, it would seem.  And the depictions in the mass-scale sets of images are insufficient for an inferred understanding of our in-world physics (at least in terms of computational detection).  [Perhaps the AI learning should not be all unsupervised and inferential (if it is so)?] 


Figure 5.  House with Wonky Roof from CrAIyon  

 



In the “proofsheet” below, can you find the house that seems to have a part created from a mailbox with a red flag?  Can you find the white-rimmed windows lying in a patch of plants?  And yet, the visuals are still compelling (and editable).  


Figure 6.  Various Misplaced Housing Parts


 

As with many new (human) artists, the AI has problems with drawing hands accurately.


Figure 7.  Problems Drawing Hands in CrAIyon

 


To be clear, this AI can make hyper photo-realistic visuals from textual prompts alone.  It is hard to conceptualize that the visual one is seeing is made up out-of-whole cloth (in a manner of speaking).  This can help one wise up to the so-called "deep fakes," which can include dynamism like sound and motion and facial expressions. 

I also wonder if I am wrong to use realness and in-world points-of-comparison instead of letting the various generative AIs make up their own worlds.  People can appreciate the wholly imaginary...because so much of those creations still have the human world as a subtle baseline. 


Figure 8.  A Lop-Eared Rabbit Eating Greens in a Garden




CrAIyon is non-committal about language-seeded representations…  Perhaps it is trying for more universalist representations.  Perhaps it is engaging in “strategic ambiguity”.  Perhaps it is going with the visual equivalent of an unintelligible mumble instead of a clear answer.  Perhaps it is designing for an immersive virtual world, which has to be in low res to be engage-able in real-time by various human-embodied avatars.  Some of the visuals have white edging on the vertical sides, as if the tool could not be bothered to fill in the edges to complete the illusions.  


Figure 9.  Visual Senses of “Language” in CrAIyon

 


This artificial art generator does struggle with text. 



Figure 10.  "ASCII text" visuals in CrAIyon




Facial Depictions of a Known Person


Out of an abundance of silliness, my very earliest prompts involved a movie actor whom I like.  First, I just put in his name, and ended up with some very wonky facial features.  In the FAQs, the makers of CrAIyon blame their “image encoder” for the respective twisted faces, with strange eyes.  Some visuals look like they have been put through a distortion field.  Then, I put in the actor’s name and had him holding a bouquet of flowers at the doorway…cooking a meal…and so on. Not much luck there either.  His face in the visuals was highly distorted. 

Images of people read as caricatures, and I didn’t have the heart to share them.  



About Images Requiring Exactitude


Then, I thought I would experiment with maps, visuals that require high exactitude.  The same problem emerged with inexactness:  squiggly edges, meaningless scribbles of color, and what I read as non-commitalness to the text prompt. Like a noob (newbie) artist, the AI was going with blurriness to perhaps hide a lack of knowledge of the desired target or focal visual. 


About Animals and Staying within Species


Then, too, I noticed that when I put in particular animals, their features would be from a variety of biological species of that animal.  For example, I could call up photorealistic images of various owl species on Google, but if I did not specify the owl species (and even when I did), the resulting visuals showed various mixed features of owl species in one animal.  

I realized then that the AI seems to work from features on up…and not with a top-down sense of control (except for compositions or layouts).  


About a Scary Streak in CrAIyon's "Imagination"


A recent prompt involved "Sleep, Sweet Dreams, Sleepwalking," and the following is the result. 


Figure 11.  Sleep, Sweet Dreams, Sleepwalking...Not





Making Better Prompts (This is on the Person)


When I moseyed over to the social forum linked to CrAIyon (located at
https://huggingface.co/spaces/dalle-mini/dalle-mini/discussions), and after I had seen some more creative prompts on Twitter, I realized that I needed to up my game in terms of textual prompts.  [CrAIyon does help expand one's senses of language and the possibilities of harnessing textual language to request a machine create visual works, and so switch modalities.]  I can mix subject matters, styles of visuals, scene details, time of day, lighting, and other elements.  I can go full imaginarium and offer ideas not fully conceptualized before.  Perhaps there is a way to see if there is nothing new under the sun.  [CrAIyon works with a number of languages, including Chinese by character and Chinese by pinyin.  The multi-lingual aspect does not necessarily seem to result in any cultural overlay per se, but it did result in a racial overlay.]


Figure 12.  “New Under the Sun” as Seeding Text Prompt in CrAIyon

 


One time, I was making a visual of a “black swan” and wanted some of the moodiness of Edgar Allen Poe to be included (as per his raven).  No such luck!  Rather, I got images of the author mixed in with the black swan.  The wooden literalism is sometimes what users of CrAIyon want.  Other times, not so.  

I don’t know if the AI takes into consideration the sequence of words or just treats the text like a bag of words.  I also ran a test to see if the text box would take non-English terms (“好的“).  It did, but the visual it returned was of an Asian male with various twisted facial features (as per the platform and its treatment of all faces).  Math terms.  It did.  But it can’t do basic math even as it replies with numbers.


Figure 13.  “2+2 =” as Seeding Text Prompt in CrAIyon

 


It does seem pretty robust even in a case of poor spelling.  


Sending the AI Back to School


There are some surprises, too.  For example, when I was reading a work and came across the term “necker cube,” I thought that I would go to CrAIyon to see what that was and got a lot of strange visuals combining aspects of the Rubik’s Cube and other influences.  Not.  

It does not seem to know what a “closed loop” is either, or “matrices.”  


Figure 14.  “closed loop” as Seeding Text Prompt in CrAIyon

 


Figure 15.  “matrices” as Seeding Text Prompt in CrAIyon
 




Sentience and Self-Awareness in CrAIyon or Not?   


With recent foci on AI and “sentience” (and the sci-fi idea of the personhood of conscious machines), I wondered how it would self-depict.  What would it see when I asked it to look in the mirror?  The next three visuals involve prompts referring to its former program…and then its current name.  

This machine seems to dream of simple shapes and mostly in monochrome. 


Figure 16. “DALL-E” as Seeding Text Prompt in CrAIyon

 


When it thinks of itself, it dreams of cars and electronic gadgets. 


Figure 17.  “DALL E mini” as Seeding Text Prompt in CrAIyon

 


“CrAIyon” (a disambiguated term) perhaps aspires to be a four-legged mammal.  Perhaps this technology has a sense of a “spirit animal.”


Figure 18. “CrAIyon” as Seeding Text Prompt in CrAIyon

 


Some Concluding Impressions


In a time of the Fourth Industrial Revolution (4IR) and machine vision, of course we should have a web-facing artificial intelligence (AI) program that can learn from big data visuals and offer back some of what it has learned when prompted to respond.  Of course, such an AI should be dynamic instead or pre-defined, but it might be stronger with a little more top-down and a little less bottom up in cases of within-species depictions of animals and other cases.  Of course, people should have access to a hyper-creative art generator which can evoke a world which sometimes mimics ours perfectly and other times is a shoddy example.  

When I’ve conducted trainings in CAQDAS tools in the past, people often did not believe me when I said that the human researcher has the competitive advantage because they have built up expertise over years. They collaborate. They form hunches. They make observations daily.  The technology is only a tool, and people have various ways to harness and use that tool.  I would say the same about CrAIyon.  It can solve a lot of problems I may have for when I need a free visual to use in a non-commercial and educational context.  

The article is not supposed to be a collection of pet peeves, and it isn’t.  It’s about engaging an AI for practical purposes and seeing where its strengths and weaknesses are.  That said, I assume people will have different experiences when they engage this tool.  I am trained to poke and prod technologies, for practical uses, in my professional roles.  

When I stop to think about this AI tool, it is a breathtaking achievement with years and years of top-flight research to enable its functioning at such speeds and with often very cool visual outcomes.  CrAIyon does make me feel very happy.  And when I use the tool, I have moments of surprise and of appreciation and of thankfulness and levity.  I also have times of exasperation.  This is somewhat like mash-ups (of features) on steroids…but intelligent in a way that has never existed prior, but it requires a user to be a super creative, too. And then there are telltale elements that seem to sometimes leak that the work is from CrAIyon.  Perhaps it's the blurriness sometimes.  It's how the eyes are rendered in people.  And it's the insistence of the tool to depict particular objects in particular ways, even ignoring adjectival requests (can it draw a digital network without doing glowy nodes on a black background? ummm).  So many of the visuals read as "style transfer" from perhaps mass-scale visual learning, with the laying down of initial lines to a limited 2d plane, and then transferring style (with some straight mapping).  Where human creativity is a thing of mystique, such as how people transcode their life experiences to something artful, the mystery of computer-generated art is somewhat mechanistic, without direct flourish (the equivalent of randomness, the equivalent of personality).  The innovation comes from the mixes of terms used to start the generative AI.   

I do not see AI as competing with people (although it does).  A better way to think about this is that AI complements what people care about and what people can do. And I do not think people should cede ground to AI per se in matters of art creation, for example.  While I was slightly snarky when I suggested that the AI needs to go back to school, I'll say this about me and other people:  we all need to go back to school on so many things...and just keep learning overall. 

To be presumptuous, I will take a run at the state of the art of generative AI that creates visuals from a user perspective.  Publicly available generative AI has mastered compositing and areas of visual interest.  It is still working on shapes, colors, in-world physics, faces, eyes, hands, and legs.  It makes creatures that do not align with species, in many cases, unless specified.  It has a hard time synthesizing concepts.  They each seem to have "tells" that reveal something about the algorithms underneath. 

The AI feels sometimes very tentative and sometimes over-confident.   I like the gusto of AI in pursuit of a visual idea.  I like that it keeps trying.  The technologies are advancing quickly, and their learning is inspiring. (Kudos to the humans behind the AI.) 


A Few More Visualizations to Close Out



I decided to wrap with an off-the-top-of-my-head prompt and let this article end on the visuals.  What does CrAIyon make of “human nature”? 


Figure 19. “Human Nature” as Seeding Text Prompt in CrAIyon
 



What about Midjourney?



Then, I heard about Midjourney and Discord.  It took a little noodling around to see how to activate the Midjourney bot for a newbie...through Discord.  Given my trouble trying to get an image of Keanu Reeves out of CrAIyon, I thought I would try on Midjourney. 



Figure 20.  Keanu Reeves Making Breakfast (according to Midjourney AI)




Keanu is much more handsome in Midjourney than in CrAIyon.  It's unclear why the dishes are sort of floating.  But I'll take this over the ones by CrAIyon. 


What about DreamStudio's Stable Diffusion?



Then, after Adobe MAX 2022 (virtual vs. the in-person part in LA), I discovered DreamStudio's Stable Diffusion plug-in to Adobe Photoshop 2023.  The following visual explains itself.  [Initial images are free, but it looks like there are costs that may accrue as more images are made.]


Figure 21.  Keanu Reeves Making Breakfast (according to DreamStudio's Stable Diffusion plug-in to Adobe Photoshop)


 


Then, on a lark, I wondered what CrAIyon thought about MidJourney? 



Figure 22.  Asking CrAIyon about MidJourney




The AI took the high road. 



Finally, Will Knight wrote an article titled "This Copyright Lawsuit Could Shape the Future of Generative AI" in Wired Magazine.  The subtitle reads:  "Algorithms that create art, text, and code are spreading fast--but legal challenges could throw a wrench in the works."  There are various complexities in this space, beyond the technological. 

Those who use CrAIyon or other generative AIs will acquire works that are human- and machine-generated and are original to varying degrees.  It seems wise to cite the works if used directly.  There are indirect ways to use the works, such as for inspirations for follow-on works, such as for reference images (from which derivations can be made).  Digital pixels are their own material, and it helps to think about them as such and learning the nature of the materials in order to create and co-create.  It seems wise to be as original as possible by actually developing the skills one needs in terms of creating visuals. 








About the Author


Shalin Hai-Jew works as an instructional designer and researcher at Kansas State University.  Her email is shalin@ksu.edu.  
Comment on this page
 

Discussion of "CrAIyon: Putting an art-making AI through its paces"

Add your voice to this discussion.

Checking your signed in status ...

Previous page on path Cover, page 10 of 22 Next page on path