Automatic writing with image generators

I did a lecture earlier this week in which I surprised myself by how vehemently I argued that image and video generators are (mostly) functionally useless. The problem I think is that you can rarely produce exactly what you want through a precise description. I can see you could stock libraries of generic stock images this way which someone then chooses from (with all the horrible implications for employment which follow from this) but I struggle to see how you could use them in an autonomous way, apart from for incredibly generic and straightforward things e.g. “a photo of a family at the beach looked happy by the sea”. Though having tried this example, the result was creepy as fuck:

A joyful and serene family portrait at the beach. The scene features a family of four: two adults and two children, all smiling and enjoying a sunny day by the sea. The parents are sitting on a beach blanket, one wearing a wide-brimmed sun hat and the other with sunglasses, while the children play nearby, building a sandcastle. The background shows gentle waves lapping at the shore and a clear blue sky. The family is dressed in casual beachwear, with bright and cheerful colors, capturing a moment of happiness and relaxation by the ocean.

However I tried automatic writing, in the sense of genuine automaticity rather than free writing, in order to see what happens. You can get some evocative images if you approach them in this way but they serve no discernible purpose other than momentary (wasteful) amusement:

It’s interesting how ChatGPT converts the free writing into a prompt, piggybacking on the sophistication of the text model to make the image model less crude than it would otherwise be. I can’t shake the feeling there’s an art to this which I’m failing to grasp, but it’s certainly a very different process to text based prompting.

Automatic writing with image generators

Share this: