LLM Drawing with Eyes Open

13 Nov, 2024

The scaffolding around Claude 3.5's artifacts allow for the drawing and display of SVG graphics, sometimes with pretty amusing results. The errors will be familiar to anyone who has written any algorithmic computer graphics - wild curves to nowhere, lines pointing in the wrong direction, and so on. For beginners, at least, the task of drawing is 99% trial and error. Write, render, chuckle, adjust, repeat.

As far as I know, Claude doesn't have the benefit of this trial and error approach, so I was curious to see what improvements might arise if we threw its rendered drawings back into the chat so that it could make adjustments as it saw fit. You know, drawing with eyes open.

Digression: one wonders why Claude's errors so closely mimic a human beginner's, given that it's training data would be dominated by "completed" works rather than off-the-cuff first attempts. This is probably readable as an indication of real synthesis rather than parroting training data.

Doing it Manually

My first attempt was via the standard Claude UI, where I screen-grabbed the output and pasted it back to the chat, after having explained the exercise at the beginning:

You are capable of drawing SVG art, but sometimes the outputs aren't fantastic.

Let's try an experiment in incremental svg composition. I will prompt you with a scene, and you can create incremental or compartmentalised images. I will feed each rendered image back to you, so you can check that it matched your intent.

At a time of your choosing, you can render the entire scene.

Sound good?

For the scene: Two bears playing catch with a Frisbee on a beach at sunset, the iterative approach (right) seems qualitatively better than the single-shot (left).

Bears

This seemed promising, but the process was cumbersome. I've been interested to build something against a gen AI API, so why not this?

I should say that building against the Anthropic API was pretty smooth sailing.

Automating

The Chain-of-Sketch Pipeline

This webapp is live at http://nilock.github.io/eyes-open, and source available at http://github.com/nilock/eyes-open, so do feel free to give it a try. It's Bring-your-own-API key.

The flow of the app is simple. Set a system prompt:

You are a talented and pragmatic artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single SVG tag delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

Again: you are pragmatic. You aim for incremental progress instead of perfection.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

And accept a user scene description. For example:

A detailed human face.

For each Claude response, the text is stripped of the (consistently well-formed) SVG element, and the stripped text is displayed side-by-side with a rendering of the SVG.

Behind the scenes, the SVG is also rendered onto a hidden canvas element so that it can be captured as an image and fed back to Claude. Because this happens invisibly, there was a period in development where a bug caused Claude's first image to be returned continually. Interestingly, Claude never directly questioned the pipeline, but always expressed frustration that its SVG edits weren't rendering as it expected.

The user manually triggers each iteration, and can export a session-in-progress to either a self-contained html page or some copy-paste friendly markdown.

Results

Overall, fairly inconsistent, and generally lacking compared to the Claude native UI! I do not know how to attribute it. Probably there is better scaffolding from the native UI system prompt. Certainly Vision Language Models being blind ↗ doesn't help matters.

Observations:

For a given prompt, it usually to makes ~4-6 drawings before declaring completion. On pathological objects (see the foot below), it seems to get stuck in a loop, where its attempts are futile and its commentary bears no resemblance to the image. Tokens do not grow on trees, so I don't know exactly how long this would continue.
It is rarely self-critical. See, eg, the dogs rendered below - how did they pass an eye-ball check!? It does catch egregious errors though, as in the first render of the human face
Occasionally (say, one in 20?) Claude wraps the svg element in triple backticks.
Claude loves speed-lines.

What follows is a selection of drawing sessions produced by the pipeline. It'd be interesting to do more explicit tweaking to the prompt and see the effects.

A pelican riding a bicycle

Prompt:

You are a talented and exacting artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single <svg> delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

Scene:

A pelican riding a bicycle

Responses:

Let me start with a basic outline of a bicycle - I'll focus on getting the basic geometric shapes of the frame, wheels, and handlebars first. This will serve as the foundation for adding the pelican rider.

[clipped SVG]

This first attempt establishes the basic bicycle structure. I'm looking to see if the proportions and positioning look correct before adding the pelican. The bicycle should appear simple but recognizable, with:

A triangular frame

Two equal-sized wheels

Basic handlebars and seat

Once I see how this renders, I'll adjust the bicycle if needed and then add the pelican character on top.

The basic bicycle structure has rendered clearly, but it needs several improvements:

The handlebars look a bit stubby

We need to add pedals and chain

The seat post could be more detailed

The frame could use some additional structural elements

Let me enhance the bicycle and add a simple pelican outline.

[clipped SVG]

This version adds:

A basic pelican head and neck outline

Extended handlebars

Pedal circle

Additional frame support

Better proportions overall

Once I see how this renders, I'll:

Complete the pelican's body

Add more detail to the bicycle

Potentially add some character to the pelican's expression

The basic structure is coming along, but now let's enhance the pelican considerably and add more character to the overall scene. The pelican needs a body, wings, and more distinctive features.

[clipped SVG]

This version includes:

A more complete pelican body with defined shape

Added wing detail

A leg reaching for the pedal

An eye for expression

A more proportional body-to-bike ratio

Once I see this render, I'll:

Adjust the pelican's posture if needed

Add more details to both the bird and bicycle

Consider adding some motion lines or other dynamic elements

The basic elements are all there, but there seems to be a disconnect between the pelican's head and body. Let me refine the entire pelican to make it more cohesive, and add some additional details to bring the whole scene together.

[clipped SVG]

This version includes:

Better connected neck and body

Two legs instead of one

Slightly adjusted wing position

Added motion lines behind the pelican

More natural body proportions

Better integration with the bicycle seat

Let me see how this renders and I can make final adjustments if needed.

The image is looking much better! Let's make one final version with some refinements to make the pelican look more naturally positioned on the bike and add a few finishing touches.

[clipped SVG]

Changes in this final version:

Added simple wheel spokes for more detail

Adjusted the pelican's body to look more naturally seated

Slightly raised wing position

Refined leg positions to better suggest pedaling motion

Made the body slightly more rounded and natural

Better proportioned overall composition

The image should now have a good balance of simplicity and detail, with the pelican looking comfortably positioned on its bicycle.

The image has come together nicely! The spokes added a good level of detail while maintaining the simple, clean aesthetic. The pelican's positioning on the bike looks natural, and the motion lines give a sense of movement.

Since we've achieved a good balance of elements - the recognizable bicycle structure, the characterful pelican, and just the right amount of detail with the spokes and motion lines - I believe we've successfully completed the scene. The overall composition tells the story clearly and engagingly.

I'm satisfied with this final version and don't see any necessary further adjustments. The drawing meets our goal of creating a charming scene of a pelican riding a bicycle.

[No image]

a detailed human face

Prompt:

You are a talented and pragmatic artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single <svg> delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

Again: you are pragmatic. You aim for incremental progress instead of perfection.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

Scene:

a detailed human face

Responses:

Let me start with a basic outline of a face in profile view. I'll focus on creating the basic shape and proportions first.

[clipped SVG]

I've created a basic profile outline of a face. The intention is to establish:

The curve of the forehead

The nose bridge and tip

The lips and chin area

The basic jaw line

Once I see how this renders, I can adjust proportions and add more details like the eye, ear, and finer facial features. I'm using smooth curves (bezier curves and quadratic curves) to create natural-looking facial contours.

Please show me how this renders, and I'll iterate based on what I see.

I see that my initial attempt rendered quite differently than intended - the proportions are off and the curves aren't forming a recognizable profile. Let me try a different approach, this time starting with a front-facing view and using more precise control points.

[clipped SVG]

In this version, I've:

Created a more symmetrical face outline

Added a simple nose structure

Added basic eye curves

The goal is to establish better proportions before adding more detail. Let me see how this renders and then I can add more features and refinements.

The basic proportions are better now, but the face looks too simplified and cartoon-like. Let me add more anatomical detail while maintaining these basic proportions.

[clipped SVG]

In this version, I've:

Maintained the face outline

Added more detailed eyes with pupils

Created a more defined nose bridge and tip

Added eyebrows

Added more naturalistic lips

The goal is to move away from the cartoon-like appearance while maintaining clear facial structure. Let's see how this renders and then I can refine the features further.

The features are still appearing too stylized. Let me create a more realistic representation with proper facial contours and shading.

[clipped SVG]

In this iteration, I've:

Added more natural contours to the face shape

Created more detailed and realistic eyes with depth

Improved the nose structure with subtle shading

Added more natural lip curvature

Included subtle facial contours and shading

Made eyebrows more expressive

Added hints of cheekbone structure

Let's see how these more detailed features render, and then I can adjust the proportions and shading as needed.

The subtle contours are starting to add depth, but let's enhance the realism further with more anatomical structure and refined features. I'll adjust proportions and add more detailed facial planes.

[clipped SVG]

In this version, I've:

Enlarged the overall face shape for better proportion

Added more subtle facial plane indicators

Enhanced eye detail with better shaped lids and pupils

Created a more structured nose with bridge shading

Refined the lip shape and added subtle volume

Included more natural eyebrow shapes

Added forehead contours

Enhanced cheekbone and jaw definition with subtle lines

Let's see how these refinements render, and then we can focus on any areas that need further improvement.

The face is taking shape but the contour lines are appearing too uniform and mechanical. Let's create a more natural look by varying line weights and adjusting the facial features to be more asymmetrical and lifelike.

[clipped SVG]

In this version, I've:

Added more natural variation in line weights

Made facial features slightly asymmetrical

Reduced the uniformity of the contour lines

Adjusted eye and eyebrow shapes to be more natural

Created more subtle transitions in the facial planes

Refined the nose structure

Made the lips more proportionate

Varied the opacity of structural lines

Let's see how these adjustments render, and then we can focus on any final refinements needed.

The rendering is getting closer, but let's make one final attempt to achieve more naturalistic features and better balance between the structural lines and main features. I'll reduce some of the horizontal lines and focus on key facial planes.

[clipped SVG]

In this final version, I've:

Reduced the number of horizontal contour lines

Made the facial features more proportionate

Refined the eye shapes and details

Created more subtle transitions in facial planes

Balanced the line weights throughout

Added more delicate cheekbone definition

Made the nose and lips more naturalistic

Let's see how this final version renders.

The progression has achieved a balanced composition with reasonable facial proportions and subtle contours. While further refinements could be made, this represents a satisfactory endpoint for our exercise. The face has maintained its structural integrity while incorporating natural asymmetries and varied line weights. The transition from our initial simple outline to this more detailed representation demonstrates successful iterative improvement.

I'll conclude here as we've achieved our main goals of creating a detailed face with appropriate anatomical features and subtle depth indicators.

[No image]

Dogs at a dog park.

Prompt:

You are a talented and pragmatic artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single <svg> delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

Again: you are pragmatic. You aim for incremental progress instead of perfection.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

Scene:

Dogs at a dog park.

Responses:

I'll start with a basic landscape scene for the dog park - a green area with a simple fence outline. This will serve as our foundation.

[clipped SVG]

This first version should establish a simple park setting with:

A blue sky background

A green grass area

A basic wooden fence design

Once I see how this renders, I'll add dogs and more park details. I want to ensure the basic scene composition works before adding the main subjects.

The basic scene rendered well, with good color contrast between the sky and grass. The fence is clear but perhaps a bit too geometric. Let's add some initial dogs and texture to make the scene more lively.

[clipped SVG]

In this version, I've added:

Some ground texture using curved paths

Two simple dogs - one brown and playful, one grey and standing

Basic shapes to represent the dogs' bodies

The next iteration should focus on:

Adding more detail to the dogs

Including some park elements like trees or a ball

Making the grass texture more natural

The dogs and ground texture rendered as intended, but they need more character and detail. Let's enhance the scene with more detailed dogs and add some playful elements.

[clipped SVG]

In this version, I've added:

Facial features to the dogs (eyes and simple smiles)

A red ball as a play element

More grass texture with additional curved paths

Small grass tufts for ground detail

For the next iteration, I'd like to:

Add more dogs in different poses

Include some cloud details in the sky

Perhaps add some shadows for depth

The scene is coming together nicely. The grass tufts add a nice touch, and the ball provides a good focal point. Let's add more dogs and some sky elements to make it more lively.

[clipped SVG]

In this version, I've added:

Simple cloud shapes in the sky

A third dog (golden colored) in a running pose

One more grass tuft for balance

Slightly adjusted positions of existing elements

For a final iteration, I might:

Add some shadows under the dogs

Include some subtle motion lines near the running dog

Maybe add a butterfly or bird for additional life

The scene is progressing well. The clouds add nice depth to the sky. Let's make the final adjustments to complete the scene with shadows and motion elements.

[clipped SVG]

In this final version, I've added:

Soft shadows under each dog for better grounding

Motion lines behind the running golden dog

Slightly adjusted the positions to better balance the composition

The scene now has a good balance of elements, depth through shadows, and a sense of movement with the different dog poses and motion lines. The grass tufts and clouds help frame the scene nicely. I believe this achieves our goal of creating a lively dog park scene.

The scene looks complete and well-balanced now. The shadows provide a nice sense of depth, and all the elements work together harmoniously. The playful dogs, simple fence, drifting clouds, and subtle grass details create a peaceful dog park atmosphere. The motion lines and varied poses of the dogs convey activity and energy, while the red ball serves as a nice focal point. I believe we've successfully achieved our goal of creating a charming dog park scene that captures both the serenity and playfulness of the setting.

I think we can consider this illustration finished!

[No image]

A detailed human foot.

Prompt:

You are a talented and pragmatic artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single <svg> delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

Again: you are pragmatic. You aim for incremental progress instead of perfection.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

Scene:

A detailed human foot.

Responses:

Let me start with a basic outline of a foot from a side view. I'll focus on getting the general shape and proportions correct.

[clipped SVG]

In this first iteration, I've created the basic outline of a foot viewed from the side. I want to see how the proportions and overall shape appear before adding more detail like toes, ankle definition, and internal contours.

After seeing the render, I'll be able to adjust proportions if needed and then move forward with adding those details.

The basic outline rendered clearly, but I see I need to adjust the proportions significantly. The foot appears too rounded and lacks definition. Let me revise the outline and add the first layer of internal details, particularly focusing on the arch of the foot and the ankle area.

[clipped SVG]

This revision focuses on creating a more anatomically correct outline with a clearer arch and ankle indication. The proportions should be better balanced now. After seeing this render, I'll proceed to add the toes and more detailed internal structure.

I see the basic shape is still not quite right. Let me try a different approach with a more elongated form and clearer toe area. I'll work on a top-down view this time, which might be more effective for showing the detailed anatomy.

[clipped SVG]

This version attempts to show the foot from above, with the toes more clearly defined and some indication of the metatarsal structure. The proportions should be more natural now. After seeing this render, I'll refine the toe shapes and add more anatomical detail.

Let me try a side view again, but this time with more anatomical accuracy. I notice the previous attempts haven't captured the true proportions of a human foot. I'll focus on getting the heel, arch, and ball of the foot correct first.

[clipped SVG]

This version aims to better capture the natural curves of a foot from the side, with a more prominent heel, a proper arch, and a more accurate proportion between the heel and toes. The ankle area is also better defined. After seeing this render, I'll add the toes and more surface details.

I see we're getting closer with the basic shape. Now I'll refine the proportions and add the toes and more anatomical details. I'll maintain the side view but make the arch more pronounced and add clear toe definition.

[clipped SVG]

This version has a more defined toe area, a more natural arch curve, and better ankle definition. I've also added subtle hints of the tendons on top of the foot. The proportions should be more accurate now. After seeing this render, I'll add more detailed anatomical features and surface texture.

The basic shape is improving, and I see the toe area is starting to take form. Let me enhance this further with more anatomical detail, better toe definition, and clearer surface contours. I'll maintain the side view but make it more detailed.

[clipped SVG]

This version features:

More detailed toe separation

Double lines for the arch to show depth

Multiple tendon lines on top of the foot

Enhanced heel and ankle definition

Better overall anatomical structure

After seeing this render, I'll refine the toe shapes further and add more subtle anatomical details.

I see the rendering is becoming more complex, but we need to simplify and focus on the most important anatomical features. Let me create a cleaner version that emphasizes the key structural elements while maintaining anatomical accuracy.

[clipped SVG]

This version aims for clarity with:

A strong, clean outline

One primary arch line

A clear dorsal tendon indication

Simplified but anatomically correct toe area

Minimal but effective toe separation marks

After seeing this render, I'll adjust any proportions that need fine-tuning and possibly add subtle shading elements.

The overall flow of lines is becoming clearer. Let me refine this further with a focus on the natural curves and muscle/tendon definition, while keeping the clean, minimal approach. I notice we need to adjust the toe positioning and the arch curve.

[clipped SVG]

This version features:

More natural curve in the arch

Better positioned toe area

Cleaner ankle definition

More accurate proportions overall

Simplified but anatomically correct tendon lines

The basic structure is much closer now. After seeing this render, I'll make final adjustments to the proportions and add any necessary subtle details to complete the anatomical representation.

I notice now that the overall shape needs to be more elongated and the heel needs to be more pronounced. Let me create a final version that captures these elements while maintaining the clean, anatomical approach.

[clipped SVG]

This final version emphasizes:

More elongated overall foot shape

Better defined heel

More natural arch curve

Cleaner dorsal line

Better proportioned toe area

The anatomical structure is now much closer to a realistic foot representation while maintaining a clean, linear style. The proportions and key anatomical indicators are more accurate.

Now I notice that we need to adjust the front angle of the foot and make the toe area more compact. I'll create a version that better matches these elements while keeping the key anatomical features.

[clipped SVG]

This version features:

More accurate front angle

More compact and natural toe area

Better proportioned heel-to-toe ratio

Smoother overall curves

More natural arch position

The anatomical structure now better matches the reference lines while maintaining clear, definitive strokes. The proportions feel more natural and the key anatomical landmarks are properly positioned.

I see now that we should simplify even further and focus on the most essential lines. The basic shape is most important, and we've been adding too much detail. Let me create a version that captures just the core form with minimal lines.

[clipped SVG]

This version is stripped down to just three essential elements:

The main outline of the foot

One key dorsal line showing the top curve

One key plantar line showing the arch

This simpler approach better matches the reference while maintaining the important anatomical proportions. The cleaner look actually makes the foot more recognizable and anatomically sound.

[ Project budget cancelled ]

bears playing catch with a Frisbee on a beach at sunset.

Prompt:

You are a talented and pragmatic artist, capable of creating images via SVG.

You will receive a single user prompt for a scene, after which point you will iteratively create the scene.

Each of your responses will include a single <svg> delineated image.

Each of your SVGs will be rendered and returned to you as an image by your patron - the user.

Each of your responses should include a description of the intended effect of the SVG, as well as an evaluation of the prior render against its own described goals.

You are free to compartmentalize your drawing efforts - maybe it is useful to draw individual elements separately, and later combine. Maybe it is useful to draw a background first, and then overlay other elements.

Again: you are pragmatic. You aim for incremental progress instead of perfection.

When a returned render satisfies your expectations, please respond with a message that does not include an SVG, to indicate that you are finished.

Scene:

bears playing catch with a Frisbee on a beach at sunset.

Responses:

Let me start with creating a simple sunset background for the beach scene. I'll focus on establishing the basic colors and atmosphere.
[clipped SVG]
This initial SVG aims to establish:

A warm sunset gradient in the sky

A setting sun near the horizon

Basic ocean water

A curved beach shoreline

Please let me see how this renders, and I'll proceed with adding the bears and Frisbee in subsequent iterations.

The background has rendered nicely with a pleasing sunset gradient. The beach curve and ocean strip create a good foundation for the scene. The sun isn't quite as prominent as intended in this version, so I'll adjust that in the next iteration while adding our first bear.

Let's add one bear to the scene:
[clipped SVG]
This iteration adds:

A simplified bear figure on the left side of the scene

Basic body structure with head, ears, snout, legs, and an arm reaching up

Maintained the successful background elements

Let's see how this renders before adding the second bear and the Frisbee.

The first bear has rendered successfully, though it could use some refinement. Now let's add the second bear and the Frisbee to create the action scene:
[clipped SVG]
This iteration adds:

A second bear (mirrored) on the right side

Adjusted arm positions for both bears to suggest throwing/catching motion

A simple red Frisbee between them

Maintained the same successful background elements

Let's see how this full scene renders.

The basic composition is now complete, but let's add some final details to make the scene more engaging and dynamic:
[clipped SVG]
Changes in this iteration:

Added eyes to the bears

Included simple beach texture with dots

Enhanced the sun with a glow effect

Added motion lines behind the Frisbee

Slightly adjusted colors and positions

Let's see how these final details enhance the scene.

The scene has come together nicely! The bears have a charming simplicity, and the sunset and beach setting create a peaceful atmosphere. Let me add one final touch to make it even more appealing:
[clipped SVG]
Final additions:

Added a gradient to the water for more depth

Included a subtle sun reflection in the water

Adjusted the water opacity for a more realistic effect

The scene now has all the key elements of a charming beach sunset with our playful bears enjoying their game of Frisbee.

The scene has come together wonderfully! We've successfully created a charming, minimalist illustration that captures the playful moment between two bears on a sunset beach. The elements work harmoniously together:

The gradient sky creates a perfect sunset atmosphere

The sun and its reflection add depth and warmth

The simplified bears have personality despite their basic shapes

The water strip provides a nice separation between sky and beach

The overall composition is balanced and peaceful

The illustration achieves its goal of depicting a whimsical scene of bears playing Frisbee at sunset, with a style that's both simple and effective. I believe we've reached a satisfying final result!

Is there anything specific you'd like me to adjust, or shall we consider this complete?

[No image]