Generating Consistent Imagery with Gemini: A Practical Guide to Building a Prompt-Based Generation Pipeline

You can transform archive photos into consistent character-driven stories using Gemini 2.5 Flash Image (Nano Banana). This guide demonstrates how to build a prompt-based pipeline that preserves character features across multiple scenes and transformations.

The Challenge: Breathing Life into Visual Archives

Traditional image editing requires specialized skills and complex tools. Most archive images remain unused because modifying them seems impossible. Gemini 2.5 Flash Image changes this by understanding spatial relationships and maintaining visual consistency across generations.

We’ll complete this three-step process:

Start with an archive image
Extract a character to create a reference sheet
Generate a story sequence using only prompts

Setting Up Your Pipeline

Essential Dependencies

Install the required packages:

1
%pip install --quiet "google-genai>=1.40.0" "networkx[default]"

Configure the Client

1
2
3
from google import genai

client = genai.Client()

Model Configuration

Use Gemini 2.5 Flash Image with these settings:

1
2
3
4
5
6
7
8
GEMINI_2_5_FLASH_IMAGE = "gemini-2.5-flash-image"
RESPONSE_MODALITIES = ["IMAGE"]
ASPECT_RATIO = "16:9"  # Generates 1344×768 images

GENERATION_CONFIG = GenerateContentConfig(
    response_modalities=RESPONSE_MODALITIES,
    image_config=ImageConfig(aspect_ratio=ASPECT_RATIO),
)

Building Your Generation Function

Create a robust generation function with retry logic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def generate_content(sources: list[PIL_Image], prompt: str) -> PIL_Image | None:
    contents = [*sources, prompt] if sources else prompt
    
    response = None
    for attempt in get_retrier():
        with attempt:
            response = client.models.generate_content(
                model=GEMINI_2_5_FLASH_IMAGE,
                contents=contents,
                config=GENERATION_CONFIG,
            )
    
    # Extract and return the generated image
    if response and response.candidates:
        content = response.candidates[0].content
        if content and content.parts:
            for part in content.parts:
                if sdk_image := part.as_image():
                    return sdk_image._pil_image
    return None

Creating Your Character Reference

Extract and Transform Your Character

Start with your archive image and create a character sheet:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
source_ids = [AssetId.ARCHIVE]
prompt = """
- Scene: Robot character sheet.
- Left: Front view of the extracted robot.
- Right: Back view of the extracted robot (seamless back).
- The robot wears a small, brown-felt backpack with tiny polished-brass buckle.
- Background: Pure white.
- Text: Caption "ROBOT CHARACTER SHEET" at top, "FRONT VIEW" and "BACK VIEW" at bottom.
"""

generate_image(source_ids, prompt, AssetId.ROBOT)

This approach provides multiple benefits:

Preserves all visible character features
Adds new elements (like the backpack) consistently
Creates a reusable reference for future generations

Generating Consistent Scenes

Structure Your Prompts Effectively

Use this proven prompt structure:

1
2
3
4
5
6
7
8
9
source_ids = [AssetId.ROBOT, AssetId.PREVIOUS_SCENE]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- [Specific scene description]
- [Character positioning and actions]
- [Environmental details]
- [Lighting specifications]
"""

Key Prompting Strategies

Reference Your Sources: Always specify which image contains which elements:

“Image 1: Robot character sheet”
“Image 2: Previous scene”
“The robot from image 1 stands…”

Use Descriptive or Imperative Language:

Descriptive: “The robot stands on a mountain peak with arms raised”
Imperative: “Move the robot to the center and raise both arms”

Control Continuity: Explicitly state what changes:

“The robot no longer holds the map”
“Remove the ice axes from the previous scene”

Advanced Scene Composition

Spatial Transformations

Gemini understands 3D space and can perform complex transformations:

1
2
3
4
5
6
prompt = """
- Remove the ice axes.
- Move the center mountain to the left edge and add a taller blue mountain to the right.
- Suspend a felt bridge between the mountains with thick wooden planks.
- Place the robot on the bridge center, pointing toward the blue mountain.
"""

Lighting and Atmosphere Control

Change mood by specifying lighting:

“Studio lighting, clean and soft”
“Golden hour light, soft and diffused”
“Dramatic side lighting with long shadows”

Managing Your Asset Pipeline

Track Generation Dependencies

Structure your assets as a directed graph:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@dataclass
class Asset:
    id: str
    source_ids: Sequence[str]  # Parent assets
    prompt: str
    pil_image: PIL_Image

class Assets(dict[str, Asset]):
    def set_asset(self, asset: Asset) -> None:
        self[asset.id] = asset

Visualize Your Pipeline

Create visual documentation of your generation process:

1
2
3
4
5
6
7
def build_graph(assets: Assets) -> nx.DiGraph:
    graph = nx.DiGraph(assets=assets)
    for asset in assets.values():
        graph.add_node(asset.id, asset=asset)
        for source_id in asset.source_ids:
            graph.add_edge(source_id, asset.id)
    return graph

Common Pitfalls and Solutions

Maintaining Character Consistency

Problem: Character features drift across generations Solution: Always include the character sheet as the first reference image

Handling Complex Transformations

Problem: Multiple changes in one prompt create unpredictable results Solution: Break complex changes into iterative steps or create specialized reference sheets

Managing Object Continuity

Problem: Objects from previous scenes appear randomly Solution: Explicitly state what happens to objects (“Remove the ice axes”, “The robot no longer holds the map”)

Production Considerations

Industrialize Your Pipeline

Automate regeneration when parent nodes change
Generate variations in parallel for different styles
Structure prompts with systematic parameters

Optimize for Scale

Save prompts and dependencies in image metadata
Use consistent naming conventions for assets
Implement version control for generated assets

Quality Control

Test character consistency across different scenarios
Validate spatial relationships in complex scenes
Maintain prompt libraries for reusable components

Next Steps

Start with a simple archive image and character extraction. Build your first scene, then expand the story iteratively. Focus on one transformation at a time until you understand how Gemini responds to your prompting style.

The complete source code and examples are available in the official notebook. Experiment with Google AI Studio to test your prompts before building your pipeline.

Generating Consistent Imagery with Gemini: A Practical Guide to Building a Prompt-Based Generation Pipeline

Generating Consistent Imagery with Gemini: A Practical Guide to Building a Prompt-Based Generation Pipeline

The Challenge: Breathing Life into Visual Archives

Setting Up Your Pipeline

Essential Dependencies

Configure the Client

Model Configuration

Building Your Generation Function

Creating Your Character Reference

Extract and Transform Your Character

Generating Consistent Scenes

Structure Your Prompts Effectively

Key Prompting Strategies

Advanced Scene Composition

Spatial Transformations

Lighting and Atmosphere Control

Managing Your Asset Pipeline

Track Generation Dependencies

Visualize Your Pipeline

Common Pitfalls and Solutions

Maintaining Character Consistency

Handling Complex Transformations

Managing Object Continuity

Production Considerations

Industrialize Your Pipeline

Optimize for Scale

Quality Control

Next Steps

Complete Guide to Using Large Language Models: From ChatGPT to Advanced AI Tools

Gemini 2.0 Disrupts PDF Processing: From 12 Minutes to 6 Seconds

How We Built Our Multi-Agent Research System