Generating Consistent Imagery with Gemini: A Practical Guide to Building a Prompt-Based Generation Pipeline

You can transform archive photos into consistent character-driven stories using Gemini 2.5 Flash Image (Nano Banana). This guide demonstrates how to build a prompt-based pipeline that preserves character features across multiple scenes and transformations.

The Challenge: Breathing Life into Visual Archives

Traditional image editing requires specialized skills and complex tools. Most archive images remain unused because modifying them seems impossible. Gemini 2.5 Flash Image changes this by understanding spatial relationships and maintaining visual consistency across generations.

We’ll complete this three-step process:

  1. Start with an archive image
  2. Extract a character to create a reference sheet
  3. Generate a story sequence using only prompts

Setting Up Your Pipeline

Essential Dependencies

Install the required packages:

1
%pip install --quiet "google-genai>=1.40.0" "networkx[default]"

Configure the Client

1
2
3
from google import genai

client = genai.Client()

Model Configuration

Use Gemini 2.5 Flash Image with these settings:

1
2
3
4
5
6
7
8
GEMINI_2_5_FLASH_IMAGE = "gemini-2.5-flash-image"
RESPONSE_MODALITIES = ["IMAGE"]
ASPECT_RATIO = "16:9"  # Generates 1344×768 images

GENERATION_CONFIG = GenerateContentConfig(
    response_modalities=RESPONSE_MODALITIES,
    image_config=ImageConfig(aspect_ratio=ASPECT_RATIO),
)

Building Your Generation Function

Create a robust generation function with retry logic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def generate_content(sources: list[PIL_Image], prompt: str) -> PIL_Image | None:
    contents = [*sources, prompt] if sources else prompt
    
    response = None
    for attempt in get_retrier():
        with attempt:
            response = client.models.generate_content(
                model=GEMINI_2_5_FLASH_IMAGE,
                contents=contents,
                config=GENERATION_CONFIG,
            )
    
    # Extract and return the generated image
    if response and response.candidates:
        content = response.candidates[0].content
        if content and content.parts:
            for part in content.parts:
                if sdk_image := part.as_image():
                    return sdk_image._pil_image
    return None

Creating Your Character Reference

Extract and Transform Your Character

Start with your archive image and create a character sheet:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
source_ids = [AssetId.ARCHIVE]
prompt = """
- Scene: Robot character sheet.
- Left: Front view of the extracted robot.
- Right: Back view of the extracted robot (seamless back).
- The robot wears a small, brown-felt backpack with tiny polished-brass buckle.
- Background: Pure white.
- Text: Caption "ROBOT CHARACTER SHEET" at top, "FRONT VIEW" and "BACK VIEW" at bottom.
"""

generate_image(source_ids, prompt, AssetId.ROBOT)

This approach provides multiple benefits:

  • Preserves all visible character features
  • Adds new elements (like the backpack) consistently
  • Creates a reusable reference for future generations

Generating Consistent Scenes

Structure Your Prompts Effectively

Use this proven prompt structure:

1
2
3
4
5
6
7
8
9
source_ids = [AssetId.ROBOT, AssetId.PREVIOUS_SCENE]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- [Specific scene description]
- [Character positioning and actions]
- [Environmental details]
- [Lighting specifications]
"""

Key Prompting Strategies

Reference Your Sources: Always specify which image contains which elements:

  • “Image 1: Robot character sheet”
  • “Image 2: Previous scene”
  • “The robot from image 1 stands…”

Use Descriptive or Imperative Language:

  • Descriptive: “The robot stands on a mountain peak with arms raised”
  • Imperative: “Move the robot to the center and raise both arms”

Control Continuity: Explicitly state what changes:

  • “The robot no longer holds the map”
  • “Remove the ice axes from the previous scene”

Advanced Scene Composition

Spatial Transformations

Gemini understands 3D space and can perform complex transformations:

1
2
3
4
5
6
prompt = """
- Remove the ice axes.
- Move the center mountain to the left edge and add a taller blue mountain to the right.
- Suspend a felt bridge between the mountains with thick wooden planks.
- Place the robot on the bridge center, pointing toward the blue mountain.
"""

Lighting and Atmosphere Control

Change mood by specifying lighting:

  • “Studio lighting, clean and soft”
  • “Golden hour light, soft and diffused”
  • “Dramatic side lighting with long shadows”

Managing Your Asset Pipeline

Track Generation Dependencies

Structure your assets as a directed graph:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@dataclass
class Asset:
    id: str
    source_ids: Sequence[str]  # Parent assets
    prompt: str
    pil_image: PIL_Image

class Assets(dict[str, Asset]):
    def set_asset(self, asset: Asset) -> None:
        self[asset.id] = asset

Visualize Your Pipeline

Create visual documentation of your generation process:

1
2
3
4
5
6
7
def build_graph(assets: Assets) -> nx.DiGraph:
    graph = nx.DiGraph(assets=assets)
    for asset in assets.values():
        graph.add_node(asset.id, asset=asset)
        for source_id in asset.source_ids:
            graph.add_edge(source_id, asset.id)
    return graph

Common Pitfalls and Solutions

Maintaining Character Consistency

Problem: Character features drift across generations Solution: Always include the character sheet as the first reference image

Handling Complex Transformations

Problem: Multiple changes in one prompt create unpredictable results Solution: Break complex changes into iterative steps or create specialized reference sheets

Managing Object Continuity

Problem: Objects from previous scenes appear randomly Solution: Explicitly state what happens to objects (“Remove the ice axes”, “The robot no longer holds the map”)

Production Considerations

Industrialize Your Pipeline

  • Automate regeneration when parent nodes change
  • Generate variations in parallel for different styles
  • Structure prompts with systematic parameters

Optimize for Scale

  • Save prompts and dependencies in image metadata
  • Use consistent naming conventions for assets
  • Implement version control for generated assets

Quality Control

  • Test character consistency across different scenarios
  • Validate spatial relationships in complex scenes
  • Maintain prompt libraries for reusable components

Next Steps

Start with a simple archive image and character extraction. Build your first scene, then expand the story iteratively. Focus on one transformation at a time until you understand how Gemini responds to your prompting style.

The complete source code and examples are available in the official notebook. Experiment with Google AI Studio to test your prompts before building your pipeline.