Generating Consistent Imagery with Gemini: A Practical Guide to Building a Prompt-Based Generation Pipeline
You can transform archive photos into consistent character-driven stories using Gemini 2.5 Flash Image (Nano Banana). This guide demonstrates how to build a prompt-based pipeline that preserves character features across multiple scenes and transformations.
The Challenge: Breathing Life into Visual Archives
Traditional image editing requires specialized skills and complex tools. Most archive images remain unused because modifying them seems impossible. Gemini 2.5 Flash Image changes this by understanding spatial relationships and maintaining visual consistency across generations.
We’ll complete this three-step process:
- Start with an archive image
- Extract a character to create a reference sheet
- Generate a story sequence using only prompts
Setting Up Your Pipeline
Essential Dependencies
Install the required packages:
| |
Configure the Client
| |
Model Configuration
Use Gemini 2.5 Flash Image with these settings:
| |
Building Your Generation Function
Create a robust generation function with retry logic:
| |
Creating Your Character Reference
Extract and Transform Your Character
Start with your archive image and create a character sheet:
| |
This approach provides multiple benefits:
- Preserves all visible character features
- Adds new elements (like the backpack) consistently
- Creates a reusable reference for future generations
Generating Consistent Scenes
Structure Your Prompts Effectively
Use this proven prompt structure:
| |
Key Prompting Strategies
Reference Your Sources: Always specify which image contains which elements:
- “Image 1: Robot character sheet”
- “Image 2: Previous scene”
- “The robot from image 1 stands…”
Use Descriptive or Imperative Language:
- Descriptive: “The robot stands on a mountain peak with arms raised”
- Imperative: “Move the robot to the center and raise both arms”
Control Continuity: Explicitly state what changes:
- “The robot no longer holds the map”
- “Remove the ice axes from the previous scene”
Advanced Scene Composition
Spatial Transformations
Gemini understands 3D space and can perform complex transformations:
| |
Lighting and Atmosphere Control
Change mood by specifying lighting:
- “Studio lighting, clean and soft”
- “Golden hour light, soft and diffused”
- “Dramatic side lighting with long shadows”
Managing Your Asset Pipeline
Track Generation Dependencies
Structure your assets as a directed graph:
| |
Visualize Your Pipeline
Create visual documentation of your generation process:
| |
Common Pitfalls and Solutions
Maintaining Character Consistency
Problem: Character features drift across generations Solution: Always include the character sheet as the first reference image
Handling Complex Transformations
Problem: Multiple changes in one prompt create unpredictable results Solution: Break complex changes into iterative steps or create specialized reference sheets
Managing Object Continuity
Problem: Objects from previous scenes appear randomly Solution: Explicitly state what happens to objects (“Remove the ice axes”, “The robot no longer holds the map”)
Production Considerations
Industrialize Your Pipeline
- Automate regeneration when parent nodes change
- Generate variations in parallel for different styles
- Structure prompts with systematic parameters
Optimize for Scale
- Save prompts and dependencies in image metadata
- Use consistent naming conventions for assets
- Implement version control for generated assets
Quality Control
- Test character consistency across different scenarios
- Validate spatial relationships in complex scenes
- Maintain prompt libraries for reusable components
Next Steps
Start with a simple archive image and character extraction. Build your first scene, then expand the story iteratively. Focus on one transformation at a time until you understand how Gemini responds to your prompting style.
The complete source code and examples are available in the official notebook. Experiment with Google AI Studio to test your prompts before building your pipeline.