Version introduction:
- gemini-3-pro-image-preview (pay‑as‑you‑go mode)
- gemini-3-pro-image-preview-bs (per‑use billing mode, recommended)
You can visit JuheNext’s AI Studio >> module to quickly use this model.
Nano Banana Pro is presented as the successor to Nano Banana (Gemini 2.5 Flash Image), extending the original model’s image editing capabilities with a stronger focus on understanding, reasoning, and real‑world knowledge. Where Nano Banana already supported tasks such as restoring old photos or creating playful figurines, Nano Banana Pro is designed to handle more complex visualizations, richer compositions, and tighter control over the final image.
Built on Gemini 3 Pro and grounded in real‑world knowledge
Nano Banana Pro is built on top of Gemini 3 Pro, and aims to bring Gemini’s reasoning and knowledge into image generation and editing. According to the description, this allows the model to:
- Interpret user instructions in a more nuanced way.
- Incorporate factual and real‑time information into images.
- Produce visuals that are not only aesthetically pleasing, but also contextually aligned with the user’s content or real‑world facts.
The model can connect to Google Search so that generated images can reflect current information, such as:
- Real‑time weather data for a grounded weather infographic.
- Up‑to‑date sports or other live information.
- Recipe details, visualized as step‑by‑step guides.
In practice, this makes Nano Banana Pro suitable for turning text or data into visual explanations, rather than only producing purely decorative images.
Visualizing ideas, concepts, and information
Nano Banana Pro is positioned as a general‑purpose tool for visualizing ideas and designs. It can be used to generate images for:
- Early‑stage prototypes and product concepts.
- Data visualizations and infographics.
- Diagrams derived from textual or handwritten notes.
Given a subject or a piece of content, the model can create “context‑rich” diagrams and infographics. Examples in the description include:
- An infographic about a houseplant, highlighting its origins, care essentials, and growth patterns.
- A step‑by‑step visual guide to making elaichi chai (cardamom tea), showing that the model can turn real‑world recipes into structured visual instructions.
- A pop‑art‑style weather infographic, generated by grounding the image in live weather data via Google Search.
These scenarios illustrate the model’s intended ability to combine:
- Content understanding (what the text or data is about),
- Factual grounding (information drawn from current or general knowledge),
- And visual design (layout, icons, and composition) in a single image.
Text inside images: legible, expressive, and multilingual
A central focus of Nano Banana Pro is text rendered directly inside images. The product description presents it as the best model in its family for producing:
- Correctly rendered text (e.g., fewer spelling errors or broken letters).
- Legible text, including longer passages, not just short labels.
- Text in multiple languages, using Gemini’s multilingual reasoning.
This is intended to be useful for:
- Posters, mockups, storyboards, and other layouts that rely on clear typography.
- Visual content that combines illustration and copy, such as campaign concepts or book covers.
- Designs that require multiple languages or localized variants.
The examples highlight several aspects of its text capabilities:
- Storyboards: Generating a black‑and‑white storyboard with multiple panels (establishing shot, medium shot, close‑up, POV), showing that the model can organize text and visual beats in sequence.
- Integrated typography: Incorporating the word “BERLIN” into the facades of buildings along a street, while keeping the buildings recognizable as houses and the letter forms subtle.
- Expressive lettering: Creating minimalistic word‑logos where the shape of each word visually reflects its meaning (e.g., onomatopoeic words like “crash” or “roar”), using different textures and font styles.
- Translation in designs: Taking a set of beverage cans with English text and translating all the English wording into Korean, while keeping everything else in the image unchanged.
- Retro graphic design: Designing a “TYPOGRAPHY” graphic with bold, condensed letters in overlapping bright blue and pink layers, halftone dot patterns, and a retro print aesthetic.
- Text as material: Rendering the tongue‑twister “How much wood would a woodchuck chuck if a woodchuck could chuck wood” with the words formed from pieces of wood in a wood‑chopping scene.
Taken together, these examples show that Nano Banana Pro is designed not only to “write” text in images, but also to:
- Handle longer and denser passages of text.
- Work with a variety of visual styles (vector, textured, calligraphic).
- Combine text with textures, materials, and scenes in creative ways.
- Support multilingual output and translation inside the image itself.
High‑fidelity multi‑image composition and character consistency
Nano Banana Pro also extends composition and consistency features. The model is described as being able to:
- Blend up to 14 input images into a single coherent scene.
- Maintain the consistency and resemblance of up to 5 people across a composition.
This is intended to help bridge the gap between idea and final visual by allowing users to:
- Turn hand‑drawn sketches into more polished product images.
- Convert architectural or product blueprints into photorealistic or 3D‑style renderings.
- Build composite lifestyle scenes by combining separate objects, people, and environments.
- Keep branding or character style consistent across multiple elements.
The provided examples illustrate these capabilities:
- 14 characters in one scene: Fourteen fluffy characters from different inputs placed together on a sofa and floor, watching a vintage TV in a cozy, dimly lit living room. Despite the number of characters, their appearance and textures are kept consistent in a single, complex composition.
- Lifestyle composites: Combining separate photos (for example, a gown, plants, and a chair) into a unified cinematic scene, while also changing the dress on the mannequin to match a specified reference dress.
- Surreal environments: Building a futuristic sunset landscape by combining multiple input images and arranging them into a wide‑format (16:9) cinematic scene.
- Fashion editorial with people and a dog: Integrating five people and a dog from different photos into a single fashion‑editorial‑style image, while preserving:
- The identity and clothing of each person.
- Natural‑looking lighting and color across all subjects.
- Plausible variation in camera angle and distance.
These examples demonstrate the model’s intended ability to manage many elements at once while maintaining visual continuity, especially for human subjects.
Studio‑style creative controls and output formats
Nano Banana Pro includes more advanced controls so that users can refine images beyond a single “one‑shot” generation. The model supports:
- Localized editing:
- Selecting and transforming specific regions of an image.
- Refining details in only part of the scene without altering the rest.
- Camera and focus adjustments:
- Changing camera angles to alter the perspective.
- Shifting focus between foreground and background elements.
- Color and lighting control:
- Applying sophisticated color grading to adjust the mood or style.
- Transforming scene lighting, such as:
- Converting a daytime scene to night.
- Adding effects like bokeh (background blur) for a shallow‑depth‑of‑field look.
- Flexible output formats:
- Supporting a variety of aspect ratios suitable for different platforms (e.g., wide cinematic formats or more square/vertical formats).
- Offering higher‑resolution outputs, including 2K and 4K options, which are suitable for detailed viewing and print‑ready assets.
These tools are intended to give creators more control over the final visual result, letting them iterate and refine images for uses ranging from social media to more formal or professional presentations.
From Nano Banana to Nano Banana Pro
In summary, Nano Banana Pro (Gemini 3 Pro Image) builds on the earlier Nano Banana model by:
- Adding deeper reasoning and factual grounding through Gemini 3 Pro and Google Search.
- Emphasizing accurate, legible, and stylistically flexible text inside images, including multilingual and translated text.
- Extending composition capabilities to handle many input images and maintain consistency across multiple people and elements.
- Providing more granular creative controls for editing and finishing images, including localized edits, camera and focus adjustments, color grading, and lighting changes.
Within this description, Nano Banana Pro is positioned as a model that can help users turn a wide range of ideas—from rough sketches and handwritten notes to live data and complex design briefs—into tailored visual content.















