To avoid service disruption, update to anewer model (for example,
gemini-2.5-flash-lite
).Learn more. Generate text using the Gemini API Stay organized with collections Save and categorize content based on your preferences.
You can ask aGemini model to generate text from a text-only prompt ora multimodal prompt. When you useFirebase AI Logic, you can make thisrequest directly from your app.
Multimodal prompts can include multiple types of input(like text along with images, PDFs, plain-text files, audio, and video).
This guide shows how to generate text from a text-only prompt and from a basicmultimodal prompt that includes a file.
Jump to code for text-only input Jump to code for multimodal input Jump to code for streamed responses
See other guides for additional options for working with text Generate structured outputMulti-turn chatBidirectional streamingGenerate text on-deviceGenerate images from text |
Before you begin
Click yourGemini API provider to view provider-specific content and code on this page. |
If you haven't already, complete thegetting started guide, which describes how toset up your Firebase project, connect your app to Firebase, add the SDK,initialize the backend service for your chosenGemini API provider, andcreate aGenerativeModel
instance.
This guide assumes you're using the latestFirebase AI Logic SDKs. If you're still using the "Vertex AI in Firebase" SDKs, see themigration guide.
For testing and iterating on your prompts, we recommend usingGoogle AI Studio.Generate text from text-only input
Before trying this sample, complete theBefore you begin section of this guide to set up your project and app. In that section, you'll also click a button for your chosenGemini API provider so that you see provider-specific content on this page. |
You can ask aGemini model to generate text by prompting with text-onlyinput.
Swift
You can callgenerateContent()
to generate text from text-only input.
importFirebaseAI// Initialize the Gemini Developer API backend serviceletai=FirebaseAI.firebaseAI(backend:.googleAI())// Create a `GenerativeModel` instance with a model that supports your use caseletmodel=ai.generativeModel(modelName:"gemini-2.5-flash")// Provide a prompt that contains textletprompt="Write a story about a magic backpack."// To generate text output, call generateContent with the text inputletresponse=tryawaitmodel.generateContent(prompt)print(response.text??"No text in response.")
Kotlin
You can callgenerateContent()
to generate text from text-only input.
// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash")// Provide a prompt that contains textvalprompt="Write a story about a magic backpack."// To generate text output, call generateContent with the text inputvalresponse=model.generateContent(prompt)print(response.text)
Java
You can callgenerateContent()
to generate text from text-only input.
ListenableFuture
.// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use caseGenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash");// Use the GenerativeModelFutures Java compatibility layer which offers// support for ListenableFuture and Publisher APIsGenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);// Provide a prompt that contains textContentprompt=newContent.Builder().addText("Write a story about a magic backpack.").build();// To generate text output, call generateContent with the text inputListenableFuture<GenerateContentResponse>response=model.generateContent(prompt);Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){@OverridepublicvoidonSuccess(GenerateContentResponseresult){StringresultText=result.getText();System.out.println(resultText);}@OverridepublicvoidonFailure(Throwablet){t.printStackTrace();}},executor);
Web
You can callgenerateContent()
to generate text from text-only input.
import{initializeApp}from"firebase/app";import{getAI,getGenerativeModel,GoogleAIBackend}from"firebase/ai";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);// Initialize the Gemini Developer API backend serviceconstai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Create a `GenerativeModel` instance with a model that supports your use caseconstmodel=getGenerativeModel(ai,{model:"gemini-2.5-flash"});// Wrap in an async function so you can use awaitasyncfunctionrun(){// Provide a prompt that contains textconstprompt="Write a story about a magic backpack."// To generate text output, call generateContent with the text inputconstresult=awaitmodel.generateContent(prompt);constresponse=result.response;consttext=response.text();console.log(text);}run();
Dart
You can callgenerateContent()
to generate text from text-only input.
import'package:firebase_ai/firebase_ai.dart';import'package:firebase_core/firebase_core.dart';import'firebase_options.dart';// Initialize FirebaseAppawaitFirebase.initializeApp(options:DefaultFirebaseOptions.currentPlatform,);// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casefinalmodel=FirebaseAI.googleAI().generativeModel(model:'gemini-2.5-flash');// Provide a prompt that contains textfinalprompt=[Content.text('Write a story about a magic backpack.')];// To generate text output, call generateContent with the text inputfinalresponse=awaitmodel.generateContent(prompt);print(response.text);
Unity
You can callGenerateContentAsync()
to generate text from text-only input.
usingFirebase;usingFirebase.AI;// Initialize the Gemini Developer API backend servicevarai=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());// Create a `GenerativeModel` instance with a model that supports your use casevarmodel=ai.GetGenerativeModel(modelName:"gemini-2.5-flash");// Provide a prompt that contains textvarprompt="Write a story about a magic backpack.";// To generate text output, call GenerateContentAsync with the text inputvarresponse=awaitmodel.GenerateContentAsync(prompt);UnityEngine.Debug.Log(response.Text??"No text in response.");
Learn how to choose amodelappropriate for your use case and app.
Generate text from text-and-file (multimodal) input
Before trying this sample, complete theBefore you begin section of this guide to set up your project and app. In that section, you'll also click a button for your chosenGemini API provider so that you see provider-specific content on this page. |
You can ask aGemini model togenerate text by prompting with text and a file—providing eachinput file'smimeType
and the file itself. Findrequirements and recommendations for input fileslater on this page.
The following example shows the basics of how to generate text from a file inputby analyzing a single video file provided as inline data (base64-encoded file).
Note that this example shows providing the file inline, but the SDKs alsosupportproviding a YouTube URL.Need a sample video file?
You can use this publicly available file with a MIME type of
video/mp4
(view or download file).https://storage.googleapis.com/cloud-samples-data/video/animals.mp4
Important:The total request size limit is 20 MB. To send large files, review theoptions for providing files in multimodal requests.
Swift
You can callgenerateContent()
to generate text from multimodal input of text and video files.
importFirebaseAI// Initialize the Gemini Developer API backend serviceletai=FirebaseAI.firebaseAI(backend:.googleAI())// Create a `GenerativeModel` instance with a model that supports your use caseletmodel=ai.generativeModel(modelName:"gemini-2.5-flash")// Provide the video as `Data` with the appropriate MIME type.letvideo=InlineDataPart(data:tryData(contentsOf:videoURL),mimeType:"video/mp4")// Provide a text prompt to include with the videoletprompt="What is in the video?"// To generate text output, call generateContent with the text and videoletresponse=tryawaitmodel.generateContent(video,prompt)print(response.text??"No text in response.")
Kotlin
You can callgenerateContent()
to generate text from multimodal input of text and video files.
// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash")valcontentResolver=applicationContext.contentResolvercontentResolver.openInputStream(videoUri).use{stream->stream?.let{valbytes=stream.readBytes()// Provide a prompt that includes the video specified above and textvalprompt=content{inlineData(bytes,"video/mp4")text("What is in the video?")}// To generate text output, call generateContent with the promptvalresponse=model.generateContent(prompt)Log.d(TAG,response.text?:"")}}
Java
You can callgenerateContent()
to generate text from multimodal input of text and video files.
ListenableFuture
.// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use caseGenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash");// Use the GenerativeModelFutures Java compatibility layer which offers// support for ListenableFuture and Publisher APIsGenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);ContentResolverresolver=getApplicationContext().getContentResolver();try(InputStreamstream=resolver.openInputStream(videoUri)){FilevideoFile=newFile(newURI(videoUri.toString()));intvideoSize=(int)videoFile.length();byte[]videoBytes=newbyte[videoSize];if(stream!=null){stream.read(videoBytes,0,videoBytes.length);stream.close();// Provide a prompt that includes the video specified above and textContentprompt=newContent.Builder().addInlineData(videoBytes,"video/mp4").addText("What is in the video?").build();// To generate text output, call generateContent with the promptListenableFuture<GenerateContentResponse>response=model.generateContent(prompt);Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){@OverridepublicvoidonSuccess(GenerateContentResponseresult){StringresultText=result.getText();System.out.println(resultText);}@OverridepublicvoidonFailure(Throwablet){t.printStackTrace();}},executor);}}catch(IOExceptione){e.printStackTrace();}catch(URISyntaxExceptione){e.printStackTrace();}
Web
You can callgenerateContent()
to generate text from multimodal input of text and video files.
import{initializeApp}from"firebase/app";import{getAI,getGenerativeModel,GoogleAIBackend}from"firebase/ai";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);// Initialize the Gemini Developer API backend serviceconstai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Create a `GenerativeModel` instance with a model that supports your use caseconstmodel=getGenerativeModel(ai,{model:"gemini-2.5-flash"});// Converts a File object to a Part object.asyncfunctionfileToGenerativePart(file){constbase64EncodedDataPromise=newPromise((resolve)=>{constreader=newFileReader();reader.onloadend=()=>resolve(reader.result.split(',')[1]);reader.readAsDataURL(file);});return{inlineData:{data:awaitbase64EncodedDataPromise,mimeType:file.type},};}asyncfunctionrun(){// Provide a text prompt to include with the videoconstprompt="What do you see?";constfileInputEl=document.querySelector("input[type=file]");constvideoPart=awaitfileToGenerativePart(fileInputEl.files[0]);// To generate text output, call generateContent with the text and videoconstresult=awaitmodel.generateContent([prompt,videoPart]);constresponse=result.response;consttext=response.text();console.log(text);}run();
Dart
You can callgenerateContent()
to generate text from multimodal input of text and video files.
import'package:firebase_ai/firebase_ai.dart';import'package:firebase_core/firebase_core.dart';import'firebase_options.dart';// Initialize FirebaseAppawaitFirebase.initializeApp(options:DefaultFirebaseOptions.currentPlatform,);// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casefinalmodel=FirebaseAI.googleAI().generativeModel(model:'gemini-2.5-flash');// Provide a text prompt to include with the videofinalprompt=TextPart("What's in the video?");// Prepare video for inputfinalvideo=awaitFile('video0.mp4').readAsBytes();// Provide the video as `Data` with the appropriate mimetypefinalvideoPart=InlineDataPart('video/mp4',video);// To generate text output, call generateContent with the text and imagesfinalresponse=awaitmodel.generateContent([Content.multi([prompt,...videoPart])]);print(response.text);
Unity
You can callGenerateContentAsync()
to generate text from multimodal input of text and video files.
usingFirebase;usingFirebase.AI;// Initialize the Gemini Developer API backend servicevarai=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());// Create a `GenerativeModel` instance with a model that supports your use casevarmodel=ai.GetGenerativeModel(modelName:"gemini-2.5-flash");// Provide the video as `data` with the appropriate MIME type.varvideo=ModelContent.InlineData("video/mp4",System.IO.File.ReadAllBytes(System.IO.Path.Combine(UnityEngine.Application.streamingAssetsPath,"yourVideo.mp4")));// Provide a text prompt to include with the videovarprompt=ModelContent.Text("What is in the video?");// To generate text output, call GenerateContentAsync with the text and videovarresponse=awaitmodel.GenerateContentAsync(new[]{video,prompt});UnityEngine.Debug.Log(response.Text??"No text in response.");
Learn how to choose amodelappropriate for your use case and app.
Stream the response
Before trying this sample, complete theBefore you begin section of this guide to set up your project and app. In that section, you'll also click a button for your chosenGemini API provider so that you see provider-specific content on this page. |
You can achieve faster interactions by not waiting for the entire result fromthe model generation, and instead use streaming to handle partial results.To stream the response, callgenerateContentStream
.
View example: Stream generated text from text-only input
Swift
You can callgenerateContentStream()
to stream generated text from text-only input.
importFirebaseAI// Initialize the Gemini Developer API backend serviceletai=FirebaseAI.firebaseAI(backend:.googleAI())// Create a `GenerativeModel` instance with a model that supports your use caseletmodel=ai.generativeModel(modelName:"gemini-2.5-flash")// Provide a prompt that contains textletprompt="Write a story about a magic backpack."// To stream generated text output, call generateContentStream with the text inputletcontentStream=trymodel.generateContentStream(prompt)fortryawaitchunkincontentStream{iflettext=chunk.text{print(text)}}
Kotlin
You can callgenerateContentStream()
to stream generated text from text-only input.
// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash")// Provide a prompt that includes only textvalprompt="Write a story about a magic backpack."// To stream generated text output, call generateContentStream and pass in the promptvarresponse=""model.generateContentStream(prompt).collect{chunk->print(chunk.text)response+=chunk.text}
Java
You can callgenerateContentStream()
to stream generated text from text-only input.
Publisher
type from theReactive Streams library.// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use caseGenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash");// Use the GenerativeModelFutures Java compatibility layer which offers// support for ListenableFuture and Publisher APIsGenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);// Provide a prompt that contains textContentprompt=newContent.Builder().addText("Write a story about a magic backpack.").build();// To stream generated text output, call generateContentStream with the text inputPublisher<GenerateContentResponse>streamingResponse=model.generateContentStream(prompt);// Subscribe to partial results from the responsefinalString[]fullResponse={""};streamingResponse.subscribe(newSubscriber<GenerateContentResponse>(){@OverridepublicvoidonNext(GenerateContentResponsegenerateContentResponse){Stringchunk=generateContentResponse.getText();fullResponse[0]+=chunk;}@OverridepublicvoidonComplete(){System.out.println(fullResponse[0]);}@OverridepublicvoidonError(Throwablet){t.printStackTrace();}@OverridepublicvoidonSubscribe(Subscriptions){}});
Web
You can callgenerateContentStream()
to stream generated text from text-only input.
import{initializeApp}from"firebase/app";import{getAI,getGenerativeModel,GoogleAIBackend}from"firebase/ai";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);// Initialize the Gemini Developer API backend serviceconstai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Create a `GenerativeModel` instance with a model that supports your use caseconstmodel=getGenerativeModel(ai,{model:"gemini-2.5-flash"});// Wrap in an async function so you can use awaitasyncfunctionrun(){// Provide a prompt that contains textconstprompt="Write a story about a magic backpack."// To stream generated text output, call generateContentStream with the text inputconstresult=awaitmodel.generateContentStream(prompt);forawait(constchunkofresult.stream){constchunkText=chunk.text();console.log(chunkText);}console.log('aggregated response: ',awaitresult.response);}run();
Dart
You can callgenerateContentStream()
to stream generated text from text-only input.
import'package:firebase_ai/firebase_ai.dart';import'package:firebase_core/firebase_core.dart';import'firebase_options.dart';// Initialize FirebaseAppawaitFirebase.initializeApp(options:DefaultFirebaseOptions.currentPlatform,);// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casefinalmodel=FirebaseAI.googleAI().generativeModel(model:'gemini-2.5-flash');// Provide a prompt that contains textfinalprompt=[Content.text('Write a story about a magic backpack.')];// To stream generated text output, call generateContentStream with the text inputfinalresponse=model.generateContentStream(prompt);awaitfor(finalchunkinresponse){print(chunk.text);}
Unity
You can callGenerateContentStreamAsync()
to stream generated text from text-only input.
usingFirebase;usingFirebase.AI;// Initialize the Gemini Developer API backend servicevarai=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());// Create a `GenerativeModel` instance with a model that supports your use casevarmodel=ai.GetGenerativeModel(modelName:"gemini-2.5-flash");// Provide a prompt that contains textvarprompt="Write a story about a magic backpack.";// To stream generated text output, call GenerateContentStreamAsync with the text inputvarresponseStream=model.GenerateContentStreamAsync(prompt);awaitforeach(varresponseinresponseStream){if(!string.IsNullOrWhiteSpace(response.Text)){UnityEngine.Debug.Log(response.Text);}}
Learn how to choose amodelappropriate for your use case and app.
View example: Stream generated text from multimodal input
Swift
You can callgenerateContentStream()
to stream generated text from multimodal input of text and a single video.
importFirebaseAI// Initialize the Gemini Developer API backend serviceletai=FirebaseAI.firebaseAI(backend:.googleAI())// Create a `GenerativeModel` instance with a model that supports your use caseletmodel=ai.generativeModel(modelName:"gemini-2.5-flash")// Provide the video as `Data` with the appropriate MIME typeletvideo=InlineDataPart(data:tryData(contentsOf:videoURL),mimeType:"video/mp4")// Provide a text prompt to include with the videoletprompt="What is in the video?"// To stream generated text output, call generateContentStream with the text and videoletcontentStream=trymodel.generateContentStream(video,prompt)fortryawaitchunkincontentStream{iflettext=chunk.text{print(text)}}
Kotlin
You can callgenerateContentStream()
to stream generated text from multimodal input of text and a single video.
// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash")valcontentResolver=applicationContext.contentResolvercontentResolver.openInputStream(videoUri).use{stream->stream?.let{valbytes=stream.readBytes()// Provide a prompt that includes the video specified above and textvalprompt=content{inlineData(bytes,"video/mp4")text("What is in the video?")}// To stream generated text output, call generateContentStream with the promptvarfullResponse=""model.generateContentStream(prompt).collect{chunk->Log.d(TAG,chunk.text?:"")fullResponse+=chunk.text}}}
Java
You can callgenerateContentStream()
to stream generated text from multimodal input of text and a single video.
Publisher
type from theReactive Streams library.// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use caseGenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel("gemini-2.5-flash");// Use the GenerativeModelFutures Java compatibility layer which offers// support for ListenableFuture and Publisher APIsGenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);ContentResolverresolver=getApplicationContext().getContentResolver();try(InputStreamstream=resolver.openInputStream(videoUri)){FilevideoFile=newFile(newURI(videoUri.toString()));intvideoSize=(int)videoFile.length();byte[]videoBytes=newbyte[videoSize];if(stream!=null){stream.read(videoBytes,0,videoBytes.length);stream.close();// Provide a prompt that includes the video specified above and textContentprompt=newContent.Builder().addInlineData(videoBytes,"video/mp4").addText("What is in the video?").build();// To stream generated text output, call generateContentStream with the promptPublisher<GenerateContentResponse>streamingResponse=model.generateContentStream(prompt);finalString[]fullResponse={""};streamingResponse.subscribe(newSubscriber<GenerateContentResponse>(){@OverridepublicvoidonNext(GenerateContentResponsegenerateContentResponse){Stringchunk=generateContentResponse.getText();fullResponse[0]+=chunk;}@OverridepublicvoidonComplete(){System.out.println(fullResponse[0]);}@OverridepublicvoidonError(Throwablet){t.printStackTrace();}@OverridepublicvoidonSubscribe(Subscriptions){}});}}catch(IOExceptione){e.printStackTrace();}catch(URISyntaxExceptione){e.printStackTrace();}
Web
You can callgenerateContentStream()
to stream generated text from multimodal input of text and a single video.
import{initializeApp}from"firebase/app";import{getAI,getGenerativeModel,GoogleAIBackend}from"firebase/ai";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);// Initialize the Gemini Developer API backend serviceconstai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Create a `GenerativeModel` instance with a model that supports your use caseconstmodel=getGenerativeModel(ai,{model:"gemini-2.5-flash"});// Converts a File object to a Part object.asyncfunctionfileToGenerativePart(file){constbase64EncodedDataPromise=newPromise((resolve)=>{constreader=newFileReader();reader.onloadend=()=>resolve(reader.result.split(',')[1]);reader.readAsDataURL(file);});return{inlineData:{data:awaitbase64EncodedDataPromise,mimeType:file.type},};}asyncfunctionrun(){// Provide a text prompt to include with the videoconstprompt="What do you see?";constfileInputEl=document.querySelector("input[type=file]");constvideoPart=awaitfileToGenerativePart(fileInputEl.files[0]);// To stream generated text output, call generateContentStream with the text and videoconstresult=awaitmodel.generateContentStream([prompt,videoPart]);forawait(constchunkofresult.stream){constchunkText=chunk.text();console.log(chunkText);}}run();
Dart
You can callgenerateContentStream()
to stream generated text from multimodal input of text and a single video.
import'package:firebase_ai/firebase_ai.dart';import'package:firebase_core/firebase_core.dart';import'firebase_options.dart';// Initialize FirebaseAppawaitFirebase.initializeApp(options:DefaultFirebaseOptions.currentPlatform,);// Initialize the Gemini Developer API backend service// Create a `GenerativeModel` instance with a model that supports your use casefinalmodel=FirebaseAI.googleAI().generativeModel(model:'gemini-2.5-flash');// Provide a text prompt to include with the videofinalprompt=TextPart("What's in the video?");// Prepare video for inputfinalvideo=awaitFile('video0.mp4').readAsBytes();// Provide the video as `Data` with the appropriate mimetypefinalvideoPart=InlineDataPart('video/mp4',video);// To stream generated text output, call generateContentStream with the text and imagefinalresponse=awaitmodel.generateContentStream([Content.multi([prompt,videoPart])]);awaitfor(finalchunkinresponse){print(chunk.text);}
Unity
You can callGenerateContentStreamAsync()
to stream generated text from multimodal input of text and a single video.
usingFirebase;usingFirebase.AI;// Initialize the Gemini Developer API backend servicevarai=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());// Create a `GenerativeModel` instance with a model that supports your use casevarmodel=ai.GetGenerativeModel(modelName:"gemini-2.5-flash");// Provide the video as `data` with the appropriate MIME type.varvideo=ModelContent.InlineData("video/mp4",System.IO.File.ReadAllBytes(System.IO.Path.Combine(UnityEngine.Application.streamingAssetsPath,"yourVideo.mp4")));// Provide a text prompt to include with the videovarprompt=ModelContent.Text("What is in the video?");// To stream generated text output, call GenerateContentStreamAsync with the text and videovarresponseStream=model.GenerateContentStreamAsync(new[]{video,prompt});awaitforeach(varresponseinresponseStream){if(!string.IsNullOrWhiteSpace(response.Text)){UnityEngine.Debug.Log(response.Text);}}
Learn how to choose amodelappropriate for your use case and app.
Requirements and recommendations for input image files
Important:The total request size limit is 20 MB. To send large files, review theoptions for providing files in multimodal requests.
Note that a file provided as inline data is encoded to base64 in transit, whichincreases the size of the request. You get an HTTP 413 error if a request istoo large.
SeeSupported input files and requirements for theVertex AI Gemini APIto learn detailed information about the following:
- Different options for providing a file in a request(either inline or using the file's URL or URI)
- Supported file types
- Supported MIME types and how to specify them
- Requirements and best practices for files and multimodal requests
What else can you do?
- Learn how tocount tokens before sending long prompts to the model.
- Set upCloud Storage for Firebase so that you can include large files in your multimodal requests and have a more managed solution for providing files in prompts. Files can include images, PDFs, video, and audio.
- Start thinking about preparing for production (see theproduction checklist), including:
- Setting upFirebase App Check to protect theGemini API from abuse by unauthorized clients.
- IntegratingFirebase Remote Config to update values in your app (like model name) without releasing a new app version.
Try out other capabilities
- Buildmulti-turn conversations (chat).
- Generate text fromtext-only prompts.
- Generatestructured output (like JSON) from both text and multimodal prompts.
- Generate images from text prompts (Gemini orImagen).
- Stream input and output (including audio) using theGemini Live API.
- Use tools (likefunction calling andgrounding with Google Search) to connect aGemini model to other parts of your app and external systems and information.
Learn how to control content generation
- Understand prompt design, including best practices, strategies, and example prompts.
- Configure model parameters like temperature and maximum output tokens (forGemini) or aspect ratio and person generation (forImagen).
- Use safety settings to adjust the likelihood of getting responses that may be considered harmful.
Learn more about the supported models
Learn about themodels available for various use casesand theirquotas andpricing.Give feedback about your experience withFirebase AI Logic
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-10-03 UTC.