Introduction
I create different Generative AI examples in this blog post using NestJS and Gemini API. The examples generate text from 1) a text input 2) a prompt and an image, and 3) a prompt and two images to analyze them. Google team provided the examples in NodeJS and I ported several of them to NestJS, my favorite framework that builds on top of the popular Express framework.
Generate Gemini API Key
Go tohttps://aistudio.google.com/app/apikey to generate an API key on a new or an existing Google Cloud project
Create a new NestJS Project
nest new nestjs-gemini-api-demo
Install dependencies
npm i--save-exact @google/generative-ai @nestjs/swagger class-transformer class-validator dotenv compressionnpm i--save-exact--save-dev @types/multer
Generate a Gemini Module
nest g mo gemininest g co gemini/presenters/http/gemini--flatnest g s gemini/application/gemini--flat
Create a Gemini module, a controller, and a service for the API.
Define Gemini environment variables
In.env.example
, it has environment variables for the Gemini API Key, Gemini Pro model, Gemini Pro Vision model, and port number
// .env.exampleGEMINI_API_KEY=<google_gemini_api_key>GEMINI_PRO_MODEL=gemini-proGEMINI_PRO_VISION_MODEL=gemini-pro-visionPORT=3000
Copy.env.example
to.env
and replace the placeholder ofGEMINI_API_KEY
with the real API Key.
Add.env
to the.gitignore
file to ensure we don't accidentally commit the Gemini API Key to the GitHub repo
// .gitignore.env
Add configuration files
The project has 3 configuration files.validate.config.ts
validates the payload is valid before any request can route to the controller to execute
// validate.config.tsimport{ValidationPipe}from'@nestjs/common';exportconstvalidateConfig=newValidationPipe({whitelist:true,stopAtFirstError:true,});
env.config.ts
extracts the environment variables fromprocess.env
and stores the values in theenv
object.
// env.config.tsimportdotenvfrom'dotenv';dotenv.config();exportconstenv={PORT:parseInt(process.env.PORT||'3000'),GEMINI:{KEY:process.env.GEMINI_API_KEY||'',PRO_MODEL:process.env.GEMINI_PRO_MODEL||'gemini-pro',PRO_VISION_MODEL:process.env.GEMINI_PRO_VISION_MODEL||'gemini-pro-vision',},};
gemini.config.ts
defines the options for the Gemini API
// gemini.config.tsimport{GenerationConfig,HarmBlockThreshold,HarmCategory,SafetySetting}from'@google/generative-ai';exportconstGENERATION_CONFIG:GenerationConfig={maxOutputTokens:1024,temperature:1,topK:32,topP:1};exportconstSAFETY_SETTINGS:SafetySetting[]=[{category:HarmCategory.HARM_CATEGORY_HATE_SPEECH,threshold:HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,},{category:HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,threshold:HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,},{category:HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,threshold:HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,},{category:HarmCategory.HARM_CATEGORY_HARASSMENT,threshold:HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,},];
Bootstrap the application
// main.tsfunctionsetupSwagger(app:NestExpressApplication){constconfig=newDocumentBuilder().setTitle('Gemini example').setDescription('The Gemini API description').setVersion('1.0').addTag('google gemini').build();constdocument=SwaggerModule.createDocument(app,config);SwaggerModule.setup('api',app,document);}asyncfunctionbootstrap(){constapp=awaitNestFactory.create<NestExpressApplication>(AppModule);app.enableCors();app.useGlobalPipes(validateConfig);app.use(express.json({limit:'1000kb'}));app.use(express.urlencoded({extended:false}));app.use(compression());setupSwagger(app);awaitapp.listen(env.PORT);}bootstrap();
Thebootstrap
function registers middleware to the application, sets up Swagger documentation, and uses a global pipe to validate payloads.
I have laid down the groundwork and the next step is to add routes to receive Generative AI inputs to generate text
Example 1: Generate text from a prompt
// generate-text.dto.tsimport{ApiProperty}from'@nestjs/swagger';import{IsNotEmpty,IsString}from'class-validator';exportclassGenerateTextDto{@ApiProperty({name:'prompt',description:'prompt of the question',type:'string',required:true,})@IsNotEmpty()@IsString()prompt:string;}
The DTO accepts a text prompt to generate text.
// gemini.constant.tsexportconstGEMINI_PRO_MODEL='GEMINI_PRO_MODEL';exportconstGEMINI_PRO_VISION_MODEL='GEMINI_PRO_VISION_MODEL';
// gemini.provider.tsimport{GenerativeModel,GoogleGenerativeAI}from'@google/generative-ai';import{Provider}from'@nestjs/common';import{env}from'~configs/env.config';import{GENERATION_CONFIG,SAFETY_SETTINGS}from'~configs/gemini.config';import{GEMINI_PRO_MODEL,GEMINI_PRO_VISION_MODEL}from'./gemini.constant';exportconstGeminiProModelProvider:Provider<GenerativeModel>={provide:GEMINI_PRO_MODEL,useFactory:()=>{constgenAI=newGoogleGenerativeAI(env.GEMINI.KEY);returngenAI.getGenerativeModel({model:env.GEMINI.PRO_MODEL,generationConfig:GENERATION_CONFIG,safetySettings:SAFETY_SETTINGS,});},};exportconstGeminiProVisionModelProvider:Provider<GenerativeModel>={provide:GEMINI_PRO_VISION_MODEL,useFactory:()=>{constgenAI=newGoogleGenerativeAI(env.GEMINI.KEY);returngenAI.getGenerativeModel({model:env.GEMINI.PRO_VISION_MODEL,generationConfig:GENERATION_CONFIG,safetySettings:SAFETY_SETTINGS,});},};
I define two providers to provide the Gemini Pro model and the Gemini Pro Vision model respectively. Then I can inject these providers into the Gemini service.
// content.helper.tsimport{Content,Part}from'@google/generative-ai';exportfunctioncreateContent(text:string,...images:Express.Multer.File[]):Content[]{constimageParts:Part[]=images.map((image)=>{return{inlineData:{mimeType:image.mimetype,data:image.buffer.toString('base64'),},};});return[{role:'user',parts:[...imageParts,{text,},],},];}
createContent
is a helper function that creates the content for the model.
// gemini.service.ts// ... omit the import statements to save space@Injectable()exportclassGeminiService{constructor(@Inject(GEMINI_PRO_MODEL)privatereadonlyproModel:GenerativeModel,@Inject(GEMINI_PRO_VISION_MODEL)privatereadonlyproVisionModel:GenerativeModel,){}asyncgenerateText(prompt:string):Promise<GenAiResponse>{constcontents=createContent(prompt);const{totalTokens}=awaitthis.proModel.countTokens({contents});constresult=awaitthis.proModel.generateContent({contents});constresponse=awaitresult.response;consttext=response.text();return{totalTokens,text};}...}
generateText
method accepts a prompt and calls the Gemini API to generate the text. The method returns the total number of tokens and the text to the controller
// gemini.controller.ts// omit the import statements to save space@ApiTags('Gemini')@Controller('gemini')exportclassGeminiController{constructor(privateservice:GeminiService){}@ApiBody({description:'Prompt',required:true,type:GenerateTextDto,})@Post('text')generateText(@Body()dto:GenerateTextDto):Promise<GenAiResponse>{returnthis.service.generateText(dto.prompt);}...otherroutes....}
Example 2: Generate text from a prompt and an image
This example needs both the prompt and the image file
// gemini.service.ts// ... omit the import statements to save space@Injectable()exportclassGeminiService{constructor(@Inject(GEMINI_PRO_MODEL)privatereadonlyproModel:GenerativeModel,@Inject(GEMINI_PRO_VISION_MODEL)privatereadonlyproVisionModel:GenerativeModel,){}...othermethods...asyncgenerateTextFromMultiModal(prompt:string,file:Express.Multer.File):Promise<GenAiResponse>{try{constcontents=createContent(prompt,file);const{totalTokens}=awaitthis.proVisionModel.countTokens({contents});constresult=awaitthis.proVisionModel.generateContent({contents});constresponse=awaitresult.response;consttext=response.text();return{totalTokens,text};}catch(err){if(errinstanceofError){thrownewInternalServerErrorException(err.message,err.stack);}throwerr;}}}
// file-validator.pipe.tsimport{FileTypeValidator,MaxFileSizeValidator,ParseFilePipe}from'@nestjs/common';exportconstfileValidatorPipe=newParseFilePipe({validators:[newMaxFileSizeValidator({maxSize:1*1024*1024}),newFileTypeValidator({fileType:newRegExp('image/[jpeg|png]')}),],});
DefinefileValidatorPipe
to validate that the uploaded file is either a JPEG or a PNG file, and that the file does not exceed 1MB.
// gemini.controller.ts@ApiConsumes('multipart/form-data')@ApiBody({schema:{type:'object',properties:{prompt:{type:'string',description:'Prompt',},file:{type:'string',format:'binary',description:'Binary file',},},},})@Post('text-and-image')@UseInterceptors(FileInterceptor('file'))asyncgenerateTextFromMultiModal(@Body()dto:GenerateTextDto,@UploadedFile(fileValidatorPipe)file:Express.Multer.File,):Promise<GenAiResponse>{returnthis.service.generateTextFromMultiModal(dto.prompt,file);}
file
is the key that provides the binary file in the form data
Example 3: Analyze two images
This example is similar to example 2 except it needs a prompt and 2 images for comparison and contrast.
// gemini.service.tsasyncanalyzeImages({prompt,firstImage,secondImage}:AnalyzeImage):Promise<GenAiResponse>{try{constcontents=createContent(prompt,firstImage,secondImage);const{totalTokens}=awaitthis.proVisionModel.countTokens({contents});constresult=awaitthis.proVisionModel.generateContent({contents});constresponse=awaitresult.response;consttext=response.text();return{totalTokens,text};}catch(err){if(errinstanceofError){thrownewInternalServerErrorException(err.message,err.stack);}throwerr;}}
// gemini.controller.ts@ApiConsumes('multipart/form-data')@ApiBody({schema:{type:'object',properties:{prompt:{type:'string',description:'Prompt',},first:{type:'string',format:'binary',description:'Binary file',},second:{type:'string',format:'binary',description:'Binary file',},},},})@Post('analyse-the-images')@UseInterceptors(FileFieldsInterceptor([{name:'first',maxCount:1},{name:'second',maxCount:1},]),)asyncanalyseImages(@Body()dto:GenerateTextDto,@UploadedFiles()files:{first?:Express.Multer.File[];second?:Express.Multer.File[];},):Promise<GenAiResponse>{if(!files.first?.length){thrownewBadRequestException('The first image is missing');}if(!files.second?.length){thrownewBadRequestException('The second image is missing');}returnthis.service.analyzeImages({prompt:dto.prompt,firstImage:files.first[0],secondImage:files.second[0]});}
first
is the key that provides the first binary file in the form data andsecond
is another key that provides the second binary file.
Test the endpoints
I can test the endpoints with Postman or Swagger documentation after starting the application
npm run start:dev
The URL of the Swagger documentation ishttp://localhost:3000/api
(Bonus) Deploy to Google Cloud Run
Install the gcloud CLI on the machine according to the official documentation. On my machine, the installation path is ~/google-cloud-sdk.
Then, I open a new terminal and change to the root of the project. On the command line, I update the environment variables before the deployment
$~/google-cloud-sdk/bin/gcloud run deploy\--update-env-varsGEMINI_API_KEY=<replace with your own key>,GEMINI_PRO_MODEL=gemini-pro,GEMINI_PRO_VISION_MODEL=gemini-pro-vision
If the deployment is successful, the NestJS application will run on Google Cloud Run.
This is the end of the blog post that analyzes data retrieval patterns in Angular. I hope you like the content and continue to follow my learning experience in Angular, NestJS, and other technologies.
Resources:
- 1. - Github Repo:https://github.com/railsstudent/nestjs-gemini-api-demo
- 2. - NodeJS Gemini tutorials:https://ai.google.dev/tutorials/node_quickstart
- 3. - Cloud run deploy documentation:https://cloud.google.com/run/docs/deploying-source-code
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse