Recognize Text in Images Securely with Cloud Vision using Firebase Auth and Functions on Apple platforms

TheFirebase ML Vision SDK for recognizing text in an image is now deprecated(See the outdated docs here). This page describes how, as an alternative to the deprecated SDK, you can call Cloud Vision APIs using Firebase Auth and Firebase Functions to allow only authenticated users to access the API.

In order to call a Google Cloud API from your app, you need to create an intermediate REST API that handles authorization and protects secret values such as API keys. You then need to write code in your mobile app to authenticate to and communicate with this intermediate service.

One way to create this REST API is by using Firebase Authentication and Functions, which gives you a managed, serverless gateway to Google Cloud APIs that handles authentication and can be called from your mobile app with pre-built SDKs.

This guide demonstrates how to use this technique to call the Cloud Vision API from your app. This method will allow all authenticated users to access Cloud Vision billed services through your Cloud project, so consider whether this auth mechanism is sufficient for your use case before proceeding.

Use of the Cloud Vision APIs is subject to theGoogle Cloud Platform LicenseAgreement andServiceSpecific Terms, and billed accordingly. For billing information, see thePricing page.Looking for on-device text recognition? Try thestandalone ML Kit library.

Before you begin

Configure your project

If you have not already added Firebase to your app, do so by following thesteps in thegetting started guide.

Use Swift Package Manager to install and manage Firebase dependencies.

Visitour installation guide to learn about the different ways you can add Firebase SDKs to your Apple project.
  1. In Xcode, with your app project open, navigate toFile > Add Packages.
  2. When prompted, add the Firebase Apple platforms SDK repository:
  3.   https://github.com/firebase/firebase-ios-sdk.git
    Note: New projects should use the default (latest) SDK version, but you can choose an older version if needed.
  4. Choose theFirebase ML library.
  5. Add the-ObjC flag to theOther Linker Flags section of your target's build settings.
  6. When finished, Xcode will automatically begin resolving and downloading your dependencies in the background.

Next, perform some in-app setup:

  1. In your app, import Firebase:

    Swift

    importFirebaseMLModelDownloader

    Objective-C

    @importFirebaseMLModelDownloader;

A few more configuration steps, and we're ready to go:

  1. If you haven't already enabled Cloud-based APIs for your project, do so now:

    1. Open theFirebase ML APIs page in theFirebase console.
    2. If you haven't already upgraded your project to thepay-as-you-go Blaze pricing plan, clickUpgrade to do so. (You'll be prompted to upgrade only if your project isn't on the Blaze pricing plan.)

      Only projects on the Blaze pricing plan can use Cloud-based APIs.

    3. If Cloud-based APIs aren't already enabled, clickEnable Cloud-based APIs.
  2. Configure your existing Firebase API keys to disallow access to the Cloud Vision API:
    1. Open theCredentials page of the Cloud console.
    2. For each API key in the list, open the editing view, and in the Key Restrictions section, add all of the available APIsexcept the Cloud Vision API to the list.

Deploy the callable function

Next, deploy the Cloud Function you will use to bridge your app and the Cloud Vision API. Thefunctions-samples repository contains an example you can use.

By default, accessing the Cloud Vision API through this function will allow only authenticated users of your app access to the Cloud Vision API. You can modify the function for different requirements.

To deploy the function:

  1. Clone or download thefunctions-samples repo and change to theNode-1st-gen/vision-annotate-image directory:
    git clone https://github.com/firebase/functions-samplescd Node-1st-gen/vision-annotate-image
  2. Install dependencies:
    cd functionsnpm installcd ..
  3. If you don't have the Firebase CLI,install it.
  4. Initialize a Firebase project in thevision-annotate-image directory. When prompted, select your project in the list.
    firebase init
  5. Deploy the function:
    firebase deploy --only functions:annotateImage

Add Firebase Auth to your app

The callable function deployed above will reject any request from non-authenticatedusers of your app. If you have not already done so, you will need toadd FirebaseAuth to your app.

Add necessary dependencies to your app

Use Swift Package Manager to install the Cloud Functions for Firebase library.

Now you are ready to start recognizing text in images.

1. Prepare the input image

In order to call Cloud Vision, the image must be formatted as a base64-encodedstring. To process aUIImage:

Swift

guardletimageData=uiImage.jpegData(compressionQuality:1.0)else{return}letbase64encodedImage=imageData.base64EncodedString()

Objective-C

NSData*imageData=UIImageJPEGRepresentation(uiImage,1.0f);NSString*base64encodedImage=[imageDatabase64EncodedStringWithOptions:NSDataBase64Encoding76CharacterLineLength];

2. Invoke the callable function to recognize text

To recognize landmarks in an image, invoke the callable function passing aJSON Cloud Vision request.

  1. First, initialize an instance of Cloud Functions:

    Swift

    lazyvarfunctions=Functions.functions()

    Objective-C

    @property(strong,nonatomic)FIRFunctions*functions;
  2. Create the request. The Cloud Vision API supports twoTypesof text detection:TEXT_DETECTION andDOCUMENT_TEXT_DETECTION.See theCloud Vision OCR Docsfor the difference between the two use cases.

    Swift

    letrequestData=["image":["content":base64encodedImage],"features":["type":"TEXT_DETECTION"],"imageContext":["languageHints":["en"]]]

    Objective-C

    NSDictionary*requestData=@{@"image":@{@"content":base64encodedImage},@"features":@{@"type":@"TEXT_DETECTION"},@"imageContext":@{@"languageHints":@[@"en"]}};
  3. Finally, invoke the function:

    Swift

    do{letresult=tryawaitfunctions.httpsCallable("annotateImage").call(requestData)print(result)}catch{ifleterror=errorasNSError?{iferror.domain==FunctionsErrorDomain{letcode=FunctionsErrorCode(rawValue:error.code)letmessage=error.localizedDescriptionletdetails=error.userInfo[FunctionsErrorDetailsKey]}// ...}}

    Objective-C

    [[_functionsHTTPSCallableWithName:@"annotateImage"]callWithObject:requestDatacompletion:^(FIRHTTPSCallableResult*_Nullableresult,NSError*_Nullableerror){if(error){if([error.domainisEqualToString:@"com.firebase.functions"]){FIRFunctionsErrorCodecode=error.code;NSString*message=error.localizedDescription;NSObject*details=error.userInfo[@"details"];}// ...}// Function completed succesfully// Get information about labeled objects}];

3. Extract text from blocks of recognized text

If the text recognition operation succeeds, a JSON response ofBatchAnnotateImagesResponsewill be returned in the task's result. The text annotations can be found in thefullTextAnnotation object.

You can get the recognized text as a string in thetext field. For example:

Swift

letannotation=result.flatMap{$0.dataas?[String:Any]}.flatMap{$0["fullTextAnnotation"]}.flatMap{$0as?[String:Any]}guardletannotation=annotationelse{return}iflettext=annotation["text"]as?String{print("Complete annotation:\(text)")}

Objective-C

NSDictionary*annotation=result.data[@"fullTextAnnotation"];if(!annotation){return;}NSLog(@"\nComplete annotation:");NSLog(@"\n%@",annotation[@"text"]);

You can also get information specific to regions of the image. For eachblock,paragraph,word, andsymbol, you can get the text recognized in the regionand the bounding coordinates of the region. For example:

Swift

guardletpages=annotation["pages"]as?[[String:Any]]else{return}forpageinpages{varpageText=""guardletblocks=page["blocks"]as?[[String:Any]]else{continue}forblockinblocks{varblockText=""guardletparagraphs=block["paragraphs"]as?[[String:Any]]else{continue}forparagraphinparagraphs{varparagraphText=""guardletwords=paragraph["words"]as?[[String:Any]]else{continue}forwordinwords{varwordText=""guardletsymbols=word["symbols"]as?[[String:Any]]else{continue}forsymbolinsymbols{lettext=symbol["text"]as?String??""letconfidence=symbol["confidence"]as?Float??0.0wordText+=textprint("Symbol text:\(text) (confidence:\(confidence)%n")}letconfidence=word["confidence"]as?Float??0.0print("Word text:\(wordText) (confidence:\(confidence)%n%n")letboundingBox=word["boundingBox"]as?[Float]??[0.0,0.0,0.0,0.0]print("Word bounding box:\(boundingBox.description)%n")paragraphText+=wordText}print("%nParagraph: %n\(paragraphText)%n")letboundingBox=paragraph["boundingBox"]as?[Float]??[0.0,0.0,0.0,0.0]print("Paragraph bounding box:\(boundingBox)%n")letconfidence=paragraph["confidence"]as?Float??0.0print("Paragraph Confidence:\(confidence)%n")blockText+=paragraphText}pageText+=blockText}}

Objective-C

for(NSDictionary*pageinannotation[@"pages"]){NSMutableString*pageText=[NSMutableStringnew];for(NSDictionary*blockinpage[@"blocks"]){NSMutableString*blockText=[NSMutableStringnew];for(NSDictionary*paragraphinblock[@"paragraphs"]){NSMutableString*paragraphText=[NSMutableStringnew];for(NSDictionary*wordinparagraph[@"words"]){NSMutableString*wordText=[NSMutableStringnew];for(NSDictionary*symbolinword[@"symbols"]){NSString*text=symbol[@"text"];[wordTextappendString:text];NSLog(@"Symbol text: %@ (confidence: %@\n",text,symbol[@"confidence"]);}NSLog(@"Word text: %@ (confidence: %@\n\n",wordText,word[@"confidence"]);NSLog(@"Word bounding box: %@\n",word[@"boundingBox"]);[paragraphTextappendString:wordText];}NSLog(@"\nParagraph:\n%@\n",paragraphText);NSLog(@"Paragraph bounding box: %@\n",paragraph[@"boundingBox"]);NSLog(@"Paragraph Confidence: %@\n",paragraph[@"confidence"]);[blockTextappendString:paragraphText];}[pageTextappendString:blockText];}}

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.