Identify the language of text with ML Kit on Android

This page describes an old version of the Language Identification API, which was part of ML Kit for Firebase. Development of this API has been moved to the standalone ML Kit SDK, which you can use with or without Firebase.Learn more.

SeeIdentify the language of text with ML Kit on Android for the latest documentation.

You can use ML Kit to identify the language of a string of text. You can get the string's most likely language or get confidence scores for all of the string's possible languages.

ML Kit recognizes text in 103 different languages in their native scripts. In addition, romanized text can be recognized for Arabic, Bulgarian, Chinese, Greek, Hindi, Japanese, and Russian.

Before you begin

  1. If you haven't already,add Firebase to your Android project.
  2. Add the dependencies for the ML Kit Android libraries to your module (app-level) Gradle file (usuallyapp/build.gradle):
    applyplugin:'com.android.application'applyplugin:'com.google.gms.google-services'dependencies{// ...implementation'com.google.firebase:firebase-ml-natural-language:22.0.0'implementation'com.google.firebase:firebase-ml-natural-language-language-id-model:20.0.7'}

Identify the language of a string

To identify the language of a string, get an instance ofFirebaseLanguageIdentification, and then pass the string to theidentifyLanguage() method.

For example:

FirebaseLanguageIdentificationlanguageIdentifier=FirebaseNaturalLanguage.getInstance().getLanguageIdentification();languageIdentifier.identifyLanguage(text).addOnSuccessListener(newOnSuccessListener<String>(){@OverridepublicvoidonSuccess(@NullableStringlanguageCode){if(languageCode!="und"){Log.i(TAG,"Language: "+languageCode);}else{Log.i(TAG,"Can't identify language.");}}}).addOnFailureListener(newOnFailureListener(){@OverridepublicvoidonFailure(@NonNullExceptione){// Model couldn’t be loaded or other internal error.// ...}});

If the call succeeds, aBCP-47 language code ispassed to the success listener, indicating the language of the text. See thecomplete list of supported languages. If nolanguage could be confidently detected, the codeund (undetermined) is passed.

By default, ML Kit returns a value other thanund only when it identifiesthe language with a confidence value of at least 0.5. You can change thisthreshold by passing aFirebaseLanguageIdentificationOptions object togetLanguageIdentification():

FirebaseLanguageIdentificationlanguageIdentifier=FirebaseNaturalLanguage.getInstance().getLanguageIdentification(newFirebaseLanguageIdentificationOptions.Builder().setIdentifyLanguageConfidenceThreshold(0.34f).build());

Get the possible languages of a string

To get the confidence values of a string's most likely languages, get aninstance ofFirebaseLanguageIdentification, and then pass the string to theidentifyAllLanguages() method.

For example:

FirebaseLanguageIdentificationlanguageIdentifier=FirebaseNaturalLanguage.getInstance().getLanguageIdentification();languageIdentifier.identifyAllLanguages(text).addOnSuccessListener(newOnSuccessListener<String>(){@OverridepublicvoidonSuccess(List<IdentifiedLanguage>identifiedLanguages){for(IdentifiedLanguageidentifiedLanguage:identifiedLanguages){Stringlanguage=identifiedLanguage.getLanguageCode();floatconfidence=identifiedLanguage.getConfidence();Log.i(TAG,language+" ("+confidence+")");}}}).addOnFailureListener(newOnFailureListener(){@OverridepublicvoidonFailure(@NonNullExceptione){// Model couldn’t be loaded or other internal error.// ...}});

If the call succeeds, a list ofIdentifiedLanguage objects is passed to thesuccess listener. From each object, you can get the language's BCP-47 code andthe confidence that the string is in that language. See thecomplete list of supported languages. Note thatthese values indicate the confidence that the entire string is in the givenlanguage; ML Kit doesn't identify multiple languages in a single string.

By default, ML Kit returns only languages with confidence values of at least0.01. You can change this threshold by passing aFirebaseLanguageIdentificationOptions object togetLanguageIdentification():

FirebaseLanguageIdentificationlanguageIdentifier=FirebaseNaturalLanguage.getInstance().getLanguageIdentification(newFirebaseLanguageIdentificationOptions.Builder().setIdentifyAllLanguagesConfidenceThreshold(0.5f).build());

If no language meets this threshold, the list will have one item, with the valueund.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.