Blog Infos
Author
Published
Topics
, , , ,
Published
Photo by Ben Kolde on Unsplash

Click here to read for FREE if you are not a premium user.

With the rise of AI, integrating intelligent features into mobile apps has become essential for providing an engaging user experience. Google’s ML Kit makes this easy for Android developers by providing a powerful suite of machine learning tools directly in the app. This guide will take you from ML Kit basics to implementing real-world applications in your app. By the end, you’ll be equipped to add AI features to your app, from text recognition to pose estimation and beyond!

What We’ll Cover:

1. What is ML Kit?

2. Setting up ML Kit in an Android project

3. Key ML Kit APIs and their use cases

4. Detailed implementation guides for popular ML Kit APIs

5. Custom models and Firebase integration

6. Best practices and performance tips

7. Conclusion

1. What is ML Kit?

ML Kit is Google’s machine learning SDK that makes it easy to integrate powerful machine learning models into mobile applications. ML Kit offers both on-device and cloud-based APIs, covering a wide range of use cases like text recognition, face detection, image labeling, and pose estimation.

Key Benefits of ML Kit:

• Ease of Use: Pre-trained models save time and resources.

• Performance: On-device processing is fast and secure.

• Cross-platform Support: Available for both Android and iOS.

  • Custom Models: Allows integration of your own custom TensorFlow Lite models.
2. Setting Up ML Kit in an Android Project

Before using ML Kit, you’ll need to configure your Android project properly. This includes adding dependencies, setting up permissions, and, if needed, linking to Firebase.

Step 1: Add Dependencies

Add the required ML Kit dependencies to your build.gradle file. Here’s an example setup with Text Recognition, Face Detection, and Image Labeling APIs:

dependencies {
    // For ML Kit Text Recognition
    implementation 'com.google.mlkit:text-recognition:16.0.0'

    // For ML Kit Face Detection
    implementation 'com.google.mlkit:face-detection:16.1.3'
    
    // For ML Kit Image Labeling
    implementation 'com.google.mlkit:image-labeling:17.0.7'
    
    // For Firebase (if cloud-based ML Kit is needed)
    implementation 'com.google.firebase:firebase-ml-model-interpreter:22.0.4'
}

 

Step 2: Configure Permissions

ML Kit relies on certain permissions depending on the feature you’re implementing. For instance, camera access is essential for real-time tasks like face detection, barcode scanning, and pose estimation, while network access (Internet permission) is required if you’re using cloud-based APIs or Firebase integration.

Declaring Permissions in AndroidManifest.xml

To start, add the necessary permissions in your AndroidManifest.xml file. This step informs the Android system about the permissions your app intends to use, but it’s only the first step for permissions like the camera.

<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.INTERNET" />

 

android.permission.CAMERA: Needed to access the camera for on-device image processing tasks.

android.permission.INTERNET: Required for cloud-based ML Kit APIs and Firebase functionality.

Requesting Dangerous Permissions at Runtime

The CAMERA permission is categorized as a “dangerous permission” in Android, which means that simply declaring it in the manifest file isn’t enough. You also need to request this permission at runtime and handle the user’s response. Here’s a full approach:

First, check if the Permission is Granted: Before accessing the camera, check if the CAMERA permission has already been granted by the user:

private val CAMERA_REQUEST_CODE = 1001

// Call this function to check or request camera permission
fun checkCameraPermission() {
    if (ContextCompat.checkSelfPermission(this, Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
        
        // If permission is not granted, check if we should show an explanation
        if (ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.CAMERA)) {
            // Show an explanation to the user asynchronously
            AlertDialog.Builder(this)
                .setTitle("Camera Permission Required")
                .setMessage("This app requires access to the camera to perform ML Kit features like face detection.")
                .setPositiveButton("OK") { _, _ ->
                    // Request the permission after explanation
                    ActivityCompat.requestPermissions(this, arrayOf(Manifest.permission.CAMERA), CAMERA_REQUEST_CODE)
                }
                .setNegativeButton("Cancel", null)
                .show()
        } else {
            // No explanation needed; directly request the permission
            ActivityCompat.requestPermissions(this, arrayOf(Manifest.permission.CAMERA), CAMERA_REQUEST_CODE)
        }
        
    } else {
        // Permission is already granted; proceed with camera operations
        startCameraOperations()
    }
}

// Handle the permission request response
override fun onRequestPermissionsResult(requestCode: Int, permissions: Array<out String>, grantResults: IntArray) {
    super.onRequestPermissionsResult(requestCode, permissions, grantResults)
    
    if (requestCode == CAMERA_REQUEST_CODE) {
        if (grantResults.isNotEmpty() && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
            // Permission was granted; proceed with camera operations
            startCameraOperations()
        } else {
            // Permission denied; show a message explaining why the camera is needed
            Toast.makeText(this, "Camera permission is required to use this feature.", Toast.LENGTH_SHORT).show()
        }
    }
}

// Dummy function to represent starting camera operations
fun startCameraOperations() {
    // Your code to start camera or ML Kit feature here if needed?
}

 

3. Key ML Kit APIs and Their Use Cases
1. Text Recognition

• Use case: Scanning and extracting text from images, such as receipts, IDs, or documents.

2. Face Detection

• Use case: Identifying facial features in real-time for AR effects, emotion detection, or filters.

3. Image Labeling

• Use case: Categorizing objects in images automatically for photo galleries, content recommendations, etc.

4. Barcode Scanning

• Use case: Useful in e-commerce and inventory management for quickly scanning product barcodes.

5. Pose Detection

• Use case: Tracking user movements for fitness apps, dance applications, and gesture control.

Each API is designed with flexibility to work either fully offline (on-device) or, in some cases, with cloud processing.

4. Implementation Guide for Popular ML Kit APIs

Let’s dive deeper into implementing some of the most popular ML Kit features.

Text Recognition
  1. Initialize the TextRecognizer:
val textRecognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)

 

2. Prepare the Input Image:

Convert the image into an ML Kit-compatible format:

val image = InputImage.fromBitmap(yourBitmap, rotationDegree)

 

3. Process the Image:

Call the process method and handle the result or error:

textRecognizer.process(image)
    .addOnSuccessListener { visionText ->
        // Process recognized text
        for (block in visionText.textBlocks) {
            val text = block.text
            // Use text as needed
        }
    }
    .addOnFailureListener { e ->
        // Handle error
    }

 

4. Display Results:

Use a TextView or overlay to display the recognized text on the UI.

Face Detection
  1. Initialize the FaceDetector:
val faceDetector = FaceDetection.getClient(FaceDetectorOptions.Builder()
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_FAST)
    .build())

 

2. Process an Image for Faces:

faceDetector.process(image)
    .addOnSuccessListener { faces ->
        for (face in faces) {
            val bounds = face.boundingBox
            val leftEyeOpenProb = face.leftEyeOpenProbability
            // Draw bounding boxes or overlay effects
        }
    }
    .addOnFailureListener { e ->
        // Handle error
    }

 

Image Labeling

  1. Initialize the ImageLabeler:
val labeler = ImageLabeling.getClient(ImageLabelerOptions.DEFAULT_OPTIONS)

 

2. Process an Image:

labeler.process(image)
    .addOnSuccessListener { labels ->
        for (label in labels) {
            val text = label.text
            val confidence = label.confidence
            // Display or categorize labels
        }
    }
    .addOnFailureListener { e ->
        // Handle error
    }

 

Job Offers

Job Offers

There are currently no vacancies.

OUR VIDEO RECOMMENDATION

, ,

Cutting-Edge-to-Edge in Android 15: Using Previews and Testing in Jetpack Compose to Manage Insets.

With the advent of Android 15, edge-to-edge design has become the default configuration. Consequently, applications must be capable of accommodating window insets, including the system status bar and navigation bar, as well as supporting drawing…
Watch Video

Cutting-Edge-to-Edge in Android 15: Using Previews and Testing in Jetpack Compose to Manage Insets.

Timo Drick
Lead Android developer
Seven Principles Mobility GmbH

Cutting-Edge-to-Edge in Android 15: Using Previews and Testing in Jetpack Compose to Manage Insets.

Timo Drick
Lead Android develop ...
Seven Principles Mob ...

Cutting-Edge-to-Edge in Android 15: Using Previews and Testing in Jetpack Compose to Manage Insets.

Timo Drick
Lead Android developer
Seven Principles Mobility ...

Jobs

5. Custom Models and Firebase Integration

If you have a specific model that ML Kit doesn’t offer, you can use custom TensorFlow Lite models with Firebase. This enables you to upload and manage your own models in Firebase.

Integrate a Custom Model:

1. Upload your model to Firebase Console under “ML Kit” -> “Custom” Models.

2. Download and Use the Model in Code:

val modelInterpreter = FirebaseModelInterpreter.getInstance(firebaseModelOptions)
modelInterpreter.process(input)
    .addOnSuccessListener { result ->
        // Process the results from custom model
    }
    .addOnFailureListener { e ->
        // Handle error
    }

 

6. Best Practices and Performance Tips

1. Run ML Kit on a Separate Thread: Avoid UI blocking by processing ML Kit functions in a background thread.

2. Optimize Model Size: Smaller models ensure faster and smoother performance, especially for on-device processing.

3. Minimize Camera Access: Release camera resources when not actively using ML Kit to save battery.

4. Use Cloud APIs Wisely: For highly accurate results or complex models, cloud-based APIs are great, but be mindful of data costs :p.

7. Conclusion

ML Kit is a powerful and accessible tool for Android developers to integrate AI features directly into their apps. With this guide, you should now be equipped to add text recognition, face detection, image labeling, and even custom machine learning models to your applications. AI-powered apps are not only more interactive but can provide unique value that sets them apart from the competition.

Stay Connected for More!

Thank you for reading! If you found this guide helpful, I’d love for you to follow me here on Medium. I regularly share tips, deep dives, and tutorials on Android development, AI integration, and the latest tools in the mobile development space. Following me ensures you won’t miss out on the latest in Android tech, coding best practices, and everything you need to build apps.

Let’s keep learning and building amazing things together — see you in the next post!

This article is previously published on proandroiddev.com.

YOU MAY BE INTERESTED IN

YOU MAY BE INTERESTED IN

blog
It’s one of the common UX across apps to provide swipe to dismiss so…
READ MORE
blog
In this part of our series on introducing Jetpack Compose into an existing project,…
READ MORE
blog
In the world of Jetpack Compose, where designing reusable and customizable UI components is…
READ MORE
blog

How to animate BottomSheet content using Jetpack Compose

Early this year I started a new pet project for listening to random radio…
READ MORE
Menu