Blog Infos
Author
Published
Topics
, , , ,
Published
This is Part 4 of a series of articles where I explain how to implement GenAI on Android. [Click here to view the full series.]

Gen-AI also comes with computer vision

This is the coolest bit of SmartWriter so far: pick a photo and the app describes what it sees — entirely on device, no cloud. On my Galaxy S25 Ultra it’s quick too: typically ~1–3 seconds per image after the first model download.

🔗 Full project (with Compose UI): https://github.com/josegbel/smart-writer

💡 What you can build with this
  • Accessibility / alt‑text: auto‑generate descriptive text for images.
  • Smart gallery captions: save human‑like captions with photos.
  • Notes with pictures: drop a photo into a note and get a first‑draft description.
  • Private visual search: tag/cluster images locally for personal search.
  • Social posting helpers: suggest captions users can tweak.

All of this runs locally, so it’s private, fast, and works offline once the model is installed.

⚙️ Setup

Add the dependency to your Version Catalog:

mlkit-genai-image-description = "com.google.mlkit:genai-image-description:1.0.0-beta1"

Then reference it from your module:

dependencies {
    implementation(libs.mlkit.genai.image.description)
}

⚠️ You’ll need a supported device (e.g., Galaxy S25 UltraPixel 9+, …). Emulators don’t run these GenAI models.

🧠 ViewModel — how it works

Below are the important pieces of my ImageDescViewModel and what each one does. (This is the exact implementation used in the app; I’m just showing the key sections here. The full source is in the repo.)

1) User picks an image, then call the API
Describe

We store the selected Uri, create the on‑device client and hand off to the feature‑status flow:

fun onImageSelected(uri: Uri) {
    _uiState.update { it.copy(imageUri = uri) }
}

fun describe(context: Context) {
    _uiState.update { it.copy(isLoading = true) }
    viewModelScope.launch {
        try {
            val options = ImageDescriberOptions.builder(context).build()
            imageDescriber = ImageDescription.getClient(options)
            prepareAndStartImageDesc(context)
        } catch (e: Exception) {
            _uiEvent.emit(ImageDescUiEvent.Error("Error: ${e.message}"))
        }
    }
}
2) Check model availability and handle download

On first run the model may need to be downloaded. We check FeatureStatus and react:

suspend fun prepareAndStartImageDesc(context: Context) {
    val featureStatus = imageDescriber?.checkFeatureStatus()?.await()

    when (featureStatus) {
        FeatureStatus.DOWNLOADABLE -> downloadFeature(context)
        FeatureStatus.DOWNLOADING -> {
            imageDescriber?.let { desc ->
                uiState.value.imageUri?.let { uri ->
                    startImageDescRequest(uri, context, desc)
                }
            }
        }
        FeatureStatus.AVAILABLE -> {
            _uiState.update { it.copy(isLoading = true) }
            imageDescriber?.let { desc ->
                uiState.value.imageUri?.let { uri ->
                    startImageDescRequest(uri, context, desc)
                }
            }
        }
        FeatureStatus.UNAVAILABLE, null -> {
            _uiEvent.emit(
                ImageDescUiEvent.Error("Your device does not support this feature.")
            )
        }
    }
}
3) Download callbacks (first‑time only)

We show progress and immediately run inference once the model is ready:

private fun downloadFeature(context: Context) {
    imageDescriber?.downloadFeature(object : DownloadCallback {
        override fun onDownloadStarted(bytesToDownload: Long) {
            _uiState.update { it.copy(isLoading = true) }
        }
        override fun onDownloadProgress(totalBytesDownloaded: Long) {
            _uiState.update { it.copy(isLoading = true) }
        }
        override fun onDownloadCompleted() {
            _uiState.update { it.copy(isLoading = false) }
            imageDescriber?.let { desc ->
                uiState.value.imageUri?.let { uri ->
                    startImageDescRequest(uri, context, desc)
                }
            }
        }
        override fun onDownloadFailed(e: GenAiException) {
            _uiState.update { it.copy(isLoading = false) }
            _uiEvent.tryEmit(
                ImageDescUiEvent.Error("Download failed: ${e.message}")
            )
        }
    })
}
4) Run inference (decode → request → await)

Decode the Uri to a Bitmap, wrap it in a request, then await the natural‑language description:

fun startImageDescRequest(
    uri: Uri,
    context: Context,
    imageDescriber: ImageDescriber,
) {
    val bitmap = ImageDecoder.decodeBitmap(
        ImageDecoder.createSource(context.contentResolver, uri)
    )
    val request = ImageDescriptionRequest.builder(bitmap).build()
    _uiState.update { it.copy(isLoading = true) }
    viewModelScope.launch {
        try {
            val description = imageDescriber.runInference(request).await().description
            _uiState.update { it.copy(description = description) }
        } catch (e: Exception) {
            _uiEvent.emit(
                ImageDescUiEvent.Error("Error describing the image: ${e.message}")
            )
        } finally {
            _uiState.update { it.copy(isLoading = false) }
        }
    }
}

💡 Tip: Very large images can be memory‑heavy. Consider down‑scaling before building the request if you hit OOMs.

Job Offers

Job Offers

There are currently no vacancies.

OUR VIDEO RECOMMENDATION

, ,

Introduction to object detection with Vertex AI and ML Kit

In this lightning talk you will learn about Google Vertex AI, a new platform to train and optimize machine learning models in the cloud. You will see an example app showing how to use a…
Watch Video

Introduction to object detection with Vertex AI and ML Kit

Christian Dziuba
Software Engineer Android
Rewe Digital

Introduction to object detection with Vertex AI and ML Kit

Christian Dziuba
Software Engineer An ...
Rewe Digital

Introduction to object detection with Vertex AI and ML Kit

Christian Dziuba
Software Engineer Android
Rewe Digital

Jobs

🗂️ Exposing data with UiState

Your ImageDescUiState carries:

  • imageUri — the user’s chosen image
  • description — the generated caption / alt‑text
  • isLoading — drives the progress indicator

Transient errors go through SharedFlow<ImageDescUiEvent> so you can show a Snackbar/toast without polluting state.

⚡ Latency (real‑world)

On a Galaxy S25 Ultra, I’m seeing ~1–3s per image after the first run. Once the model is on device, the feature works offline.

✅ Recap
  • Fully on‑device image descriptions with ML Kit GenAI.
  • Minimal code if you’ve already implemented the other three features — the feature‑status/download pattern is the same.
  • Great for accessibility, captions, and private photo workflows.
🎉 Thanks for reading!

That’s the end of the SmartWriter series — I hope you found it useful (and a bit fun). If you enjoyed this, follow me on Medium and hit Subscribe so you don’t miss future Android + Kotlin experiments. I’m planning more hands-on pieces soon.

If you want to try everything yourself, the code’s in the repo — and you can read the other parts below:

Got suggestions or questions? Drop a comment or ping me — I’d love to hear how you’re using ML Kit GenAI in your apps. 🚀

This article was previously published on proandroiddev.com.

Menu