Skip to main content

Tile search & area summary

The Tile Search widget does AI-assisted imagery analysis end to end: find relevant satellite tiles by meaning or area, then summarize a selection with a vision model — all in one widget. It talks to the fusion-analyst api backend.

The flow

  1. Search. Tile Search sends your text and/or drawn extent to POST /api/embeddings/search. The backend embeds the query with the CLIP text encoder and ranks tiles by cosine similarity against their precomputed image vectors (and/or filters spatially).
  2. Review. Results render as image chips (fetched through the GET /api/embeddings/tile-image proxy) and as extent graphics on the source Map.
  3. Select & summarize. Selecting tiles and pressing Summarize runs the area_summary workflow over the selection. The widget shows a single processing-status line, then opens the finished AOI summary in a modal.

Setting up

  1. Add a Map and a Tile Search widget to the same view.
  2. In the Tile Search config drawer's Data section, pick the Map as the source and choose an embedding dataset.
  3. Optionally set an AOI summary focus to steer the vision summary.

Backend

The embeddings API lives under /api/embeddings:

EndpointPurpose
GET /api/embeddings/datasetsList searchable embedding datasets.
POST /api/embeddings/searchRun a query / extent / combined search.
GET /api/embeddings/tile-imageProxy one tile chip as a PNG.

The area_summary workflow is selected by workflow: "area_summary" on POST /api/runs. It reads the selected tiles from client_context.selected_tiles, runs one vision call per tile (capped at 24), and streams the synthesized summary over the run's SSE stream using the same normalized event taxonomy as the chat analyst.

:::note CLIP text encoder Query embedding uses the CLIP text encoder (torch + transformers) because the precomputed tile vectors are CLIP image embeddings — Azure OpenAI text embeddings live in a different vector space and can't be compared against them. Azure OpenAI is still used for the vision area summary. :::

Authentication

Both steps forward the signed-in user's ArcGIS token as the Authorization bearer. The backend uses it to authorize secured imagery services when exporting tile chips, both for the search-result chips and for the per-tile vision calls. Tile chips are fetched as blobs with the header and shown via object URLs, so the token never appears in an image URL. See Authentication.