Tile search & area summary
The Tile Search widget does AI-assisted imagery
analysis end to end: find relevant satellite tiles by meaning or area, then
summarize a selection with a vision model — all in one widget. It talks to the
fusion-analyst api backend.
The flow
- Search. Tile Search sends your text and/or drawn extent to
POST /api/embeddings/search. The backend embeds the query with the CLIP text encoder and ranks tiles by cosine similarity against their precomputed image vectors (and/or filters spatially). - Review. Results render as image chips (fetched through the
GET /api/embeddings/tile-imageproxy) and as extent graphics on the source Map. - Select & summarize. Selecting tiles and pressing Summarize runs the
area_summaryworkflow over the selection. The widget shows a single processing-status line, then opens the finished AOI summary in a modal.
Setting up
- Add a Map and a Tile Search widget to the same view.
- In the Tile Search config drawer's Data section, pick the Map as the source and choose an embedding dataset.
- Optionally set an AOI summary focus to steer the vision summary.
Backend
The embeddings API lives under /api/embeddings:
| Endpoint | Purpose |
|---|---|
GET /api/embeddings/datasets | List searchable embedding datasets. |
POST /api/embeddings/search | Run a query / extent / combined search. |
GET /api/embeddings/tile-image | Proxy one tile chip as a PNG. |
The area_summary workflow is selected by workflow: "area_summary" on
POST /api/runs. It reads the selected tiles from
client_context.selected_tiles, runs one vision call per tile (capped at 24),
and streams the synthesized summary over the run's SSE stream using the same
normalized event taxonomy as the chat analyst.
:::note CLIP text encoder Query embedding uses the CLIP text encoder (torch + transformers) because the precomputed tile vectors are CLIP image embeddings — Azure OpenAI text embeddings live in a different vector space and can't be compared against them. Azure OpenAI is still used for the vision area summary. :::
Authentication
Both steps forward the signed-in user's ArcGIS token as the Authorization
bearer. The backend uses it to authorize secured imagery services when exporting
tile chips, both for the search-result chips and for the per-tile vision calls.
Tile chips are fetched as blobs with the header and shown via object URLs, so the
token never appears in an image URL. See
Authentication.