Roadmap

EarthKit Agent is under active development. The following is a list of features and enhancements we plan to ship next. We’d love your input on what you’d like to see! Please reach out on Discord if you’d like to chat.

More Modalities & Social Media Import

Geolocation task consisting of multiple images
Video uploads
Automatically importing Tweets, Telegram posts, and other social media sources, likely via Bellingcat’s auto-archiver

Geo-Calibration

Given an initial rough geolocation produced by EarthKit Agent, use a mix of OrienterNet and Sample4Geo to refine the geolocation to the geographical features of proximity of the point of interest.

Expand the set of tools available to Agents

Overpass Turbo querying (similar to the one on earthkit)
Call ML models: EigenPlaces and Sample4Geo
Python code execution for calculations
MultiOn Web Agent

Improvements to Geoverification

Support streaming for report generation
Add more content types (e.g. showing google streetview embeds)
Bounding box annotations of images
Agentic Geoverification (currently we have a deterministic flow that uses LMMs as a judge and report generator.)

Human-in-the-loop

Ability to interrupt the agent workflow, change its behavior, and add additional instructions
Ability for the agent to generate intermediate results / carry out sub-tasks in a geolocation project

Tree-based Agent workflow for complex tasks

Imagine a scenario in which the image could either be in San Francisco, Tokyo, or New York. Instead of exploring all the possibilities at once, the agent branches out into three separate investigations, collecting evidence from each of the three locations. A tree-based approach allows the agent to explore each of the three locations in parallel, before merging the results back together. This will also enable longer chains of thought and more complex investigations. Reference: Language Agent Tree Search and Multion’s Q. Additionally, check out this video for an experimental demo.

Usage and Credits