Roadmap
EarthKit Agent is under active development. The following is a list of features and enhancements we plan to ship next. We’d love your input on what you’d like to see! Please reach out on Discord if you’d like to chat.
More Modalities & Social Media Import
- Geolocation task consisting of multiple images
- Video uploads
- Automatically importing Tweets, Telegram posts, and other social media sources, likely via Bellingcat’s auto-archiver
Geo-Calibration
Given an initial rough geolocation produced by EarthKit Agent, use a mix of OrienterNet and Sample4Geo to refine the geolocation to the geographical features of proximity of the point of interest.
Expand the set of tools available to Agents
- Overpass Turbo querying (similar to the one on earthkit)
- Call ML models: EigenPlaces and Sample4Geo
- Python code execution for calculations
- MultiOn Web Agent
Improvements to Geoverification
- Support streaming for report generation
- Add more content types (e.g. showing google streetview embeds)
- Bounding box annotations of images
- Agentic Geoverification (currently we have a deterministic flow that uses LMMs as a judge and report generator.)
Human-in-the-loop
- Ability to interrupt the agent workflow, change its behavior, and add additional instructions
- Ability for the agent to generate intermediate results / carry out sub-tasks in a geolocation project
Tree-based Agent workflow for complex tasks
Imagine a scenario in which the image could either be in San Francisco, Tokyo, or New York. Instead of exploring all the possibilities at once, the agent branches out into three separate investigations, collecting evidence from each of the three locations. A tree-based approach allows the agent to explore each of the three locations in parallel, before merging the results back together. This will also enable longer chains of thought and more complex investigations. Reference: Language Agent Tree Search and Multion’s Q. Additionally, check out this video for an experimental demo.