- OlmoEarth is a platform that integrates multiple AI models to extract meaningful insights from environmental data.
- The platform, developed by nonprofit organization Allen Institute for AI, is trained on 10 terabytes’ worth of Earth observation data.
- The platform enables researchers as well as conservation organizations to analyze massive data sets by customizing AI models on the platform.
Environmental data-gathering technology has proliferated in recent years. But how do you derive meaningful insights from myriad data sources?
A new AI-powered platform aims to solve this problem.
OlmoEarth, developed by the nonprofit Allen Institute for AI (Ai2), is a platform that integrates multiple artificial intelligence models that have been trained on approximately 10 terabytes of environment observation data. The open-source platform, launched in November, helps extract actionable insights from satellite as well as sensor data. The platform allows researchers as well as organizations to use their own data to customize a foundational model and use it to monitor trends such as forest loss or mangrove health without having to build models from scratch.
“It’s intended to democratize access to this kind of technology in a no-code kind of way,” Patrick Beukema, the OlmoEarth lead at Ai2, told Mongabay in a video interview.
The motivation behind building the platform was to drastically reduce the time scientists spent parsing through humongous volumes of data to get meaningful information from it.
“What we set out to do was to flip that on its head and really go from them spending months to literally days to get the same sort of information,” Ted Schmitt, senior director of conservation at Ai2, told Mongabay in a video interview.

Beukema and Schmitt spoke with Mongabay’s Abhishyant Kidangoor about the journey to building OlmoEarth, the gaps they want to overcome, and the need for the platform at this point in time. The following interview has been lightly edited for length and clarity.
Mongabay: How would you describe OlmoEarth to someone who knows nothing about it?
Patrick Beukema: OlmoEarth is an artificial intelligence platform for understanding what’s happening on Earth in real time across varied spatial and temporal contexts. It’s intended to democratize access to this kind of technology in a no-code kind of way. Historically, there’s been recognition that there’s a lot of value in Earth data, but it has been very difficult to unlock. Our team built this platform to enable widespread access and really democratize access to this kind of intelligence for all kinds of organizations.
Ted Schmitt: We have heard that there is so much potential for AI in the environment and conservation community. And yet, what we also heard is that realizing that potential was difficult. We asked a lot of questions as to why that was the case and we ended up where we did with this platform. We knew we could build great AI, and we wanted to provide subject matter experts and communities that have the expertise on the ground with the tools to engage with AI without having to learn AI deeply.
Mongabay: What was the motivation behind building this platform? What was that initial spark?
Ted Schmitt: One of our early partners is the Global Mangrove Alliance and the folks at Wetlands International who run Global Mangrove Watch. They often described to us how they spend literally months wading through data with 80-90% of their time doing data wrangling, and 10-15% of their time actually being ecologists. What we set out to do was to flip that on its head and really go from them spending months to literally days to get the same sort of information. We’ve done that also with the Group on Earth Observations with deforestation and forest loss data as well. It’s about giving them more time to be the experts while leveraging all the AI and data capabilities.
Patrick Beukema: The data and engineering here is actually quite challenging. It’s unlike, say, natural language processing or conventional computer vision where you can input a string of text and get an output on whatever insights you’re looking for. Here, we really need to assimilate very large quantities of multimodal data like radar and optical data and all the other kinds of data at play that might go into how we would understand an ecosystem like where mangroves are or where deforestation is occurring in the Amazon. You need so many different data sources to really do a great job at answering these kinds of questions and that’s extremely difficult to do from an engineering and artificial intelligence perspective.
There’s a need for models that can make sense of all this data. But then there’s also a need for everything else to operationalize these models to make them relevant for organizations that are reliant on this technology. Our platform approach is designed to really close this entire gap. We’ve built everything that a user would need to go from answering their questions on where deforestation is happening in the Amazon or where we are seeing mangrove loss. Or maybe they just want to look at a map or want an alert on deforestation in a particular corner of the Amazon Rainforest. Our goal is to really make this very difficult, challenging, expensive, tedious, labor-intensive process just as simple as clicking a few buttons.
This way, they can concentrate on the things that they want to concentrate on and not worry about how to get information from all these different data sources and harmonize them, and then build some kind of operational application off of that.
We also saw a lot of folks reinventing the wheel and not sharing, or not being able to share, their technological innovations or models with other groups who were kind of mission-aligned and working on similar applications. We thought, “Hey, how about we just build the open infrastructure once, make it accessible, and then everyone can leverage this together.”
Ted Schmitt: We want to serve those folks, whether it’s Global South countries, Indigenous people and local communities, and the people who otherwise wouldn’t have access to this data or technology because it’s just out of reach.

Mongabay: What did the journey of building OlmoEarth look like?
Patrick Beukema: We started with users. Actually, we were approached by so many organizations because folks generally know we’re building AI in this space. So we got approached by hundreds of organizations asking us to build models for them. We couldn’t possibly do as much as we would like to. And so we started thinking about what are the common user problems that cut across all these different organizations. What do they need? What are they asking for? What are the pain points? How can we solve them in a way that can be leveraged simultaneously?
We built new foundation models for this platform that harmonizes across a bunch of different data sets. We iterated based on users’ needs, in addition to, of course, academic benchmarks, which are important. We also needed to know we are building something that is scientifically grounded and valid. And so, we simultaneously built our model in such a way that it was very performant on both academic research benchmarks, but anchored in the real world.
Ted Schmitt: There’s a lot of people doing embeddings out there. But we felt like it was really important to go beyond embeddings and enable fine-tuning and allow an ecologist or a local community member to go in and tune our foundation model and deeply inform that model with their expertise.
Patrick Beukema: That was really the central idea of the platform. How do we make these bespoke models? How do we make it as easy as possible to get these very powerful foundational models but tuned for users’ needs and whatever their spatial and temporal interests are?
These organizations are very much experts in, say, mangroves. And so, they can say very precisely “here is where the mangroves are, here is where they were lost, or this is what happened to these mangroves.” They can put that knowledge into our platform in terms of those labels and then we can just enable the model to be fine-tuned.
Ted Schmitt: They’re the experts on presenting this data. We really don’t want to do that. We want to focus on what we’re best at, which is the AI capability.

Mongabay: What are the gaps in the platform that you are working to improvise? What does the future look like for OlmoEarth?
Patrick Beukema: We’ve gotten massive interest since we launched. We have a huge number of organizations that we want to support and really make sure they’re successful with their missions. That’s what we’re trying to do, which is to really accelerate their mission. So we have a massive backlog right now that we’re working through.
Ted Schmitt: We have the potential to reach across different spatial and temporal time frames. Everywhere from operational managers who need short time frames to land-use managers who are doing monitoring on a monthly or seasonal or annual basis, all the way out to policymakers looking at very large spaces or even the whole Earth in annual or decadal time frames. We want to fill those adoption gaps in the spatial and temporal sense.
Patrick Beukema: We want to help as many organizations as possible, and we think the existing models are doing a good job in some use cases, but not all use cases. We think that there are opportunities with additional modalities and additional data types.
For example, we think there’s a lot of potential in incorporating weather variables. So you might imagine that knowing the state of the sea is integral to understanding how mangroves are behaving or how ecosystems change. So we believe that we can make the models even more multimodal with additional data on weather. We want to teach the models to harmonize across those data sets in addition to the ones we already have so that we can build out a systems-oriented understanding of what’s happening on the planet, at the ground level but also at the atmospheric level.
So we are looking to make the models more powerful while still preserving their efficiency. It’s quite critical for us to enable organizations to leverage this technology within a reasonable turnaround time. We also need these things to be very cost-effective because we want this platform to be a resource to address and find operationalized solutions to our planet’s complex and quite urgent problems.
Banner image: Surviving mangroves after Odette storm hit a local beach in Barangay Cayhagan, Sipalay City, Negros Occidental, December 2021. Image by Armusaofficial via Wikimedia Commons (CC BY-SA 4.0).
Abhishyant Kidangoor is a staff writer at Mongabay. Find him on 𝕏 @AbhishyantPK.