How AI foundation models help unmask soy facilities in Brazil
Data scientists at Trase and Ode Partners explain how they adapted Earth observation foundation models to map thousands of soy processing and storage facilities in Brazil.

Facilities data is vital for companies seeking to manage environmental and social risks in their supply chains and for financial institutions wanting to understand risks in loan books and investment portfolios.
Maps from Trase show the location, ownership and capacity of facilities in commodity supply chains, including cattle slaughterhouses and soy processing and storage in Brazil, palm oil mills in Indonesia and cocoa cooperatives in Côte d'Ivoire. The open-access data is freely available at our facilities data map page.
Identifying thousands of different types of facilities across vast areas is a huge challenge. Trase researchers have been working with Ode Partners to use artificial intelligence (AI) to detect soy processing and storage facilities in Brazil, enabling businesses to address deforestation in supply chains through product traceability. Trase and Ode have developed a solution that uses two state-of-the-art geospatial foundation models – AlphaEarth and Clay – to systematically identify agricultural facilities at scale.
What are foundation models?
Foundation models are mathematical models trained on large datasets that can be fine tuned to a wide range of downstream tasks. Well-known examples include Claude, GPT-4 and Gemini. Their design benefits from an important breakthrough in 2017: the invention of transformer architecture. This marked a shift in AI by introducing self-attention, a mechanism that allows models to process all parts of an input in parallel and capture global relationships. In practice, this means that when processing a sentence such as “I have a dog, a cat and a fish,” the model does not interpret each word in isolation or strictly sequentially. Instead, each part of the sentence is compared to all other parts to determine how strongly they relate to one another, allowing the model to weigh contextual dependencies across the entire sentence. For instance, the words “cat” and “dog” can relate strongly to each other along a “furriness” dimension, in contrast to fish.
Vision transformers (ViT) extend this idea to images by dividing them into small patches, which are then embedded into vectors and processed using self-attention. Below we see an image of a silo in Brazil decomposed into patches and projected into a two-dimensional principal component analysis space for visualization. Patches corresponding to forest regions (green) cluster separately from those representing infrastructure (red). While the visualization shows only two dimensions, the actual embeddings learned by a model exist in a much higher dimensional space.
AlphaEarth and Clay foundations
Building on embedding representations, recent Earth observation foundation models take this idea and apply it to large collections of Earth observation data. They learn general purpose representations of the Earth’s surface from huge, diverse satellite datasets. In doing so, they turn complex spatial and spectral patterns into compact numerical embeddings that are easy for machine learning models to work with. These representations can then be reused for a wide range of tasks such as finding similar regions, detecting changes over time, grouping similar areas and classification. The main advantage is that they often achieve strong results even when labeled training data is limited.
Trase and Ode mapped soy-related agricultural facilities in Brazil using two Earth observation foundation models: AlphaEarth and Clay. Both models generate embedding representations from satellite imagery that capture the spatial and temporal patterns of agricultural infrastructure. Ode played a central role in adapting the Clay model for this purpose and equipping the Trase team with the technical knowledge needed to apply it effectively. Building on this shared expertise, Ode also conducted foundation experiments with the Clay model that shaped the current pipeline, the details of which they have published in their companion article.
While both models serve the same goal in our pipeline, they offer complementary approaches to representing Earth observation data. AlphaEarth provides global, precomputed 64-dimensional embeddings at a 10-metre resolution (2017–2025), generated by integrating multimodal inputs like optical, radar, light detection and ranging (LiDAR), topographic and climatic data at a pixel level. In contrast, Clay uses a self-supervised ViT framework to generate patch-level one-dimension embeddings with 1,024 features. By combining these distinct architectures, we secured a rich, highly contextualized feature set for our downstream localization tasks.
Downstreaming models to detect soy facilities
For this work we built a two-stage pipeline (outlined below) focused on pixel and patched-based detection according to each model’s embedding format. In the first stage, we used the analysis-ready embeddings from AlphaEarth for 2024 to scan the vast Brazilian territory and identify high-potential soy facility locations. We used a machine-learning method called Random Forest that combined results from 50 decision trees and trained it on the AlphaEarth embeddings as input features using 321 ground-truth reference points. Since each tree casts a ‘yes/no’ vote, we used the fraction of yes votes as the probability and filtered for pixels with a probability above 80%. This step resulted in 7,400 candidate locations, which were subsequently passed to the second stage for refinement using the Clay model.
Because AlphaEarth embeddings primarily capture pixel-level spectral information with limited spatial context, the Clay-based stage was designed to incorporate surrounding landscape structure, which is essential to distinguish soybean facilities from visually similar targets such as urban infrastructure and poultry facilities. First, Sentinel-2 imagery from 2024 was queried for the period with typically lower cloud cover (May–October) over the set of candidate locations. Subsequently, a median composite was generated from the stacked images to produce a single representative mosaic. The pixel values were then retrieved through Google Earth Engine as 128 x 128 pixel (representing approximately 1.28 km²) image patches centered on each candidate location at 10-metre spatial resolution. The retrieved patches were normalized and used as input to the pre-trained Clay model. During inference, the encoder component of the model was executed without masking (all parts of the image were used), and the global embedding representation was extracted from the classification token of the ViT. These embeddings capture both spectral and spatial contextual information from the surrounding landscape.
At this stage, the final Clay embeddings are available and ready to be used in the downstream model. So we trained a feedforward artificial neural network based on a Multilayer Perceptron (MLP) architecture using the Clay embeddings as input features, again relying on the same 321 ground-truth points from AlphaEarth’s stage. This refinement step reduced false positives by leveraging broader spatial context and semantic differences in land-use structure with an output of 4,200 facility locations.
As a final automated step we integrated the new facility detections with official storage facility records from Brazilian National Supply Company (CONAB) and cross-reference the facilities against MapBiomas soybean map from Collection 10, isolating only those within a 10km radius of known soy plantations resulting in a set of 12,000 facility locations.
Removing the bottleneck
Our work demonstrated that Earth observation models are able to remove the computational challenges of processing millions of ground-truth samples. By combining AlphaEarth with Clay in a two-stage pipeline using a relatively small set of reference points, we were able to narrow down the entire Brazilian territory to 12,000 candidate locations of soy-related facilities. This pipeline shows that it is possible to monitor agricultural infrastructure at a continental scale with limited resources, opening the way for similar applications across other commodity supply chains and geographies.
Read Ode Partners article Using large Earth observation models to map soy infrastructure in Brazil


