October 30, 2025

Peabiru Cibernáutico: CloudWalk’s First AI Conference in São Paulo (Brazil)

The conference takes place on November 4, 2025 at Casa Rockambole (Pinheiros)

R&D and AI team

@ CloudWalk

Peabiru Cibernáutico: CloudWalk’s First Public AI Conference

CloudWalk's R&D/AI team presents: Peabiru Cibernáutico, our first AI Conference in São Paulo (Brazil).

Click here to register

On November 4, 2025, CloudWalk will host Peabiru Cibernáutico, its first open-to-the-public AI conference. A meeting point for researchers, creators, and thinkers exploring usual and unusual topics on AI.

The inaugural edition will feature special guests Avi Loeb (3I/Atlas project) and Roberta Duarte (Black Hole AI Simulations), as well as presentations by CW's AI team, including (but not limited to!):

Ecosystem challenges the traditional feature selection approach where you pick the best features once and discard the rest. Instead of throwing away potentially useful features, Ecosystem recycles them through multiple selection rounds to create diverse models with different feature combinations. The framework uses recursive feature elimination to generate multiple feature sets, trains models on each set, and then searches for the optimal ensemble strategy to combine them. This approach recognizes that features deemed weak in one context might be valuable in another, and that diverse models with different strengths can outperform a single optimized model. Presented by Guilherme Petrucci

A Transformer Encoder for Structured Financial Data brings BERT’s approach to financial transaction histories. Instead of treating data as single feature vectors, the system processes entire sequences of transactions over time, using discretization and transformer encoders to capture temporal patterns traditional models miss. After pre-training on masked modeling, it fine-tunes for specific tasks and now matches tabular models in performance while offering a complementary view of the data. Presented by Rafael Katopodis

Feature Search addresses a fundamental problem with manual feature engineering: it doesn’t scale with computation the way learning and search do. Instead of relying on humans to craft features by hand, this system automates the process by treating feature engineering as a search problem with generation and evaluation in a loop. The evaluation is rigorous, using feature ablation to measure whether each new feature actually improves predictions beyond the existing feature set, tested across multiple models and random seeds to ensure statistical significance. Applied to CloudWalk’s data, the system generated and evaluated 3,000 candidate features in a single day, identifying around 200 that provided meaningful performance gains. Presented by Daniel Gieseler

PyEhsa is a Python package that answers “where is activity heating up or cooling down over time?” Previously only available in ArcGIS and R, it detects emerging, intensifying, and fading geographic clusters by combining spatial and temporal statistics. Works with GeoPandas and Python’s geospatial tools to track patterns like crime hotspots or business activity shifts across regions. Presented by Lucas Azevedo

Understanding Spatial Behavior Through Dispersion Analysis reveals how merchants actually move in the real world by analyzing the geographic patterns of their transactions. Using techniques like Ripley’s K function and DBSCAN clustering, this approach identifies whether a merchant is sedentary (stays in one spot), nomadic (constantly moving), or semi-nomadic (operates from a few locations). The analysis can identify clustering patterns, useful for applications from fraud detection and to the understanding of business operations. Presented by Stéfany Barbosa

GeoQuadHash solves a problem with traditional location indexing: the real world isn’t uniform, so why use a fixed grid? GeoQuadHash adapts its resolution based on where your data actually is: high detail in dense urban areas, low detail in sparse regions. It combines quadtree adaptability with geohash simplicity, using a encoding where the hash string literally maps to the tree path. The result is efficient spatial indexing that matches your data’s natural clustering, useful from ML feature engineering to fast geofencing. Presented by Matheus Inoue

Wallace is an evolutionary framework that extends genetic algorithms beyond traditional numeric problems to tackle complex language-driven challenges. While classic genetic algorithms work with fixed encodings like binary strings, Wallace uses LLMs to generate custom crossover and mutation operations for text-based solutions. The framework offers plug-and-play solvers ranging from classic approaches to LLM-powered variants, enabling applications from query optimization to mathematical problem solving. At CloudWalk, Wallace serves as a creative partner for data scientists and engineers who need to explore solution spaces that are too complex for traditional optimization methods. Presented by Amanda Nunes

Evolutionary Feature Selection (EFS) takes inspiration from nature: just like organisms evolve to survive their environment, EFS evolves feature sets to find the best combination for your model. Feature subsets act as chromosomes that mutate and crossbreed across generations, with only the fittest surviving. The result is an efficient way to identify which features actually matter for your problem. Presented by Aaryan Dubey

Conversational AI: JIM and Claudio are CloudWalk’s AI agents for customer interaction. JIM evolved from a single agent to a router system with specialized sub-agents, cutting costs and latency while getting better results with smaller models. Claudio handles customer support troubleshooting, evolving from a simple chatbot to using graph-based workflows that systematically work through issues. Both have improved significantly and now run on CloudWalk’s self-hosted models. Presented by Audrey Vasconcelos and Yoschihiro Kaimoto

JIM Evals makes sure CloudWalk’s AI assistant keeps getting better without anyone babysitting it. Before any new code goes live, the system runs tests to catch broken features. Once deployed, it watches real conversations and uses AI judges to score things like sentiment and whether the assistant actually helped. When it spots problems, the system finds examples of what went wrong, figures out patterns, generates new test cases, and even creates pull requests with suggested fixes. The whole improvement cycle runs automatically from problem detection to deployment. Presented by Laís Alves

The Self-Driven Company project evoke the power of the collective to challenge how intelligent systems can be autonomous entrepreneurs and workforce. Despite the remarkable success of individual LLMs, coupled LM shine when dealing with real-world interaction. This way, we created a multi-agent system, equipped with different tools, that interacts in a complex manner to produce working business, in our case a vending machine. From manager agents, to analysts, scrappers and buyers, capable of joining forces in an autonomous, but transparent and auditable way. Presented by Rodrigo Motta and Pedro Setubal

Tyrell Eldon: Parallel Experiments with Crash Recovery solves a research bottleneck—running thousands of experiments without losing progress when crashes happen. Using SQLite persistence and process/thread pools, it tracks each experiment’s state. Crash at experiment 5,847 of 10,000? Just restart—the system resumes where it stopped and retries automatically, replacing 50+ lines of concurrent.futures boilerplate with two. Ideal for ML hyperparameter searches, large-scale API calls, and parameterized computations. Presented by Yeté Labarca

Computing in the Age of Hyperscalers How can we tackle the challenges of massive computation in the age of hyperscalers? This presentation explores the computational problems faced by modern enterprises and how hyperscalers can be seamlessly used to address them. We’ll discuss the nature of these challenges, the paradigm shift introduced by hyperscalers, and how we’re tackling these issues at CloudWalk. Prested by Igor Morgado.