most read
Software Engineering
Why We Killed Our End-to-End Test Suite Sep 24
Culture & Values
The Spark Of Our Foundation: a letter from our founders Dec 9
Software Engineering
The value of canonicity Oct 30
Careers
We bring together great minds from diverse backgrounds who enable discussion and debate and enhance problem-solving.
Learn more about our careers



The first edition of the 2025 Nubank Engineering Meetup kicked off with a core topic for those working with distributed architectures, microservices, and system reliability: observability.
The event, held in February, opened the year’s technical meetup calendar and featured Guto (an engineer at Nu and the night’s host), AWS Solution Architects Lucas Vieira Souza da Silva and Luis Tiani, as well as Nubank’s engineering team. Our representatives, Caio (Engineering Manager) and Otávio (Lead Engineer), shared the behind-the-scenes evolution of our log stack and the creation of the Observability Stream and Alexandria platforms.
Their talk focused on how to integrate open source tools with AWS managed services to build scalable, efficient observability pipelines. The session covered the foundations of the three pillars of observability—metrics, logs, and traces—and included practical demos using OpenTelemetry, Prometheus, Grafana, and OpenSearch.
What is observability, really?
The opening question was simple but essential: what does it mean to make a system observable? The answer lies in the ability to answer, with concrete data, questions about the internal behavior of applications in production. This is done using three types of signals:
These signals complement one another and form the foundation for building dashboards, setting up alerts, and conducting deep system behavior analysis.
Check our job opportunities
The role of OpenTelemetry
One of the most relevant open source tools today is OpenTelemetry. Maintained by the CNCF, it provides:
Collectors that act as agents—receiving, enriching, and exporting observability signals;
With OpenTelemetry, it’s possible to instrument applications, collect telemetry in various formats (including Prometheus), and send it to different backends like OpenSearch, Prometheus, and more.
Open source means freedom, but also complexity
The CNCF maintains a rich ecosystem of open source observability tools, from data ingestion to visualization. But building and operating a fully open source stack requires time, expertise, and responsibility—covering infrastructure, upgrades, scalability, and security.
That’s where managed services come in. AWS proposes to simplify operations while preserving the benefits of open technologies. Instead of managing your own Prometheus or Grafana instances, you can leverage fully managed versions—streamlining integration and scaling.
OpenSearch: from Elastic Search to vector search
A key highlight was OpenSearch, a fork of Elastic Search created in 2021 and now maintained by the Linux Foundation. It’s widely used for:
AWS offers OpenSearch in two modes:
OpenSearch also includes OpenSearch Ingestion, built on Data Prepper, for transforming and sending JSON-formatted data to the cluster.
Building a managed observability stack
The session also explored how AWS services can be integrated to build a comprehensive observability stack:
Real-world demo: OpenTelemetry in an EKS cluster
To make everything more tangible, Lucas presented a live demo of the “OpenTelemetry Demo” application running on EKS. The app, supported by a traffic generator, emitted telemetry signals processed by an OpenTelemetry Collector and sent to Prometheus and OpenSearch.
Using Grafana, the team correlated metrics, logs, and traces into unified dashboards, enabling:
All of this was achieved using Grafana variables to cross-reference data from Prometheus and OpenSearch, making incident investigation and data correlation faster and easier.
Rebuilding Nubank’s logging stack: Scaling and cost challenges
In the second half of the meetup, Caio and Otávio shared insights about the evolution of Nubank’s internal observability platform for logs — a journey shaped by rapid growth, limitations with external vendors, and strategic decisions to ensure cost efficiency and control over data.
The problem: Log volume growth and external vendor costs
With over 3,000 microservices and a growing customer base, Nubank began handling daily log volumes reaching half a petabyte. The original strategy — relying on an external SaaS vendor — began to fall short in two key areas:
The solution? Build a fully internal, highly scalable, resilient, and cost-effective platform.
A new platform for data ingestion
The first step in this restructuring was the creation of Observability Stream, our internal platform for collecting and processing telemetry data — starting with logs and later expanding to traces.
Technical requirements
The team established four core requirements:
Micro-batching architecture
To balance performance and technical feasibility, the team implemented a micro-batching model with decoupled processing stages connected via queues (SQS). The flow includes:
This architecture brought robustness and modularity, paving the way for the next step: log querying.
Alexandria: Our internal log search platform
Once all logs were processed and stored, the next step was building Alexandria — Nubank’s internal platform for log querying, used by engineers across the company.
Scalable search with Trino and Parquet
The architecture relies on:
Impact and results
Efficient observability with open source and the cloud
Nubank Engineering Meetup #11 offered a deep and practical dive into the world of observability with Open Source and AWS. It reinforced the importance of metrics, logs, and traces, while showing how to build a modern observability stack that blends the openness of community-driven tools with the convenience of managed cloud services.
With real-world examples, detailed architecture, and live demonstrations, the event was a valuable resource for engineers and platform teams looking to improve visibility and reliability across their systems.
Stay tuned for the next editions of the Nubank Engineering Meetup for more in-depth technical content on the challenges and solutions involved in building simple, secure, and innovative financial products.
Check our job opportunities