most read
Software Engineering
Why We Killed Our End-to-End Test Suite Sep 24
Product
Product Managers: what they do and why we need them Feb 15
Software Engineering
The value of canonicity Oct 30
Careers
We bring together great minds from diverse backgrounds who enable discussion and debate and enhance problem-solving.
Learn more about our careers



Reviewed by Felipe Yukio
In today’s digital age, data plays a pivotal role in driving business strategies and decisions. As a trailblazer in the fintech industry, Nubank understands the power of data and seeks to maximize its potential. This blog post will delve deep into the concept of Core Datasets and how it’s proving to be a game-changer for Nubank.
We’ll also shed light on the practice of data self-service and why it’s an asset to modern businesses. Keep reading!
Understanding data self-service
Imagine a workspace where every employee, regardless of their department, has the ability to access and analyze necessary data whenever required. That’s precisely what data self-service is all about. It democratizes data, allowing for a smoother flow of information across various departments, promoting a culture of data-driven decision-making.
Benefits of data self-service
However, this powerful tool doesn’t come without its set of challenges.
Challenges
Check our job opportunities
Nubank’s Core Datasets in action
Core Datasets are the bedrock of reliable and best practice-oriented data management. They mitigate common issues like reprocessing and pave the way for consistent and trustworthy data streams. These datasets act as the reference point, ensuring uniformity and reducing discrepancies.
To truly appreciate the value of Core Datasets, let’s explore two use cases from Nubank’s operations:
Customer data challenges
Different business units within Nubank once had to grapple with specialized business rules, which made a consolidated customer analysis incredibly complex. Recognizing the complications this could bring, Nubank looked to Core Datasets as the solution.
By utilizing these datasets, we’ve been able to present a comprehensive view of our customers. What resulted was not just a harmonized customer analysis process but a centralized source of truth. This shift simplified both the maintenance and evolution of our customer data, fostering greater efficiency and clarity in our operations.
Data discrepancies in credit card products
With a diverse range of credit card products under Nubank’s banner, we found ourselves navigating a labyrinth of metrics, each with its unique set of business rules. The breadth of data sources meant that reconciling them became a meticulous task.
To address this, we initiated the unification of business rules for corporate indicators. This process required a deep-rooted collaboration between stakeholders and business units. By clearly defining ownership both functionally and technically, we achieved a cohesive corporate view. This new perspective respected specialized views, ensuring that while we had a comprehensive overview, the unique nuances of each unit were not lost.
Furthermore, this shift bolstered our governance processes, especially in areas concerning data quality, usability, and integrity.
Core Datasets Reference
In practice, core datasets have stricter documentation, called design specs. Nubank’s Analytics Engineering team reference the following link:
Data Quality at Airbnb
Operational dynamics
No two problems are identical, and thus, their solutions might vary. When working with Core Datasets, the essence is to achieve the properties listed below, ensuring the final dataset is:
At Nubank, two distinct approaches have been implemented to achieve this: the Tabular Modelling and the EAVT Modelling. Let’s dive into it while understanding the theoretical motivation behind these methodologies.
Kimball’s Dimensional Modeling concepts
Imagine a table depicting transactions. A primary key, the ‘grain,’ defines its core essence. From here, various characteristics or ‘dimension tables’ are appended to describe these events, culminating in what’s termed as the ‘Star schema’. Following these principles, Nubank ensures operational efficiency.
This involves mapping business processes, defining the grain, identifying dimensions, and detailing the actual event, leading to a structured database with flexible and scalable properties.
However, technological advancements and changing paradigms have shifted the focus from storage to processing concerns.
The EAVT approach
Given the cost-intensive nature of implementing more information to a table, a more column-rich dataset becomes appealing. From this line of thinking emerged the EAVT (Entity, Attribute, Value, Timestamp) model. EAVT can be visualized as a table where columns are stacked up, ready to be pivoted into the desired tabular format when necessary.
The EAVT model, with its emphasis on Entity, Attribute, Value, and Timestamp, presents a refreshing perspective in the realm of data handling. One of its most pronounced advantages is the reduced need for schema modifications. This, in turn, provides a greater degree of modularity and facilitates simpler iterations. When working with large datasets, such a model proves invaluable, allowing data handlers to adapt swiftly to changes.
However, every silver lining has a cloud. While the EAVT model is revolutionary in many ways, it may not be the perfect fit for all scenarios. For instance, when dealing with smaller tables, implementing the EAVT model can be seen as excessive, perhaps even cumbersome. Another challenge arises when handling intricate business logic. In such situations, the model demands complex manipulation, which can be daunting for those unacquainted with its details.
Nubank has developed a robust framework for EAVT manipulation, equipped with monitoring tools, alert systems, and business rule trackers.
In conclusion, our journey within the intricate world of data has been both challenging and enlightening. Despite the milestones achieved, it feels as if we’re just starting, and we eagerly look forward to the endless possibilities ahead!
Check our job opportunities