At Nubank, deploying machine learning models in a production environment built primarily on Clojure presents unique challenges and opportunities. 

This article explores the key topics around this subject, focusing on the technical aspects of integrating machine learning models within a Clojure-first environment.

Read on and find out all about it!

What is Production Software?

When deploying software, especially in a financial institution like Nubank, the definition of production software extends beyond just the code. It involves a comprehensive ecosystem that includes:

  • Authentication: ensuring that only authorized services can communicate with the software.
  • Encryption: protecting sensitive data, such as personal identification numbers, during transmission.
  • Logging: keeping detailed records of all interactions with the software to aid in debugging and auditing.
  • Integration testing: verifying that different software components interact correctly in the production environment.
  • Governance: encompassing data governance (like LGPD compliance) and managing the overall production environment.

These elements are essential for creating a robust, secure, and compliant production environment.

Check our job opportunities

Standardizing production components with Clojure as the primary language

In a large-scale environment like Nubank, where multiple teams deploy software, standardizing production components is crucial. This standardization ensures that all teams can focus on their core tasks without reinventing the wheel for common requirements like authentication, encryption, and logging. 

However, this standardization also poses challenges when integrating machine learning models, especially when these models are developed in languages like Python, which are not native to the Clojure ecosystem.

Nubank predominantly uses Clojure for its production code, leveraging its robust features and JVM compatibility. The challenge is that many standardized components in the Nubank ecosystem are written specifically for Clojure, making it difficult to reuse these components directly with models written in other languages, such as Python itself or even R.

One early approach to this challenge was to rewrite essential components like authentication and encryption in Python to support machine learning models. While this allowed for some integration, it led to high engineering costs and incomplete solutions, as it was challenging to keep up with the rapid development pace in the Clojure ecosystem.

The sidecar pattern: a strategic solution

To overcome the limitations of the initial approach, Nubank developed the sidecar pattern. This architectural solution involves deploying a Clojure service (the sidecar) alongside the machine learning model. 

The sidecar handles all interactions with the broader infrastructure, while the machine learning model focuses solely on predictions. Some of the advantages of this solution include:

  • Reduced code duplication: the sidecar pattern eliminates the need to rewrite components in different languages, reducing the engineering workload.
  • Simplified model deployment: by offloading non-essential tasks to the sidecar, the machine learning model becomes easier to manage and deploy.
  • Leveraging existing infrastructure: the sidecar pattern allows machine learning models to benefit from the robust infrastructure already established for Clojure services at Nubank.

Exploring ONNX for Interoperability

What is ONNX?

Open Neural Network Exchange (ONNX) is an open-source format designed to allow machine learning models to be easily transferred between different frameworks. This format is particularly useful for ensuring interoperability between models developed in different languages and environments.

ONNX in the Nubank context

Nubank explored ONNX as a potential solution for integrating machine learning models with the existing Clojure-based infrastructure. The ONNX Runtime, which supports multiple programming languages, could allow machine learning models to be deployed without needing the original environment in which they were trained. Here are some of the benefits of this format:

  • Cross-language compatibility: ONNX enables models trained in Python, for example, to be executed in a Clojure environment without extensive rewrites.
  • Simplified dependency management: ONNX reduces the complexity of managing dependencies, particularly in production environments, by requiring only the ONNX Runtime.
  • Canonical model representation: ONNX provides a standardized way to represent and serve models across different deployments, potentially reducing time to market for new models.

The future of Machine Learning deployment at Nubank

Nubank continues to explore innovative solutions like ONNX while maintaining the stability and robustness of its Clojure-first environment. 

The sidecar pattern remains a strategic choice for integrating machine learning models, with ONNX being considered for specific use cases where it can provide clear business value.

By balancing standardization, interoperability, and business needs, Nubank employees are finding creative solutions while staying true to the company’s technological foundation.

That’s how we’re going to build the purple future. That’s working at Nu!

Check our job opportunities