How Lumi streamlines loan approvals with Amazon SageMaker AI

Lumi is a leading Australian fintech lender empowering small businesses with fast, flexible, and transparent funding solutions. They use real-time data and machine learning (ML) to offer customized loans that fuel sustainable growth and solve the challenges of accessing capital. This post explores how Lumi uses Amazon SageMaker AI to meet this goal, enhance their transaction processing and classification capabilities, and ultimately grow their business by providing faster processing of loan applications, more accurate credit decisions, and improved customer experience.

Apr 4, 2025 - 16:00
How Lumi streamlines loan approvals with Amazon SageMaker AI

This post is co-written with Paul Pagnan from Lumi.

Lumi is a leading Australian fintech lender empowering small businesses with fast, flexible, and transparent funding solutions. They use real-time data and machine learning (ML) to offer customized loans that fuel sustainable growth and solve the challenges of accessing capital. Their goal is to provide fast turnaround times— hours instead of days—to set them apart from traditional lenders. This post explores how Lumi uses Amazon SageMaker AI to meet this goal, enhance their transaction processing and classification capabilities, and ultimately grow their business by providing faster processing of loan applications, more accurate credit decisions, and improved customer experience.

Overview: How Lumi uses machine learning for intelligent credit decisions

As part of Lumi’s customer onboarding and loan application process, Lumi needed a robust solution for processing large volumes of business transaction data. The classification process needed to operate with low latency to support Lumi’s market-leading speed-to-decision commitment. It needed to intelligently categorize transactions based on their descriptions and other contextual factors about the business to ensure they are mapped to the appropriate classification. These classified transactions then serve as critical inputs for downstream credit risk AI models, enabling more accurate assessments of a business’s creditworthiness. To achieve this, Lumi developed a classification model based on BERT (Bidirectional Encoder Representations from Transformers), a state-of-the-art natural language processing (NLP) technique. They fine-tuned this model using their proprietary dataset and in-house data science expertise. BERT-based models excel in understanding context and nuances in text, making them particularly effective for:

  • Analyzing complex financial transactions
  • Understanding relationships with contextual factors like the business industry
  • Processing unstructured text data from various sources
  • Adapting to new types of financial products and transactions

Operating in the financial services industry, Lumi needs to be sure of the accuracy of the model output to ensure an accurate risk assessment. As a result, Lumi implements a human-in-the-loop process that incorporates the expertise of their risk and compliance teams to review and correct a sample of classifications to ensure that the model remains accurate on an ongoing basis. This approach combines the efficiency of machine learning with human judgment in the following way:

  1. The ML model processes and classifies transactions rapidly.
  2. Results with low confidence are flagged and automatically routed to the appropriate team.
  3. Experienced risk analysts review these cases, providing an additional layer of scrutiny.
  4. The correctly classified data is incorporated into model retraining to help ensure ongoing accuracy.

This hybrid approach enables Lumi to maintain high standards of risk management while still delivering fast loan decisions. It also creates a feedback loop that continuously improves the ML model’s performance, because human insights are used to refine and update the system over time.

Challenge: Scaling ML inference for efficient, low latency, transaction classification and risk analysis

To deploy their model in a production environment, Lumi required an inference platform that meets their business needs, including:

  • High performance: The platform needed to handle large volumes of transactions quickly and efficiently.
  • Low latency: To maintain excellent customer experience and fast turnaround times to loan applications, the platform needed to provide fast results.
  • Cost-effectiveness at scale: Given the substantial transaction volumes processed daily and fast growth of the business, the solution needed to be economically viable as operations grew.
  • Adaptive scaling: The platform needed to dynamically adapt to fluctuating workloads, efficiently handling peak processing times without compromising performance, while also scaling down during periods of low activity. Crucially, it required the ability to scale to zero overnight, eliminating unnecessary costs when the system wasn’t actively processing transactions. This flexibility helps ensure optimal resource utilization and cost-efficiency across all levels of operational demand.
  • Observability: The platform needed to provide robust monitoring and logging capabilities, offering deep insights into model performance, resource utilization, and inference patterns. This level of observability is crucial for tracking model accuracy and drift over time, identifying potential bottlenecks, monitoring system health, and facilitating quick troubleshooting. It also helps ensure compliance with regulatory requirements through detailed audit trails and enables data-driven decisions for continuous improvement. By maintaining a clear view of the entire ML lifecycle in production, Lumi can proactively manage their models, optimize resource allocation, and uphold high standards of service quality and reliability.

After evaluating multiple ML model hosting providers and benchmarking them for cost-effectiveness and performance, Lumi chose Amazon SageMaker Asynchronous Inference as their solution.

Solution: Using asynchronous inference on Amazon SageMaker AI

Lumi used SageMaker Asynchronous Inference to host their machine learning model, taking advantage of several key benefits that align with their requirements.

Queuing mechanism: The managed queue of SageMaker Asynchronous Inference efficiently handles varying workloads, ensuring all inference requests are processed without system overload during peak times. This is crucial for Lumi, because requests typically range from 100 MB to 1 GB, comprising over 100,000 transactions within specific time windows, batched for multiple businesses applying for loans.

Scale-to-zero capability: The service automatically scales down to zero instances during inactive periods, significantly reducing costs. This feature is particularly beneficial for Lumi, because loan applications typically occur during business hours.

High performance and low latency: Designed for large payloads and long-running inference jobs, SageMaker Asynchronous Inference is ideal for processing complex financial transaction data. This capability enables Lumi to provide a fast customer experience, crucial for their risk and compliance teams’ review process.

Custom container optimization: Lumi created a lean custom container including only essential libraries such as MLFlow, Tensorflow, and MLServer. Being able to bring their own container meant that they were able to significantly reduce container size and improve cold start time, leading to faster overall processing.

Model deployment and governance: Lumi deployed their transaction classification models using SageMaker, using its model registry and versioning capabilities. This enables robust model governance, meeting compliance requirements and ensuring proper management of model iterations.

Integration with existing systems on AWS: Lumi seamlessly integrated SageMaker Asynchronous Inference endpoints with their existing loan processing pipeline. Using Databricks on AWS for model training, they built a pipeline to host the model in SageMaker AI, optimizing data flow and results retrieval. The pipeline leverages several AWS services familiar to Lumi’s team. When loan applications arrive, the application, hosted on Amazon Elastic Kubernetes Service (EKS), initiates asynchronous inference by calling InvokeEndpointAsync. Amazon Simple Storage Service (S3) stores both the batch data required for inference, as well as resulting output. Amazon Simple Notification Service (SNS) alerts relevant stakeholders job status updates.

Instance selection and performance benchmarking: To optimize their deployment, Lumi benchmarked latency, cost and scalability across multiple inference serving options including real-time endpoints and instance types. Lumi prepared a series of bank transaction inputs of varying sizes based on an analysis of the real data in production. They used JMeter to call the Asynchronous Inference endpoint to simulate real production load on the cluster. Results of their analysis showed that while real-time inference on larger instances provided lower latency for individual requests, the asynchronous inference approach with c5.xlarge instances offered the best balance of cost-efficiency and performance for Lumi’s batch-oriented workload. This analysis confirmed Lumi’s choice of SageMaker Asynchronous Inference and helped them select the optimal instance size for their needs. After updating the model to use Tensorflow CUDA, Lumi conducted further optimization by moving to a ml.g5.xlarge GPU enabled cluster which improved performance by 82% while reducing costs by 10%.

Best Practices and Recommendations

For businesses looking to implement similar solutions, consider the following best practices:

Optimize Your Container: Follow Lumi’s lead by creating a lean, custom container with only the necessary dependencies. This approach can significantly improve inference speed and reduce costs.

Leverage Asynchronous Processing: For workloads with variable volume or long processing times, asynchronous inference can provide substantial benefits in terms of scalability and cost-efficiency.

Plan for Scale: Design your ML infrastructure with future growth in mind. SageMaker AI’s flexibility allows you to easily add new models and capabilities as your needs evolve.

Model Observability and Governance: When evaluating an inference and hosting platform, consider observability and governance capabilities. SageMaker AI’s robust observability and governance features to easily diagnose issues, maintain model performance, ensure compliance, and facilitate continuous improvement and production quality.

Conclusion

By implementing SageMaker AI, Lumi has achieved significant improvements to their business. They have seen an increase of 56% transaction classification accuracy after moving to the new BERT based model. The ability to handle large batches of transactions asynchronously has dramatically reduced the overall processing time for loan applications by 53%. The auto-scaling and scale-to-zero feature has resulted in substantial cost savings during off-peak hours, improving the cost efficiency of the model by 47%. In addition, Lumi can now easily handle sudden spikes in loan applications without compromising on processing speed or accuracy.

“Amazon SageMaker AI has been a game-changer for our business. It’s allowed us to process loan applications faster, more efficiently and more accurately than ever before, while significantly reducing our operational costs. The ability to handle large volumes of transactions during peak times and scale to zero during quiet periods has given us the flexibility we need to grow rapidly without compromising on performance or customer experience. This solution has been instrumental in helping us achieve our goal of providing fast, reliable loan decisions to small businesses.”

says Paul Pagnan, Chief Technology Officer at Lumi

Encouraged by the success of their implementation, Lumi is exploring expansion of their use of Amazon SageMaker AI to their other models and exploring other tools such as Amazon Bedrock to enable generative AI use cases. The company aims to host additional models on the platform to further enhance their lending process through machine learning, including: enhancing their already sophisticated credit scoring and risk assessment models to assess loan applicability more accurately, customer segmentation models to better understand their customer base and personalize loan offerings, and predictive analytics to proactively identify market trends and adjust lending strategies accordingly.

Resources


About the Authors

Paul Pagnan is the Chief Technology Officer at Lumi. Paul drives Lumi’s technology strategy, having led the creation of its proprietary core lending platform from inception. With a diverse background in startups, Commonwealth Bank, and Deloitte, he ensures Lumi is at the forefront of technology while ensuring its systems are scalable and secure. Under Paul’s leadership, Lumi is setting new standards in FinTech. Follow him on LinkedIn.

Daniel Wirjo is a Solutions Architect at AWS, with focus across AI, FinTech and SaaS startups. As a former startup CTO, he enjoys collaborating with founders and engineering leaders to drive growth and innovation on AWS. Outside of work, Daniel enjoys taking walks with a coffee in hand, appreciating nature, and learning new ideas. Follow him on LinkedIn.

Melanie Li, PhD is is a Senior Generative AI Specialist Solutions Architect at AWS based in Sydney, Australia, where her focus is on working with customers to build solutions leveraging state-of-the-art AI and machine learning tools. She has been actively involved in multiple Generative AI initiatives across APJ, harnessing the power of Large Language Models (LLMs). Prior to joining AWS, Dr. Li held data science roles in the financial and retail industries. Follow her on LinkedIn.

Jat AI Stay informed with the latest in artificial intelligence. Jat AI News Portal is your go-to source for AI trends, breakthroughs, and industry analysis. Connect with the community of technologists and business professionals shaping the future.