Finally, The Secret to Risk Modeling with Python

February 7, 2018

Author: Paul Thomas

A Brief Recap on the History of Python

It was the week of Christmas in 1989 in Amsterdam, 48* F and cloudy, when Guido van Rossum started tinkering on a small project to busy himself while his employer’s office closed for the holiday. van Rossum set out to create a language, based on a his previous project called ABC, that was reliant on the Unix infrastructure and conventions, without being Unix-bound. He placed a high emphasis on readability and uniformity in an era that praised languages like C and Perl (PHP’s high-maintenance personality would crash the party later). The result was a gorgeously elegant open source language named after Monty Python’s Flying Circus.

Since its official release in 1993, Python has displayed its prowess across multiple industries and in widely varying use cases. It is now the most widely taught introductory language in the top computer science programs worldwide, was named the most in-demand programming language in the U.S. by Forbes’ fintech columnist, and hit #2 on the list of most GitHub pulls by language in 2017.

Python in Finance

One increasingly popular application of Python is in credit risk modeling. Many brilliant data scientists and analysts wrangle the usability of Python to implement machine learning and deep learning algorithms. From simple algorithms like logistic regression, decision trees, random forests, support vector machines via classification and regression, to more advanced methods like clustering and neural networks, the many advantages of Python are opening the doors to apply artificial intelligence to credit risk challenges.


Provenir launches support for Python Credit Risk Models
Read the News


Yet, despite the promise that Python and other popular languages bring to financial services, we talk to companies nearly every day who vent frustration with the process of testing and deploying models to production, then maintaining or changing models as business objectives evolve. It has been stated that “The next breakthrough in data analysis may not be in individual algorithms, but in the ability to rapidly combine, deploy, and maintain existing algorithms.” I would venture to guess that most of us in financial services (or any data-driven industry, for that matter) would say, “Of course.” Delays in model deployment are nothing new, and we’re ready for a change.

Challenges Deploying Risk Models

Ben Lorica, the Chief Data Scientist at O’Reilly once called out the delays that exist from modeling to production, identifying two root causes for the long analytical lifecycle:

  1. Silos – Data Scientists and Production Engineering teams are historically divided, resulting in a wall between the two organizations that results in inherent delays.
  2. Recoding – Models often have to be recoded before they can be deployed into production. So, while your data scientist may prefer Python, your production system probably requires Java.

You’re probably squirming in your seat right now because you get the same familiar anxiety that I do when thinking about this dynamic. But, stay with me.

Vent your frustration here

powered by Typeform

It Doesn’t Have to Be This Way

If everybody is so frustrated with this problem, why doesn’t anyone fix it?

I’m so glad you asked.

Provenir tackled this challenge head on in two very distinct ways:

  1. Empowerment – Silos or no silos, your production system should be so easy to use that your data scientist can deploy a model autonomously. Deploying a model into a Provenir decisioning environment is as simple as attaching a document to an email. Just upload the document, visually map your values, and go.
  2. Native Operationalization – Recoding?! That sounds like re-work to me. Why not operationalize models in their native languages? So, Provenir supports the native operationalization of risk models in R, SaS, and Excel along with PMML and MathML. Today, we are delighted to launch support for Python models.

If this is something you’ve struggled with and you’re looking for a better way, we’d love to talk.

Or, see it in action in this step-by-step guide on operationalizing risk models in Provenir

(It’s so easy, it’s only two pages long.)
Prove it


1 https://queue.acm.org/detail.cfm?id=2431055
2 https://www.oreilly.com/ideas/data-scientists-and-the-analytic-lifecycle

[if lte IE 8]
[if lte IE 8]
[if lte IE 8]
[if lte IE 8]