Using Machine Learning to Optimize Subscription Billing
- 04 Dec, 2017
- Smartphone Gadgets Engineer Billing
As a data scientist at Recurly, my job is to use the vast amount of data that we have collected to build products that make subscription businesses more successful. One way to think about data science at Recurly is as an extended R&D department for our customers. We use a variety of tools and techniques, attack problems big and small, but at the end of the day, our goal is to put all of Recurly’s expertise to work in service of your business.
Managing a successful subscription business requires a wide range of decisions. What is the optimum structure for subscription plans and pricing? What are the most effective subscriber acquisition methods? What are the most efficient collection methods for delinquent subscribers? What strategies will reduce churn and increase revenue?
At Recurly, we're focused on building the most flexible subscription management platform, a platform that provides a competitive advantage for your business. We reduce the complexity of subscription billing so you can focus on winning new subscribers and delighting current subscribers.
Recently, we turned to data science to tackle a big problem for subscription businesses: involuntary churn.
The Problem: The Retry Schedule
One of the most important factors in subscription commerce is subscriber retention. Every billing event needs to occur flawlessly to avoid adversely impacting the subscriber relationship or worse yet, to lose that subscriber to churn.
Every time a subscription comes up for renewal, Recurly creates an invoice and initiates a transaction using the customer’s stored billing information, typically a credit card. Sometimes, this transaction is declined by the payment processor or the customer’s bank. When this happens, Recurly sends reminder emails to the customer, checks with the Account Updater service to see if the customer's card has been updated, and also attempts to collect payment at various intervals over a period of time defined by the subscription business. The timing of these collection attempts is called the “retry schedule.”
Our ability to correct and successfully retry these cards prevents lost revenue, positively impacts your bottom line, and increases your customer retention rate.
Other subscription providers typically offer a static, one-size-fits-all retry schedule, or leave the schedule up to the subscription business, without providing any guidance. In contrast, Recurly can use machine learning to craft a retry schedule that is tailored to each individual invoice based on our historical data with hundreds of millions of transactions. Our approach gives each invoice the best chance of success, without any manual work by our customers.
A key component of Recurly’s values is to test, learn and iterate. How did we call on those values to build this critical component of the Recurly platform?
Applying Machine Learning
We decided to use statistical models that leverage Recurly’s data on transactions (hundreds of millions of transactions built up over years from a wide variety of different businesses) to predict which transactions are likely to succeed. Then, we used these models to craft the ideal retry schedule for each individual invoice. The process of building the models is known as machine learning.
The term "machine learning" encompasses many different processes and methods, but at its heart is an effort to go past explicitly programmed logic and allow a computer to arrive at the best logic on its own.
While humans are optimized for learning certain tasks—like how children can speak a new language after simply listening for a few months—computers can also be trained to learn patterns. Aggregating hundreds of millions of transactions to look for the patterns that lead to transaction success is a classic machine learning problem.
A typical machine learning project involves gathering data, training a statistical model on that data, and then evaluating the performance of the model when presented with new data. A model is only as good as the data it’s trained on, and here we had a huge advantage.