Applying Prediction Results From Amazon's Machine Learning

John Harlan, over 1 year ago

Machine Learning Is More Than Just Making Predictions

We wanted to follow-up on our previous post to talk a little more about the actual implementation of Machine Learning applications. Specifically dealing with the need to handle cases of (1) low-certainty predictions and (2) misprediction. In our opinion, how you use and approach the results can be more important to a successful implementation of a Machine Learning based application than the actual model itself.

Nobody's Perfect

Your model will make not make correct predictions 100% of the time. Everybody know's that. And it will also be unsure of the result at times and produce low-certainty results. The most important step in implementing a Machine Learning app in your business processes is how you build the application to handle these scenarios to limit business disruption. It's difficult to really imagine a scenario where you would blindly take whatever the highest prediction from a Machine Learning model and push it into your workflow or other business processes. applying machine learning high value workflow

Our Approach

Couple things to note before we jump into our approach. In this post, we are talking about the use case of (1) a multi-class model and (2) a high-value business workflow process (i.e. mistakes are very costly). We focused on a couple points in our implementation of Machine Learning into a client's workflow processes that helped create a successful outcome:

Starting Early Will Catch More Mistakes The earlier in the work process that you are interacting the better as you have more opportunity to correct mistakes in later steps as well as provide human-validation to the results of the model. Simple enough. Additionally, this provides a very valuable 'quality control' element to your application.

Set Your Pain Tolerance Even with a highly accurate model, you will need to decide at what level you will reject predicted results. Is it 99% certainty? Or 80%? It all depends on what the cost is of an inaccurate prediction entered into the workflow versus rejecting the prediction and handling the instance manually.

Repetitive Processes With Predefined Outcomes Certain functions lend themselves to machine learning applications more than others. To that end, we focused on functions in our clients' workflows that represent a highly repetitive task with a select and established a set of predefined outcomes. For example, we want to predict the document types out of a total of five potential results and the function of assigning document types is the same from one document to the next.

Realistic Expectations By establishing a goal of 'Highly certain predictions 80% of the time', we set ourselves up for a greater chance of success in a shorter timeline. As in all things, there is a diminishing return on investment and you don't want to get caught in a never-ending cycle of trying to control for edge cases. Your goal isn't to replace human interaction entirely, rather it's to focus valuable labor hours on handling tough edge cases and higher-value functions.

Should we do it?

So putting it all together, how do you decide when to implement a machine learning application within an existing business workflow? We think it's a straightforward equation and one you will have seen before:

Savings = TotalManualLaborCost x AcceptanceThreshold x PredictionAccuracyPercentage
ErrorCost = AcceptanceThreshold x PredictionErrorRate x TotalUnits x CostOfError
NetSavingsPerUnit = (Savings - ErrorCost) / TotalUnits
BreakEvenUnits = CostOfImplementation / NetSavingsPerUnit