Deploying and Controlling Intelligent Predictions for Improved Financial Performance

Day5 Analytics’ client completes thousands of residential electrical, HVAC and plumbing service requests each month – where technicians attend a problem and attempt to repair it. The cost of each request is affected by various factors – such as, when the request is made, the urgency with which work needs to be completed, the number of previous calls made to service the same equipment, etc. The final cost of the service is only known when work is finally completed – however, the client needs an accurate cost estimate to manage cashflow, inform discretionary spending, and assess whether there is value in delivering the repair service. Basic cost estimation methods result in crude high-level estimates, that lead to missed financial opportunities, and challenge decision making at the field level.

Using a combination of automatic machine learning, simplified field deployable applications and an automated way to ensure predictions remain accurate, Day5 Analytics developed a solution that drives cost estimate improvements of 40%.

Over a 3-part blog series, Day5 Analytics demonstrates how to create and deploy a machine learning solution with an innovative blend of low-code and low-cost technologies.

Part 1 will focus on the production deployment of cost prediction using automated machine learning, using KNIME and PyCaret
Part 2 will demonstrate how to allow field operators and various systems to leverage the cost predictor via a simple user interface, using Microsoft PowerApps and KNIME Server
Part 3 will explore how deployed prediction models can be monitored and automatically retrained for accuracy when drift is detected, using KNIME Server

Part 1: Improving Cost Forecasts with PyCaret Automated Machine Learning in KNIME

KNIME Analytics Platform is an open-source software program for creating data science workflows. KNIME Server is used to deploy the workflows as analytical applications and services. Workflows developed in KNIME are easy to setup and understand, with no coding required. KNIME’s ease-of-use accelerates time-to-value, and as an open-source platform, integrations with a growing number of tools and software allow for near unlimited flexibility and problem-solving coverage.

PyCaret is an automated machine learning library, that performs data preparation and machine learning modelling within minutes. With KNIME’s proprietary Integrated Deployment functionality, users can move from testing to deployment without additional changes or manual rework required – which is typically where deployment of data science applications falter. This accelerates the time to production, reduces errors during transfer from development to production, increases compliance, and results in a streamlined and automated processes.

Model Training and predictions

Data sourced from an ERP system is consolidated and preprocessed in KNIME for machine learning. Any data preparation steps that need to be repeated for predictions is contained within a pair of Integrated Deployment nodes (Capture Workflow Start and End); this generates a deployment-ready workflow. The dataset is then split into training and testing sets for machine learning with PyCaret.

PyCaret allows both regression and classification automated machine learning techniques in low-code fashion, among other modelling methods. The installation of PyCaret in KNIME can be found here.

Training data is passed to PyCaret embedded within KNIME for predictive modelling. PyCaret tests various models on the dataset, and returns the best performing model. This model is saved locally and loaded into a Python Predictor node, which is assessed for accuracy against test (unseen) data. The prediction steps are also captured with Integrated Deployment nodes.

Model Deployment

The deployment-ready machine learning model is consolidated and written to a workflow, for which KNIME generates a corresponding REST API through its Swagger interface. The workflow is then deployed to the KNIME Server, and can receive new data for predicting cost.

A third workflow (Remote Workflow Call) is used to call the deployment workflow. Data is received via REST API, and passed to the deployment workflow via KNIME’s Call Workflow functionality. The deployment workflow produces a cost prediction, which is passed back to the Remote Workflow Call workflow, which in turn passes the cost prediction back to the mobile application via REST API.

API enables integration with various third-party systems and the client’s own internal systems (without major integration investments), allowing the client to accurately predict cost as soon as a request is received. This is covered in Part 2 of the blog series.

Cost Prediction Accuracy

The cost of a service request can range from tens to thousands of dollars. Prediction error is assessed using the Mean Absolute Error (MAE) – the absolute difference between actual and predicted cost, averaged over all predictions. In all cases, the machine learning model is able to predict costs closer to the real amounts, by $140 or more, to drives business cost certainty.

In our sample, the MAE with PyCaret is improved by 38% compared to the simple average (status quo) method; a significant improvement. This occurs not only at an aggregate level, but even when compared across months and services – indicating that the performance improvement can be leveraged at deeper levels of the operation.

Plotting the distribution of errors for each model allows for a better understanding of error reduction – the PyCaret error distribution skews closer to the real cost than the status quo (average) method.

The model also provides business insight into drivers that affect service request cost, using a feature importance plot.

The ‘previous call’ feature and the ‘day of the service call’ (i.e., how many service requests of the same category were received in the prior 6 months, and weekday/weekend /statutory holiday respectively) impact the cost most significantly.

Next Steps

The prediction KNIME model can be found on the KNIME Hub, here.

Part 1 of the blog post series detailed the development, assessment, and deployment of the machine learning model cost predictions compared to the ‘status quo’ average cost method. Part 2 and Part 3 will demonstrate the Microsoft Power Apps mobile application for field deployment, and model monitoring and automatic retraining functionality built in KNIME Server, respectively.

To inquire about custom-built and deployed machine learning solutions at your company, contact Day5 Analytics.

finance, KNIME, machine learning

Deploying and Controlling Intelligent Predictions for Improved Financial Performance

Part 1: Improving Cost Forecasts with PyCaret Automated Machine Learning in KNIME

Model Training and predictions

Model Deployment

Cost Prediction Accuracy

Next Steps

Share post

Expertise

Solutions

Training

About

Subscribe to our Newsletter

Need to get in touch sooner?

Our Latest Offers Are Live!

Subsidized Training

Website Chatbot

Stick With Us!