Windows-based object library enables developers to easily create analytics-powered decision models in C++, C#, Java, R and Python, using optimization, Monte Carlo simulation, data mining and machine learning; now includes innovative, fully automated methods for assessing risk of unexpected results from machine learning models when deployed for production use.
INCLINE VILLAGE, Nev., Sept. 20, 2022 /PRNewswire-PRWeb/ -- Frontline Systems is shipping Solver SDK® V2023, a new version of its advanced analytics Software Development Kit for Windows, that enables developers to easily create and run models using mathematical optimization, Monte Carlo simulation and risk analysis, data mining and machine learning.
Solver SDK is not new – customers have been using ever-more-powerful versions of this developer tool for analytics since 2005. But now it's the first and only SDK with fully automated methods for risk analysis of previously trained and validated machine learning (ML) models.
Solver SDK handles virtually any type or size of optimization model, up to millions of inter-related decisions, and risk analysis for virtually any type or size of Monte Carlo simulation model. In 2016, Frontline began offering XLMiner® SDK with powerful features for training, validating, and deploying predictive models using machine learning – and in 2021 Solver SDK and XLMiner SDK were merged into one powerful, multi-purpose developer tool for analytics.
Now, Solver SDK V2023 includes an innovative capability for risk analysis of machine learning models that leverages multiple capabilities of the software. Risk analysis changes the focus from how accurately a ML model will predict a single new case, to how it will perform in aggregate over thousands or millions of new cases, what the business consequences will be, and the (quantified) risk that this will be different than expected from the ML model's training and validation.
"With a patent application now on file to preserve invention rights, developers using our Solver SDK are the first to benefit from these innovative methods", said Daniel Fylstra, Frontline's President and CEO. Frontline Systems is concurrently releasing new versions of RASON®, its cloud platform for analytics, and Analytic Solver®, its toolkit for business analysts using Excel for the Web, Windows and Macintosh, with support for the same innovative methods.
How and Why Machine Learning has Lacked Risk Analysis
For a decade, data science and machine learning (DSML) tools – including Solver SDK – have offered facilities for 'training' a model on one set of data, 'validating' its performance on another set of data, and 'testing' it versus other ML models on a third set of data. But this is not risk analysis: based on known data, it doesn't assess the risk that the ML model will perform differently on new data when put into production use. While it's common to assess a ML model's performance in use, and move to re-train the model if its performance is unexpectedly poor, by that time those risks have occurred, often with adverse financial consequences. Quantification of such risks "ahead of time" has been missing in practice.
There are many reasons for this situation: Data scientists with expertise in ML methods often are not trained in risk analysis; they think of "features" and even predicted output values as data, not as "random variables" with sampled instances. Even if known, conventional risk analysis methods are expensive and time-consuming to apply to machine learning: ML data sets include many (sometimes hundreds) of features, with limited "provenance" of the data's origins. There are hundreds of classical probability distributions that could be 'candidates' to fit each feature. Only some of the features are typically found, after ML model training, to have predictive value; many are found to be correlated with other features and hence 'redundant'. And in typical projects, a great many ML models are built.
How Solver SDK Performs Automated Risk Analysis
Unlike most other DSML libraries, Solver SDK includes powerful algorithms for risk analysis in the same package: Probability distribution fitting, correlation fitting, stratified sample generation and Monte Carlo analysis. But asking developers to "quickly master risk analysis" is asking too much. So Frontline Systems has invented ways to automate the entire risk analysis process. Using the new capability is as simple as creating a new object with one line of code in C++, C#, Java, R or Python, setting a few properties, and calling a new method – and the risk analysis typically adds just seconds to a minute to the existing process of training, validating and testing a ML model.
Internally, for each feature in a dataset, Solver SDK tests an entire family of probability distributions – drawing on its first-mover support for the new Metalog family of distributions, created by Dr. Tom Keelin; optimizes all the parameters of each distribution; detects and models correlations among features, using rank order and copula methods; performs synthetic data generation, using Monte Carlo methods for stratified sampling and correlation; computes the ML model's predictions, as well as user-specified financial consequences, for each simulated case; and importantly, assesses and quantifies the differences in performance of the ML model on this simulated data versus the training, validation and test data.
Results of the risk analysis, including key summary statistics, percentiles and risk measures, are available as object properties. And by simply creating and populating a DataFrame with two more lines of code, the Solver SDK user has all the data needed to create charts, or perform further analysis.
Synthetic Data Generation as a Side Benefit
Synthetic Data Generation (SDG) has become topical in machine learning in recent years, with a number of companies founded just to supply software and services around this technology. SDG is used when there isn't enough original data, or when use of the original data is restricted by law or regulation. But until now (in a patent and literature search), SDG has simply been used to better train ML models.
Solver SDK V2023 includes a powerful, general-purpose, easy to use Synthetic Data Generation facility, usable by simply creating an object SyntheticDataGenerator. Unlike some special-purpose SDG offerings, this facility can accurately model the behavior of nearly any combination of features with continuous values. But Solver SDK also uses synthetic data in an entirely new way, to analyze the risk that a ML model will yield unexpected results "large enough to matter" when deployed for production use.
Works with Already-Available 'Augmented Machine Learning'
Solver SDK's V2022 release featured "augmented machine learning" features found only in other sophisticated machine learning tools. The developer simply supplies data in a DataFrame, creates an Estimator object, and adds Learner objects of different types – classification and regression trees, neural networks, linear and logistic regression, discriminant analysis, naïve Bayes, k-nearest neighbors and more. When the developer calls the Estimator "fit" method, Solver SDK automatically tests and fits parameters for all of the Learners (ML algorithms) to the training data, validates and compares them according to user-chosen criteria, and delivers the trained ML model that best fits the data. Again with just a few lines of code, the developer can perform a risk analysis on the "best model" found by the SDK.
Moving from Single Models to 'ModelOps' and Cloud Deployment
Developers seeking to move to the next stage of 'MLOps' or 'ModelOps' will find that Solver SDK works closely with RASON® Decision Services, Frontline Systems' comprehensive cloud platform for decision intelligence on Microsoft Azure – indeed, the RASON service uses Solver SDK in its own operations. RASON further supports multi-stage "decision flows" that can include data retrieval and in-memory SQL queries; training, validation, risk analysis, and use of machine learning models; direct integration with business rules and optimization models; and delivery of results via REST APIs with JSON and OData endpoints, for use in popular BI and analytics tools like Power BI, and low code / no-code application development tools like Power Apps and Power Automate. This sophisticated combination makes it easier than ever to build powerful, automated decision intelligence solutions.
Free Trials, Learning and Coaching Resources
Developers can sign up for free trial accounts to evaluate Solver SDK at https://www.solver.com, and RASON at https://rason.com. They can use these tools to create and solve models in C++, C#, Java, R and Python (with Visual Studio example projects) and RASON, exercise the REST API, try out dozens of code examples illustrating use of predictive models and machine learning, optimization and simulation, and download the Solver SDK and RASON User Guides and Reference Guides in PDF form. For more information please contact [email protected].
Frontline Systems Inc. (http://www.solver.com) is the alternative to analytics complexity, helping business analysts and managers gain insights and make better decisions for an uncertain future, without the cost, delays and risk of 'big vendor' tools. Its products integrate forecasting and data mining for "predictive analytics," Monte Carlo simulation for risk analysis, conventional and stochastic optimization for "prescriptive analytics," and business rules and Excel calculations to make the best business decisions. Founded in 1987, Frontline is based in Incline Village, Nevada (775-831-0300).
Microsoft Excel, Office 365, Azure, Power BI, Power Apps, Power Automate and Visual Studio are trademarks of Microsoft Corp. Analytic Solver®, RASON® and Solver SDK® are registered trademarks of Frontline Systems Inc.
Media Contact
Daniel Fylstra, Frontline Systems Inc., 775-831-0300 x120, [email protected]
SOURCE Frontline Systems Inc.
Share this article