The Benefits of Using Predictive Risk Scoring in Security AI

By TSP
May 16, 2024

Invicti recently launched its Predictive Risk Scoring feature, which is a groundbreaking industry first that can generate accurate security risk predictions before vulnerability scanning even begins. To learn more, visit Predictive Risk Scoring for details. This feature utilizes a custom-built machine learning model trained on real-world vulnerability data (excluding customer data) that is operated internally by Invicti. It can closely estimate the likely risk level of a site to assist in prioritization.

In continuation of our previous post introducing this innovative capability and its potential to revolutionize application security, here is a more in-depth look into the technical aspects. We had the opportunity to interview Bogdan Calin, Invicti’s Principal Security Researcher and the main creator of Predictive Risk Scoring, to delve into the feature itself, as well as discuss AI, ML, and the future of application security.

Companies across industries, including security, are incorporating AI features based on large language models (LLMs). What sets Invicti’s AI approach with Predictive Risk Scoring apart from others?

Bogdan Calin: The key to implementing any AI feature effectively is to identify a real customer problem and then develop a model and approach to address it. Integrating AI into a product solely for the sake of having AI is not advisable. For Predictive Risk Scoring, we started with the challenge of prioritizing testing for customers with numerous sites and applications, needing guidance on where to begin scanning. From the outset, we recognized that utilizing an LLM would not be suitable for this problem. Therefore, we opted for a different machine learning model specifically trained to address our requirements.

Why did you opt for a dedicated machine learning model for Predictive Risk Scoring rather than using an LLM? What advantages does this approach offer compared to integrating with ChatGPT or another popular model?

Bogdan Calin: In the realm of security, reliability and predictability are paramount. Particularly in automated discovery and testing scenarios like our tools, an LLM would introduce too much unpredictability and sluggishness, hindering its effectiveness in solving the actual customer problem. To estimate risk levels accurately, we required a model capable of processing website attribute data and delivering a numeric risk prediction. LLMs are designed for text processing, not calculations, making them less suitable for this purpose. Hence, we chose to construct and train a decision tree-based model tailored to our specific needs.

Having a dedicated machine learning model is ideal for this use case as it enables rapid, precise, and secure results. Compared to an LLM, our model is comparatively lightweight, facilitating swift processing of each request with minimal computing resources. This independence allows us to swiftly evaluate thousands of sites and independently run the model, without relying on a major LLM provider or exposing any site-related data externally.

The primary downside of utilizing LLMs as security tools is their lack of explainability and interpretability, stemming from the excessive complexity and multitude of internal layers and parameters, rendering it arduous to ascertain how a result was generated. Conversely, with decision tree models like the one utilized for Predictive Risk Scoring, the internal decision-making process can be elucidated. Consistent results are guaranteed with the same input data, which cannot be ensured with LLMs. Notably, our model is more secure against text-based attacks such as prompt injections.

Moreover, the most significant advantage over an LLM lies in our ability to construct, train, and fine-tune the model to precisely deliver accurate results tailored to our needs. Mathematically speaking, our risk predictions are entirely accurate for a minimum of 83% of cases, with practical accuracy exceeding this figure significantly, approximating 90%.

Could you elaborate on these accuracy levels? While we mention a minimum of 83%, what does accuracy signify in this context? How does it differ from scan accuracy?

Bogdan Calin: The primary objective of Predictive Risk Scoring is to evaluate the risk level of a site before scanning, based on a considerably smaller amount of input data compared to a full scan. Therefore, the prediction accuracy indicates the confidence in our model’s ability to scrutinize a site and predict its precise risk level in no less than 83% of instances, despite the limited data available.

Regarding practical application in prioritization, the prediction accuracy is notably higher. Users are primarily concerned with identifying at-risk sites efficiently rather than obtaining an exact risk score. From a binary perspective for prioritization, our model boasts over 90% accuracy in pinpointing the sites that require immediate testing. Technically speaking, this represents a near-optimal estimate achievable without scanning each site for complete input data, irrespective of whether AI or manual processes are employed.

It is essential to differentiate predictive risk scores from vulnerability scan outcomes. Risk scoring examines a site pre-scan, estimating its vulnerability level based on shared characteristics with vulnerable sites in our training data. A high risk score suggests that the site mirrors susceptible attributes, thus posing elevated risk levels. In contrast, the DAST scanner performs scans on sites, reporting actual vulnerabilities identified during security checks, rather than predictions or assessments.

Various organizations and sectors face limitations concerning AI usage. How does Predictive Risk Scoring conform to regulated environments?

Bogdan Calin: Regulatory apprehensions and constraints regarding AI predominantly pertain to LLMs and generative AI. For instance, concerns arise from transmitting confidential data to external providers, with uncertainties surrounding the utilization of such data in model training or inadvertent exposure to users. Certain industries mandate explainability for all software (including AI), which poses challenges considering the opaqueness of LLMs due to myriad internal parameters that influence one another.

With Predictive Risk Scoring, we circumvent these concerns by eschewing LLMs and refraining from dispatching requests to external AI service providers. Our machine learning model prioritizes explainability and determinism, avoiding training on customer data. Additionally, the model does not process natural language directives like LLMs, shielding against prompt injections and similar vulnerabilities.

AI is experiencing rapid growth in terms of R&D, available implementations, and applications. How do you envision this impacting application security in the foreseeable future? And what lies ahead for Predictive Risk Scoring?

Bogdan Calin: Presently, the prevalent availability of AI language models does not readily facilitate the generation of illicit content such as phishing and exploits. However, as universally accessible AI models like llama3 advance and uncensored model utilization becomes more feasible, future cyberattacks are likely to increasingly leverage AI-generated code and text.

Anticipating the integration of small local LLMs on Android and iOS devices to interact with users and perform tasks, the vulnerability to prompt injections escalates. Akin to the existing AI voice cloning capabilities enabled by open-source tools, voice-based authentication alone becomes vulnerable. Prompt attacks could extend across numerous communication channels, exacerbating this risk further.

AI-driven application development has become commonplace and is poised to become the standard modus operandi for application construction. Developers habituated to AI code generation might inadvertently compromise code security by overlooking thorough verification processes. In light of the security risks posed by LLM-generated code, the overall security of code is likely to diminish.

For Predictive Risk Scoring, enhancements are underway to refine the feature, providing improved results and incorporating additional risk variables.

Ready to go proactive with your application security? Get a free proof-of-concept demo!

The Benefits of Using Predictive Risk Scoring in Security AI