Data science is changing how cybersecurity teams hunt threats

17 Oct 2018

Image: © sas/

Data science is helping cybersecurity teams focus on threats in more efficient ways.

Data science is transforming the world’s industries. In a society more driven by information than ever before, the valuable insights that we can glean from data are creating new ways of doing business. spoke to cybersecurity professionals about the profound impact data science is having on their mitigation methods.

Reducing the need for manual tasks

According to Fergal Toomey, chief scientist at network analytics firm Corvil, data science has cut down on the need for manual effort when it comes to tasks such as “spotting common patterns in data, identifying anomalies or correlating information from disparate sources”. As well as making previously time-consuming tasks faster, data science can also make things possible that were never possible before, such as automatically flagging new threats based on their similarity to known exploits and their behaviour patterns.

It can often be overwhelming for security professionals when it comes to managing their real-time alerts, but Toomey said that data science can help correlate alerts and anomalies to identify common entities – such as user accounts or system resources – that may be under attack. “Data science also helps security teams correlate their internal picture against external threat intelligence, revealing, for example, if their systems are being targeted by a known hacker campaign or exploit.”

Sam Curry, chief security officer at endpoint detection and response specialist Cybereason, said: “Data science has made automation and processing at speed better. The biggest struggle in security has been ‘speed versus intelligence’. Trying to keep up at line rates with data and telemetry and then piece it together is like trying to solve million-piece puzzles every minute.”

A game of cat and mouse

“Human enrichment” of the collected data is still critical, said Rory Duncan, security go-to-market leader at Dimension Data, a leading IT consultancy. “It’s a real game of cat and mouse, and data science is critical in determining how the threat landscape is evolving, which helps businesses become predictive in their security posture, rather than reactive.”

Simon Elliston Ball, data scientist and cybersecurity product manager at data software firm Hortonworks, said data science is invaluable when it comes to separating the wheat from the chaff. “The area where a data science-driven approach adds real value is finding the things we already know about, but may not have made the blacklist yet.

“This is how techniques from the world of the data scientist – such as clustering, auto-encoders, embedding and algorithms – make traditional threat intelligence sources stretch much further.”

Fleming Shi, senior vice-president of advanced technologies at IT security firm Barracuda Networks, said that data science can provide crucial early warnings. “By analysing the frequency, volume and geographical blast radius of the attack, we can use a rich set of data and machine- learned models to identify anomalies.

“This use of data science can usually provide us with an early warning on a malicious campaign and enables us to stay vigilant, whilst avoiding distractions from traffic noise.”

Threat intelligence and log analysis

There are many ways data science can help when it comes to threat intelligence and log analysis, said Paolo Passeri, cyber intelligence principal at Netskope, a SaaS security specialist. “The two work very well together since the most recent log analysis technologies do not simply correlate logs, but also integrate with threat intelligence services to augment their detection capability. Data is used for both building the threat intelligence and extracting the actionable information from the logs.”

When it comes to incident investigation, data science can help teams establish the so-called ‘kill chain’ of an attack, which can show them what needs to be done to limit damage and prevent recurrence.

Amnon Lotem, head of data science at cloud security player Radware, said that data science is enabling the transition from rule- and signature-based detection to more holistic techniques such as multidimensional anomaly detection (using unsupervised machine learning) and generalisation of malicious behaviour (using supervised machine learning).

Duncan believes that data science in cybersecurity can be helpful in both reactive and predictive risk mitigation strategies. The latter requires some immersion of teams in the dark web. “Forward-thinking security teams are striking off the beaten path and monitoring active hacking communities and mapping these conversations.”

The benefits of data in cybersecurity don’t just extend to real-time threat detection, as Gary McGraw, vice-president of security technology at Synopsys, explained. “One of the challenges of cybersecurity is an overemphasis on hacking, threats and real-time detection. In my view, the only way out of this ‘hamster wheel of pain’ is to build security in in the first place.”

McGraw is co-author of the BSIMM (Building Security In Maturity Model), a data-driven study of existing software security initiatives. It quantifies practices of many different organisations, describing the common ground shared by many as well as the variations.

The future of data science in cybersecurity

As technology advances, cybercriminals and other bad actors will be keeping up in a digital “arms race”, as Toomey puts it – but how will data science change to meet the challenges ahead?

Lotem said that as datasets become larger and algorithms sharper, detection systems will improve. He added that elements of the security operations centre (SOC) may be automated in future, including triage and automatic mitigation.

Ball cautions against buying into the hype of the technology behind data science. “It is, however, worth remembering that data science is not just about a collection of algorithms. Data science is as much a method and an approach; a mindset that can help security practitioners scale and automate their activity by training machines to copy the experts; and a means of augmenting those experts. Data science is a process – from exploration to hypothesis, and testing to conclusions, which can then be turned into defences.”

Curry concluded: “The hope is that we completely reverse the advantage in cyber conflict, which currently favours the attacker, and perhaps one day we will make an AI that can be fast and smart, and sit beside us in the SOC and help us reach further than we could before.”

Updated, 10.21am, 19 October 2018: Upon receiving clarification, this article was amended to refer to a contributor as Simon Elliston Ball, previously cited as Simon Ball.

Ellen Tannam was a journalist with Silicon Republic, covering all manner of business and tech subjects