Image for post
Image for post
Photo by Mike Kononov on Unsplash

Cybersecurity and Machine Learning: Predicting Numeric Outliers

Cybersecurity is all about understanding normality

In the detection of Cybersecurity threats, we often need to understand what is normal, and what is not normal. This leads to either signature detection (detecting known things) and anomaly detection (detecting things that go away from the normal). For scripting type attacks, we often use a signature detection method, but in more human-focused attacks, we often focus on detecting when we move away from the normal. Someone who is committing fraud, for example, we often change their behaviour in some way. It is this change that could identify a possible threat.

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
| inputlookup hostperf.csv 
| eval _time=strptime(_time, “%Y-%m-%dT%H:%M:%S.%3Q%z”)
| timechart span=10m max(rtmax) as responsetime
| head 1000
2015–02–18 22:10:001.275
2015–02–18 22:20:005.933
2015–02–18 22:30:005.599
2015–02–18 22:40:002.839
2015–02–18 22:50:003.702
2015–02–18 23:00:004.602
2015–02–18 23:10:008.361
2015–02–18 23:20:0011.885
2015–02–18 23:30:005.519
2015–02–18 23:40:0010.044
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
| inputlookup hostperf.csv 
| eval _time=strptime(_time, “%Y-%m-%dT%H:%M:%S.%3Q%z”)
| timechart span=10m max(rtmax) as responsetime
| head 1000
| eventstats avg(“responsetime”) as avg stdev(“responsetime”) as stdev
| eval lowerBound=(avg-stdev*exact(4)), upperBound=(avg+stdev*exact(4))
| eval isOutlier=if(‘responsetime’ < lowerBound OR ‘responsetime’ > upperBound, 1, 0)
| eventstats avg(“responsetime”) as avg stdev(“responsetime”) as stdev
lowerBound=(avg-stdev*exact(4))
upperBound=(avg+stdev*exact(4))
isOutlier=if(‘responsetime’ < lowerBound OR ‘responsetime’ > upperBound, 1, 0)
Image for post
Image for post
| eventstats median(“responsetime”) as median p25(“responsetime”) as p25 p75(“responsetime”) as p75 
| eval IQR=(p75-p25)
| eval lowerBound=(median-IQR*exact(4)), upperBound=(median+IQR*exact(4))
| eventstats median("responsetime") as median
| eval absDev=(abs(‘responsetime’-median))
| eventstats median(absDev) as medianAbsDev
| eval lowerBound=(median-medianAbsDev*exact(4)), upperBound=(median+medianAbsDev*exact(4))

Conclusions

If you are interested, we will be releasing a new Cyber&Data programme, and supported by The Data Lab. Here is a forthcoming course:

Written by

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. EU Citizen. Auld Reekie native. Old World Breaker. New World Creator.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store