The Exploitation of Machine Learning in Cybersecurity

Today, companies are able to perform tasks that were previously impossible, thanks to the technologies that create, store and examine large sets of data. But, this benefit has its own sets of risk, precisely from a security standpoint.

With huge amounts of data getting generated and transferred over networks, it is an uphill task for the cybersecurity experts monitoring everything and there is a danger of potential threats going unnoticed very easily. Hiring more security experts would provide a solution to this, but the number of security experts in the market is very low and a lot of vacant jobs are yet to be filled.

The solution for this can be provided by machine learning. It is a phenomenon that is disrupting numerous industries and is a buzzword in the Silicon Valley. As most of the jobs today are being forfeited to artificial intelligence and robots, is it possible to convey a complicated responsibility like cybersecurity to machines? This topic is being debated by the security professionals with very strong arguments on either end of the spectrum. Meanwhile, the security vendors and tech firms are devising ways to add machine learning to their cybersecurity arsenal.

Reality or Pipe Dream?

The CTO at Bromium, Simon Crosby, calls machine learning as cybersecurity’s latest pipe dream. He argues that “in security, there is no silver bullet.” the fact that backs up this argument are that in cybersecurity field, you are always up against fraudulent minds, people who are quite aware of the working of machines and machine learning and how to circumvent their abilities. Through inconspicuous and minuscule steps, many attacks are carried out, often hidden in the guise of authorized commands and requests.

Some other cybersecurity experts argue that the answer of cybersecurity is machine learning in detecting the breaches that are highly advanced and it will be further successful in securing the IT environments as they become highly complex. Today, AI is still not ready to replace human beings, but by automating the pattern recognition process, it can boost the efforts of humans. The truth that cannot be denied here is in the realm of cybersecurity, machine learning possesses very distinct use cases.

Attended Machine Learning

The major argument against the unsupervised machine learning’s security solutions is that they churn out numerous false alerts and positives resulting effectively in sensibility decrement and alert fatigue. Additionally, in the corporate networks, the amount of events and data generated are beyond the human experts’ capacity. Both of these cannot fight cyber threats alone. So, this led to the development of solutions where the human experts and AI join forces rather than competing with each other.

In this regard, the CSAIL (Computer Science and AI Lab) of MIT has led one of the highly remarkable efforts building a system known as AI², which is an adaptive cybersecurity platform that utilizes machine learning and the expert analysts’ assistance to improve and adapt overtime.

AI², takes its name from the combination of analyst intuition and artificial intelligence, reviews the data from tens of millions of log lines every day and separates the ones that were found suspicious. The human analyst then takes the filtered data and provides the feedback to AI² by means of labeling the legitimate threats. The system, overtime, fine-tunes its monitoring and learns from its successes and mistakes and eventually becomes better at reducing false positives and finding real breaches.

AI² – Kalyan Veeramachaneni/MIT CSAIL

Kalyan Veeramachaneni, the Research Lead at MIT CSAIL says, “Basically, the largest savings here is that we are able to show the analyst, the events with number ranging from 100 to 200 in a single day,” which is notably less than thousands of security events that the experts in cybersecurity have to handle every day.

During a period of 90 days, the platform was tested crunching a daily dose of 40 million log lines generated from an e-commerce website. AI² could detect 85% of the attacks after the training without the assistance of humans.

F-Secure, a Finnish security vendor, is another company that has placed its bets on the combination of machine and human intelligence in its recent cybersecurity efforts, which decreases the time it takes to detect the cyberattacks, as well as to responds to them. On an average, it will take several months for the organizations to discover a breach. With its Rapid Detection Service, F-Secure is looking to cut down this time frame to 30 minutes.

From a combination of the sensors placed on the network segments and the software installed on the customer workstations, the system gathers data. This data is fed to behavioral analytics and threat intelligence engines, which utilize machine learning to categorize the incoming samples and identify anomalies and outliers and determine the normal behavior. The system utilizes big data analytics to determine the growing threats through the anonymized datasets collected from a large number of clients and stores data analytics for comparing the samples against historical data and real-time analytics to recognize the known security threats.

A team of cybersecurity experts is present at the core of the system who will examine the results of machine learning and eventually recognize and handle the security incidents. While machine learning does most of the work, the software engineers and experts can become much more focused and productive on more advanced concepts like improving the overall system, reverse engineering attacks and identifying the relationships between threats.

Erka Koivunen, the advisor for the cybersecurity department at F-Secure, says that “the human part is a significant factor. The attackers are humans. So, you cannot depend on machines alone to perceive them. Our experts know the way the attackers think and the tactics the attackers utilize to hide their presence from the standard means of detection.”

Sifting Through Unstructured Data

While the data collected from endpoints and network traffic is useful in recognizing threats, it accounts only for a minor part of the cybersecurity picture. A large amount of information and intelligence needed to identify and protect the enterprises from the emerging threats lies with unstructured like social media posts, news stories, research papers and blog posts. Gaining value from these resources is giving cybersecurity experts, the edge over machines.

By exploiting the capabilities of NLP (Natural Language Processing) of its flagship AI platform, Watson, IBM wants to bridge this gap. IBM aims to leverage the unique capabilities of Watson in sifting through unstructured data to learn and read from several thousands of cybersecurity documents every month and utilize that knowledge to examine, recognize and prevent cybersecurity threats. The interesting difference between teaching one of your children and teaching Watson is that Watson never forgets.

Combining the data that is being collected by X-Force Exchange, the threat intelligence platform of IBM along with this capability, IBM wants to raise the efficiency level of Watson to that of an expert assistant as a means of addressing the shortage of talent in the industry. This is helpful in decreasing the rate of false positives. If this experiment is successful, Watson will deploy to the enterprise customers as a cloud service known as Watson for Cybersecurity.

Massive Alliance, a cybersecurity startup, utilizes a slightly distinct approach to extract the information from unstructured data. Strixus, the cybersecurity platform of Massive Alliance, utilizes a set of sophisticated proprietary tools that collect the data anonymously that is related to its customers from the dark web (TOR based networks), deep web (non-indexed pages) and surface web (public search engines).

A machine learning engine based on sentiment analyzes the data that is collected. This engine perceives the content’s general emotion. Behind this technology, the mechanics include the mathematical engines that produce the adaptive models of threat actors’ behavior and determine the danger they pose against the client. The analysts get these results and they process the information thereby spotting the potential risks.

This technique provides the cybersecurity organization, the special capability to monitor numerous (billions in number) results every day, identify and alert the potentially brand-damaging information, and proactively detect/prevent data loss and attacks before they happen.

Will AI be a replacement for cybersecurity experts in the future?

It is too early to determine whether the solutions based on machine learning will replace the cybersecurity experts. The balance may shift in the future but for now, there is no choice for humans as well as robots than uniting against the ever-increasing cyber threats.

Author Bio: Savaram Ravindra was born and raised in Hyderabad, popularly known as the ‘City of Pearls’. He is presently working as a Senior Security Engineer at Tekslate.com and Mindmajix.com. His previous professional experience includes Security Engineer at Cognizant Technology Solutions. He holds a Masters degree in Nanotechnology from VIT University. He enjoys spending time with his friends. He can be contacted at savaramravindra4@gmail.com. Connect with him also on LinkedIn and Twitter.

Savaram Ravindra is a guest blogger, all opinions are his own

The Exploitation of Machine Learning for Cybersecurity

Reality or Pipe Dream?

Attended Machine Learning

Sifting Through Unstructured Data

Will AI be a replacement for cybersecurity experts in the future?

Related Posts