Beating Back Cyber Attacks with Big Data + Analytics

Contributed by

6 min read

There's a downside to the widespread success of cloud computing, SaaS, smart devices and the plethora of customer and end-user data collected and warehoused in today's business environment: public and private sector organizations are more vulnerable than ever to attacks on their data systems.

The issue is no longer _if_ your business is vulnerable or may someday face a cyber-attack; the issue is when, and if you will be prepared.

Your organization's financial security, intellectual property and reputation are at risk from those seeking to do; harm for profit, spite, social justice or sport.

Cyber attacks are serious concerns for organizations of all sizes, requiring complex solutions to address real-time threats. Utilizing the combination of big data + analytics is the ideal solution to protect businesses and organizations from such attacks, identifying threats and attacks before and as they occur.

Big Data is Turning the Table on Security Threats
The more comprehensive, sensitive and greater volume of end user and customer data you warehouse, the more tempting you are to someone wanting to do harm. That said, the same data attracting the threat can be used to thwart an attack. Big data includes all events, activities, actions, and occurrences associated with a threat or attack:

  • User: authentication and access location, access date and time, user profiles, privileges, roles, travel and business itineraries, activity behaviors, normal working hours, typical data accessed, application usage
  • Device: type, software revision, security certificates, protocols
  • Network: locations, destinations, date and time, new and non-standard ports, code installation, log data, activity and bandwidth
  • Customer: customer database, credit/debit card numbers, purchase histories, authentication, addresses, personal data
  • Content: documents, files, email, application availability, intellectual property

The more log data you amass, the greater the opportunity to detect, diagnose and protect an organization from cyber attacks by identifying anomalies within the data and correlating them to other events falling outside of expected behaviors, indicating a potential security breach. The challenge lies in analyzing large amounts of data to uncover unexpected patterns in a timely manner. That's where analytics comes into play.

Leveraging Big Data with Analytics to Catch a Thief
Using analytics, organizations can exercise real-time monitoring of network and user behaviors, identifying suspicious activity as it occurs. Organizations can model various network, user, application and service profiles to create intelligence-driven security measures capable of quickly identifying anomalies and correlating events indicating a threat or attack:

  • Traffic anomalies to, from or between data warehouses
  • Suspicious activity in high value or sensitive resources of your data network
  • Suspicious user behaviors such as varied access times, levels, location, information queries and destinations
  • Newly installed software or different protocols used to access sensitive information
  • Identify ports used to aggregate traffic for external offload of data
  • Unauthorized or dated devices accessing a network
  • Suspicious customer transactions

Analytics can be highly effective in identifying an attack not quite underway or recommending an action to counter an attack, thus minimizing or eliminating losses. Analytics makes use of big data with timely analysis of disparate events to thwart both the smallest and largest scale attacks.

The Big Data Solution to Security Monitoring
If security monitoring is a data storage problem, then it requires a big data analytics solution capable of analyzing large amounts of data in real time. The natural place to look for that solution is within Apache Hadoop, and the ecosystem of dependent technologies. But although Hadoop does a good job performing analytics on large amounts of data, it was developed to provide batch analysis, not real-time streaming analytics required to detect security threats.

In contrast, the solution for real-time streaming analytics is Apache Storm, a free and open source real-time computation system. Storm functions similar to Hadoop, but was developed for real-time analytics. Storm is fast and scalable, supporting not only real-time analytics but machine learning as well, necessary to reduce the number of false positives found in security monitoring. Storm is commonly found in cloud solutions supporting antivirus programs, where large amounts of data is analyzed to identify threats, supporting quick data processing and anomaly detection.

The key is real-time analysis. Big data contains the activities and events signaling a potential threat, but it takes real-time analytics to make it an effective security tool, and the statistical analysis of data science tools to prevent security breaches.


  • Cyber attacks are a threat to businesses and public sector organizations alike — it's nearly assured that everyone will face an attack at some point in time.
  • While warehousing as much network, application, service and user data as possible attracts those who seek to harm, that same data can be effective in thwarting or countering an attack to avoid losses. The challenge using vast amounts of data to identify a cyber attack is the ability to quickly correlate events and identify suspicious behaviors pointing to a potential security breach.
  • Big data and analytics are ideally suited to identify unusual behaviors in real time, identifying a potential threat or attack underway, allowing immediate action to minimize or prevent losses. Analytics leverages big data to provide real-time security, discovering threats otherwise impossible to discover in time to take action to counter an attack.

This blog post was published October 28, 2014.

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.

Get our latest posts in your inbox

Subscribe Now