Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

[quicksand] practice of reliable and secure data platform

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Preface

OpenSOC is a security big data analysis framework unveiled by Cisco at BroCON. It is a big data analysis framework for network packets and flows. It is a combination of big data analysis and security analysis technology. It can detect network anomalies in real time and can expand many nodes. Its storage uses open source project Hadoop, real-time index uses open source project ElasticSearch, and online stream analysis uses the famous open source project Storm.

Combined with its own actual situation, Yixin has also realized a set of safe data platform-quicksand platform, which integrates collection, analysis and storage. This paper focuses on the architecture of quicksand platform, what optimizations and improvements have been done compared with OpenSOC, and the experience summary of quicksand platform in the landing process.

I. the architecture of quicksand platform

The whole platform architecture is divided into several layers: acquisition layer, pre-processing layer, analysis layer, storage layer and response layer. If necessary, kafka is used as a message queue for data transmission between layers, which ensures the reliability of the data in the transmission process.

1.1 acquisition layer

The acquisition layer is mainly used for data acquisition, and then send the collected data to kafka. The data collected mainly include:

Traffic data-parsing using packetbeat

Log data-logs in file form are collected by filebeat; data in syslog form are collected by rsyslog

Operation and maintenance data-to facilitate troubleshooting and cluster performance monitoring, metricbeat is used to collect operation and maintenance data of quicksand platform cluster servers.

In the process of actual operation, we find that packetbeat has some defects in * scenarios, and give corresponding solutions, such as:

Web page compression will lead to body garbled

If there is no content-type field, it will not be unpacked.

Parameters in the body section are added to the params field

Urlparse error caused by connect request

Urlparse error 1.2 preprocessing layer caused by non-standard url coding field

The preprocessing program of quicksand platform (hereinafter referred to as "ybridge") is a set of distributed preprocessing framework independently developed by Yixin based on golang, which can realize the input and output of custom users and the functions of data by writing configuration. By writing plug-ins, each kind of data can be processed separately, which can meet all kinds of needs in the process of practical use.

Ybridge has the following advantages:

Flexible function

High performance

No dependence

Support redundant deployment with high reliability

Support for docker/vm deployment, easy to scale

Ability to send running data to metricbeat and perform performance monitoring

The main work of ybridge includes the following points:

Support for gzip decoding

Data formatting

Field expansion

Field extraction

Sensitive field coding

Log informatization

Data encryption and decryption

Delete useless data

Data Compression 1.3 Analysis layer

As a big data analysis platform, data analysis is the core. Although the analysis can be achieved in kibana or writing a separate program, this way needs to pull data from ES and then analyze, on the one hand, the timeliness will be poor, on the other hand, over-reliance on ES clusters will lead to poor platform stability. For this reason, the quicksand platform implements a set of analysis engine based on spark, which takes kafka as the data source and stores the analysis results in ES. You can manually analyze a rule on kibana, and then apply it to the analysis engine.

The functions implemented in the analysis layer include:

Asset discovery

* Discovery

Information disclosure

Trace the source of internal threats

Business risk Control 1.4 Storage Tier

The storage layer consists of two ES clusters (ES_all cluster and ES_out cluster) and one hbase cluster. The reason for using two ES clusters is that they have different functions to avoid the unavailability of the whole cluster and improve the stability of the platform. ~ ES_all cluster is used to store full amount of raw data, convenient for manual analysis and traceability ES_all cluster to store full amount of raw data, convenient for manual analysis and traceability ES_out cluster to store the result data after analysis, and convenient for program to call ~

ES stores short-term hot data, while hbase stores long-term cold data.

Data is stored in hbase in timestamps, and a rowkey stores data for one second. Through ybridge, users can play back the cold data in hbase for a certain period of time to kafka, and then perform subsequent operations, such as analysis or traceability.

1.5 response layer

The response layer is used to analyze and process the user's data and respond. The response layer mainly includes:

Kibana: used for data search, monitoring and display, * tracing, etc.

Monitoring visualization: through the way of icons to present the current major risks on the big screen, you can more intuitively understand the security threats that enterprises are facing.

AlertAPI: after a problem is found through monitoring, it is often necessary to have a follow-up action, such as automatic response or alarm.

There are many ways to invoke subsequent actions, such as:

Write a program to analyze and alarm

Watcher (charge. Official tools of elastic)

Elastalert (free. Alarm framework based on python)

Subsequent actions can include:

SMS alarm

Mail alarm

Automatically block malicious IP

Reprocessing the data and rewriting it to ES

Jupyterhub: extract data from ES cluster and use python for offline data analysis, multi-person data analysis platform.

Second, compared with OpenSOC

OpenSOC also stores traffic data and log data. After data collection, it is first sent to kafka, then formatted and field expanded by storm, and then written to hive, ES and HBase respectively. Finally, the data is analyzed by webservise or analysis tools. The architecture of quicksand platform is basically the same as that of OpenSOC, but there will be slight differences in the above figure, which will be described below.

2.1Using beats for data acquisition

In terms of data collection, beats is widely used in quicksand platform. Beats is the official product of Elastic, with high community activity and excellent performance and function.

Network traffic uses packetbeat for network data analysis, log files are collected by filebeat, and metricbeat is used for system performance monitoring and ybridge performance monitoring. Beats has the following advantages:

High performance

Easy to use

Using the same technology stack as ybridge, it is easy to expand or modify the function.

Beats version and elasticsearch synchronous update 2.2 split of the processing layer

The real-time processing part of OpenSOC includes many pairs of data analysis, enrichment, analysis and so on. According to the function, the quicksand platform divides this part into two layers. The first part is the pre-processing layer, and the second part is the analysis layer. The preprocessing layer uses a self-developed ybridge program, which mainly implements the ETL function and supports horizontal expansion. Timeliness and processing speed can ensure that the results of the preprocessing layer can be directly stored or thrown to kafka for the next processing. In the analysis layer, there is a set of analysis framework based on the implementation of spark. Although the timeliness of spark is not so high compared with storm,spark, spark can do better aggregation analysis, and is inherently compatible with machine learning and graph computing, so it is very suitable for data analysis, while storm is more expensive to implement similar functions.

2.3 abandon hive

A piece of data in OpenSOC will be stored in hive, hbase and ES respectively, so there will undoubtedly be a huge investment in storage resources. Considering that the query rate of hive is slow, and the data can be analyzed directly or extracted by ES, the quicksand platform does not use hive storage. Short-term data is found directly in ES, while long-term data is taken out of hbase before being used. Such an architecture would be more suitable for small and medium-sized enterprises, striking a balance between functions and resources.

2.4 different ways of data playback

OpenSOC stores pcap files in hbase, while quicksand platform stores data in json format after preprocessing and formatting, which will be more convenient and save storage space when playing back the data.

OpenSOC uses webserver to achieve data playback, while quicksand platform uses ybridge to achieve playback.

2.5 closely integrated with threat intelligence

OpenSOC enriches data in the real-time processing layer, and the quicksand platform also accesses threat intelligence. Threat intelligence plays a more and more important role in enterprise security, which can help enterprises find potential security problems.

For quicksand platform, on the one hand, it transforms internal alarm into internal intelligence, on the other hand, it combines internal intelligence with external intelligence to form reliable and unique threat intelligence, and feeds it back to logs and flows. help enterprise security analysts to make data analysis and decision-making more convenient.

Third, landing experience 3.1 platform is highly available

The most important thing of quicksand platform is to provide stable and reliable data services, so it is very important that the platform is highly available. First of all, the entire platform is redundant except for beats and kibana, and even the log receiving server of the quicksand platform adopts the method of dual-active deployment; secondly, the preprocessor can start and stop at any time to achieve a smooth upgrade of the program without the user's perception; finally, to ensure stability, the quicksand platform adds a large number of monitoring alarms, such as:

ES cluster exception monitoring

Data loss monitoring

Packet loss rate monitoring

Service Survival status Monitoring 3.2 how to solve the problem of packet loss

When parsing network traffic, the biggest problem is packet loss. The first thing you need to do is to be able to detect packet loss. You can send 100 packets of specified UA on a regular basis, and then count the number of packets received on the es side. If the number of packets lost exceeds a certain value, alarm will be given.

As for the reasons for packet loss, there may be many reasons, here is the main analysis of software packet loss. There are also many solutions, and the big ones have the following ideas:

Improve parsing efficiency (such as using pf_ring or DPDK)

Hardware shunt

Software shunt

Carefully select mirrored access points and do not recommend access to core data, which can greatly reduce the amount of parsed data

When choosing a solution, it is suggested that according to the actual situation of your own enterprise, the first one is the most efficient, but pf_ring is a fee-paying software, while DPDK often needs to be developed, which requires some cost. In addition, it can also be solved by means of hardware diversion or software diversion. Hardware shunting is recommended because it is simpler and more reliable.

3.3 centralized management of services

In order to enable this platform to land, a large number of servers are needed, and different servers run different programs, so a set of centralized management means is needed. To this end, the quicksand platform uses the following two tools:

Gosuv:gosuv is a distributed supervisor framework written by go, which can remotely manage programs on the server through web pages.

Consul:consul is a distributed micro-service framework, which implements the functions of service registration, service discovery and unified configuration management.

The way of gosuv + consul can easily realize the centralized management of the program and the monitoring of the program status.

Summary

Data is the basis of security analysis. With data, threat intelligence, situational awareness, * portraits, business risk control, * traceability, * identification, and asset discovery are not out of reach. Combined with the actual scenario of Yixin, the quicksand platform has made some improvements to OpenSOC and landed on the ground from the comprehensive consideration of efficiency, cost and function.

Through the quicksand platform, security staff can focus most of their energy on data analysis, make up for the shortcomings of commercial security products, and better help security students to have a comprehensive understanding of the security status of the enterprise. Quicksand platform is not only a data platform, but also an important supplement to the existing security measures.

Author: safe development of Gao Yang

First launch: Yixin Security Emergency response Center

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report