Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

IP data Visualization of face Engineering

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Preface

One of the things I care about very much is to look good, uh-huh. And whether it's interesting or not. Although it may not be useful.

The following is the effect picture, which is not dazzling enough because of the limited amount of data.

The main content of this paper is to visualize the web log or any text file containing IP data through echarts,threejs. Simply put, pretending to be forced, you can put this dynamic picture on the big screen.

For all source code and related data files, please visit the following github warehouse

Https://github.com/youerning/blog/tree/master/ip-visualize

Prerequisites familiar with python and framework flask familiar with JavaScript to obtain data IP data

Data acquisition mode

Log file elk three-piece set of other

In the final analysis, the data comes from the log file, which mainly refers to the web log.

The web log of my own website is used here in the following format.

'116.24.64.239-- [12/Mar/2018:18:58:40 + 0800] "GET / example HTTP/2.0" 502 365\ nThe 12/Mar/2018:18:54:55 + 0800] "GET / HTTP/2.0" 200 1603\ n'

Use the following code to take out the IP address.

# Open the log file fp = open ("website.log") # create an ip collection. Since only the IP address is needed here, ip_set = set () # read one line of log at a time through a loop. If the log volume is large, it is recommended that the log file is small and can be read all at once, while True: line = fp.readline () if len (line.strip ())

< 1: break ip = line.split()[0] ip_set.add(ip)# 访问用户IP的个数len(ip_set)# 查看前20个IPlist(ip_set)[:20]['111.206.36.133', '220.181.108.183', '40.77.178.63', '220.181.108.146', '119.147.207.152', '112.97.63.49', '66.249.64.16', '138.246.253.19', '123.125.67.164', '40.77.179.59', '66.249.69.170', '119.147.207.144', '66.249.79.108', '157.55.39.23', '123.125.71.80', '42.236.10.84', '123.125.71.79', '111.206.36.10', '106.11.152.155', '66.249.66.148']不过为了使用广泛这里使用正则表达式.import repat = "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"ipfind = re.compile(pat)line = '116.24.64.239 - - [12/Mar/2018:18:54:55 +0800] "GET / HTTP/2.0" 200 1603\n'ip = ipfind.findall(line)if ip: ip = ip[0] print(ip)下面是完整步骤 # 创建ip列表 ip_lis = list() # files of logs files = glob("logs/*") # complie regex pat = "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" ipfind = re.compile(pat) # extract ip from every file for logfile in files: with open(logfile) as fp: # 通过循环每次读取日志一行,如果日志量大建议以下方式,日志文件不大,可以直接readlines,一次性全部读取出来 # 如果太大则用readline一行一行的读 lines = fp.readlines() for line in lines: if len(line.strip()) < 1: continue ip = ipfind.findall(line) if ip: ip = ip[0] ip_lis.append(ip) 至此,我们将访问文件里面的的IP拿出来了。 值得注意的是: 如果你有搭建elk之类的日志集群,那么获取数据会更简单更快,只是方式不同而已.这里就不赘述了. IP地址的地理信息 如果只是拿到IP数据,在本文并没有用,因为为了在地图上可视化每一个IP的位置,我们需要知道每个IP地址的地理信息,即,经纬度,所在城市等。 这里使用dev.maxmind.com提供的开源免费的geoip数据库. 下载地址: https://dev.maxmind.com/geoip/geoip2/geolite2/ 这里不保证IP地址对应的位置信息绝对正确。为了保证IP地址的准确性,可以搜索在线的Geo服务。 为了使用上面下载的数据库,首先得下载相应的模块. pip install geoip2 通过下面代码获取指定IP的地理信息 # 导入相应模块import geoip2.database# 记载下载的数据库文件路径,这里是在代码执行的工作目录reader = geoip2.database.Reader("GeoLite2-City.mmdb")response = reader.city("61.141.65.76")# 查看国家名response.country.nameOut[115]: 'China'# 查看城市名response.city.nameOut[116]: 'Shenzhen'response.city.names["zh-CN"]Out[117]: '深圳市'# 查看经纬度response.location.latitudeOut[118]: 22.5333response.location.longitudeOut[119]: 114.1333 上面只是用geoip2这个库查看城市,国家, 经纬度,更多信息可自己探索. 处理数据 在处理数据之前,我们要知道,我们要处理成什么数据格式,由于画图是一件很费时费力的工作,这里借助的是这个echarts的demo,地址如下: http://echarts.baidu.com/examples/editor.html?c=lines3d-flights&gl=1 该demo的数据源如下: http://echarts.baidu.com/examples/data-gl/asset/data/flights.json 数据结构大致如下。

But this format is a little misleading.

By reading the js code of demo, you will find that the data format for drawing flying lines is:

[source latitude data point, source longitude data point], [target dimension data point, target longitude data point]]..]

The data format number required by threejs is as follows

Var data = ['seriesA', [latitude, longitude, magnitude, latitude, longitude, magnitude,...]], [' seriesB', [latitude, longitude, magnitude, latitude, longitude, magnitude,...]

The interpretation of echarts's official website demo can be viewed at the following address.

Https://github.com/youerning/blog/blob/master/ip-visualize/ipvis/prototype/lines3d-flights.html

Too much code insertion takes up too much space.

The data is processed as follows

From functools import lru_cache@lru_cache (maxsize=512) def get_info (ip): "return info of ip Returns: city, country, sourceCoord DestCoord "" try: resp = reader.city (ip) city = resp.city.name if not city: city = "unknow" country = resp.country.names ["zh-CN"] if not country: country = "unknow" except Exception as e: print ("the ip is bad: {}" .format (ip)) Print ("=" * 30) print (e) return False sourceCoord = [resp.location.longitude Resp.location.latitude] return city, country, sourceCoord, destCoord# ip_Lis are the visual data of the IP address list ipinfo_lis = [get_info (ip) for ipin ip_lis] obtained above

After processing the data, you can expose the data through an interface, which uses the json data format.

Then get the data through ajax.

Real-time data update

It's just about ideas.

Log file

Mainly through the file location of the file object of the python as to whether there is any new content written to the data, and if so, load the data into the exposed data interface.

Elk stack

This is relatively simple, regular query data demo using tutorial # installation depends on pip install flask, geoip2# download source code # enter the ipvis directory # put the log files in the logs directory # start python app.py# to access web http://127.0.0.1/p1http://127.0.0.1/p2

It is worth noting that the geoip custom database query is not very fast, so when you visit the page, you will feel slow, mainly because the query of ip data takes too long, 1.8w pieces of data query about 14 seconds

And there may be a performance problem with the library echarts (at least in the case of the ball chart, even the official demo of the official website), because when you open http://127.0.0.1/p1, cpu may soar to 100%.

The deficiency can not load the data in real time, too much data has performance problems, IP data classification is not detailed enough, the chart is not sound enough.

A small project that is interesting only to me. I'm not sure if you have any more interesting ideas.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report