Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of hadoop website Log

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains the "hadoop site log example analysis", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "hadoop site log example analysis" bar!

I. Project requirements

The log in the log processing method refers to the Web log only. In fact, there is no precise definition, which may include, but is not limited to, user access logs generated by various front-end Web servers-apache, lighttpd, nginx, tomcat, etc., as well as logs output by various Web applications themselves.

Second, demand analysis: KPI index design

PV (PageView): page visit statistics

IP: traffic statistics of page independent IP

Time: statistics of users' hourly PV

Source: statistics of users' source domain names

Browser: user access device statistics

Now I will focus on the analysis of browser statistics.

III. Analysis process

1. A nginx record of the log

222.68.172.190-[18/Sep/2013:06:49:57 + 0000] "GET / images/my.jpg HTTP/1.1" 200 19939

"http://www.angularjs.cn/A00n"

"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"

2. Analyze the log records above

Remote_addr: record the ip address of the client, 222.68.172.190

Remote_user: record the client user name,-

Time_local: record access time and time zone, [18/Sep/2013:06:49:57 + 0000]

Request: record the url and http protocols of the request, "GET / images/my.jpg HTTP/1.1"

Status: record the request status. The success is 200,200.

Body_bytes_sent: the main content size of the file sent to the client by record, 19939

Http_referer: used to record links from that page, "http://www.angularjs.cn/A00n""

Http_user_agent: record the relevant information of the customer's browser, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"

3. Analyze the above log record in java language (use space segmentation)

String line = "222.68.172.190-[18/Sep/2013:06:49:57 + 0000]\" GET / images/my.jpg HTTP/1.1\ "200200\" http://www.angularjs.cn/A00n\"\ "Mozilla/5.0 (Windows NT 19939) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36\"; String [] elementList = line.split ("") For (int iTuno Bandi)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report