消息队列是分布式系统中重要的组件,使用消息队列主要是为了通过异步处理提高系统性能和削峰、降低系统耦合性。
ELK常用工具
一、概念
mutate插件可以对事件中的数据进行修改,包括rename、update、replace、convert、split、gsub、uppercase、lowercase、strip、remove_field、join、merge等功能。
- rename:对于已经存在的字段,重命名其字段名称。
1 | filter { |
- update:更新字段内容(如果字段不存在不会新建)。
1 | filter { |
- replace:与update功能相同,区别在于如果字段不存在则会新建字段。
1 | filter { |
- convert:数据类型转换。
1 | filter { |
- gsub:通过正则表达式实现文本替换的功能。
1 | filter { |
- uppercase/lowercase:大小写转换。
1 | filter { |
- split:将提取到的某个字段按照某个字符分割。
1 | filter { |
- strip:去除首尾的空白字符。
1 | filter { |
- remove_field:删除字段。
1 | filter { |
- join:将类型为array的字段中的元素使用指定字符为分隔符聚合成一个字符串。
1 | filter { |
- merge:合并字段。
1 | filter { |
二、使用
- 下载测试数据
- 解压至
/Users/your_name/elk/ml-25m/movies.csv
- 启动Elasticsearch实例
- 修改Logstash配置
logstash.conf
1 | input { |
启动Logstash实例
查询
curl -XGET "localhost:9200/movies/_search?pretty" -H "content-type:application/json" -d '{"_source":["movieId","title"],"query":{"match":{"title":"liu*"}}}'
curl -XGET "localhost:9200/_search?pretty" -H "content-type:application/json" -d '{"_source":["movieId","title"],"query":{"match":{"title":"liu*"}}}'
三、参考
Grok
一、概念
Grok
是ELK Stack
中用来快速解析日志的一个脚本工具,运用得好的话可以极大程度的降低日志解析的工作,是将非结构化的日志数据解析为可查询的结构化数据的一种方法。它使用正则表达式提取日志记录中的数据,其正则表达式语法与Perl
和Ruby
语言中的正则表达式语法类似,语法:%{SYNTAX:SEMANTIC}
,SYNTAX匹配模式的名称,分为配置pattern和自定义pattern,SEMANTIC则是对匹配到的文本气的别名。
默认情况下SEMANTIC匹配到的是string,特殊的
%{SYNTAX:SEMANTIC:type}
,即执行匹配文本的数据类型,目前仅支持int和float。
类型 | 含义 | 正则 |
---|---|---|
INT | int类型 | (?:[+-]?(?:[0-9]+)) |
NUMBER | 数字 | (?:%{BASE10NUM}) |
DATA | 数据,可以对应字符串 | .*? |
GREEDYDATA | 数据,可以对应字符串,贪婪匹配 | .* |
WORD | 单词 | \b\w+\b |
IP | ip地址,v4或v6 | (?:%{IPV6} |
DATE | 日期 | %{DATE_US} |
TIME | 时间 | (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9]) |
DATESTAMP | 日期+时间 | %{DATE}[- ]%{TIME} |
PATH | 系统路径 | (?:%{UNIXPATH} |
HOSTNAME | 主机名 | \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(.? |
MAC | mac地址 | (?:%{CISCOMAC} |
UUID | uuid | [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12} |
EMAILADDRESS | email地址 | %{EMAILLOCALPART}@%{HOSTNAME} |
- 自定义pattern(不常用,内置基本已够用)
二、使用
- 使用内置pattern
1 | filter { |
测试grok是否生效,传送门
- nginx日志格式
1
2
3log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';- 测试文件access.log
1
2127.0.0.1 - - [26/Apr/2017:16:29:31 +0800] "GET /demo/Demo/jquery.dump.js HTTP/1.1" 200 4482 "http://localhost/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
127.0.0.1 - - [26/Apr/2017:16:29:31 +0800] "GET /demo/Demo/main.js HTTP/1.1" 200 3018 "http://localhost/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"- grok模式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{HTTPDATE:time_local}\] \"%{DATA:request}\" %{INT:status} %{NUMBER:bytes_sent} \"%{DATA:refer}\" \"%{DATA:http_user_agent}\"
{
"remote_addr": [
[
"127.0.0.1"
]
],
"HOSTNAME": [
[
"127.0.0.1"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
null
]
],
"remote_user": [
[
"-"
]
],
"time_local": [
[
"26/Apr/2017:16:29:31 +0800"
]
],
"MONTHDAY": [
[
"26"
]
],
"MONTH": [
[
"Apr"
]
],
"YEAR": [
[
"2017"
]
],
"TIME": [
[
"16:29:31"
]
],
"HOUR": [
[
"16"
]
],
"MINUTE": [
[
"29"
]
],
"SECOND": [
[
"31"
]
],
"INT": [
[
"+0800"
]
],
"request": [
[
"GET /demo/Demo/jquery.dump.js HTTP/1.1"
]
],
"status": [
[
"200"
]
],
"bytes_sent": [
[
"4482"
]
],
"BASE10NUM": [
[
"4482"
]
],
"refer": [
[
"http://localhost/index.php"
]
],
"http_user_agent": [
[
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
]
]
}- grok模式:
%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156{
"COMBINEDAPACHELOG": [
[
"127.0.0.1 - - [26/Apr/2017:16:29:31 +0800] "GET /demo/Demo/jquery.dump.js HTTP/1.1" 200 4482 "http://localhost/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36""
]
],
"COMMONAPACHELOG": [
[
"127.0.0.1 - - [26/Apr/2017:16:29:31 +0800] "GET /demo/Demo/jquery.dump.js HTTP/1.1" 200 4482"
]
],
"clientip": [
[
"127.0.0.1"
]
],
"HOSTNAME": [
[
"127.0.0.1"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
null
]
],
"ident": [
[
"-"
]
],
"USERNAME": [
[
"-",
"-"
]
],
"auth": [
[
"-"
]
],
"timestamp": [
[
"26/Apr/2017:16:29:31 +0800"
]
],
"MONTHDAY": [
[
"26"
]
],
"MONTH": [
[
"Apr"
]
],
"YEAR": [
[
"2017"
]
],
"TIME": [
[
"16:29:31"
]
],
"HOUR": [
[
"16"
]
],
"MINUTE": [
[
"29"
]
],
"SECOND": [
[
"31"
]
],
"INT": [
[
"+0800"
]
],
"verb": [
[
"GET"
]
],
"request": [
[
"/demo/Demo/jquery.dump.js"
]
],
"httpversion": [
[
"1.1"
]
],
"BASE10NUM": [
[
"1.1",
"200",
"4482"
]
],
"rawrequest": [
[
null
]
],
"response": [
[
"200"
]
],
"bytes": [
[
"4482"
]
],
"referrer": [
[
""http://localhost/index.php""
]
],
"QUOTEDSTRING": [
[
""http://localhost/index.php"",
""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36""
]
],
"agent": [
[
""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36""
]
],
"extra_fields": [
[
""
]
]
}