Anaiysis 与 Analyzer

  • Analysis - 文本分析是把全文本转换成一系列的单词(term /token)的过程,也叫分词
  • Analysis 是通过 Analyzer 来实现的
    • 可使用 Elasticesearch 内置的分析器 或者按需求定制化分析器
  • 除了在数据写入时转换词条,匹配 Query 语句时候也需要用相同的分析器会查询语句进行分析

Analyzer 的组成

  • 分词器是专门处理分词的组件,Analyzer 由三部分组成
    • Character Filters (针对原始文本处理,例如去除 html)
    • Tokenizer(按照规则切分为单词)
    • Token Filter (将切分的单词进行加工,小写,删除 stopwords,增加同义语)

V22weN

Elasticsearch 的内置分词器

  • Standard Analyzer - 默认分词器,按词切分,小写处理
  • Simple Analyzer - 按照非字母切分(符号被过滤),小写处理
  • Stop Analyzer - 小写处理,停用词过滤(the ,a,is)
  • Whitespace Analyzer - 按照空格切分,不转小写
  • Keyword Analyzer - 不分词,直接将输入当做输出
  • Patter Analyzer - 正则表达式,默认 \W+
  • Language - 提供了 30 多种常见语言的分词器
  • Customer Analyzer 自定义分词器

Standard Analyzer

  • 默认的分词器
  • 按词切分
  • 小写处理
    1
    2
    3
    4
    5
    6
    #standard
    GET _analyze
    {
    "analyzer": "standard",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }
    7sIoBX

Simple Analyzer

  • 按照非字母切分,非字母的都被去除
  • 小写处理
    1
    2
    3
    4
    5
    6
    #simple 去除非字母的 :2 -  xi
    GET _analyze
    {
    "analyzer": "simple",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }

Whitespace Analyzer

  • 空格切分
    1
    2
    3
    4
    5
    6
    #stop
    GET _analyze
    {
    "analyzer": "whitespace",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }
    ASsP2B

Stop Analyzer

  • 相比 Simple Analyzer
  • 多了 stop filter
    • 后把 the ,a, is,in 等修饰性词语去除
1
2
3
4
5
GET _analyze
{
"analyzer": "stop",
"text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
}

OZVFq4

Keyword Analyzer

  • 不分词,直接将输入当作一个 term 输出
    1
    2
    3
    4
    5
    6
    #keyword
    GET _analyze
    {
    "analyzer": "keyword",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }
    ArA8J3

Pattern Analyzer

  • 通过正则表达进行分词
  • 默认是 \W+,非字符的符号进行分隔
    1
    2
    3
    4
    5
    GET _analyze
    {
    "analyzer": "pattern",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }
    B6ISBW

Language Analyzer

  • 各国语言分词
    1
    2
    3
    4
    5
    6
    #english
    GET _analyze
    {
    "analyzer": "english",
    "text": "2 running Quick brown-foxes leap over lazy dogs in the summer evening."
    }

使用 _analyzer Api

  • 直接指定 Analyzer 进行测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
GET _analyze
{
"analyzer": "standard",
"text" : "Mastering Elasticsearch , elasticsearch in Action"
}
//返回结果
{
"tokens" : [
{
"token" : "mastering",
"start_offset" : 0,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "elasticsearch",
"start_offset" : 10,
"end_offset" : 23,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "elasticsearch",
"start_offset" : 26,
"end_offset" : 39,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "in",
"start_offset" : 40,
"end_offset" : 42,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "action",
"start_offset" : 43,
"end_offset" : 49,
"type" : "<ALPHANUM>",
"position" : 4
}
]
}
  • 指定索引的字段进行测试
    1
    2
    3
    4
    5
    POST books/_analyze
    {
    "field": "title",
    "text": "Mastering Elasticesearch"
    }
  • 自定义分词进行测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
POST /_analyze
{
"tokenizer": "standard",
"filter": ["lowercase"],
"text": "Mastering Elasticesearch"
}
//结果返回
{
"tokens" : [
{
"token" : "mastering",
"start_offset" : 0,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "elasticesearch",
"start_offset" : 10,
"end_offset" : 24,
"type" : "<ALPHANUM>",
"position" : 1
}
]
}

专题目录

ElasticStack-安装篇
ElasticStack-elasticsearch篇
ElasticStack-logstash篇
elasticSearch-mapping相关
elasticSearch-分词器介绍
elasticSearch-分词器实践笔记
elasticSearch-同义词分词器自定义实践
docker-elk集群实践
filebeat与logstash实践
filebeat之pipeline实践
Elasticsearch 7.x 白金级 破解实践
elk的告警调研与实践

概述

在英语中,一个单词常常是另一个单词的“变种”,如:happy=>happiness,这里的变形就是处理单复数,happy叫做happiness的词干(stem)。而adult=>man,woman,是处理同义词。

或者再如下面,需要达到搜索都能搜索出来,达到一定精确度。

1
2
3
4
5
裙子,裙
西红柿,番茄
china,中国,中华人民共和国
男生,男士,man
女生,女士,women

于是就有了需要自定义分词器解决同义词的场景。

实践自定义分词器

自定义分词器其实也就是组合

  • Character Filter
  • Tokenizer
  • Token Filter
    这三个的过程。默认分词器仅仅是把这3个默认组合了。

Character Filters

  • 在 Tokenizer 之前对文本进行处理,例如增加删除及替换字符。可以配置多个 Character Filters。会影响 Tokenizer 的 position 和 offset 信息
  • 一些自带的 Character Filters
    • HTML strip - 去除 html 标签
    • Mapping - 字符串替换
    • Pattern replace - 正则匹配替换

Demo char_filter

html_strip 去除html标签
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
POST _analyze
{
"tokenizer":"keyword",
"char_filter":["html_strip"],
"text": "<b>hello world</b>"
}
//结果
{
"tokens" : [
{
"token" : "hello world",
"start_offset" : 3,
"end_offset" : 18,
"type" : "word",
"position" : 0
}
]
}
mapping 字符串替换
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
POST _analyze
{
"tokenizer": "standard",
"char_filter": [
{
"type" : "mapping",
"mappings" : [ "- => _"]
}
],
"text": "123-456, I-test! test-990 650-555-1234"
}
//返回
{
"tokens" : [
{
"token" : "123_456",
"start_offset" : 0,
"end_offset" : 7,
"type" : "<NUM>",
"position" : 0
},
{
"token" : "I_test",
"start_offset" : 9,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "test_990",
"start_offset" : 17,
"end_offset" : 25,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "650_555_1234",
"start_offset" : 26,
"end_offset" : 38,
"type" : "<NUM>",
"position" : 3
}
]
}
pattern_replace 正则表达式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
GET _analyze
{
"tokenizer": "standard",
"char_filter": [
{
"type" : "pattern_replace",
"pattern" : "http://(.*)",
"replacement" : "$1"
}
],
"text" : "http://www.elastic.co"
}
//返回
{
"tokens" : [
{
"token" : "www.elastic.co",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}

Tokenizer

  • 将原始的文本按照一定的规则,切分为词(term or token)
  • Elasticsearch 内置的 Tokenizers
    • whitespace | standard | uax_url_email | pattern | keyword | path hierarchy

Demo tokenizer

path_hierarchy 通过路劲切分
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
POST _analyze
{
"tokenizer":"path_hierarchy",
"text":"/user/ymruan/a"
}
{
"tokens" : [
{
"token" : "/user",
"start_offset" : 0,
"end_offset" : 5,
"type" : "word",
"position" : 0
},
{
"token" : "/user/ymruan",
"start_offset" : 0,
"end_offset" : 12,
"type" : "word",
"position" : 0
},
{
"token" : "/user/ymruan/a",
"start_offset" : 0,
"end_offset" : 14,
"type" : "word",
"position" : 0
}
]
}

Token tokenizer Filter

  • 将 Tokenizer 输出的单词,进行增加、修改、删除
  • 自带的 Token Filters
    • Lowercase |stop| synonym(添加近义词)

Demo filter

whitespace 空格
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
GET _analyze
{
"tokenizer": "whitespace",
"filter": ["stop","snowball"], //on the a
"text": ["The gilrs in China are playing this game!"]
}
{
"tokens" : [
{
"token" : "The", //大写的The 不做过滤
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 0
},
{
"token" : "gilr",
"start_offset" : 4,
"end_offset" : 9,
"type" : "word",
"position" : 1
},
{
"token" : "China",
"start_offset" : 13,
"end_offset" : 18,
"type" : "word",
"position" : 3
},
{
"token" : "play",
"start_offset" : 23,
"end_offset" : 30,
"type" : "word",
"position" : 5
},
{
"token" : "game!",
"start_offset" : 36,
"end_offset" : 41,
"type" : "word",
"position" : 7
}
]
}

自定义 analyzer

  • 官网自定义分词器的标准格式
1
2
3
4
5
6
7
8
9
10
11
12
自定义分析器标准格式是:
PUT /my_index
{
"settings": {
"analysis": {
"char_filter": { ... custom character filters ... },//字符过滤器
"tokenizer": { ... custom tokenizers ... },//分词器
"filter": { ... custom token filters ... }, //词单元过滤器
"analyzer": { ... custom analyzers ... }
}
}
}
  • 自定义分词器
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#定义自己的分词器
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer":{
"type":"custom",
"char_filter":[
"emoticons"
],
"tokenizer":"punctuation",
"filter":[
"lowercase",
"english_stop"
]
}
},
"tokenizer": {
"punctuation":{
"type":"pattern",
"pattern": "[ .,!?]"
}
},
"char_filter": {
"emoticons":{
"type":"mapping",
"mappings" : [
":) => happy",
":( => sad"
]
}
},
"filter": {
"english_stop":{
"type":"stop",
"stopwords":"_english_"
}
}
}
}
}

自定义同义词需求解决

  • synonym_graph
    我们定义一个my_synonym_filter的filter进行处理同义词,
    同时自定义自己的分词器my_custom_analyzer,并指定字段title使用my_custom_analyzer分词器
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    PUT my_index
    {
    "settings": {
    "analysis": {
    "analyzer": {
    "my_custom_analyzer": {
    "type": "custom",
    "tokenizer": "standard",
    "filter": [
    "lowercase",
    "my_synonym_filter"
    ]
    }
    },
    "filter": {
    "my_synonym_filter": {
    "type": "synonym",
    "synonyms": [
    "british,english",
    "queen,monarch"
    ]
    }
    }
    }
    },
    "mappings": {
    "properties": {
    "title": {
    "type": "text",
    "analyzer": "my_custom_analyzer",
    "search_analyzer": "my_custom_analyzer"
    },
    "author": {
    "type": "keyword"
    }
    }
    }
    }

测试一下分词器的效果

1
2
3
4
5
6
POST my_index/_analyze
{
"field": "title",
"text": "Elizabeth is the English queen",
"analyzer": "my_custom_analyzer"
}

我们会发现Elizabeth is the English queen,包含了english,而我们设置了british,english,为同义词,所以分词器就包含了british,english

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
{
"tokens" : [
{
"token" : "elizabeth",
"start_offset" : 0,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "is",
"start_offset" : 10,
"end_offset" : 12,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "the",
"start_offset" : 13,
"end_offset" : 16,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "english",
"start_offset" : 17,
"end_offset" : 24,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "british",
"start_offset" : 17,
"end_offset" : 24,
"type" : "SYNONYM",
"position" : 3
},
{
"token" : "queen",
"start_offset" : 25,
"end_offset" : 30,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "monarch",
"start_offset" : 25,
"end_offset" : 30,
"type" : "SYNONYM",
"position" : 4
}
]
}

我们再通过插入数据来测试一下 queen,monarch 同义词

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

PUT my_index/_doc/1
{
"title": "Elizabeth is the English queen"
}


GET my_index/_search
{
"query": {
"match": {
"title": "monarch"
}
}
}

我们发现我们能通过monarch 查询出来含queen的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "my_index2",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"title" : "Elizabeth is the English queen"
}
}
]
}
}

自此解决同义词的需求

动态设置同义词的方案

开启热更新,仅仅适用search_analyzer
https://www.elastic.co/guide/en/elasticsearch/reference/7.x/indices-reload-analyzers.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"word_syn"
]
}
},
"filter": {
"word_syn": {
"type": "synonym_graph",
"synonyms_path": "analysis/synonym.txt",
## 开启热更新
"updateable": true
},
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"search_analyzer": "my_custom_analyzer"
},
"author": {
"type": "keyword"
}
}
}
}

热更新重载分词器

1
POST my_index/_reload_search_analyzers

专题目录

ElasticStack-安装篇
ElasticStack-elasticsearch篇
ElasticStack-logstash篇
elasticSearch-mapping相关
elasticSearch-分词器介绍
elasticSearch-分词器实践笔记
elasticSearch-同义词分词器自定义实践
docker-elk集群实践
filebeat与logstash实践
filebeat之pipeline实践
Elasticsearch 7.x 白金级 破解实践
elk的告警调研与实践

概述

1、为什么命名有包含搜索关键词的文档,但结果里面就没有相关文档呢?
2、我存进去的文档到底被分成哪些词(term)了?

带着这些问题,我们来实践一下

实践

测试一下

让我们从一个实例出发,如下创建一个文档:

1
2
3
4
PUT test/_doc/1
{
"msg":"Eating an apple a day keeps doctor away"
}

然后我们做一个查询,我们试图通过搜索 eat这个关键词来搜索这个文档

1
2
3
4
5
6
7
8
POST test/_search
{
"query":{
"match":{
"msg":"eat"
}
}
}

ES的返回结果为0。这不太对啊,我们用最基本的字符串查找也应该能匹配到上面新建的文档才对啊!

分词原理

搜索引擎的核心是倒排索引,而倒排索引的基础就是分词。所谓分词可以简单理解为将一个完整的句子切割为一个个单词的过程。在 es 中单词对应英文为 term。我们简单看个例子:
KeMwjZ

ES 的倒排索引即是根据分词后的单词创建,即 北京天安门这4个单词。这也意味着你在搜索的时候也只能搜索这4个单词才能命中该文档。

实际上 ES 的分词不仅仅发生在文档创建的时候,也发生在搜索的时候,如下图所示:

FKOsVy

读时分词发生在用户查询时,ES 会即时地对用户输入的关键词进行分词,分词结果只存在内存中,当查询结束时,分词结果也会随即消失。而写时分词发生在文档写入时,ES 会对文档进行分词后,将结果存入倒排索引,该部分最终会以文件的形式存储于磁盘上,不会因查询结束或者 ES 重启而丢失。

ES 中处理分词的部分被称作分词器,英文是Analyzer,它决定了分词的规则。ES 自带了很多默认的分词器,比如Standard、 Keyword、Whitespace等等,默认是 Standard。当我们在读时或者写时分词时可以指定要使用的分词器。

写时分词

回到上手阶段,我们来看下写入的文档最终分词结果是什么。通过如下 api 可以查看:

1
2
3
4
5
POST test/_analyze
{
"field": "msg",
"text": "Eating an apple a day keeps doctor away"
}

其中 test为索引名,_analyze 为查看分词结果的 endpoint,请求体中 field 为要查看的字段名,text为具体值。该 api 的作用就是请告诉我在 test 索引使用 msg 字段存储一段文本时,es 会如何分词。

返回结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
{
"tokens": [
{
"token": "eating",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "an",
"start_offset": 7,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "apple",
"start_offset": 10,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "a",
"start_offset": 16,
"end_offset": 17,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "day",
"start_offset": 18,
"end_offset": 21,
"type": "<ALPHANUM>",
"position": 4
},
{
"token": "keeps",
"start_offset": 22,
"end_offset": 27,
"type": "<ALPHANUM>",
"position": 5
},
{
"token": "doctor",
"start_offset": 28,
"end_offset": 34,
"type": "<ALPHANUM>",
"position": 6
},
{
"token": "away",
"start_offset": 35,
"end_offset": 39,
"type": "<ALPHANUM>",
"position": 7
}
]
}

返回结果中的每一个 token即为分词后的每一个单词,我们可以看到这里是没有 eat 这个单词的,这也解释了在上手中我们搜索 eat 没有结果的情况。如果你去搜索 eating ,会有结果返回。

写时分词器需要在 mapping 中指定,而且一经指定就不能再修改,若要修改必须新建索引。如下所示我们新建一个名为ms_english 的字段,指定其分词器为 english:

1
2
3
4
5
6
7
8
9
PUT test/_doc/_mapping?include_type_name=true
{
"properties": {
"msg_english":{
"type":"text",
"analyzer": "english"
}
}
}

读时分词

由于读时分词器默认与写时分词器默认保持一致,拿 上手 中的例子,你搜索 msg 字段,那么读时分词器为 Standard ,搜索 msg_english 时分词器则为 english。这种默认设定也是非常容易理解的,读写采用一致的分词器,才能尽最大可能保证分词的结果是可以匹配的。
然后 ES 允许读时分词器单独设置,如下所示:

1
2
3
4
5
6
7
8
9
10
11
POST test/_search
{
"query":{
"match":{
"msg":{
"query": "eating",
"analyzer": "english"
}
}
}
}

如上 analyzer 字段即可以自定义读时分词器,一般来讲不需要特别指定读时分词器。

如果不单独设置分词器,那么读时分词器的验证方法与写时一致;如果是自定义分词器,那么可以使用如下的 api 来自行验证结果。

1
2
3
4
5
POST _analyze
{
"text":"eating",
"analyzer":"english"
}
1
2
3
4
5
6
7
8
9
10
11
{
"tokens": [
{
"token": "eat",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
}
]
}

由上可知 english分词器会将 eating处理为 eat,而standard则不作处理,即eating处理为eating。

通过下图的分析,我们就知道为什么我们查不出数据了。
UxvCpN

解决问题

由于 eating只是 eat的一个变形,我们依然希望输入 eat时可以匹配包含 eating的文档,那么该如何解决呢?
答案很简单,既然原因是在分词结果不匹配,那么我们就换一个分词器呗~ 我们可以先试下 ES 自带的 english分词器,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 增加字段 msg_english,与 msg 做对比
PUT test/_doc/_mapping?include_type_name=true
{
"properties": {
"msg_english":{
"type":"text",
"analyzer": "english"
}
}
}

# 写入相同文档
PUT test/doc/1
{
"msg":"Eating an apple a day keeps doctor away",
"msg_english":"Eating an apple a day keeps doctor away"
}

# 搜索 msg_english 字段
POST test/_search
{
"query": {
"match": {
"msg_english": "eat"
}
}
}

执行上面的内容,我们会发现结果有内容了,原因也很简单,如下图所示:
Ctf9xM
由上图可见 english分词器会将 eating分词为 eat,此时我们搜索 eat或者 eating肯定都可以匹配对应的文档了。至此,需求解决。

自定义分词器

假设有一天产品提了一个需求:西红柿,番茄 的同义词都要能搜索出来,这个时候我们就可以自定义分词器。

专题目录

ElasticStack-安装篇
ElasticStack-elasticsearch篇
ElasticStack-logstash篇
elasticSearch-mapping相关
elasticSearch-分词器介绍
elasticSearch-分词器实践笔记
elasticSearch-同义词分词器自定义实践
docker-elk集群实践
filebeat与logstash实践
filebeat之pipeline实践
Elasticsearch 7.x 白金级 破解实践
elk的告警调研与实践

xyjcvD

概述

最近有需求要将商品同步到es,并做pv、uv的宽表处理,评估一下Flink CDC的能力,实践一下Flink CDC的同步功能。
本次目标是将mysql和pg的表数据,在es进行宽表处理。

  • 本次模拟场景 产品表products、订单表orders在mysql数据库,物流表shipments在postgres数据库,最终的宽表enriched_orders在elasticsearch.

    模拟电商公司的订单表和物流表,需要对订单数据进行统计分析,对于不同的信息需要进行关联后续形成订单的大宽表后,交给下游的业务方使用 ES 做数据分析,这个案例演示了如何只依赖 Flink 不依赖其他组件,借助 Flink 强大的计算能力实时把 Binlog 的数据流关联一次并同步至 ES 。

版本

  • Apache Flink 1.13.1
  • flink-sql-connector-elasticsearch7_2.11-1.13.0.jar
  • flink-sql-connector-mysql-cdc-1.4.0.jar
  • flink-sql-connector-postgres-cdc-1.4.0.jar
  • java8

相关链接

安装

docker方式安装

1
2
FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
docker network create flink-network

JobManager

1
2
3
4
5
6
7
docker run \
--rm \
--name=jobmanager \
--network flink-network \
--publish 8081:8081 \
--env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
flink:1.13.1-scala_2.11 jobmanager

TaskManager

1
2
3
4
5
6
docker run \
--rm \
--name=taskmanager \
--network flink-network \
--env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
flink:1.13.1-scala_2.11 taskmanager

下载方式安装

https://flink.apache.org/zh/downloads.html

下载最新版本即可。

1
2
3
4
5
6
7
8
9
10
11
12
// 停止集群
bin/stop-cluster.sh
// 启动集群
bin/start-cluster.sh
// 进入flink sql 客户端命令行界面
bin/sql-client.sh embedded
// 查看当前运行的jobs
bin/flink list
// 查看所有的任务,包括失败、成功、取消的
bin/flink list -a
// 取消命令
bin/flink cancel jobID

实践

1、下载 docker-compose.yml

先使用docker创造一些mysql 和 pg的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
version: '2.1'
services:
postgres:
image: debezium/example-postgres:1.1
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=1234
- POSTGRES_DB=postgres
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
mysql:
image: debezium/example-mysql:1.1
ports:
- "3306:3306"
environment:
- MYSQL_ROOT_PASSWORD=123456
- MYSQL_USER=mysqluser
- MYSQL_PASSWORD=mysqlpw
elasticsearch:
image: elastic/elasticsearch:7.6.0
environment:
- cluster.name=docker-cluster
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- discovery.type=single-node
ports:
- "9200:9200"
- "9300:9300"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
kibana:
image: elastic/kibana:7.6.0
ports:
- "5601:5601"
zookeeper:
image: wurstmeister/zookeeper:3.4.6
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:2.12-2.2.1
ports:
- "9092:9092"
- "9094:9094"
depends_on:
- zookeeper
environment:
- KAFKA_ADVERTISED_LISTENERS=INSIDE://:9094,OUTSIDE://localhost:9092
- KAFKA_LISTENERS=INSIDE://:9094,OUTSIDE://:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
- KAFKA_INTER_BROKER_LISTENER_NAME=INSIDE
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CREATE_TOPICS="user_behavior:1:1"
volumes:
- /var/run/docker.sock:/var/run/docker.sock

2、进入 mysql 容器,初始化数据:

1
docker-compose exec mysql mysql -uroot -p123456
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- MySQL
CREATE DATABASE mydb;
USE mydb;
CREATE TABLE products (
id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description VARCHAR(512)
);
ALTER TABLE products AUTO_INCREMENT = 101;

INSERT INTO products
VALUES (default,"scooter","Small 2-wheel scooter"),
(default,"car battery","12V car battery"),
(default,"12-pack drill bits","12-pack of drill bits with sizes ranging from #40 to #3"),
(default,"hammer","12oz carpenter's hammer"),
(default,"hammer","14oz carpenter's hammer"),
(default,"hammer","16oz carpenter's hammer"),
(default,"rocks","box of assorted rocks"),
(default,"jacket","water resistent black wind breaker"),
(default,"spare tire","24 inch spare tire");

CREATE TABLE orders (
order_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
order_date DATETIME NOT NULL,
customer_name VARCHAR(255) NOT NULL,
price DECIMAL(10, 5) NOT NULL,
product_id INTEGER NOT NULL,
order_status BOOLEAN NOT NULL -- 是否下单
) AUTO_INCREMENT = 10001;

INSERT INTO orders
VALUES (default, '2020-07-30 10:08:22', 'Jark', 50.50, 102, false),
(default, '2020-07-30 10:11:09', 'Sally', 15.00, 105, false),
(default, '2020-07-30 12:00:30', 'Edward', 25.25, 106, false);


3、进入postgres 容器,初始化数据:

1
docker-compose exec postgres psql -h localhost -U postgres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- PG
CREATE TABLE shipments (
shipment_id SERIAL NOT NULL PRIMARY KEY,
order_id SERIAL NOT NULL,
origin VARCHAR(255) NOT NULL,
destination VARCHAR(255) NOT NULL,
is_arrived BOOLEAN NOT NULL
);
ALTER SEQUENCE public.shipments_shipment_id_seq RESTART WITH 1001;
ALTER TABLE public.shipments REPLICA IDENTITY FULL;

INSERT INTO shipments
VALUES (default,10001,'Beijing','Shanghai',false),
(default,10002,'Hangzhou','Shanghai',false),
(default,10003,'Shanghai','Hangzhou',false);

4、下载以下 jar 包到 <FLINK_HOME>/lib/:
查看上方相关链接下载

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
--FlinkSQL
CREATE TABLE products (
id INT,
name STRING,
description STRING
) WITH (
'connector' = 'mysql-cdc',
'hostname' = 'localhost',
'port' = '3306',
'username' = 'root',
'password' = '123456',
'database-name' = 'mydb',
'table-name' = 'products'
);

CREATE TABLE orders (
order_id INT,
order_date TIMESTAMP(0),
customer_name STRING,
price DECIMAL(10, 5),
product_id INT,
order_status BOOLEAN
) WITH (
'connector' = 'mysql-cdc',
'hostname' = 'localhost',
'port' = '3306',
'username' = 'root',
'password' = '123456',
'database-name' = 'mydb',
'table-name' = 'orders'
);

CREATE TABLE shipments (
shipment_id INT,
order_id INT,
origin STRING,
destination STRING,
is_arrived BOOLEAN
) WITH (
'connector' = 'postgres-cdc',
'hostname' = 'localhost',
'port' = '5432',
'username' = 'postgres',
'password' = 'postgres',
'database-name' = 'postgres',
'schema-name' = 'public',
'table-name' = 'shipments'
);

CREATE TABLE enriched_orders (
order_id INT,
order_date TIMESTAMP(0),
customer_name STRING,
price DECIMAL(10, 5),
product_id INT,
order_status BOOLEAN,
product_name STRING,
product_description STRING,
shipment_id INT,
origin STRING,
destination STRING,
is_arrived BOOLEAN,
PRIMARY KEY (order_id) NOT ENFORCED
) WITH (
'connector' = 'elasticsearch-7',
'hosts' = 'http://localhost:9200',
'index' = 'enriched_orders'
);

INSERT INTO enriched_orders
SELECT o.*, p.name, p.description, s.shipment_id, s.origin, s.destination, s.is_arrived
FROM orders AS o
LEFT JOIN products AS p ON o.product_id = p.id
LEFT JOIN shipments AS s ON o.order_id = s.order_id;

6、修改 mysql 和 postgres 里面的数据,观察 elasticsearch 里的结果。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
--MySQL
INSERT INTO orders
VALUES (default, '2020-07-30 15:22:00', 'Jark', 29.71, 104, false);

--PG
INSERT INTO shipments
VALUES (default,10004,'Shanghai','Beijing',false);

--MySQL
UPDATE orders SET order_status = true WHERE order_id = 10004;

--PG
UPDATE shipments SET is_arrived = true WHERE shipment_id = 1004;

--MySQL
DELETE FROM orders WHERE order_id = 10004;

期间遇到的问题

https://github.com/ververica/flink-cdc-connectors/issues/197

概述

在使用nestjs梳理RBAC构架,顺便把前端框架升到vue3,详细看了下Vue3 Composition API,有点意思

Composition API

1.ref

  • ref可以代理字符串、数字、boolean等基本类型值
  • ref声明的值需要通过.value去改变
  • ref目的是为了引用原始类型值,但仍然可以引用非基本类型值例如对象
  • ref 本质也是reactive 可以简单地把 ref(1) 理解为这个样子 reactive({value: 1})
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    //用法一 代理基本类型
    import { ref } from 'vue'

    const refVal = ref(1)

    const add = () => {
    refVal.value++ //值改变,视图更新
    }

    //用法二
    const refObj = ref({ foo: 1 })

    const add = () => {
    //需要通过.value去访问
    refObj.value.foo = 2 //值改变,触发视图更新
    }

    //用法三
    //可以通过ref代理某个对象下面的值,复制修改响应式数据不影响原对象
    const obj = { foo: 1 }

    const refVal = ref(obj.foo)

    const add = () => {
    refVal.value++
    console.log(refVal.value) //值改变,视图更新
    console.log(obj.foo) //值不变
    }


2.reactive

  • reactive接受一个可代理的对象,但不能是字符串、数字、boolean等基本类型值
  • reactive不需要通过.value去访问属性值
1
2
3
4
5
6
7
import { reactive } from 'vue'

const obj = reactive({ foo: 1 })

const add = () => {
obj.foo++ //值改变,视图更新
}

3.shallowReactive

  • 用法同reactive,用于定义一个浅响应数据只代理第一层,当数据结构比较复杂时,每层都用proxy代理消耗性能
1
2
3
4
5
6
7
8
9
import { shallowReactive } from 'vue'

const obj = shallowReactive({ foo: { bar: 1 } })

const add = () => {
obj.foo.bar = 2 //值改变,不触发视图更新
obj.foo = { bar: 2 } //值改变,视图更新
}

4.readonly

  • 用于定义一个只可读数据,接受一个Object对象
1
2
3
4
5
6
7
8
import { readonly } from 'vue'

const obj = readonly({ text: 'hi' })

const add = () => {
obj.text = 'hello' //报错
}

5.shallowReadonly

1
2
3
4
5
6
7
8
9
import { shallowReadonly } from 'vue'

const obj = shallowReadonly({ foo: { bar: 1 } })

const add = () => {
obj.foo = { bar: 2 } //报错
obj.foo.bar = 2 //有效
}

6.toRef

创建一个ref类型数据, 并和以前的数据关联
相当于引用, 修改响应式数据会影响原始数据
第一个参数为 obj 对象;第二个参数为对象中的属性名

应用场景:如果想让响应式数据和原始的数据关联起来, 并且更新响应式数据之后还不想更新UI, 那么就可以使用toRef

1
2
3
4
5
6
7
8
9
10
11
12
13
14
//数据发生改变, 视图也不会自动更新

const obj = { foo: 1 }

//const obj = reactive({ foo: 1}) //reactive创建的对象会触发视图更新

const refVal = toRef(obj, 'foo')

const add = () => {
refVal.value++
console.log(refVal.value) //值改变,视图不更新
console.log(obj.foo) //值改变,视图不更新
}

7.toRefs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//用法一

<template>
<p @click="add">{{ foo }}</p>
</template>

import { reactive, toRefs } from 'vue'

const obj = reactive({ foo: 1 })

return{
...toRefs(obj) //将obj里的每个属性转化为ref响应式数据
}

//用法二

//批量创建ref类型数据, 并和以前数据关联,不触发视图更新

import { reactive, toRefs } from 'vue'

const obj = { foo: 1, num: 1 }
//const obj = reactive({ foo: 1, num: 1 }) //reactive创建的对象会触发视图更新

const state = toRefs(obj)

const add = () => {
state.foo.value = 2
state.num.value = 2
console.log(state.foo.value) // 2 值改变,视图不更新
console.log(obj.foo) // 2 值改变,视图不更新
}

8.shallowRef

这是一个浅层的 ref ,只代理.value 可用于优化性能

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

import { shallowRef, triggerRef } from 'vue'

const obj = shallowRef({ foo: 1 })

//shallowRef只代理 ref 对象本身,也就是说只有 .value 是被代理的,而 .value 所引用的对象并没有被代理

const add = () => {
obj.value.foo = 2 //值改变,视图不更新
triggerRef(obj) // 可通过修改值后立即驱动视图更新

obj.value = { foo: 2 } //值改变,视图更新

}

9.unref

unref接收一个值,如果这个值是 ref 就返回 .value,否则原样返回

10.markRaw

markRaw 方法可以将原始数据标记为非响应式的,即使用 ref 或 reactive 将其包装,仍无法实现数据响应式,其接收一个参数,即原始数据,并返回被标记后的数据

markRaw 函数所做的事情,就是在数据对象上定义 __v_skip 属性,从而跳过代理

1
2
3
4
5
6
7
8
9
10
11
12
13
import { markRaw, reactive } from 'vue'

//通过markRow代理过的对象,不会触发视图更新
//markRow可用来数据改变但不需要视图改变的情况,用于提升性能。
const obj = { foo: 1 }
const obj2 = markRaw(obj)
const state = reactive(obj2)

const add = () => {
state.foo = 2 //值改变,视图不更新
console.log(state) //2
console.log(obj) //2
}

11.toRaw

toRaw方法用于拿到原始数据,对原始数据进行修改,不会更新UI界面,
与markRow()方法类似可用于提升性能,不同的是 markRow接收的不是被代理过的响应式数据
toRaw 方法是用于获取 ref 或 reactive 对象的原始数据的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import { toRaw, reactive, ref } from 'vue'

//代理reactive对象
const obj = reactive({ foo: 1 })
const obj2 = toRaw(obj)

const add = () => {
obj2.foo = 2 //值改变,视图不会更新
}

//代理ref创建的对象

const obj = ref({ foo: 1 })
const obj2 = toRaw(obj.value) //与reactive不同的是,需要用.value去获取原始数据,因为经过Vue处理之后,.value中保存的才是当初创建时传入的那个原始数据

const add = () => {
obj2.foo = 2 //值改变,视图不会更新
console.log(obj.value) //输出 { foo: 2 }
}

12.isRef

用于判断数据是否是ref创建的,Vue3创建ref的时候会增加__v_isRef: true属性来标识ref数据

1
2
3
4
5
6
import { ref, isRef } from 'vue'

const val = ref(1)

console.log(isRef(val)) //true

13.isReactive

判断数据对象是否是 reactive

1
2
3
4
5
6
import { ref, isReactive } from 'vue'

const obj = reactive({ foo: 1 })

console.log(isReactive(obj)) //true

14.isReadonly

判断数据对象是否是readonly只可读

1
2
3
4
5
6
import { readonly, isReadonly } from 'vue'

const val = readonly({ foo: 1 })

console.log(isReadonly(val)) //true

15.isProxy

用于判断对象是否是reactive 或 readonly 创建的代理对象

1
2
3
4
5
6
7
8
import { readonly, reactive, isProxy } from 'vue'

const obj = reactive({ foo: 1 })
const val = readonly({ foo: 1 })

console.log(isProxy(obj)) //true
console.log(isProxy(val)) //true

16.computed

用法与Vue2中的computed一样

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import { ref, computed } from 'vue'

//写法-
const val = ref(1)

//vue3中计算属性的函数中如果只传入一个回调函数,表示的是get
const doubule = computed(() => val.value * 2)

const add = () => {
val.value = 3
console.log(doubule.value) // 6 需要通过.value去访问
}

//写法2
<template>
<input type="text" v-model="doubule">
</template>

const val = ref(1)

const doubule = computed(() => val.value * 2)

const doubule = computed({
get() {
//dobule的返回值
return obj.foo * 2
},
set(value) {
//写你的逻辑代码
val.value++
obj.foo = val.value
}
})

17.watch

watch监听数据变化,需手动传入监听的数据,返回新值和旧值
与vue2不同的是 vue2需要通过computed计算才会返回新值和旧值,否则返回的都是新值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import { ref, reactive, watch } from 'vue'
//监听ref创建的数据

const val = ref(0)

watch(val,(newVal, oldVal) => {
console.log(newVal) //1 输出新值
console.log(oldVal) //0 输出旧值
},
{
immediate: false, //是否在初始化监听
deep: false //是否开启深度监听
}
)
const add = () => {
val.value = 1 //值改变
}

//监听reactive创建的数据,与ref不同的是需要用箭头函数指向要监听的数据
const obj = reactive({ foo: 0 })

watch(() => obj.foo,(newVal, oldVal) => {
console.log(newVal) //1 输出新值
console.log(oldVal) //0 输出旧值
},
{
immediate: false, //是否在初始化监听
deep: false //是否开启深度监听
}
)

const add = () => {
obj.foo = 1 //值改变
}


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//监听多个数据源
const val = ref(1)

const obj = reactive({ foo: 1 })

watch([() => obj.foo, val], ([newFoo, newVal], [oldFoo, oldVal]) => {
console.log(newFoo, oldFoo)
console.log(newVal, oldVal)
})

const add = () => {
val.value += 2
obj.foo++
}

//watch 接受一个stop


18.watchEffect

watchEffect也是监听数据变化。
与watch不同的是:
1.不需要手动传入依赖
2.每次初始化都会执行
3.无法获取到原值,只能得到变化后的值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import { ref, reactive, watchEffect } from 'vue'

const obj = reactive({ foo: 1 })

const val = ref(0)

watchEffect(() => {
console.log(obj.foo)
console.log(val.value)
})

const add = () => {
val.value++
obj.foo++
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//watchEffect还接受一个函数作为参数 ,可用于清除副作用
watchEffect(async () => {
const data = await fetch(obj.foo)
})

//当 obj.foo 变化后,意味着将会再次发送请求,那么之前的请求怎么办呢?是否应该将之前的请求标记为 invalidate

watchEffect(async (onInvalidate) => {
let validate = true
onInvalidate(() => {
validate = false
})
const data = await fetch(obj.foo)
if (validate){
/* 正常使用 data */
} else {
/* 说明当前副作用已经无效了,抛弃即可 */
}
})

19.defineComponent & PropType

两者都是为了更好的推断TS类型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import { defineComponent, PropType } from 'vue'

interface Mylist {
name: string
age: number
}

export default defineComponent({
props: {
list: Object as PropType<Mylist[]>
},

setup(){}
})

20.生命周期函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
improt{
onBeforeMount
onMounted
onBeforeUpdate
onUpdated
onBeforeUnmount
onUnmounted
onActivated
onDeactivated
onErrorCaptured
} from 'vue'
vue3 新增的两个钩子

onRenderTracked
onRenderTriggered

export default {
onRenderTriggered(e) {
debugger
// 检查哪个依赖项导致组件重新呈现
}
}


21.customRef

自定义 ref,常用来定义需要异步获取的响应式数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//可以用customRef实现一个搜索框防抖

<template>
<input type="text" v-model="text">
</template>


const useDebouncedRef = (value: string, delay = 1000) => {
let timeout: any
/**
* customRef回调接受两个参数
* track用于追踪依赖
* trigger用于触发响应
* 回调需返回一个包含get和set方法的对象
*/
return customRef((track, trigger) => {
return {
get() {
track() //追踪该数据
return value
},
set(newVal: string) {
clearTimeout(timeout)
timeout = setTimeout(() => {
value = newVal
trigger() // 数据被修改,更新ui界面
}, delay)
}
}
})
}
const text = useDebouncedRef('')

watch(text, async (newText) => {
if (!newText) return void 0
console.log(newText) //停止输入1秒后输出。
})

return{ text }

22.defineProps & defineEmit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
//在setup中直接用于接受props和emit,可做ts类型推导
<script setup lang="ts">
import { defineProps, defineEmit } from 'vue'

const props = defineProps<{
foo: number,
age?: number,
}>()

//当有多个事件且参数不同时
const emit = defineEmit<{
(e: 'close', id: number): void
(e: 'show', name: string, age: number): void
}>()

emit('close', 1)
emit('show', '1', 2)

//参数相同时
const emit = defineEmit<(e: 'close' | 'show', id: number) => void>()

emit('close', 1)
emit('show', 1)
</script>

23.defineAsyncComponent

1
2
3
4
5
6
7
8
9
10
//用于引入组件

<script setup lang="ts">
import { defineAsyncComponent } from 'vue'
const AsyncShow = defineAsyncComponent(
() => import('@/components/AsyncShow.vue')
)
</script>


24.script vars

支持将组件状态驱动的 CSS 变量注入到“单个文件组件”样式中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<template>
//颜色改变
<p class="text">hello</p>
</template>
<script setup lang="ts">
const color = '#3b6af9'
</script>

<style lang="scss" scoped>
.text {
color: v-bind(color);
}
</style>


25.provide && inject

与 Vue2中的 provide 和 inject 作用相同,只不过在Vue3中需要手动从 vue 中导入
这里简单说明一下这两个方法的作用:

provide :向子组件以及子孙组件传递数据。接收两个参数,第一个参数是 key,即数据的名称;第二个参数为 value,即数据的值
inject :接收父组件或祖先组件传递过来的数据。接收一个参数 key,即父组件或祖先组件传递的数据名称

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// A.vue
<script>
import {provide} from 'vue'
export default {
setup() {
const obj= {
name: '前端印象',
age: 22
}
// 向子组件以及子孙组件传递名为info的数据
provide('info', obj)
}
}
</script>
// B.vue
<script>
import {inject} from 'vue'
export default {
setup() {
// 接收A.vue传递过来的数据
inject('info') // {name: '前端印象', age: 22}
}
}
</script>
// C.vue
<script>
import {inject} from 'vue'
export default {
setup() {
// 接收A.vue传递过来的数据
inject('info') // {name: '前端印象', age: 22}
}
}
</script>

26.getCurrentInstance

获取当前实例,和vue2中的this相同,用于setup函数中(不建议使用)

1
2
3
4
import { getCurrentInstance } from 'vue'
const { ctx } = getCurrentInstance()
console.log(ctx)

27.vue-router里的hooks

1
2
3
4
5
6
7
import { useRoute, useRouter } from 'vue-router'
const route = useRoute()
const router = useRouter()

console.log(route.params.id)
router.push('/xxx/xxx')

28.vuex 里的 hooks

1
2
3
4
5
import { useStore } from 'vue-router'

const store = useStore()


Uu3mWe

docker-compose安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
version: '3.0'
services:
es01:
image: docker.elastic.co/elasticsearch/elasticsearch:7.13.0
container_name: es01
environment:
- node.name=es01
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es02,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms2048m -Xmx2048m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./elasticsearch/analysis/synonym.txt:/usr/share/elasticsearch/config/analysis/synonym.txt
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./elasticsearch/config/certs:/usr/share/elasticsearch/config/certs
- ./elasticsearch/config/crack/x-pack-core-7.13.0.jar:/usr/share/elasticsearch/modules/x-pack-core/x-pack-core-7.13.0.jar
- ./elasticsearch/data01:/usr/share/elasticsearch/data
ports:
- 9200:9200
es02:
image: docker.elastic.co/elasticsearch/elasticsearch:7.13.0
container_name: es02
environment:
- node.name=es02
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms2048m -Xmx2048m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./elasticsearch/config/certs:/usr/share/elasticsearch/config/certs
- ./elasticsearch/config/crack/x-pack-core-7.13.0.jar:/usr/share/elasticsearch/modules/x-pack-core/x-pack-core-7.13.0.jar
- ./elasticsearch/data02:/usr/share/elasticsearch/data
es03:
image: docker.elastic.co/elasticsearch/elasticsearch:7.13.0
container_name: es03
environment:
- node.name=es03
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es02
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms2048m -Xmx2048m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./elasticsearch/config/certs:/usr/share/elasticsearch/config/certs
- ./elasticsearch/config/crack/x-pack-core-7.13.0.jar:/usr/share/elasticsearch/modules/x-pack-core/x-pack-core-7.13.0.jar
- ./elasticsearch/data03:/usr/share/elasticsearch/data

kibana:
image: docker.elastic.co/kibana/kibana:7.13.0
container_name: kibana
restart: always
ports:
- 5601:5601
volumes:
- ./kibana/config:/usr/share/kibana/config
environment:
I18N_LOCALE: zh-CN
ELASTICSEARCH_URL: https://es01:9200
ELASTICSEARCH_HOSTS: '["https://es01:9200","https://es02:9200","https://es03:9200"]'

ent-search:
image: docker.elastic.co/enterprise-search/enterprise-search:7.13.0
container_name: ent-search
environment:
- "JAVA_OPTS=-Xms2048m -Xmx2048m"
volumes:
- ./enterprise-search/config/enterprise-search.yml:/usr/share/enterprise-search/config/enterprise-search.yml
- ./enterprise-search/config/certs:/usr/share/enterprise-search/config/certs
ports:
- 3002:3002

cerebro:
image: lmenezes/cerebro:0.9.4
container_name: cerebro
restart: always
ports:
- 8900:9000
command:
- -Dhosts.0.host=https://es01:9200
- -Dplay.ws.ssl.loose.acceptAnyCertificate=true

es-head:
image: mobz/elasticsearch-head:5
container_name: es-head
restart: always
ports:
- 9100:9100

networks:
default:
external:
name: dakewe

使用X-Pack设置授权加密

按照 Elasticsearch 的要求,如果我们在 docker 的环境中启动 xpack.security.enabled,我们必须也启动 xpack.security.transport.ssl.enabled。否则,我们将会看到如下的错误信息:

[1]:Transport SSL must be enabled if security is enabled on a [basic] license. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]

接下来,针对7.13.0版本的ElasticSearch配置加密授权访问,下边的步骤是必不可少的,建议认真阅读下去。X-PackElasticSearch 的一个插件,这个插件将提供与ElasticSearch来往的安全性。通过安装这个插件,我们就可以对 ElasticSearch 的集群节点生成证书,配置服务访问密码,以及使用TLS来确保HTTP客户端与集群之间的通信是加密的。

1
docker exec -it es01 bash

进入容器后,前往工作目录下(即/usr/share/elasticsearch),为Elasticearch集群创建一个证书颁发机构。使用elasticsearch-certutil命令输出一个默认名为elastic-stack-ca.p12的PKCS#12密钥存储库文件,它包含CA的公共证书和用于为每个节点签名证书的私钥。

1
2
cd /usr/share/elasticsearch
bin/elasticsearch-certutil ca

如下的命令来生成一个证书

1
bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

上面的命令将使用我们的 CA 来生成一个证书 elastic-certificates.p12:
执行exit退出容器,我们把上面的 elastic-certificates.p12 证书移至./elasticsearch/config/certs文件夹。

1
2
3
docker cp es01:/usr/share/elasticsearch/elastic-certificates.p12 ./elasticsearch/config/certs
sudo chmod -R 777 ./elasticsearch/config
docker-compose down

在docker-compose.yaml配置好证书映射
HUW30H
别忘了docker-compose down关掉服务,因为我们要改配置了。

接下来修改config/elasticsearch.yml来使用加密授权。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
cluster.name: "docker-cluster"
network.host: 0.0.0.0

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: Authorization,X-Requested-With,Content-Length,Content-Type


xpack.license.self_generated.type: basic

xpack.security.enabled: true

# 传输层通信:传输协议用于Elasticsearch节点之间的内部通信
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12

# HTTP层通信:客户端到Elasticsearch集群的通信
xpack.security.authc.api_key.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.http.ssl.truststore.path: certs/elastic-certificates.p12
xpack.security.http.ssl.verification_mode: certificate

xpack.monitoring.collection.enabled: false

verification_mode 我们选择certificate,这个模式不会去检查证书的CN,只验证证书是否是信任机构签名的即可.如果我们需要验证,并且配置了IP,则需要把这个模式该为full

如果证书是PEM格式,则使用下方配置

1
2
3
4
5
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.key: /home/es/config/node01.key
xpack.security.transport.ssl.certificate: /home/es/config/node01.crt
xpack.security.transport.ssl.certificate_authorities: [ "/home/es/config/ca.crt" ]

设置授权访问的账号和密码

再次启动并进入容器docker exec -it es01 bash,使用elasticsearch-setup-passwords为各个角色创建随机的密码:

1
bin/elasticsearch-setup-passwords auto

也可以使用密码设置来为每个角色设定密码:

1
bin/elasticsearch-setup-passwords interactive

4YqC8I
用interactive参数为每个角色设定密码

访问localhost:9200,输入user elastic的密码,成功获取正确json说明x-pack授权加密已经成功。

为 Elasticsearch 设置认证

内置用户
elastic : 内置的超级用户
kibana_system: 用户Kibana用于连接Elasticsearch并与之通信
logstash_system: Logstash写入监控数据时所需要的ES权限用户
beats_system: Beats写入监控数据时所需要的ES权限用户
apm_system: APM写入监控数据时所需要的ES权限用户
remote_monitoring_user: Metricbeat用户在Elasticsearch中收集和存储监视信息时使用。

使Kibana应用帐号密码

在kibana部分的kibana.yml追加参数:

1
2
elasticsearch.username: "kibana_system"
elasticsearch.password: "XXX"

使logstash应用帐号密码

1
2
3
4
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.hosts: [ "https://es0:9200" ]
xpack.monitoring.elasticsearch.username: "logstash_system"
xpack.monitoring.elasticsearch.password: "XXX"

然后执行docker-compose up -d kibana 启动服务,等待几分钟,访问localhost:5601,成功出现需要输入密码的界面说明配置成功。
q295bz

到这里,已经完成elasticsearch和kibana的全部部署工作.
pyo0I0

专题目录

ElasticStack-安装篇
ElasticStack-elasticsearch篇
ElasticStack-logstash篇
elasticSearch-mapping相关
elasticSearch-分词器介绍
elasticSearch-分词器实践笔记
elasticSearch-同义词分词器自定义实践
docker-elk集群实践
filebeat与logstash实践
filebeat之pipeline实践
Elasticsearch 7.x 白金级 破解实践
elk的告警调研与实践

出处:英文原文

类转换器的作用是将普通的javascript对象转换成类对象。我们通过api端点或者json文件访问所得的是普通的json文本,一般我们通过JSON.parse把其转换成普通的javascript对象,但是有时候我们想让它变成一个类的对象而不是普通的javascript对象。比如用class-validator来验证从后端api获取的json字符串时,我们就需要自动把json转为待验证类的对象而不是一个js对象。

例如我们现在可以读取远程api的一个users.json的内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[{
"id": 1,
"firstName": "Johny",
"lastName": "Cage",
"age": 27
},
{
"id": 2,
"firstName": "Ismoil",
"lastName": "Somoni",
"age": 50
},
{
"id": 3,
"firstName": "Luke",
"lastName": "Dacascos",
"age": 12
}]

我们有一个User

1
2
3
4
5
6
7
8
9
10
11
12
13
14
export class User {
id: number;
firstName: string;
lastName: string;
age: number;

getName() {
return this.firstName + " " + this.lastName;
}

isAdult() {
return this.age > 36 && this.age < 60;
}
}

然后你想通过user.json来获取User的对象数组

1
2
3
4
5
fetch("users.json").then((users: User[]) => {
// you can use users here, and type hinting also will be available to you,
// but users are not actually instances of User class
// this means that you can't use methods of User class
});

现在你可以获取users[0].firstname但是由于你获取的是普通的js对象而非User类的对象,所以你无法调用users[0].getName()方法,而class-transformer就是为了把普通的js对象按你的需求转换成类对象而生的。

你只要像下面这样就可以创建真正的User[]对象数组了

1
2
3
4
fetch("users.json").then((users: Object[]) => {
const realUsers = plainToClass(User, users);
// now each user in realUsers is instance of User class
});

安装

安装class-transformer:
npm install class-transformer --save
安装reflect-metadata:

安装后在app.ts这种顶层文件你需要import "reflect-metadata";

基础方法

plainToClass

普通对象转换为类对象

1
2
3
import {plainToClass} from "class-transformer";

let users = plainToClass(User, userJson); // to convert user plain object a single user. also supports arrays

plainToClassFromExist

普通对象合并已经创建的类实例

1
2
3
4
const defaultUser = new User();
defaultUser.role = 'user';

let mixedUser = plainToClassFromExist(defaultUser, user); // mixed user should have the value role = user when no value is set otherwise.

classToPlain

类实例转换为普通对象

转换后可以使用JSON.stringify再转成普通的json文本

1
2
import {classToPlain} from "class-transformer";
let photo = classToPlain(photo);

classToClass

克隆类实例

1
2
import {classToClass} from "class-transformer";
let photo = classToClass(photo);

可以使用ignoreDecorators选项去除所有原实例中的装饰器

serialize

直接把类实例转换为json文本,是不是数组都可以转换

1
2
import {serialize} from "class-transformer";
let photo = serialize(photo);

deserialize 和 deserializeArray

直接把json文本转换为类对象

1
2
import {deserialize} from "class-transformer";
let photo = deserialize(Photo, photo);

如果json文本是个对象数组请使用deserializeArray方法

1
2
import {deserializeArray} from "class-transformer";
let photos = deserializeArray(Photo, photos);

强制类型安全

plainToClass会把所有的被转换对象的属性全部类实例的属性,即时类中并不存在某些属性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import {plainToClass} from "class-transformer";

class User {
id: number
firstName: string
lastName: string
}

const fromPlainUser = {
unkownProp: 'hello there',
firstName: 'Umed',
lastName: 'Khudoiberdiev',
}

console.log(plainToClass(User, fromPlainUser))

// User {
// unkownProp: 'hello there',
// firstName: 'Umed',
// lastName: 'Khudoiberdiev',
// }

你可以使用excludeExtraneousValues选项结合Expose装饰器来指定需要公开的属性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import {Expose, plainToClass} from "class-transformer";

class User {
@Expose() id: number;
@Expose() firstName: string;
@Expose() lastName: string;
}

const fromPlainUser = {
unkownProp: 'hello there',
firstName: 'Umed',
lastName: 'Khudoiberdiev',
}

console.log(plainToClass(User, fromPlainUser, { excludeExtraneousValues: true }))

// User {
// id: undefined,
// firstName: 'Umed',
// lastName: 'Khudoiberdiev'
// }

子类型转换

嵌套对象

由于现在Typescript对反射还没有非常好的支持,所以你需要使用@Type装饰器来隐式地指定属性所属的类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import {Type, plainToClass} from "class-transformer";

export class Album {

id: number;

name: string;

@Type(() => Photo)
photos: Photo[];
}

export class Photo {
id: number;
filename: string;
}

let album = plainToClass(Album, albumJson);
// now album is Album object with Photo objects inside

多类型选项

一个嵌套的子类型也可以匹配多个类型,这可以通过判断器实现。判断器需要指定一个 property,而被转换js对象中的嵌套对象的也必须拥有与property相同的一个字段,并把值设置为需要转换的子类型的名称。判断器还需要指定所有的子类型值以及其名称,具体示例如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import {Type, plainToClass} from "class-transformer";

const albumJson = {
"id": 1,
"name": "foo",
"topPhoto": {
"id": 9,
"filename": "cool_wale.jpg",
"depth": 1245,
"__type": "underwater"
}
}

export abstract class Photo {
id: number;
filename: string;
}

export class Landscape extends Photo {
panorama: boolean;
}

export class Portrait extends Photo {
person: Person;
}

export class UnderWater extends Photo {
depth: number;
}

export class Album {

id: number;
name: string;

@Type(() => Photo, {
discriminator: {
property: "__type",
subTypes: [
{ value: Landscape, name: "landscape" },
{ value: Portrait, name: "portrait" },
{ value: UnderWater, name: "underwater" }
]
}
})
topPhoto: Landscape | Portrait | UnderWater;

}

let album = plainToClass(Album, albumJson);
// now album is Album object with a UnderWater object without `__type` property.

此外可以设置keepDiscriminatorProperty: true,这样可以把判断器的属性也包含在转换后的对象中

排除与公开

公开方法的返回值

添加@Expose装饰器即可公开getter和方法的返回值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import {Expose} from "class-transformer";

export class User {

id: number;
firstName: string;
lastName: string;
password: string;

@Expose()
get name() {
return this.firstName + " " + this.lastName;
}

@Expose()
getFullName() {
return this.firstName + " " + this.lastName;
}
}

公开属性为不同名称

如果要使用其他名称公开某些属性,可以通过为@Expose装饰器指定name选项来实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import {Expose} from "class-transformer";

export class User {

@Expose({ name: "uid" })
id: number;

firstName: string;

lastName: string;

@Expose({ name: "secretKey" })
password: string;

@Expose({ name: "fullName" })
getFullName() {
return this.firstName + " " + this.lastName;
}
}

跳过指定属性

有时您想在转换过程中跳过一些属性。这可以使用@Exclude装饰器完成:

1
2
3
4
5
6
7
8
9
10
11
import {Exclude} from "class-transformer";

export class User {

id: number;

email: string;

@Exclude()
password: string;
}

现在,当您转换用户时,password属性将被跳过,并且不包含在转换结果中。

根据操作决定跳过

我们可以通过toClassOnly或者toPlainOnly来控制一个属性在哪些操作中需要排除

1
2
3
4
5
6
7
8
9
10
11
import {Exclude} from "class-transformer";

export class User {

id: number;

email: string;

@Exclude({ toPlainOnly: true })
password: string;
}

现在password属性将会在classToPlain操作中排除,相反的可以使用toClassOnly

跳过类的所有属性

你可以通过在类上添加@Exclude装饰器并且在需要公开的属性上添加@Expose装饰器来只公开指定的属性

1
2
3
4
5
6
7
8
9
10
11
12
13
import {Exclude, Expose} from "class-transformer";

@Exclude()
export class User {

@Expose()
id: number;

@Expose()
email: string;

password: string;
}

另外,您可以在转换期间设置排除策略:

1
2
import {classToPlain} from "class-transformer";
let photo = classToPlain(photo, { strategy: "excludeAll" });

这时你不需要在添加@Exclude装饰器了

跳过私有属性或某些前缀属性

我们可以排除公开具有指定前缀的属性以及私有属性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import {Expose} from "class-transformer";

export class User {

id: number;
private _firstName: string;
private _lastName: string;
_password: string;

setName(firstName: string, lastName: string) {
this._firstName = firstName;
this._lastName = lastName;
}

@Expose()
get name() {
return this.firstName + " " + this.lastName;
}

}

const user = new User();
user.id = 1;
user.setName("Johny", "Cage");
user._password = 123;

const plainUser = classToPlain(user, { excludePrefixes: ["_"] });
// here plainUser will be equal to
// { id: 1, name: "Johny Cage" }

使用组来控制排除的属性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import {Exclude, Expose} from "class-transformer";

@Exclude()
export class User {

id: number;

name: string;

@Expose({ groups: ["user", "admin"] }) // this means that this data will be exposed only to users and admins
email: string;

@Expose({ groups: ["user"] }) // this means that this data will be exposed only to users
password: string;
}

let user1 = classToPlain(user, { groups: ["user"] }); // will contain id, name, email and password
let user2 = classToPlain(user, { groups: ["admin"] }); // will contain id, name and email

使用版本范围来控制公开和排除的属性

如果要构建具有不同版本的API,则class-transformer具有非常有用的工具。您可以控制应在哪个版本中公开或排除模型的哪些属性。示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import {Exclude, Expose} from "class-transformer";

@Exclude()
export class User {

id: number;

name: string;

@Expose({ since: 0.7, until: 1 }) // this means that this property will be exposed for version starting from 0.7 until 1
email: string;

@Expose({ since: 2.1 }) // this means that this property will be exposed for version starting from 2.1
password: string;
}

let user1 = classToPlain(user, { version: 0.5 }); // will contain id and name
let user2 = classToPlain(user, { version: 0.7 }); // will contain id, name and email
let user3 = classToPlain(user, { version: 1 }); // will contain id and name
let user4 = classToPlain(user, { version: 2 }); // will contain id and name
let user5 = classToPlain(user, { version: 2.1 }); // will contain id, name nad password

特殊处理

将日期字符串转换为Date对象

有时,您的JavaScript对象中有一个以字符串格式接收的Date。您想从中创建一个真正的javascript Date对象。您只需将Date对象传递给@Type装饰器即可完成此操作:

当从类对象反向转换为普通对象时registrationDate将会被转回为字符串

1
2
3
4
5
6
7
8
9
10
11
12
13
import {Type} from "class-transformer";

export class User {

id: number;

email: string;

password: string;

@Type(() => Date)
registrationDate: Date;
}

当您想将值转换为Number, String, Boolean 类型时也是这样做

数组处理

当你想转换数组时,你必须使用@Type装饰器指定数组项的类型也可以使用自定义的数组类型

Set和Map也是一样

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import {Type} from "class-transformer";

export class AlbumCollection extends Array<Album> {
// custom array functions ...
}

export class Photo {

id: number;

name: string;

@Type(() => Album)
albums: Album[];
// albums: AlbumCollection; 使用自定义类型
}

export class Skill {
name: string;
}

export class Weapon {
name: string;
range: number;
}

export class Player {
name: string;

@Type(() => Skill)
skills: Set<Skill>;

@Type(() => Weapon)
weapons: Map<string, Weapon>;
}

自定义转换

基本使用

你可以使用@Transform添加额外的数据转换,例如当你想把通过普通对象中的字符串日期转换后的date对象继续转换变成moment库的对象:

1
2
3
4
5
6
7
8
9
10
11
12
import {Transform} from "class-transformer";
import * as moment from "moment";
import {Moment} from "moment";

export class Photo {

id: number;

@Type(() => Date)
@Transform(value => moment(value), { toClassOnly: true })
date: Moment;
}

现在当执行plainToClass转换后的对象中的date属性将是一个Moment对象。@Transform同样支持组和版本。

高级用法

@Transform有更多的参数给你创建自定义的转换逻辑

@Transform((value, obj, type) => value)
参数 描述
value 自定义转换执行前的属性值
obj 转换源对象
type 转换的类型

其他装饰器

签名 示例
@TransformClassToPlain @TransformClassToPlain({ groups: [“user”] })
@TransformClassToClass @TransformClassToClass({ groups: [“user”] })
@TransformPlainToClas @TransformPlainToClass(User, { groups: [“user”] })
上述装饰器接受一个可选参数:ClassTransformOptions-转换选项,例如groups, version, name,示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
@Exclude()
class User {

id: number;

@Expose()
firstName: string;

@Expose()
lastName: string;

@Expose({ groups: ['user.email'] })
email: string;

password: string;
}

class UserController {

@TransformClassToPlain({ groups: ['user.email'] })
getUser() {
const user = new User();
user.firstName = "Snir";
user.lastName = "Segal";
user.password = "imnosuperman";

return user;
}
}

const controller = new UserController();
const user = controller.getUser();
user对象将包含firstname,latstname和email

使用泛型

由于目前Typescript对反射的支持还没有完善,所以只能使用其它替代方案,具体可以查看这个例子

隐式类型转换

你如果将class-validator与class-transformer一起使用,则可能不想启用此功能。

根据Typescript提供的类型信息,启用内置类型之间的自动转换。默认禁用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import { IsString } from 'class-validator'

class MyPayload {

@IsString()
prop: string
}


const result1 = plainToClass(MyPayload, { prop: 1234 }, { enableImplicitConversion: true });
const result2 = plainToClass(MyPayload, { prop: 1234 }, { enableImplicitConversion: false });

/**
* result1 will be `{ prop: "1234" }` - notice how the prop value has been converted to string.
* result2 will be `{ prop: 1234 }` - default behaviour
*/

循环引用

如果User包含一个Photo类型的photos数组属性,而Photo又包含一个属性链接到User,则转换过程中此属性会被忽略,除了classToClass操作。

概述

nestjs的装饰器很好用,于是想着自定义装饰器来实践下。
想到2个场景非常适合自定义装饰器。
一个是通过@User装饰器获取token下当前用户,
一个是通过@Permissions,进行角色权限的守卫校验。
那么实践一下呗

@User

对@user装饰器的实现,主要作用是能快速从token中快速拿到用户信息
user.decorator.ts

1
2
3
4
5
6
7
8
9
import { createParamDecorator, ExecutionContext } from '@nestjs/common';

export const User = createParamDecorator(
(data: string, ctx: ExecutionContext) => {
const request = ctx.switchToHttp().getRequest();
const user = request.user;
return data ? user && user[data] : user;
},
);

使用

1
2
3
4
@Get()
async findOne(@User() user: UserEntity) {
console.log(user);
}

1
2
3
4
@Get()
async findOne(@User('firstName') firstName: string) {
console.log(`Hello ${firstName}`);
}

@Permissions

permissions.decorator.ts

基于角色权限的装饰器实现,主要作用是进行增删改查操作的权限校验

1
2
3
4
5
6
7
8
9
10
11
// import { applyDecorators, SetMetadata } from '@nestjs/common';
// export const Permissions = (permissions: string) => SetMetadata('permissions', permissions);


import { applyDecorators, SetMetadata, UseGuards } from "@nestjs/common";
export function Permissions(permissions: string): Function {
// 可定义‘组合装饰器’
return applyDecorators(
SetMetadata('permissions', permissions)
)
}

拿个例子说下
获取用户列表v1/users接口增加@Permissions('sys:user:list')

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import { Body, Controller, Delete, Get, Param, ParseIntPipe, Post, Put, Query, UseGuards, UsePipes, ValidationPipe } from '@nestjs/common';
import { AuthGuard } from '@nestjs/passport';
import { ApiBearerAuth, ApiCreatedResponse, ApiOkResponse, ApiOperation, ApiParam, ApiTags } from '@nestjs/swagger';
import { JwtAuthGuard } from 'src/common/guards/jwt-auth.guard';
import { RolesGuard } from 'src/common/guards/roles.guard';
import { Result } from 'src/common/utils/result';
import { QueryUserDto } from './dto/query.dto';
import { UserService } from './user.service';
import { Permissions } from 'src/common/decorator/permissions.decorator'

@ApiTags('用户相关')
@Controller('v1/users')
@ApiBearerAuth()
@UseGuards(JwtAuthGuard, RolesGuard)
export class UserController {
constructor(
private readonly userService: UserService
) { }

@Get()
@ApiOperation({ summary: '查询用户列表' })
@Permissions('sys:user:list')
async list(@Query() dto: QueryUserDto): Promise<Result> {
console.log(dto)
const res = await this.userService.page(dto)
return Result.ok(res)
// throw new ForbiddenException()
}
}

通过RolesGuard进行角色权限校验
roles.guard.ts

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import { Injectable, CanActivate, ExecutionContext, ForbiddenException } from '@nestjs/common';
import { Reflector } from '@nestjs/core';
import { Observable } from 'rxjs';

@Injectable()
export class RolesGuard implements CanActivate {
constructor(private reflector: Reflector) { }
async canActivate(
context: ExecutionContext,
): Promise<boolean> {
const request = context.switchToHttp().getRequest();
const user = request.user;
console.log('当前用户', user)
// 当前请求所需权限
const currentPerm = this.reflector.get<string>('permissions', context.getHandler());
console.log('当前所需权限:', currentPerm)
// 标识不需要权限
if (!currentPerm) {
return true;
}
// 根据用户id 查询所拥有的权限
// const permList = await this.permSerivce.findUserPerms(user.id)
// const perms: string[] = []
// for (let i = 0, len = permList.length; i < len; i++) {
// permList[i]['m_perms'].indexOf(',') > -1 ? perms.push(...permList[i]['m_perms'].split(',')) : perms.push(permList[i]['m_perms'])
// }
// 匹配权限
// if (perms.includes(currentPerm)) return true
// throw new ForbiddenException()
}
}

概述

最近在用nestjs重新做RBAC构架。
本次完整实践nestjs的注册登陆,到jwt授权认证,到guard守卫拦截验证,到strategy的JWT策略。
和Spring Boot的Shiro对比下来,还是nestjs的写着舒服。

实践

auth.controller.ts
注册登陆的controller实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import { Body, Controller, Post } from '@nestjs/common';
import { ApiOkResponse, ApiOperation, ApiTags } from '@nestjs/swagger';
import { Result } from 'src/common/utils/result';
import { CreateUserDto } from '../user/dto/create.dto';
import { LoginUserDto } from '../user/dto/login.dto';
import { UserEntity } from '../user/user.entity';
import { UserService } from '../user/user.service';
import { AuthService } from './auth.service';

@ApiTags('登录注册')
@Controller('v1/auth')
export class AuthController {
constructor(
private readonly userService: UserService,
private readonly authService: AuthService
) { }

@Post('register')
@ApiOperation({ summary: '用户注册' })
@ApiOkResponse({ type: UserEntity })
async create(@Body() user: CreateUserDto): Promise<Result> {
console.log('user', user)
const res = await this.userService.create(user)
return Result.ok(res)
}

@Post('login')
@ApiOperation({ summary: '登录' })
async login(@Body() dto: LoginUserDto): Promise<Result> {
const res = await this.userService.login(dto.account, dto.password)
return Result.ok(res)
}
}

user.service.ts
注册登陆及JWT的service实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
import { HttpException, HttpStatus, Injectable, NotFoundException } from '@nestjs/common';
import { InjectRepository } from '@nestjs/typeorm';
import { Like, Repository, UpdateResult } from 'typeorm';
import { UserEntity } from './user.entity';
import { Result } from 'src/common/utils/result';
import { CreateUserDto } from './dto/create.dto';
import { QueryUserDto } from './dto/query.dto';
import { classToPlain, plainToClass } from 'class-transformer';
import { RedisService } from 'nestjs-redis';
import { ConfigService } from '@nestjs/config';
import { genSalt, hash, compare } from 'bcrypt'
import { JwtService } from '@nestjs/jwt';
import { UpdateUserDto } from './dto/update.dto';
@Injectable()
export class UserService {
constructor(
@InjectRepository(UserEntity)
private readonly userRep: Repository<UserEntity>,
private readonly redisService: RedisService,
private readonly config: ConfigService,
private readonly jwtService: JwtService,
) { }

async create(dto: CreateUserDto): Promise<UserEntity | Result> {
console.log(dto)
const existing = await this.findByUsername(dto.username)
if (existing) throw new HttpException('账号已存在,请调整后重新注册!', HttpStatus.NOT_ACCEPTABLE);
const salt = await genSalt()
dto.password = await hash(dto.password, salt)
const user = plainToClass(UserEntity, { salt, ...dto }, { ignoreDecorators: true })
console.log('user', user)
const res = await this.userRep.save(user)
return res
}

// 登录
async login(account: string, password: string): Promise<object | Result> {
const user = await this.findByUsername(account)
console.log("user", user)
if (!user) throw new HttpException('账号或密码错误', HttpStatus.NOT_FOUND);
console.log('账号', account)
console.log('密码', password)
console.log('加密的密码', user.password)
const checkPassword = await compare(password, user.password)
console.log('是否一致', checkPassword)
if (!checkPassword) throw new HttpException('账号或密码错误', HttpStatus.NOT_FOUND);
// 生成 token
const data = this.genToken({ id: user.id })
return data
}

// 根据ID查找
async findById(id: number): Promise<UserEntity> {
const res = await this.userRep.findOne(id)
if (!res) {
throw new NotFoundException()
}
return res
}

// 根据用户名查找
async findByUsername(username: string): Promise<UserEntity> {
return await this.userRep.findOne({ username })
}

// 生成 token
genToken(payload: { id: number }): Record<string, unknown> {
const accessToken = `Bearer ${this.jwtService.sign(payload)}`
const refreshToken = this.jwtService.sign(payload, { expiresIn: this.config.get('jwt.refreshExpiresIn') })
return { accessToken, refreshToken }
}

// 刷新 token
refreshToken(id: number): string {
return this.jwtService.sign({ id })
}

// 校验 token
verifyToken(token: string): number {
try {
if (!token) return 0
const id = this.jwtService.verify(token.replace('Bearer ', ''))
return id
} catch (error) {
return 0
}
}

// 根据JWT解析的ID校验用户
async validateUserByJwt(payload: { id: number }): Promise<UserEntity> {
return await this.findById(payload.id)
}
}

jwt-auth.guard.ts
守卫实现,拦截accessToken和refreshToken,过期进行续签。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import { Injectable, CanActivate, ExecutionContext, UnauthorizedException } from '@nestjs/common';
import { AuthGuard } from '@nestjs/passport';
import { Observable } from 'rxjs';
import { UserService } from 'src/modules/user/user.service';
@Injectable()
export class JwtAuthGuard extends AuthGuard('jwt') {
constructor(
private readonly userService: UserService,
) {
super()
}
async canActivate(
context: ExecutionContext,
): Promise<boolean> {
const req = context.switchToHttp().getRequest()
const res = context.switchToHttp().getResponse()
try {
const accessToken = req.get('Authorization')
if (!accessToken) throw new UnauthorizedException('请先登录')

const atUserId = this.userService.verifyToken(accessToken)
if (atUserId) return this.activate(context)
console.log(req.user)
const refreshToken = req.get('RefreshToken')
const rtUserId = this.userService.verifyToken(refreshToken)
if (!rtUserId) throw new UnauthorizedException('当前登录已过期,请重新登录')
const user = await this.userService.findById(rtUserId)
if (user) {
const tokens = this.userService.genToken({ id: rtUserId })
// request headers 对象 prop 属性全自动转成小写,
// 所以 获取 request.headers['authorization'] 或 request.get('Authorization')
// 重置属性 request.headers[authorization] = value
req.headers['authorization'] = tokens.accessToken
req.headers['refreshtoken'] = tokens.refreshToken
// 在响应头中加入新的token,客户端判断响应头有无 Authorization 字段,有则重置
res.header('Authorization', tokens.accessToken)
res.header('RefreshToken', tokens.refreshToken)
// 将当前请求交给下一级
return this.activate(context)
} else {
throw new UnauthorizedException('用户不存在')
}
} catch (error) {
// Logger
return false
}
}

async activate(context: ExecutionContext): Promise<boolean> {
return super.canActivate(context) as Promise<boolean>
}
}

jwt.strategy.ts
JWT解析后进行校验

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

import { Injectable, UnauthorizedException } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import { PassportStrategy } from '@nestjs/passport';
import { ExtractJwt, Strategy } from 'passport-jwt';
import { UserService } from 'src/modules/user/user.service';

@Injectable()
export class JwtStrategy extends PassportStrategy(Strategy) {
constructor(
private readonly userService: UserService,
private readonly config: ConfigService
) {
super({
jwtFromRequest: ExtractJwt.fromAuthHeaderAsBearerToken(),
secretOrKey: config.get('jwt.secretkey'),
})
}

async validate(payload: any) {
const user = await this.userService.validateUserByJwt(payload)
// 如果有用户信息,代表 token 没有过期,没有则 token 已失效
if (!user) throw new UnauthorizedException()
return user
}
}

概述

根据 ElasticStack-安装篇 安装好logstash,我们开始进行配置和同步数据。
本次实践通过logstash同步mysql到es。

配置

进入logstash容器进行安装logstash-input-jdbc

1
docker exec -it docker_logstash bash

安装logstash-input-jdbc

1
./bin/logstash-plugin install logstash-input-jdbc

下载 mysql-connector-java-8.0.24.jar
放入/data/dockers/logstash/config/mysql

在文件目录/data/dockers/logstash/config/conf.d下创建jdbc.conf文件,进行mysql数据到es的配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
input{
jdbc{
# 连接数据库
jdbc_connection_string => "jdbc:mysql://47.119.168.111:3306/fob?serverTimezone=Asia/Shanghai&characterEncoding=utf8&useSSL=false"
jdbc_user => "root"
jdbc_password => "XXXXX"
# 连接数据库的驱动包
jdbc_driver_library => "/usr/share/logstash/config/mysql/mysql-connector-java-8.0.24.jar"
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
codec => plain { charset => "UTF-8" }

# 数据追踪
# 追踪的字段
tracking_column => "updated_at"
# 上次追踪的元数据存放位置
last_run_metadata_path => "/usr/share/logstash/config/lastrun/logstash_jdbc_last_run"
# 设置时区
jdbc_default_timezone => "Asia/Shanghai"
# sql 文件地址
# statement_filepath => ""
# sql
statement => "SELECT g.id AS id,g.product_name AS product_name,g.shop_id AS shop_id,g.category_id AS category_id,g.keyword AS keyword,g.status AS status FROM fg_product g WHERE g.updated_at > :sql_last_value"
# 是否清除 last_run_metadata_path 的记录,如果为真那么每次都相当于从头开始查询所有的数据库记录
clean_run =>false
# 这是控制定时的,重复执行导入任务的时间间隔,第一位是分钟 不设置就是1分钟执行一次
schedule => "* * * * *"
}
}
output{
elasticsearch{
# 要导入到的Elasticsearch所在的主机
hosts => "47.119.168.111:9200"
# 要导入到的Elasticsearch的索引的名称
index => "fob_index"
# 类型名称(类似数据库表名)
document_type => "fg_product"
# 主键名称(类似数据库表名)
document_id => "%{id}"
}

stdout{
# JSON 格式输出
codec => json_lines
}
}


查看mysql数据是否进入到es
k0CrP0

专题目录

ElasticStack-安装篇
ElasticStack-elasticsearch篇
ElasticStack-logstash篇
elasticSearch-mapping相关
elasticSearch-分词器介绍
elasticSearch-分词器实践笔记
elasticSearch-同义词分词器自定义实践
docker-elk集群实践
filebeat与logstash实践
filebeat之pipeline实践
Elasticsearch 7.x 白金级 破解实践
elk的告警调研与实践

0%