Elasticsearch之pinyin(ES拼音插件)安装教程
...大约 2 分钟
插件介绍
在电商网站(如:某东,某宝)中搜索商品的时候,输入拼音也能搜索到商品,那么在elasticsearch是如何实现的呢?答案就是安装pinyin插件,下面我会教大家如何安装并简单使用,如有错误,请评论指正,谢谢!
开始安装
基于elasticsearch6.2.4安装,操作系统:Mac OS
1. 使用 elasticsearch-plugin 工具安装
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v6.2.4/elasticsearch-analysis-pinyin-6.2.4.zip
2. 重启elasticsearch
使用栗子
1. 使用自定义拼音分析器创建索引
PUT /medcl/
{
"index" : {
"analysis" : {
"analyzer" : {
"pinyin_analyzer" : {
"tokenizer" : "my_pinyin"
}
},
"tokenizer" : {
"my_pinyin" : {
"type" : "pinyin",
"keep_separate_first_letter" : false,
"keep_full_pinyin" : true,
"keep_original" : true,
"limit_first_letter_length" : 16,
"lowercase" : true,
"remove_duplicated_term" : true
}
}
}
}
}
2. 测试分析器,分析一个中文名字,比如刘德华
GET /medcl/_analyze
{
"text": ["刘德华"],
"analyzer": "pinyin_analyzer"
}
结果:
{
"tokens" : [
{
"token" : "liu",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : "de",
"start_offset" : 1,
"end_offset" : 2,
"type" : "word",
"position" : 1
},
{
"token" : "hua",
"start_offset" : 2,
"end_offset" : 3,
"type" : "word",
"position" : 2
},
{
"token" : "刘德华",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 3
},
{
"token" : "ldh",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 4
}
]
}
3. 创建mapping
POST /medcl/folks/_mapping
{
"folks": {
"properties": {
"name": {
"type": "keyword",
"fields": {
"pinyin": {
"type": "text",
"store": false,
"term_vector": "with_offsets",
"analyzer": "pinyin_analyzer",
"boost": 10
}
}
}
}
}
}
4. 添加测试文档
POST /medcl/folks/andy
{"name":"刘德华"}
5. 搜索
http://localhost:9200/medcl/folks/_search?q=name:刘德华
curl http://localhost:9200/medcl/folks/_search?q=name.pinyin:刘德华
curl http://localhost:9200/medcl/folks/_search?q=name.pinyin:liu
curl http://localhost:9200/medcl/folks/_search?q=name.pinyin:ldh
curl http://localhost:9200/medcl/folks/_search?q=name.pinyin:de+hua
6. 使用Pinyin-TokenFilter
PUT /medcl1/
{
"index" : {
"analysis" : {
"analyzer" : {
"user_name_analyzer" : {
"tokenizer" : "whitespace",
"filter" : "pinyin_first_letter_and_full_pinyin_filter"
}
},
"filter" : {
"pinyin_first_letter_and_full_pinyin_filter" : {
"type" : "pinyin",
"keep_first_letter" : true,
"keep_full_pinyin" : false,
"keep_none_chinese" : true,
"keep_original" : false,
"limit_first_letter_length" : 16,
"lowercase" : true,
"trim_whitespace" : true,
"keep_none_chinese_in_first_letter" : true
}
}
}
}
}
Token Test: 刘德华 张学友 郭富城 黎明 四大天王
GET /medcl/_analyze
{
"text": ["刘德华 张学友 郭富城 黎明 四大天王"],
"analyzer": "user_name_analyzer"
}
结果:
{
"tokens" : [
{
"token" : "ldh",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 0
},
{
"token" : "zxy",
"start_offset" : 4,
"end_offset" : 7,
"type" : "word",
"position" : 1
},
{
"token" : "gfc",
"start_offset" : 8,
"end_offset" : 11,
"type" : "word",
"position" : 2
},
{
"token" : "lm",
"start_offset" : 12,
"end_offset" : 14,
"type" : "word",
"position" : 3
},
{
"token" : "sdtw",
"start_offset" : 15,
"end_offset" : 19,
"type" : "word",
"position" : 4
}
]
}
7. Used in phrase query
- option 1
PUT /medcl/
{
"index" : {
"analysis" : {
"analyzer" : {
"pinyin_analyzer" : {
"tokenizer" : "my_pinyin"
}
},
"tokenizer" : {
"my_pinyin" : {
"type" : "pinyin",
"keep_first_letter":false,
"keep_separate_first_letter" : false,
"keep_full_pinyin" : true,
"keep_original" : false,
"limit_first_letter_length" : 16,
"lowercase" : true
}
}
}
}
}
GET /medcl/folks/_search
{
"query": {"match_phrase": {
"name.pinyin": "刘德华"
}}
}
- option2
PUT /medcl/
{
"index" : {
"analysis" : {
"analyzer" : {
"pinyin_analyzer" : {
"tokenizer" : "my_pinyin"
}
},
"tokenizer" : {
"my_pinyin" : {
"type" : "pinyin",
"keep_first_letter":false,
"keep_separate_first_letter" : true,
"keep_full_pinyin" : false,
"keep_original" : false,
"limit_first_letter_length" : 16,
"lowercase" : true
}
}
}
}
}
POST /medcl/folks/andy
{"name":"刘德华"}
GET /medcl/folks/_search
{
"query": {"match_phrase": {
"name.pinyin": "刘德h"
}}
}
GET /medcl/folks/_search
{
"query": {"match_phrase": {
"name.pinyin": "刘dh"
}}
}
GET /medcl/folks/_search
{
"query": {"match_phrase": {
"name.pinyin": "dh"
}}
}
Powered by Waline v2.15.5