Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is ElasticSearch?

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "What is ElasticSearch". The explanation content in this article is simple and clear, easy to learn and understand. Please follow the ideas of Xiaobian and go deep into it slowly to study and learn "What is ElasticSearch" together!

(i) Introduction

ElasticSearch's goal is to achieve search. When the amount of data is small, we can search the data in the relational database through the index, but if the amount of data is large, the search efficiency will be very low, and at this time we need a distributed search engine. Elasticsearch is a Lucene-based search server. It provides a full-text search engine with distributed multi-user capabilities, based on RESTful web interfaces.

ES is primarily used for full-text search, structured search, and analytics. ES is widely used, such as Wikipedia, Github, etc. all use ES to achieve search.

(ii) Understanding of core concepts

2.1 data structure

Since ES is used for searching, it must also store data. In relational databases such as Mysql, data storage follows the following logic:

A database has multiple tables, each table has multiple rows of data, and each row of data consists of multiple columns.

The storage in ES is like this:

An index is equivalent to a database, each index has multiple types (equivalent to table structure), each index has multiple documents (equivalent to rows), and each document consists of multiple fields (equivalent to fields).

You can think of ES as a document-oriented database. The following diagram illustrates the similarities between ES and relational databases:

It is worth noting that in ES 7.x, types will be slowly abandoned, and in 8.x, types will be completely abandoned.

2.2 indexes and documents

An index in ES is not the same thing as an index in Mysql. An index in ES is a collection of documents, and an index is a database.

As mentioned earlier, ES is document-oriented, documents are the most important units in ES, and documents are pieces of data. There are several important concepts in the document:

A document contains multiple keys: value

A document is actually a JSON string.

2.3 slice

ES是一个分布式搜索引擎,分片就是把一堆数据分布到多个分片中。而索引是对每个分片的一个备份,这些副本同样能处理查询请求。

现在假设集群有两个node节点,设置分片数是5个,副本数是1个,那么数据存储结构将变成下面这样,可以保证副本和分片在不同的节点上:

2.4 倒排索引

为什么ES的搜索这么快,和其中所使用的倒排索引也有一定的关系。倒排索引建立的是分词和文档之间的映射关系。下面通过一个简单的例子来讲解一下什么是倒排索引

原来的数据中我们通过文档ID去关联标签,但是在查询时就需要遍历所有文档。通过倒排索引,我们可以通过关键词来找到最匹配的文档。

(三)ES的基本操作

ES是基于Restful风格进行操作的,因此对于习惯了写crud的程序员来说,ES很容易上手。ES的操作可以使用Kibana,也可以使用Postman直接调用,因为归根结底它就是一个restful的操作。我这里使用Idea的ES插件直接调用。

3.1 创建文档

PUT http://ip:port/索引名/类型名/文档id

{

"key":"value"

}

因为类型名在后续的版本中将会被删除,这里可以用_doc代表默认类型:

PUT http://ip:port/索引名/_doc/文档id

下面给出操作截图

通过put创建一个索引之后,我们可以在head中看到对应的数据

3.2 创建带有数据类型的索引

3.1中创建数据时,没有指定具体的数据类型,我们当然也可以为索引指定数据类型

PUT http://ip:port/索引名

参数示例:

{

"mappings": {

"properties": {

"name": {

"type": "text"

},

"address": {

"type": "text"

}

}

}

}

ES中的核心数据类型如下:

(1)字符串类型: text, keyword

(2)数字类型:long, integer, short, byte, double, float, half_float, scaled_float

(3)日期:date

(4)日期 纳秒:date_nanos

(5)布尔型:boolean

(6)Binary:binary

(7)Range: integer_range, float_range, long_range, double_range, date_range

3.3 查看索引或者文档的数据

通过GET请求可以查看索引以及文档的信息:

GET http://ip:port/索引名 #查看索引

GET http://ip:port/索引名/类型名/文档ID #查看文档

3.4 修改数据

修改数据和创建数据一样,通过PUT操作就会更新原来的数据:

PUT http://ip:port/索引名/类型名/文档id

{

"key":"value"

}

如果是修改的话,响应结果中的version就会增加。

另外一种方法是使用Post请求:

POST http://ip:port/索引名/类型名/文档id/_update

参数实例:

{

"doc": {

"name": "javayz4"

}

}

更推荐使用这种方式,如果使用PUT方法忘了加某个key,更新就会变成新增

3.5 删除数据

通过DELETE的方式删除数据

DELETE http://ip:port/索引名/类型名/文档id #删除具体的文档

DELETE http://ip:port/索引名 #删除索引

(四)ES的搜索操作

ES最重要的就是它的搜索操作了。

4.1 简单搜索

直接将搜索的参数带到链接中:

GET http://ip:port/索引名/_search?q=key:value

结果如下:

4.2 通过param传递参数

除了将参数放到链接当中,还可以将参数通过JSON请求体的方式传递,其中from和size是分页的参数,query中传递查询条件,_source表示结果中要展示的列,不写就表示展示所有。

GET http://ip:port/索引名/_search

参数示例:

{

"from": 0,

"size": 20,

"query": {

"match": {

"name": "javayz2"

}

},

"_source": ["name","address"]

}

除了上面示例中的这些参数之外,还有很多参数可以使用,比如排序:

"sort": [

{

"age": {

"order": "desc"

}

}

]

多条件查询:must表示下面的两个条件都要满足,还可以填should,表示任意满足其中一个条件即可,或者是must_not,表示must的相反值

"query": {

"bool": {

"must": [

{

"match": {

"name": "javayz"

}

},

{

"match": {

"address": "hz"

}

}

]

}

}

如果你的数据中存在集合,可以通过空格对多个条件进行查询:

查询过程中还支持高亮查询

"highlight":{

"pre_tags": "",

"post_tags": "",

"fields": {

"name": {}

}

}

(五)分词器

所谓分词器,就是将一段话分成一个个关键字,搜索时就按照这些关键字进行搜索。比较好用的分词器有中文的IK分词器。

基本使用

给出下载链接:https://github.com/medcl/elasticsearch-analysis-ik/releases

下载和自己ES相同的版本,在plugin目录下新建一个ik文件夹,将下载的文件解压到ik目录下,重新启动即可。

IK分词器提供了两种算法:

1、ik_smart:最少切分

2、ik_max_word:最细粒划分

首先最少切分是根据字典给出最少的切分:

ik_max_word是最细粒划分,他会给出最多的结果:

{

"analyzer": "ik_max_word",

"text": "我是Java工程师"

}

结果:

{

"tokens": [

{

"token": "我",

"start_offset": 0,

"end_offset": 1,

"type": "CN_CHAR",

"position": 0

},

{

"token": "是",

"start_offset": 1,

"end_offset": 2,

"type": "CN_CHAR",

"position": 1

},

{

"token": "java",

"start_offset": 2,

"end_offset": 6,

"type": "ENGLISH",

"position": 2

},

{

"token": "工程师",

"start_offset": 6,

"end_offset": 9,

"type": "CN_WORD",

"position": 3

},

{

"token": "工程",

"start_offset": 6,

"end_offset": 8,

"type": "CN_WORD",

"position": 4

},

{

"token": "师",

"start_offset": 8,

"end_offset": 9,

"type": "CN_CHAR",

"position": 5

}

]

}

感谢各位的阅读,以上就是"什么是ElasticSearch"的内容了,经过本文的学习后,相信大家对什么是ElasticSearch这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是,小编将为大家推送更多相关知识点的文章,欢迎关注!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report