Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the data asymmetry between MySQL and Elasticsearch

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

How to solve the data asymmetry between MySQL and Elasticsearch? This problem may be often seen in our daily study or work. I hope you can gain a lot from this question. The following is the reference content that the editor brings to you, let's take a look at it!

Jdbc-input-plugin can only append the database and write incrementally to elasticsearch, but often the database on the source side of jdbc may be deleted or updated. In this way, there is an asymmetry between the database and the database of the search engine. This article mainly introduces the relevant information about the solution to the problem of data asymmetry between MySQL and Elasticsearch. For elasticsearch incremental writing, but often the database on the source side of jdbc may do database deletion or update operation, here provides a solution, friends in need can refer to, hope to help you.

Of course, if you have a development team, you can write programs to synchronize search engine operations when deleting or updating. If you don't have this ability, you can try the following methods.

There is a data table article, and the mtime field defines ON UPDATE CURRENT_TIMESTAMP, so the time to update mtime changes each time.

Mysql > desc article +-- +-+ | Field | Type | Null | Key | Default | Extra | +-+- -+ | id | int (11) | NO | | 0 | | title | mediumtext | NO | | NULL | description | mediumtext | YES | | NULL | | author | varchar | YES | | NULL | source | varchar | YES | | NULL | | content | longtext | YES | | NULL | | status | enum ('Y') | 'N') | NO | |' N' | ctime | timestamp | NO | | CURRENT_TIMESTAMP | | mtime | timestamp | YES | | ON UPDATE CURRENT_TIMESTAMP | | +- -+-+ 7 rows in set (0.00 sec)

Logstash adds query rules for mtime

Jdbc {jdbc_driver_library = > "/ usr/share/java/mysql-connector-java.jar" jdbc_driver_class = > "com.mysql.jdbc.Driver" jdbc_connection_string = > "jdbc:mysql://localhost:3306/cms" jdbc_user = > "cms" jdbc_password = > "password" schedule = > "* *" # expressions for timing cron Here statement = > "select * from article where mtime >: sql_last_value" use_column_value = > true tracking_column = > "mtime" tracking_column_type = > "timestamp" record_last_run = > true last_run_metadata_path = > "/ var/tmp/article-mtime.last"} is executed every minute

Create a recycling bin table, which is used to resolve database deletions, or to disable status ='N'.

CREATE TABLE `elasticsearch_ trash` (`id` int (11) NOT NULL, `ctime` timestamp NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8

Create a trigger for the article table

CREATE DEFINER= `dba` @ `% `dba`% `TRIGGER `UPDATE` BEFORE UPDATE ON `roomle` FOR EACH ROWBEGIN-the logic here is to delete the corresponding data in the search engine when the status of the article changes to N. IF NEW.status ='N' THEN insert into elasticsearch_trash (id) values (OLD.id); END IF;-the logic here is that when the state is changed to Y, the article ID still exists in the mode elasticsearch_trash, resulting in erroneous deletion. So you need to delete the recycling record in the Recycle Bin. IF NEW.status ='Y' THEN delete from elasticsearch_trash where id = OLD.id; END IF;ENDCREATE DEFINER= `dba` @ `% `TRIGGER `article_BEFORE_ DELETE` BEFORE DELETE ON `roomle` FOR EACH ROWBEGIN-here the logic is that colleagues will put the changed articles in the search engine Recycle Bin when the article is deleted. Insert into elasticsearch_trash (id) values (OLD.id); END

Next we need to write a simple Shell to run every minute, extract the data from the elasticsearch_trash data table, and then use the curl command to call the elasticsearch restful interface to delete the recovered data.

You can also develop related programs, here is an example of Spring boot scheduled tasks.

Entity

Package cn.netkiller.api.domain.elasticsearch;import java.util.Date;import javax.persistence.Column;import javax.persistence.Entity;import javax.persistence.Id;import javax.persistence.Table;@Entity@Tablepublic class ElasticsearchTrash {@ Id private int id; @ Column (columnDefinition = "TIMESTAMP DEFAULT CURRENT_TIMESTAMP") private Date ctime; public int getId () {return id;} public void setId (int id) {this.id = id;} public Date getCtime () {return ctime } public void setCtime (Date ctime) {this.ctime = ctime;}}

Warehouse

Package cn.netkiller.api.repository.elasticsearch;import org.springframework.data.repository.CrudRepository;import com.example.api.domain.elasticsearch.ElasticsearchTrash;public interface ElasticsearchTrashRepository extends CrudRepository {}

Scheduled task

Package cn.netkiller.api.schedule;import org.elasticsearch.action.delete.DeleteResponse;import org.elasticsearch.client.transport.TransportClient;import org.elasticsearch.rest.RestStatus;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.scheduling.annotation.Scheduled;import org.springframework.stereotype.Component;import com.example.api.domain.elasticsearch.ElasticsearchTrash;import com.example.api.repository.elasticsearch.ElasticsearchTrashRepository @ Componentpublic class ScheduledTasks {private static final Logger logger = LoggerFactory.getLogger (ScheduledTasks.class); @ Autowired private TransportClient client; @ Autowired private ElasticsearchTrashRepository alasticsearchTrashRepository; public ScheduledTasks () {} @ Scheduled (fixedRate = 1000 * 60) / / schedule task public void cleanTrash () {for (ElasticsearchTrash elasticsearchTrash: alasticsearchTrashRepository.findAll ()) {DeleteResponse response = client.prepareDelete ("information", "article", elasticsearchTrash.getId () + ") .get (); RestStatus status = response.status () Logger.info ("delete {} {}", elasticsearchTrash.getId (), status.toString ()); if (status = = RestStatus.OK | | status = = RestStatus.NOT_FOUND) {alasticsearchTrashRepository.delete (elasticsearchTrash);}}

Spring boot starts the main program.

Package cn.netkiller.api;import org.springframework.boot.SpringApplication;import org.springframework.boot.autoconfigure.SpringBootApplication;import org.springframework.scheduling.annotation.EnableScheduling;@SpringBootApplication@EnableSchedulingpublic class Application {public static void main (String [] args) {SpringApplication.run (Application.class, args);}} Thank you for reading! After reading the above, do you have a general understanding of how to solve the data asymmetry between MySQL and Elasticsearch? I hope the content of the article will be helpful to all of you. If you want to know more about the relevant articles, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report