ES Source Code Learning-- the implementation Logic of Get API 04/28 Update SLTechnology News&Howtos

ES Source Code Learning-- the implementation Logic of Get API

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

When the es project on Github talks about its ease of use, it is used to illustrate the out-of-the-box features of ES, using Get API. The excerpt is as follows:

-- add document curl-XPUT 'http://localhost:9200/twitter/doc/1?pretty'-H' Content-Type: application/json'-d'{"user": "kimchy", "post_date": "2009-11-15 × × 3:12:00", "message": "Trying out Elasticsearch, so far so good?"}'--read document curl-XGET 'http://localhost:9200/twitter/doc/1?pretty=true'

The common uses of Get API are as follows:

1 check whether the added documents are in line with expectations, which is super useful when troubleshooting problems.

2 get the whole document details according to id, which is used in the fetch phase of the search.

Get API is an excellent starting point to study the internal mechanism of ES. Through Get API, you can learn the following points:

A. the rest api implementation of ES.

B. the document routing method of ES.

C. The RPC implementation mechanism of ES.

D. ES's translog.

E. how ES uses lucene's IndexSearcher.

F. How ES gets the doc_id of lucene according to id.

G. how ES gets the document details according to the doc_id of lucene.

The study of the internal mechanism of ES is helpful to release the power of ES. For example, when developing ES's plugin according to the business, its internal process is a good reference. The more you know the internal details, the less likely you are to step in the hole.

The core process of GET API is as follows:

S1: receive client request

When you see the controller.registerHandler () method, it's easy to think of http's request public class RestGetAction extends BaseRestHandler {@ Inject public RestGetAction (Settings settings, RestController controller, Client client) {super (settings, controller, client); controller.registerHandler (GET, "/ {index} / {type} / {id}", this);} @ Override public void handleRequest (final RestRequest request, final RestChannel channel, final Client client) {... Client.get (getRequest, new RestBuilderListener (channel) {...});}}

S2: execute the request on the current node

Public class NodeClient extends AbstractClient {... @ Override public void doExecute (Action action, Request request, ActionListener listener) {TransportAction transportAction = actions.get (action);. TransportAction.execute (request, listener);}} here implies a mapping table of actions, as follows: public class ActionModule extends AbstractModule {... @ Override protected void configure () {... RegisterAction (GetAction.INSTANCE, TransportGetAction.class);...}}

S3: locate the part where the document is located

The idea of document location is very simple. By default, according to the document id, the document fragment ShardId is calculated with the hash function, and the NodeId is located by the fragment ShardId. ES maintains a routing table-like object internally, and the class name is RoutingTable. Through RoutingTable, all the shards can be found according to the index name, and the cluster Node corresponding to the shards can be found through the shard Id. There are two knowledge points about the positioning of documents from the perspective of application: routing and preferencepublic class TransportGetAction extends TransportSingleShardAction {... @ Override protected ShardIterator shards (ClusterState state, InternalRequest request) {return clusterService.operationRouting () .getShards (clusterService.state (), request.concreteIndex (), request.request (). Type (), request.request (). Id (), request.request (). Routing (), request.request (). Preference ());}}

S4: forward the request to the node where the shard is located

The distribution of the request involves the RPC communication of ES. In the previous step, navigate to NodeId and send the request to that NodeId. Because every Node code of ES is the same, each Node bears the responsibility of both Server and Client, which is different from other RPC frameworks. The core methods are transportService.sendRequest () and messageReceived (). Public abstract class TransportSingleShardAction extends TransportAction {class AsyncSingleAction {public void start () {transportService.sendRequest (clusterService.localNode (), transportShardAction, internalRequest.request (), new BaseTransportResponseHandler () {...});} private class ShardTransportHandler extends TransportRequestHandler {@ Override public void messageReceived (final Request request, final TransportChannel channel) throws Exception {... Response response = shardOperation (request, request.internalShardId); channel.sendResponse (response);}

S5: read the index file through id to get the document information corresponding to the id.

There are two phases here: step1: merge type and id into a field, locate the doc_idstep2 of lucene from the inverted index of lucene, and get details from forward information according to doc_id. Public final class ShardGetService extends AbstractIndexShardComponent {... Private GetResult innerGet (String type, String id, String [] gFields, boolean realtime, long version, VersionType versionType, FetchSourceContext fetchSourceContext, boolean ignoreErrorsOnGeneratedFields) {fetchSourceContext = normalizeFetchSourceContent (fetchSourceContext, gFields); Get = indexShard.get (new Engine.Get (realtime, new Term (UidFieldMapper.NAME, Uid.createUidAsBytes (typeX, id)) .version (version) .versionType (versionType)); InnerGetLoadFromStoredFields (type, id, gFields, fetchSourceContext, get, docMapper, ignoreErrorsOnGeneratedFields);}}

(note: if it is realtime=true, read the source from the translog first, and then read it from the index without reading it.)

S5 involves the internal implementation of Lucene, so I won't repeat it here.

Finally, let's sum up:

Get API is the function point that opens up the whole process within ES. From a functional point of view, it is simple enough; from an implementation point of view, it concatenates the main flow of ES and takes it as a starting point, so it will not float on the surface like the RestMainAction that displays You Know and for Search, and it will not be as complicated as the interface to achieve search.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.