What is hbase's sql solution? 07/02 Update SLTechnology News&Howtos

What is hbase's sql solution?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Hbase sql solution is what kind of, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

The recent demand is to use sql such as mysql and sqlserver to query hbase data, so that not only the query is more in line with the usage habits than the query language that comes with hbase, but also can integrate the framework of mybatis and hibernate, so that you don't have to consider so much to write. I searched the Internet: "hbase sql query", phoenix has the most tutorial stickers. Being used by so many people, it must have its uniqueness, and the disadvantage will not be so difficult to accept, so I will try the effect with phoenix first and see how it feels.

First of all, it is not difficult to install phoenix. Simply download the corresponding hbase version of phoenix, decompress it, put the corresponding package under the lib of each hbase RegionServer, and restart hbase. My hadoop is version 6.3.0 of cdh, and I didn't have any difficulty installing it. After that, go to the bin directory of the phoenix installation location, execute sqlline.py, and you can enter the phoenix user interface in shell.

If the installation is successful, try some basic operations of phoenix, such as! Tables,create table, wait. It's all right, and its usage is similar to that of mysql. The big difference is that both inserts and updates use upsert, which is similar to hbase.

There is no problem with using it on linux, so the next step is to use javaApi to connect. The connection method is not difficult, just like other components, configure the connection information and then you can connect. There are many online tutorials on how to match them, so I won't work as a porter. Find a random link [12] Phoenix Java API operation homogenization content so much, always say something of your own. It is worth mentioning that the phoenix is connected using jdbc, and the configuration looks the same as the mysql connection, which is obviously very friendly. In accordance with the above tutorial operation, I feel very pleased, worthy of many tutorials recommended by the middleware, too comfortable to use (in addition to manually add two phoenix packages, I do not know if the pom file can be configured to automatically load such a trivial problem), even rookies like me can easily use. However, the tables above in the tutorial are built directly in the default namespace, so what if I want to build in the specified namespace?

In accordance with the previous habit, I directly changed the table name to the namespace. Table name, but an error was reported, indicating that the namespace.table table was not found. It's because I used it. Did something go wrong? I will。 Change it to:, but still report an error. The error message is only changed from namespace.table to namespace:table, indicating that the error type is all the same and the error is not here.

Search the Internet for "phoenix-specified namespaces" and a large number of posts tell me one thing: phoenix wants to use namespace mapping and requires additional configuration. Here I would like to say that many tutorials are not exactly the same, some are more and some are less, according to experience, more configuration is less likely to cause problems. Give a fully configured post: HBase-Phoenix installation and use tutorial 3 (enable, operate, turn off SCHEMA). When I read various posts about how to open phoenix namespace mapping, some posts mentioned that this feature not only did not work properly, but also caused problems with the original normal function. There is also something about how to close it in this post above, but I didn't encounter it in the process of using it. I'll show it to you just in case.

The post I read at that time is the cdh version, but the post can not be found now. The configuration content is completed on the web page, and the configuration content is more. I will show you the results after completion. The steps are basically the same, but the methods are different.

Configure the information on the client and server, respectively, and write these two configurations to the hbase.site.xml file in the bin directory of phoenix.

Then restart hbase, enter phoenix in shell, and you can use create schema to create namespaces.

Try java api to connect again, first change the previously downloaded hbase.site.xml to the current one, and then connect again. The error configuration information is related to the new phoenix configuration I added before. It feels like I haven't got a good configuration, and I can't find it for a long time. Manually add the configuration when creating a phoenix connection in a way that can be used first:

Properties pros = new Properties (); pros.setProperty ("phoenix.schema.isNamespaceMappingEnabled", "true"); pros.setProperty ("phoenix.schema.mapSystemTablesToNamespace", Boolean.toString (true))

There will be no problem with the connection created later. You can query tables under multiple namespaces directly, which is equivalent to cross-library queries. One thing to note is that when you use shell to enter phoenix to create a table under the namespace, you need to use schema before you can create the table. But when you use java api to create, you don't need to be so troublesome. Just create table schema.table directly. It doesn't need to be as troublesome as it is in shell. If, as in shell, use schema first and then create table will report an error. On this point, I did not see any tutorial mention, obviously this is the more valuable thing.

Take a chestnut:

Public static void selectTable () throws SQLException {String sql = "select * from ENTERPRISE_INFORMATION.TEST_PHOENIX_API a left join ENTERPRISE_INFORMATION.TEST_PHOENIX_API2 b on a.mykey = b.mykey"; rs = stat.executeQuery (sql); JSONArray array = resultSetToJosnArray (rs); System.out.println (array.toJSONString ()) } public static void testCreateTable () throws SQLException {String sql= "create table ENTERPRISE_INFORMATION.test_phoenix_api2 (mykey integer not null primary key, mycolumn varchar)"; stat.executeUpdate (sql); conn.commit ();} public static void upsert () throws SQLException {String sql1= "upsert into ENTERPRISE_INFORMATION.test_phoenix_api2 values (1) String sql2= "upsert into ENTERPRISE_INFORMATION.test_phoenix_api2 values"; String sql3= "upsert into ENTERPRISE_INFORMATION.test_phoenix_api2 values"; stat.executeUpdate (sql1); stat.executeUpdate (sql2); stat.executeUpdate (sql3); conn.commit () } public static void delete () throws SQLException {String sql1= "delete from ENTERPRISE_INFORMATION.test_phoenix_api where mykey = 1"; stat.executeUpdate (sql1); conn.commit ();} public static void deleteTable () throws SQLException {String sql = "drop table ENTERPRISE_INFORMATION.test_phoenix_api"; stat.executeUpdate (sql); conn.commit ()

After creating a connection, the executed table name can be directly used in schema.table, and without double quotation marks, both uppercase and lowercase are converted to uppercase. When performing select, use rs = stat.executeQuery (sql); when adding, deleting and modifying, use stat.executeUpdate (sql). If you use executeQuery when adding, deleting and correcting, you may report an error. It is also important to note that phoenix uses single quotes instead of double quotes when identifying string type data. Double quotes will report errors and cannot be recognized, which is the same whether it is executed by shell or java api, such as my above sql:

"upsert into ENTERPRISE_INFORMATION.test_phoenix_api2 values (1)

It is important to note that test4 is wrapped in single quotation marks.

I used squirrel-3.7.1 for the graphical interface operation of phoenix. Configuration is relatively simple, you need to add the driver of phoenix, and then write down the jdbc connection of phoenix. The use is not bad, and can connect to multiple databases. But after I did the phoenix namespace mapping, I couldn't connect. The error report is the same as my previous error report over the java api connection, the lack of configuration for phoenix.schema. I can't solve this problem. I really can't find anything wrong with the configuration. A lot of tutorials on the Internet failed to solve my problem. In the end, I had to give up squirrel and use shell to operate phoenix.

The above are all minor problems, which are still within the scope of my acceptance. The next step is to map the existing tables of phoenix and hbase. According to my understanding of phoenix, phoenix in use is the same as mysql, schema is mapped to a namespace, if table does not create a new table in hbase and map them together, I can operate the table in phoenix to convert the modification into a hbase statement; if table and hbase already have the same table name, then create a new mapping, the operation is still the same. Then it's better that the table name in hbase itself is uppercase, so I don't have to decorate the table name with double quotes, which is much easier to write.

In fact, I am not wrong, as long as the mapping of the same table name, then you can use phoenix to modify the hbase data. But when I modified the existing data through phoenix, I found a strange thing-phoenix could not see the existing data in hbase, but if the phoenix table was deleted, the hbase table would also be deleted, indicating that the mapping was successful. But why can't you see the data in phoenix? You can see the data in hbase. When I tried to use phoenix to add data to the mapping table, and then look at the data in hbase, I found a strange thing:

Hbase creates a table (imports the mysql table into hbase through sqoop):

Sqoop import\-connect jdbc:mysql://192.168.49.201:3306/envir\-- username root\-- password Hskj123456!\-- table hs_apply_detailinfo_r\-- hbase-table ENVIR:HS_APPLY_DETAILINFO_R\-- M1\-- column-family 0\-- hbase-create-table\-- hbase-row-key ID

Phoenix creates a table:

CREATE TABLE envir.hs_apply_detailinfo_r (ID varchar primary key, APPLY_TYPE varchar, APPLY_ID varchar, MAT_CD varchar, MAT_NUM double, PRODUCE_TIME date, OUTDATE_TIME date, SUP_CD varchar)

Phoenix modifies data:

Upsert envir.hs_apply_detailinfo_r (ID,APPLY_ID) values ('10 years last month 2')

If you look at the data of envir.hs_apply_detailinfo_r in phoenix, you can see the newly added data, and only this item, the data of the table imported by hbase is not available. Then to the hbase scan, found that the table added data, rather than modify the original row of 10, as 0:APPLY_ID cells, but added a new cell, column qualifier for the hexadecimal code, rather than I entered in phoenix APPLY_ID; and the value is not the string I entered 2, but a string of hexadecimal codes. In hue, I also cannot view the data of the table mapped by phoenix.

This problem is relatively large, a direct impact on whether I will continue to use phoenix, but I do not quite understand what went wrong, has been searching on the Internet, probably know that the problem is phoenix in the preservation of characters when processing and hbase is not the same, if the phoenix table field to be mapped to the hbase table field, to add column_encoded_bytes=0; and field properties to modify, can use unsigned_ as far as possible with unsigned_.

Phoenix mapping already exists HBase table, data cannot be queried

Data type problems in phoenix mapping hbase

So the creation of the phoenix table should be changed to:

CREATE TABLE envir.hs_apply_detailinfo_r ("ID" varchar primary key, "info". "APPLY_TYPE" VARBINARY, "info". "APPLY_ID" VARBINARY, "info". "MAT_CD" varchar, "info". "MAT_NUM" UNSIGNED_DOUBLE, "info". "PRODUCE_TIME" UNSIGNED_DATE, "info". OUTDATE_TIME "UNSIGNED_DATE," info "SUP_CD" varchar) column_encoded_bytes=0

I changed the column family to info, so I should also change-- column-family to info when importing with sqoop above. 0 is the default column family when phoenix does not specify a column family.

After the modification, the problem of table field mapping is solved, and I can modify the row=10 to list the value of the info:APPLY_TYPE cell. But the problem of numerical value is still unsolved. The value I entered in phoenix is' 2percent, and the result in hbase is still hexadecimal code. This problem is a little confusing, too low-level. I read the introduction on data type conversion (http://phoenix.apache.org/language/datatypes.html) given on the official website of phoenix, and there is no solution. Other netizens have encountered a similar problem, the solution is: when using hbase storage, first use Bytes.toBytes () to compress into phoenix form, so that you can use phoenix to get the data inserted by hbase.

In addition, there is a problem when phoenix stores the int of hbase, reporting that the length of the wrong field is not enough. Again, the hbase table and phoenix table above, if I create the phoenix table first, and then import the data into hbase through sqoop, I will report that the field length is not long enough (the pit encountered by hbase-presto-phoenix). So if I want to use phoenix, I need to give up a lot of things, the import of data can not be imported with sqoop, and the query modification of handwritten map/reduce; data must use phoenix, otherwise it can not be recognized correctly. Some solutions are written in the summary of migrating data from mysql to HBase+Phoenix, but I don't understand it. It's too difficult and troublesome for me. Instead of continuing to do so, it is better to change to a suitable middleware to meet the demand.

My colleague used antdb (antdb official website), I probably read the process of using it, and felt similar to phoenix, creating a virtual mysql to map the hbase table, so that users can modify hbase data in the same way as mysql. But I can't set the primary key. I don't know if there are some features like phoenix, and RegionServer was stopped directly during my use. I don't know why. It feels too immature, and there are no tutorial stickers to teach me how to use it, so as an alternative, let's take a look at how to use simplehbase.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.