In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Good programmer big data learning route hive table query
1. join query
1. Small result sets always drive large result sets (small tables drive large tables, small tables are placed on the left table). Try not to use join, but join is difficult to avoid.
left join, left outer join, left semi join
Hive supports left join since version 0.8
Left join and left outer join have similar effects.
The on in the join of hive can only be connected with the equivalent "=", not with
< >= 90;
5. sort
sort by: sort, partial sort, can only guarantee the result sorting of a single reducer. order by: sort, global sort. Ensure that the results of the entire job are sorted. When there is only one reducer, sort by and order by have the same effect. It is recommended to use sort by usual and: desc asc . (default ascending order)
##Query student average scores sorted in descending order
select
avg(grade) avg_score
from
student join score
on student.id=score.stu_id
group by student.id
order by avg_score desc;
Number of reducers (equal to 1 or 2):
set mapreduce.job.reduces=2
##Order by
select
avg(grade) avg_score
from
student join score
on student.id=score.stu_id
group by student.id
order by avg_score desc;
##Sort by
select
avg(grade) avg_score
from
student join score
on student.id=score.stu_id
group by student.id
sort by avg_score desc;
6.distribute by:
Controls how output from map is reduced. This procedure is used by default for converting the entire hive statement into a job. If it is not written, the hash value in the first column is used by default. When there is only one reducer, it cannot be represented. If distribute by and sort by appear together, pay attention to the order problem? distribute by in front
clusterd by: This is equivalent to distribute by and sort by(ascending order). The following field name needs to be the same clusterd by It has both distribute by and sort by (only in ascending order)
select
id
from
student
distribute by id
sort by id;
select
id
from
student
clusterd by id;
7. limit : Limit the result set.
select
id,name
from student
limit 3;
8.union all:
Merge the result sets of two or more queries together without reordering each result set. The number of fields must be consistent, and the field types must be the same as much as possible.
##Merge ID103 using union all
select
id sid,
name snames
from student
where id103;
9.union:
Merge two or more query result sets together, de-duplicate, sort the merged data
##Merge ID103 using union
select
id sid,
name sname
from student
where id103
order by sname;
10. subquery
A subquery is a SELECT statement embedded within another SQL statement. Hive has limited support for subqueries. It only allows subqueries to appear in the FROM clause of a SELECT statement. Hive supports unrelated subqueries that are queried in WHERE clauses using either IN or EXISTS syntax. Hive currently does not support related subqueries, and the execution of related subqueries depends on the data of external queries.
##Unrelated Subqueries
select
id,name
from
student
where id
in
(select stu_id
from
score);
##Execution of related subqueries depends on data from external queries
select sid,uname
from
(select
id sid,
name uname
from student
) s
order by sid
##Outer queries access the results of subqueries like tables, which is why subqueries must be given an alias (s). Columns in a subquery must have unique names so that outer queries can reference them.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.