What is the meaning of cluster by in HIVE

1. What is cluster by?

      If you want to thoroughly understand: the difference between order by, sort by, distribute by and cluster by in HIVE , please check the difference between order by, sort by, distribute by and cluster by in HIVE . Website: https://blog.csdn.net/weixin_42845682/article/details/104953351

2. What is the significance of cluster by?

      You should already know that when the fields specified by distribute by and sort by are the same, you can use cluster by. However, someone cannot help but ask: Does cluster by have any meaning? Partition by XX field and sort by XX field.
      The answer is: when the number of partitions is less than the field type, it makes sense.
       For example:
       there is a student table, there are 100 majors in the school, but because of performance problems, only 5 partitions can be specified. At this time, according to the professional division, and then according to the professional order, there is meaning.
      

Published 48 original articles · Like 36 · Visits 130,000+

Guess you like

Origin blog.csdn.net/weixin_42845682/article/details/104954071