How to pick rows with higher value only in Hive

How to pick rows with higher value only in Hive

I have the following data in Hive sorted by Count column: ItemX ItemXX Count ----- ------ ----- item1 item11 8 item1 item12 2 item1 item13 1 item2 item21 10 item2 item22 1 item3 item31 9 item3 item32 7 item3 item33 5 I need to filter max Count lines only. Like this: item1 item11 8 item2 item21 10 item3 item31 9 How can I do this in Hive QL

Use row_number function: select ItemX, ItemXX, Count from select ItemX, ItemXX, Count, row_number overpartition by ItemX order by Count desc rn from table_name s where rn1

In common cases without sorting you can use these queries: -- 1. SELECT a. FROM tbl a INNER JOIN SELECT ItemX, MAXCount Count FROM tbl GROUP BY ItemX b ON a.ItemX b.ItemX AND a.Count b.Count; -- 2. SELECT a. FROM tbl a LEFT OUTER JOIN tbl b ON a.ItemX b.ItemX AND a.Count b.Count WHERE b.ItemX IS NULL;

Комментарии

Популярные сообщения из этого блога

Как преобразовать вертикальную запись в горизонтальную?

Skipping acquire of configured file 'contrib/binary-i386/Packages' as repository … doesn't support architecture 'i386'

How to delete a folder in remote Windows from Linux