hadoop - How to delete duplicate records from Hive table? -
i trying learn deleting duplicate records hive table.
my hive table: 'dynpart' columns: id, name, technology
id name technology 1 abcd hadoop 2 efgh java 3 ijkl mainframes 2 efgh java
we have options 'distinct' use in select query, select query retrieves data table. tell how use delete query remove duplicate rows hive table.
sure not recommended or not standard delete/update records in hive. want learn how it.
you can use insert overwrite statement update data
insert overwrite table dynpart select distinct * dynpart;
Comments
Post a Comment