hadoop - How to delete duplicate records from Hive table? -
i trying learn deleting duplicate records hive table.
my hive table: 'dynpart' columns: id, name, technology
id  name  technology 1   abcd  hadoop 2   efgh  java 3   ijkl  mainframes 2   efgh  java we have options 'distinct' use in select query, select query retrieves data table. tell how use delete query remove duplicate rows hive table.
sure not recommended or not standard delete/update records in hive. want learn how it.
you can use insert overwrite statement update data
insert overwrite table dynpart select distinct * dynpart; 
Comments
Post a Comment