HDFS is an append only filesystem, meaning to modify (UPDATE/DELETE statements) any portion of a file that is already written, one must rewrite the entire file and replace the old file, or write a new file to insert even a single record.
Compaction isn't an automatic process. You need to write your own code to query one table, then insert into another format like parquet/orc
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…