hi I have a ~5 million documents in mongodb (replication) each document 43 fields. how to remove duplicate document. I tryed
db.testkdd.ensureIndex({
duration : 1 , protocol_type : 1 , service : 1 ,
flag : 1 , src_bytes : 1 , dst_bytes : 1 ,
land : 1 , wrong_fragment : 1 , urgent : 1 ,
hot : 1 , num_failed_logins : 1 , logged_in : 1 ,
num_compromised : 1 , root_shell : 1 , su_attempted : 1 ,
num_root : 1 , num_file_creations : 1 , num_shells : 1 ,
num_access_files : 1 , num_outbound_cmds : 1 , is_host_login : 1 ,
is_guest_login : 1 , count : 1 , srv_count : 1 ,
serror_rate : 1 , srv_serror_rate : 1 , rerror_rate : 1 ,
srv_rerror_rate : 1 , same_srv_rate : 1 , diff_srv_rate : 1 ,
srv_diff_host_rate : 1 , dst_host_count : 1 , dst_host_srv_count : 1 ,
dst_host_same_srv_rate : 1 , dst_host_diff_srv_rate : 1 ,
dst_host_same_src_port_rate : 1 , dst_host_srv_diff_host_rate : 1 ,
dst_host_serror_rate : 1 , dst_host_srv_serror_rate : 1 ,
dst_host_rerror_rate : 1 , dst_host_srv_rerror_rate : 1 , lable : 1
},
{unique: true, dropDups: true}
)
run this code i get a error "errmsg" : "namespace name generated from index ..
{
"ok" : 0,
"errmsg" : "namespace name generated from index name "project.testkdd.$duration_1_protocol_type_1_service_1_flag_1_src_bytes_1_dst_bytes_1_land_1_wrong_fragment_1_urgent_1_hot_1_num_failed_logins_1_logged_in_1_num_compromised_1_root_shell_1_su_attempted_1_num_root_1_num_file_creations_1_num_shells_1_num_access_files_1_num_outbound_cmds_1_is_host_login_1_is_guest_login_1_count_1_srv_count_1_serror_rate_1_srv_serror_rate_1_rerror_rate_1_srv_rerror_rate_1_same_srv_rate_1_diff_srv_rate_1_srv_diff_host_rate_1_dst_host_count_1_dst_host_srv_count_1_dst_host_same_srv_rate_1_dst_host_diff_srv_rate_1_dst_host_same_src_port_rate_1_dst_host_srv_diff_host_rate_1_dst_host_serror_rate_1_dst_host_srv_serror_rate_1_dst_host_rerror_rate_1_dst_host_srv_rerror_rate_1_lable_1" is too long (127 byte max)",
"code" : 67
}
how can solve the problem ?
See Question&Answers more detail:
os