Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
222 views
in Technique[技术] by (71.8m points)

Python compare filenames and remove pseudo duplicates

I am still developing this small script to help me in my daily workflow. Incorporating previous input I've received in an earlier post, I came across my final issue.

Task

  • search for a filename pattern
  • copy to a new location
  • rename the files based on a particular scheme
  • delete pseudo duplicates (the filenames include an incrementing number)

This is the code so far

import os
import shutil

# set source and destination for files

source = "some path"
dest = "some other path"

monat = input('Monat: ')

full_dest = dest + monat

os.chdir(source)

# user inputs

f_start = int(input('Erste Rechnungsnummer: '))
f_end = int(input('Letzte Rechnungsnummer: '))

os.mkdir(dest + monat)

for file in os.listdir("."):
     if "_REC-" in file and f_start <= int(file.split("-")[1]) <= f_end:
       shutil.copy(file, full_dest)
     if "_GS-" in file and f_start <= int(file.split("-")[1]) <= f_end:
       shutil.copy(file, full_dest)
     if "_RECA-" in file and f_start <= int(file.split("-")[1]) <= f_end:
        shutil.copy(file, full_dest)        

os.chdir(full_dest)

for f in os.listdir():

 # split base filename and extension

      filename, file_ext = (os.path.splitext(f))

  # seperate at the '-' character

      f_num_i, f_num_r, f_num_date = filename.split('-')

  # seperate the type of file, i.e. invoice and document count

      f_num_random, f_num_t = f_num_i.split('_')

  # set new name format

      new_name = ('{}-{}-{}{}'.format(f_num_r, f_num_t, f_num_random, file_ext))

  # rename all files 

      os.rename(f, new_name)

The program returns filenames such as this:

  • 7222-REC-15356.pdf
  • 7222-REC-15358.pdf

Now, I only need to keep the file with the highest number (i.e. 15358 in the above example) however, I am not sure how to achieve this. I think I would need to compare files that have a common base and then strip them to select the one with the highest number and delete the other ones. I am not sure, to be honest on how to approach the task at hand.

I am confident that this might a simple task for a seasoned veteran. Thanks in advance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...