Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
463 views
in Technique[技术] by (71.8m points)

python - How to read the last MB of a very large text file

I am trying to find a string near the end of a text file. The problem is that the text file can vary greatly in size. From 3MB to 4GB. But everytime I try to run a script to find this string in a text file that is around 3GB, my computer runs out of memory. SO I was wondering if there was anyway for python to find the size of the file and then read the last megabyte of it.

The code I am currently using is as follows, but like I said earlier, I do not seem to have a big enough memory to read such large files.

find_str = "ERROR"
file = open(file_directory)                           
last_few_lines? = file.readlines()[-20:]   

error? = False  

for line in ?last_few_lines?:
    if find_str in line:
    ?    error? = True
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use file.seek():

import os
find_str = "ERROR"
error = False
# Open file with 'b' to specify binary mode
with open(file_directory, 'rb') as file:
    file.seek(-1024 * 1024, os.SEEK_END)  # Note minus sign
    if find_str in file.read():
        error = True

You must specify binary mode when you open the file or you will get 'undefined behavior.' Under python2, it might work anyway (it did for me), but under python3 seek() will raise an io.UnsupportedOperation exception if the file was opened in the default text mode. The python 3 docs are here. Though it isn't clear from those docs, the SEEK_* constants are still in the os module.

Update: Using with statement for safer resource management, as suggested by Chris Betti.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...