I have reproduced what you are seeing:
import urllib, os
link = "http://python.org"
print "opening url:", link
site = urllib.urlopen(link)
meta = site.info()
print "Content-Length:", meta.getheaders("Content-Length")[0]
f = open("out.txt", "r")
print "File on disk:",len(f.read())
f.close()
f = open("out.txt", "w")
f.write(site.read())
site.close()
f.close()
f = open("out.txt", "r")
print "File on disk after download:",len(f.read())
f.close()
print "os.stat().st_size returns:", os.stat("out.txt").st_size
Outputs this:
opening url: http://python.org
Content-Length: 16535
File on disk: 16535
File on disk after download: 16535
os.stat().st_size returns: 16861
What am I doing wrong here? Is os.stat().st_size not returning the correct size?
Edit:
OK, I figured out what the problem was:
import urllib, os
link = "http://python.org"
print "opening url:", link
site = urllib.urlopen(link)
meta = site.info()
print "Content-Length:", meta.getheaders("Content-Length")[0]
f = open("out.txt", "rb")
print "File on disk:",len(f.read())
f.close()
f = open("out.txt", "wb")
f.write(site.read())
site.close()
f.close()
f = open("out.txt", "rb")
print "File on disk after download:",len(f.read())
f.close()
print "os.stat().st_size returns:", os.stat("out.txt").st_size
this outputs:
$ python test.py
opening url: http://python.org
Content-Length: 16535
File on disk: 16535
File on disk after download: 16535
os.stat().st_size returns: 16535
Make sure you are opening both files for binary read/write.
// open for binary write
open(filename, "wb")
// open for binary read
open(filename, "rb")
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…