python - 'utf-8' codec can't decode byte 0x80

Question

Welcome To Ask or Share your Answers For Others

python - 'utf-8' codec can't decode byte 0x80

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - 'utf-8' codec can't decode byte 0x80

I'm trying to download BVLC-trained model and I'm stuck with this error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 110: invalid start byte

I think it's because of the following function (complete code)

  # Closure-d function for checking SHA1.
  def model_checks_out(filename=model_filename, sha1=frontmatter['sha1']):
      with open(filename, 'r') as f:
          return hashlib.sha1(f.read()).hexdigest() == sha1

Any idea how to fix this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T20:00:58+0000

You are opening a file that is not UTF-8 encoded, while the default encoding for your system is set to UTF-8.

Since you are calculating a SHA1 hash, you should read the data as binary instead. The hashlib functions require you pass in bytes:

with open(filename, 'rb') as f:
    return hashlib.sha1(f.read()).hexdigest() == sha1

Note the addition of b in the file mode.

See the open() documentation:

mode is an optional string that specifies the mode in which the file is opened. It defaults to 'r' which means open for reading in text mode. [...] In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.)

and from the hashlib module documentation:

You can now feed this object with bytes-like objects (normally bytes) using the update() method.

Categories

python - 'utf-8' codec can't decode byte 0x80

python - 'utf-8' codec can't decode byte 0x80

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags