UnicodeDecodeError 'utf-8' codec can't decode – using python shapefile reader
Question:
I’m trying to read a shapefile
r = shapefile.Reader(filepath, encoding = "utf-8")
but when I try to get a value from the .records() object like:
r.records()[0]
it returns to me the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 4: invalid continuation byte
Answers:
That means your file is not encoded in utf-8. Try: ISO8859-1
If you are on Linux (or have git-bash on Windows) you can use the file
command to find out the encoding.
You can use this piece of code, to try different encodings when opening the shapefile. The code also searches for a .cpg file, which holds the encoding for a shapefile.
import os
import shapefile
# List with different encodings
encodings = ['utf-8', 'ISO8859-1']
# Try to add the encoding from the .cpg file
cpg_path = shp_path.replace('.shp', '.cpg')
if os.path.exists(cpg_path):
with open(cpg_path) as cpg_file:
for l in cpg_file:
encodings.insert(0, str(l))
# Try to open the shapefile with the encodings from the list
for e in encodings:
try:
with shapefile.Reader(shp_path, encoding=e) as shp:
print(f'Successfully opened the shapefile with encoding: {e}')
except UnicodeDecodeError:
print(f'Error when opening the shapefile with encoding: {e}')
I’m trying to read a shapefile
r = shapefile.Reader(filepath, encoding = "utf-8")
but when I try to get a value from the .records() object like:
r.records()[0]
it returns to me the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 4: invalid continuation byte
That means your file is not encoded in utf-8. Try: ISO8859-1
If you are on Linux (or have git-bash on Windows) you can use the file
command to find out the encoding.
You can use this piece of code, to try different encodings when opening the shapefile. The code also searches for a .cpg file, which holds the encoding for a shapefile.
import os
import shapefile
# List with different encodings
encodings = ['utf-8', 'ISO8859-1']
# Try to add the encoding from the .cpg file
cpg_path = shp_path.replace('.shp', '.cpg')
if os.path.exists(cpg_path):
with open(cpg_path) as cpg_file:
for l in cpg_file:
encodings.insert(0, str(l))
# Try to open the shapefile with the encodings from the list
for e in encodings:
try:
with shapefile.Reader(shp_path, encoding=e) as shp:
print(f'Successfully opened the shapefile with encoding: {e}')
except UnicodeDecodeError:
print(f'Error when opening the shapefile with encoding: {e}')