Python easy way to set default encoding for opening files in text mode?
Question:
Is there an easy and cross-platform way to set default encoding for opening files (text mode) in Python, so you don’t have to write
open(filename, 'r', encoding='utf-8')
each time and can simply write
open(filename, 'r')
?
Answers:
You could create your own contextmanager:
import contextlib
@contextlib.contextmanager
def start_transaction(f ,mode="r", enc="utf-8"):
f = open(f, mode, encoding=enc)
try:
yield f
except:
raise
with start_transaction("in.txt") as f:
for line in f:
print (line)
from io import open # for python2 compatibility
old_open = open
def open(*args, **kwargs):
encoding = kwargs.pop('encoding', 'utf8')
return old_open(*args, encoding=encoding, **kwargs)
If you are sure there is a method named open
you mentioned above, then define such a function
import functools
open_file = functools.partial(open, encoding='utf-8')
then, open file with this new method,
f = open_file('some_file.txt', 'r')
Updated answer for Python 3.7 and up: you can set PYTHONUTF8 environment variable to 1. Reference:
https://docs.python.org/3/library/os.html#utf8-mode
The Python UTF-8 Mode ignores the locale encoding and forces the usage of the UTF-8 encoding
Some background knowledge for hungers:
open()
calls locale.getpreferredencoding()
to get default encoding if you are not passing it explicitly https://docs.python.org/3/library/functions.html#open . Keyword here is locale.
- locale can not be easily retrieved and system dependant so Python does it best to guess it. https://docs.python.org/3/library/locale.html#locale.getpreferredencoding
BTW, setting environment variable is useful when you don’t have the control over the source code, OR the number of files need to be changed is huge.
Is there an easy and cross-platform way to set default encoding for opening files (text mode) in Python, so you don’t have to write
open(filename, 'r', encoding='utf-8')
each time and can simply write
open(filename, 'r')
?
You could create your own contextmanager:
import contextlib
@contextlib.contextmanager
def start_transaction(f ,mode="r", enc="utf-8"):
f = open(f, mode, encoding=enc)
try:
yield f
except:
raise
with start_transaction("in.txt") as f:
for line in f:
print (line)
from io import open # for python2 compatibility
old_open = open
def open(*args, **kwargs):
encoding = kwargs.pop('encoding', 'utf8')
return old_open(*args, encoding=encoding, **kwargs)
If you are sure there is a method named open
you mentioned above, then define such a function
import functools
open_file = functools.partial(open, encoding='utf-8')
then, open file with this new method,
f = open_file('some_file.txt', 'r')
Updated answer for Python 3.7 and up: you can set PYTHONUTF8 environment variable to 1. Reference:
https://docs.python.org/3/library/os.html#utf8-mode
The Python UTF-8 Mode ignores the locale encoding and forces the usage of the UTF-8 encoding
Some background knowledge for hungers:
open()
callslocale.getpreferredencoding()
to get default encoding if you are not passing it explicitly https://docs.python.org/3/library/functions.html#open . Keyword here is locale.- locale can not be easily retrieved and system dependant so Python does it best to guess it. https://docs.python.org/3/library/locale.html#locale.getpreferredencoding
BTW, setting environment variable is useful when you don’t have the control over the source code, OR the number of files need to be changed is huge.