Use isinstance to test for Unicode string
Question:
How can I do something like:
>>> s = u'hello'
>>> isinstance(s,str)
False
But I would like isinstance
to return True
for this Unicode encoded string. Is there a Unicode string object type?
Answers:
Is there a Unicode string object type?
Yes, it is called unicode
:
>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>
Note that in Python 3.x, this type was removed because all strings are now Unicode.
Test for str
:
isinstance(unicode_or_bytestring, str)
or, if you must handle bytestrings, test for bytes
separately:
isinstance(unicode_or_bytestring, bytes)
The two types are deliberately not exchangible; use explicit encoding (for str
-> bytes
) and decoding (bytes
-> str
) to convert between the types.
In Python 2, where the modern Python 3 str
type is called unicode
and str
is the precursor of the Python 3 bytes
type, you could use basestring
to test for both:
isinstance(unicode_or_bytestring, basestring)
basestring
is only available in Python 2, and is the abstract base type of both str
and unicode
.
If you wanted to test for just unicode
, then do so explicitly:
isinstance(unicode_tring, unicode)
Is there a Unicode string object type?
Yes this works:
>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>
This is however only useful if you know that it is unicode.
Another solution is to use the six
package, which saves you from python2.x and python3.x conversions and catches unicode
and str
>>> unicode_s = u'hello'
>>> s = 'hello'
>>> isinstance(unicode_s, str)
False
>>> isinstance(unicode_s, unicode)
True
>>> isinstance(s, str)
True
>>> isinstance(unicode_s, str)
False
>>> isinstance(s, six.string_types)
True
>>> isinstance(unicode_s, six.string_types)
True
How can I do something like:
>>> s = u'hello'
>>> isinstance(s,str)
False
But I would like isinstance
to return True
for this Unicode encoded string. Is there a Unicode string object type?
Is there a Unicode string object type?
Yes, it is called unicode
:
>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>
Note that in Python 3.x, this type was removed because all strings are now Unicode.
Test for str
:
isinstance(unicode_or_bytestring, str)
or, if you must handle bytestrings, test for bytes
separately:
isinstance(unicode_or_bytestring, bytes)
The two types are deliberately not exchangible; use explicit encoding (for str
-> bytes
) and decoding (bytes
-> str
) to convert between the types.
In Python 2, where the modern Python 3 str
type is called unicode
and str
is the precursor of the Python 3 bytes
type, you could use basestring
to test for both:
isinstance(unicode_or_bytestring, basestring)
basestring
is only available in Python 2, and is the abstract base type of both str
and unicode
.
If you wanted to test for just unicode
, then do so explicitly:
isinstance(unicode_tring, unicode)
Is there a Unicode string object type?
Yes this works:
>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>
This is however only useful if you know that it is unicode.
Another solution is to use the six
package, which saves you from python2.x and python3.x conversions and catches unicode
and str
>>> unicode_s = u'hello'
>>> s = 'hello'
>>> isinstance(unicode_s, str)
False
>>> isinstance(unicode_s, unicode)
True
>>> isinstance(s, str)
True
>>> isinstance(unicode_s, str)
False
>>> isinstance(s, six.string_types)
True
>>> isinstance(unicode_s, six.string_types)
True