Use isinstance to test for Unicode string

Question:

How can I do something like:

>>> s = u'hello'
>>> isinstance(s,str)
False

But I would like isinstance to return True for this Unicode encoded string. Is there a Unicode string object type?

Asked By: A.D

||

Answers:

Is there a Unicode string object type?

Yes, it is called unicode:

>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>

Note that in Python 3.x, this type was removed because all strings are now Unicode.

Answered By: user2555451

Test for str:

isinstance(unicode_or_bytestring, str)

or, if you must handle bytestrings, test for bytes separately:

isinstance(unicode_or_bytestring, bytes)

The two types are deliberately not exchangible; use explicit encoding (for str -> bytes) and decoding (bytes -> str) to convert between the types.

In Python 2, where the modern Python 3 str type is called unicode and str is the precursor of the Python 3 bytes type, you could use basestring to test for both:

isinstance(unicode_or_bytestring, basestring)

basestring is only available in Python 2, and is the abstract base type of both str and unicode.

If you wanted to test for just unicode, then do so explicitly:

isinstance(unicode_tring, unicode)
Answered By: Martijn Pieters

Is there a Unicode string object type?

Yes this works:

>>> s = u'hello'
>>> isinstance(s, unicode)
True
>>>

This is however only useful if you know that it is unicode.
Another solution is to use the six package, which saves you from python2.x and python3.x conversions and catches unicode and str

>>> unicode_s = u'hello'
>>> s = 'hello'
>>> isinstance(unicode_s, str)
False
>>> isinstance(unicode_s, unicode)
True
>>> isinstance(s, str)
True
>>> isinstance(unicode_s, str)
False
>>> isinstance(s, six.string_types)
True
>>> isinstance(unicode_s, six.string_types)
True
Answered By: eleijonmarck
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.