Case insensitive dictionary
Question:
I’d like my dictionary to be case insensitive.
I have this example code:
text = "practice changing the color"
words = {'color': 'colour',
'practice': 'practise'}
def replace(words,text):
keys = words.keys()
for i in keys:
text= text.replace(i ,words[i])
return text
text = replace(words,text)
print text
Output = practise changing the colour
I’d like another string, "practice changing the Color"
, (where Color
starts with a capital) to also give the same output.
I believe there is a general way to convert to lowercase using
mydictionary[key.lower()]
but I’m not sure how to best integrate this into my existing code. (If this would be a reasonable, simple approach anyway).
Answers:
If I understand you correctly and you want a way to key dictionaries in a non case-sensitive fashion, one way would be to subclass dict and overload the setter / getter:
class CaseInsensitiveDict(dict):
def __setitem__(self, key, value):
super(CaseInsensitiveDict, self).__setitem__(key.lower(), value)
def __getitem__(self, key):
return super(CaseInsensitiveDict, self).__getitem__(key.lower())
While a case insensitive dictionary is a solution, and there are answers to how to achieve that, there is a possibly easier way in this case. A case insensitive search is sufficient:
import re
text = "Practice changing the Color"
words = {'color': 'colour', 'practice': 'practise'}
def replace(words,text):
keys = words.keys()
for i in keys:
exp = re.compile(i, re.I)
text = re.sub(exp, words[i], text)
return text
text = replace(words,text)
print text
Would you consider using string.lower()
on your inputs and using a fully lowercase dictionary? It’s a bit of a hacky solution, but it works
Just for the record. I found an awesome impementation on Requests:
https://github.com/kennethreitz/requests/blob/v1.2.3/requests/structures.py#L37
In my particular instance, I needed a case insensitive lookup, however, I did not want to modify the original case of the key. For example:
>>> d = {}
>>> d['MyConfig'] = 'value'
>>> d['myconfig'] = 'new_value'
>>> d
{'MyConfig': 'new_value'}
You can see that the dictionary still has the original key, however it is accessible case-insensitively. Here’s a simple solution:
class CaseInsensitiveKey(object):
def __init__(self, key):
self.key = key
def __hash__(self):
return hash(self.key.lower())
def __eq__(self, other):
return self.key.lower() == other.key.lower()
def __str__(self):
return self.key
The __hash__ and __eq__ overrides are required for both getting and setting entries in the dictionary. This is creating keys that hash to the same position in the dictionary if they are case-insensitively equal.
Now either create a custom dictionary that initializes a CaseInsensitiveKey using the provided key:
class CaseInsensitiveDict(dict):
def __setitem__(self, key, value):
key = CaseInsensitiveKey(key)
super(CaseInsensitiveDict, self).__setitem__(key, value)
def __getitem__(self, key):
key = CaseInsensitiveKey(key)
return super(CaseInsensitiveDict, self).__getitem__(key)
or simply make sure to always pass an instance of CaseInsensitiveKey as the key when using the dictionary.
The currently accepted answer wouldn’t work for lots of cases, so it cannot be used as a drop-in dict
replacement. Some tricky points in getting a proper dict
replacement:
- overloading all of the methods that involve keys
- properly handling non-string keys
- properly handling the constructor of the class
The following should work much better:
class CaseInsensitiveDict(dict):
@classmethod
def _k(cls, key):
return key.lower() if isinstance(key, basestring) else key
def __init__(self, *args, **kwargs):
super(CaseInsensitiveDict, self).__init__(*args, **kwargs)
self._convert_keys()
def __getitem__(self, key):
return super(CaseInsensitiveDict, self).__getitem__(self.__class__._k(key))
def __setitem__(self, key, value):
super(CaseInsensitiveDict, self).__setitem__(self.__class__._k(key), value)
def __delitem__(self, key):
return super(CaseInsensitiveDict, self).__delitem__(self.__class__._k(key))
def __contains__(self, key):
return super(CaseInsensitiveDict, self).__contains__(self.__class__._k(key))
def has_key(self, key):
return super(CaseInsensitiveDict, self).has_key(self.__class__._k(key))
def pop(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).pop(self.__class__._k(key), *args, **kwargs)
def get(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).get(self.__class__._k(key), *args, **kwargs)
def setdefault(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).setdefault(self.__class__._k(key), *args, **kwargs)
def update(self, E={}, **F):
super(CaseInsensitiveDict, self).update(self.__class__(E))
super(CaseInsensitiveDict, self).update(self.__class__(**F))
def _convert_keys(self):
for k in list(self.keys()):
v = super(CaseInsensitiveDict, self).pop(k)
self.__setitem__(k, v)
I’ve modified the simple yet good solution by pleasemorebacon (thanks!) making it slightly more compact, self-contained and with minor updates to allow construction from {'a':1, 'B':2}
and support __contains__
protocol.
Finally, since the CaseInsensitiveDict.Key
is expected to be string (what else can be case-sensitive or not), it is a good idea to derive Key
class from the str
, then it is possible, for instance, to dump CaseInsensitiveDict
with json.dumps
out of the box.
# caseinsensitivedict.py
class CaseInsensitiveDict(dict):
class Key(str):
def __init__(self, key):
str.__init__(key)
def __hash__(self):
return hash(self.lower())
def __eq__(self, other):
return self.lower() == other.lower()
def __init__(self, data=None):
super(CaseInsensitiveDict, self).__init__()
if data is None:
data = {}
for key, val in data.items():
self[key] = val
def __contains__(self, key):
key = self.Key(key)
return super(CaseInsensitiveDict, self).__contains__(key)
def __setitem__(self, key, value):
key = self.Key(key)
super(CaseInsensitiveDict, self).__setitem__(key, value)
def __getitem__(self, key):
key = self.Key(key)
return super(CaseInsensitiveDict, self).__getitem__(key)
Here is a basic test script for those who like to check things in action:
# test_CaseInsensitiveDict.py
import json
import unittest
from caseinsensitivedict import *
class Key(unittest.TestCase):
def setUp(self):
self.Key = CaseInsensitiveDict.Key
self.lower = self.Key('a')
self.upper = self.Key('A')
def test_eq(self):
self.assertEqual(self.lower, self.upper)
def test_hash(self):
self.assertEqual(hash(self.lower), hash(self.upper))
def test_str(self):
self.assertEqual(str(self.lower), 'a')
self.assertEqual(str(self.upper), 'A')
class Dict(unittest.TestCase):
def setUp(self):
self.Dict = CaseInsensitiveDict
self.d1 = self.Dict()
self.d2 = self.Dict()
self.d1['a'] = 1
self.d1['B'] = 2
self.d2['A'] = 1
self.d2['b'] = 2
def test_contains(self):
self.assertIn('B', self.d1)
d = self.Dict({'a':1, 'B':2})
self.assertIn('b', d)
def test_init(self):
d = self.Dict()
self.assertFalse(d)
d = self.Dict({'a':1, 'B':2})
self.assertTrue(d)
def test_items(self):
self.assertDictEqual(self.d1, self.d2)
self.assertEqual(
[v for v in self.d1.items()],
[v for v in self.d2.items()])
def test_json_dumps(self):
s = json.dumps(self.d1)
self.assertIn('a', s)
self.assertIn('B', s)
def test_keys(self):
self.assertEqual(self.d1.keys(), self.d2.keys())
def test_values(self):
self.assertEqual(
[v for v in self.d1.values()],
[v for v in self.d2.values()])
I just set up a function to handle this:
def setLCdict(d, k, v):
k = k.lower()
d[k] = v
return d
myDict = {}
So instead of
myDict['A'] = 1
myDict['B'] = 2
You can:
myDict = setLCdict(myDict, 'A', 1)
myDict = setLCdict(myDict, 'B', 2)
You can then either lower case the value before looking it up or write a function to do so.
def lookupLCdict(d, k):
k = k.lower()
return d[k]
myVal = lookupLCdict(myDict, 'a')
Probably not ideal if you want to do this globally but works well if its just a subset you wish to use it for.
You can do a dict key case insensitive search with a one liner:
>>> input_dict = {'aBc':1, 'xyZ':2}
>>> search_string = 'ABC'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
1
>>> search_string = 'EFG'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
>>>
You can place that into a function:
def get_case_insensitive_key_value(input_dict, key):
return next((value for dict_key, value in input_dict.items() if dict_key.lower() == key.lower()), None)
Note that only the first match is returned.
If you only need to do this once in your code (hence, no point to a function), the most straightforward way to deal with the problem is this:
lowercase_dict = {key.lower(): value for (key, value) in original_dict}
I’m assuming here that the dict in question isn’t all that large–it might be inelegant to duplicate it, but if it’s not large, it isn’t going to hurt anything.
The advantage of this over @Fred’s answer (though that also works) is that it produces the same result as a dict when the key isn’t present: a KeyError.
There are multiple approaches to this problem, each has its set of pros and cons. Just to add to the list (looks like this option wasn’t mentioned), it’s possible to extend str
class and use it as a key:
class CaseInsensitiveStr(str):
def __hash__(self) -> 'int':
return hash(self.lower())
def __eq__(self, other:'str') -> 'bool':
return self.lower() == other.lower()
It can work well if dictionary in question is private and some kind of interface is used to access it.
class MyThing:
def __init__(self):
self._d: 'dict[CaseInsensitiveStr, int]' = dict()
def set(self, key:'str', value:'int'):
self._d[CaseInsensitiveStr(key)] = value
def get(self, key:'str') -> 'int':
return self._d[CaseInsensitiveStr(key)]
Or…if you’d rather use an off-the-shelf product rather than hacking it yourself…try…
https://pypi.org/project/case-insensitive-dictionary/
I’d like my dictionary to be case insensitive.
I have this example code:
text = "practice changing the color"
words = {'color': 'colour',
'practice': 'practise'}
def replace(words,text):
keys = words.keys()
for i in keys:
text= text.replace(i ,words[i])
return text
text = replace(words,text)
print text
Output = practise changing the colour
I’d like another string, "practice changing the Color"
, (where Color
starts with a capital) to also give the same output.
I believe there is a general way to convert to lowercase using
mydictionary[key.lower()]
but I’m not sure how to best integrate this into my existing code. (If this would be a reasonable, simple approach anyway).
If I understand you correctly and you want a way to key dictionaries in a non case-sensitive fashion, one way would be to subclass dict and overload the setter / getter:
class CaseInsensitiveDict(dict):
def __setitem__(self, key, value):
super(CaseInsensitiveDict, self).__setitem__(key.lower(), value)
def __getitem__(self, key):
return super(CaseInsensitiveDict, self).__getitem__(key.lower())
While a case insensitive dictionary is a solution, and there are answers to how to achieve that, there is a possibly easier way in this case. A case insensitive search is sufficient:
import re
text = "Practice changing the Color"
words = {'color': 'colour', 'practice': 'practise'}
def replace(words,text):
keys = words.keys()
for i in keys:
exp = re.compile(i, re.I)
text = re.sub(exp, words[i], text)
return text
text = replace(words,text)
print text
Would you consider using string.lower()
on your inputs and using a fully lowercase dictionary? It’s a bit of a hacky solution, but it works
Just for the record. I found an awesome impementation on Requests:
https://github.com/kennethreitz/requests/blob/v1.2.3/requests/structures.py#L37
In my particular instance, I needed a case insensitive lookup, however, I did not want to modify the original case of the key. For example:
>>> d = {}
>>> d['MyConfig'] = 'value'
>>> d['myconfig'] = 'new_value'
>>> d
{'MyConfig': 'new_value'}
You can see that the dictionary still has the original key, however it is accessible case-insensitively. Here’s a simple solution:
class CaseInsensitiveKey(object):
def __init__(self, key):
self.key = key
def __hash__(self):
return hash(self.key.lower())
def __eq__(self, other):
return self.key.lower() == other.key.lower()
def __str__(self):
return self.key
The __hash__ and __eq__ overrides are required for both getting and setting entries in the dictionary. This is creating keys that hash to the same position in the dictionary if they are case-insensitively equal.
Now either create a custom dictionary that initializes a CaseInsensitiveKey using the provided key:
class CaseInsensitiveDict(dict):
def __setitem__(self, key, value):
key = CaseInsensitiveKey(key)
super(CaseInsensitiveDict, self).__setitem__(key, value)
def __getitem__(self, key):
key = CaseInsensitiveKey(key)
return super(CaseInsensitiveDict, self).__getitem__(key)
or simply make sure to always pass an instance of CaseInsensitiveKey as the key when using the dictionary.
The currently accepted answer wouldn’t work for lots of cases, so it cannot be used as a drop-in dict
replacement. Some tricky points in getting a proper dict
replacement:
- overloading all of the methods that involve keys
- properly handling non-string keys
- properly handling the constructor of the class
The following should work much better:
class CaseInsensitiveDict(dict):
@classmethod
def _k(cls, key):
return key.lower() if isinstance(key, basestring) else key
def __init__(self, *args, **kwargs):
super(CaseInsensitiveDict, self).__init__(*args, **kwargs)
self._convert_keys()
def __getitem__(self, key):
return super(CaseInsensitiveDict, self).__getitem__(self.__class__._k(key))
def __setitem__(self, key, value):
super(CaseInsensitiveDict, self).__setitem__(self.__class__._k(key), value)
def __delitem__(self, key):
return super(CaseInsensitiveDict, self).__delitem__(self.__class__._k(key))
def __contains__(self, key):
return super(CaseInsensitiveDict, self).__contains__(self.__class__._k(key))
def has_key(self, key):
return super(CaseInsensitiveDict, self).has_key(self.__class__._k(key))
def pop(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).pop(self.__class__._k(key), *args, **kwargs)
def get(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).get(self.__class__._k(key), *args, **kwargs)
def setdefault(self, key, *args, **kwargs):
return super(CaseInsensitiveDict, self).setdefault(self.__class__._k(key), *args, **kwargs)
def update(self, E={}, **F):
super(CaseInsensitiveDict, self).update(self.__class__(E))
super(CaseInsensitiveDict, self).update(self.__class__(**F))
def _convert_keys(self):
for k in list(self.keys()):
v = super(CaseInsensitiveDict, self).pop(k)
self.__setitem__(k, v)
I’ve modified the simple yet good solution by pleasemorebacon (thanks!) making it slightly more compact, self-contained and with minor updates to allow construction from {'a':1, 'B':2}
and support __contains__
protocol.
Finally, since the CaseInsensitiveDict.Key
is expected to be string (what else can be case-sensitive or not), it is a good idea to derive Key
class from the str
, then it is possible, for instance, to dump CaseInsensitiveDict
with json.dumps
out of the box.
# caseinsensitivedict.py
class CaseInsensitiveDict(dict):
class Key(str):
def __init__(self, key):
str.__init__(key)
def __hash__(self):
return hash(self.lower())
def __eq__(self, other):
return self.lower() == other.lower()
def __init__(self, data=None):
super(CaseInsensitiveDict, self).__init__()
if data is None:
data = {}
for key, val in data.items():
self[key] = val
def __contains__(self, key):
key = self.Key(key)
return super(CaseInsensitiveDict, self).__contains__(key)
def __setitem__(self, key, value):
key = self.Key(key)
super(CaseInsensitiveDict, self).__setitem__(key, value)
def __getitem__(self, key):
key = self.Key(key)
return super(CaseInsensitiveDict, self).__getitem__(key)
Here is a basic test script for those who like to check things in action:
# test_CaseInsensitiveDict.py
import json
import unittest
from caseinsensitivedict import *
class Key(unittest.TestCase):
def setUp(self):
self.Key = CaseInsensitiveDict.Key
self.lower = self.Key('a')
self.upper = self.Key('A')
def test_eq(self):
self.assertEqual(self.lower, self.upper)
def test_hash(self):
self.assertEqual(hash(self.lower), hash(self.upper))
def test_str(self):
self.assertEqual(str(self.lower), 'a')
self.assertEqual(str(self.upper), 'A')
class Dict(unittest.TestCase):
def setUp(self):
self.Dict = CaseInsensitiveDict
self.d1 = self.Dict()
self.d2 = self.Dict()
self.d1['a'] = 1
self.d1['B'] = 2
self.d2['A'] = 1
self.d2['b'] = 2
def test_contains(self):
self.assertIn('B', self.d1)
d = self.Dict({'a':1, 'B':2})
self.assertIn('b', d)
def test_init(self):
d = self.Dict()
self.assertFalse(d)
d = self.Dict({'a':1, 'B':2})
self.assertTrue(d)
def test_items(self):
self.assertDictEqual(self.d1, self.d2)
self.assertEqual(
[v for v in self.d1.items()],
[v for v in self.d2.items()])
def test_json_dumps(self):
s = json.dumps(self.d1)
self.assertIn('a', s)
self.assertIn('B', s)
def test_keys(self):
self.assertEqual(self.d1.keys(), self.d2.keys())
def test_values(self):
self.assertEqual(
[v for v in self.d1.values()],
[v for v in self.d2.values()])
I just set up a function to handle this:
def setLCdict(d, k, v):
k = k.lower()
d[k] = v
return d
myDict = {}
So instead of
myDict['A'] = 1
myDict['B'] = 2
You can:
myDict = setLCdict(myDict, 'A', 1)
myDict = setLCdict(myDict, 'B', 2)
You can then either lower case the value before looking it up or write a function to do so.
def lookupLCdict(d, k):
k = k.lower()
return d[k]
myVal = lookupLCdict(myDict, 'a')
Probably not ideal if you want to do this globally but works well if its just a subset you wish to use it for.
You can do a dict key case insensitive search with a one liner:
>>> input_dict = {'aBc':1, 'xyZ':2}
>>> search_string = 'ABC'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
1
>>> search_string = 'EFG'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
>>>
You can place that into a function:
def get_case_insensitive_key_value(input_dict, key):
return next((value for dict_key, value in input_dict.items() if dict_key.lower() == key.lower()), None)
Note that only the first match is returned.
If you only need to do this once in your code (hence, no point to a function), the most straightforward way to deal with the problem is this:
lowercase_dict = {key.lower(): value for (key, value) in original_dict}
I’m assuming here that the dict in question isn’t all that large–it might be inelegant to duplicate it, but if it’s not large, it isn’t going to hurt anything.
The advantage of this over @Fred’s answer (though that also works) is that it produces the same result as a dict when the key isn’t present: a KeyError.
There are multiple approaches to this problem, each has its set of pros and cons. Just to add to the list (looks like this option wasn’t mentioned), it’s possible to extend str
class and use it as a key:
class CaseInsensitiveStr(str):
def __hash__(self) -> 'int':
return hash(self.lower())
def __eq__(self, other:'str') -> 'bool':
return self.lower() == other.lower()
It can work well if dictionary in question is private and some kind of interface is used to access it.
class MyThing:
def __init__(self):
self._d: 'dict[CaseInsensitiveStr, int]' = dict()
def set(self, key:'str', value:'int'):
self._d[CaseInsensitiveStr(key)] = value
def get(self, key:'str') -> 'int':
return self._d[CaseInsensitiveStr(key)]
Or…if you’d rather use an off-the-shelf product rather than hacking it yourself…try…
https://pypi.org/project/case-insensitive-dictionary/