Using shingles and fuzziness in Elasticsearch Python DSL?


How do you call shingles in Python DSL?

This is a simple example that searches for a phrase in the “name” field and another one in the “surname” field.

import json
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q

def make_dsl_query(fields):
    Construct a query
    es_client = Elasticsearch()
    my_query = Search(using=es_client, index="my_index", doc_type="my_type")

    if fields['name'] and fields['surname']:
        my_query = my_query.query(Q('bool', should=
                   [Q("match", name=fields['name']),
                    Q("match", surname=fields['surname'])]))
    return my_query

if __name__ == '__main__':

    my_query = make_dsl_query(fields={"name": "Ivan The Terrible", "surname": "Conqueror of the World"})
    response = my_query.execute()

    # print response
    for hit in response:
        print(hit.meta.score,, hit.surname)

1) Is it possible to use shingles? And how? I’ve tried many things and can’t find anything in the documentation on it.

This would work in a normal Elasticsearch query, but apparently called in a different way in the Python DSL…

my_query = my_query.query(Q('bool', should=
                   [Q("match", name.shingles=fields['name']),
                    Q("match", surname.shingles=fields['surname'])]))

2) How do I pass fuzziness parameters to my match? Can’t seem to find anything on it either. Ideally I would be able to do something like this:

my_query = my_query.query(Q('bool', should=
                   [Q("match", name=fields['name'], fuzziness="AUTO", max_expansions=10),
                    Q("match", surname=fields['surname'])]))
Asked By: Ivan Bilan



To use shingles you need to define them in your mappings, it’s too late to try and use them in query time. At query time what you can do is use a match_phrase query.

my_query = my_query.query(Q('bool', should=
               [Q("match", name.shingles=fields['name']),
                Q("match", surname.shingles=fields['surname'])]))

This should work if written as:

 my_query = my_query.query(Q('bool', should=
               [Q("match", name__shingles=fields['name']),
                Q("match", surname__shingles=fields['surname'])]))

Assuming you have the shingles field defined on both name and surname fields.

Note that you can also use the | operator:

 my_query = Q("match", name__shingles=fields['name']) | Q("match", surname.shingles=fields['surname'])

instead of constructing the bool query yourself.

Hope this helps.

Answered By: Honza Král

As of January, 2023: elasticsearch-dsl does support fuzzy matches, but it’s just not very well documented.

For simple fuzzy matches:

Q('fuzzy', fieldName=matchString)

When you want to set a custom fuzziness:

Q({"fuzzy": {"yourFieldName": {"value": matchString, "fuzziness": fuzziness}}})

My understanding is that the fuzzy keyword is just a wrapper for a standard query, see


  1. (solution courtesy of @leberknecht on github)
Answered By: bucephalopod