Using Solr-Docker with python return a wrong results

Question:

I have a flask app which runs in a Docker container and I wanted to use Solr with it for indexing and searching, so I built a container for Solr using the Solr official image and used it with my app using docker-compose.
In the app I have multiple types of objects that I want to index for example type1 and type2 and each type has specific fields, so I got in Solr, documents that have different fields, such as doc1 could have field1 and field2, and doc2 could have field3, field4 and field5, and each document has a field called type to specify its type.

I have two types of search first one is searching for documents of a specific type and this is an example URL of it which is used with requests Python package:

response = requests.get("http://solr:8983/solr/myCollection/select?q=*val*&defType=edismax&fq=type:type1&qf=field1^2&qf=field2^1")

, and the other is overall search so I search for documents of all types, and here is its URL example:

response = requests.get("http://solr:8983/solr/myCollection/select?q=*val*&defType=edismax&fq=type:type1||type2&qf=field1^1&qf=field2^1&qf=field3^1&qf=field4^1&qf=field1^1")

I have two problems with my work:

  1. I don’t get the result that I expected when I run some queries.
  2. some fields have values with special characters like (z=x+y*f) and when I try to escape these special characters by it doesn’t work.

So, is the queries that I wrote have something wrong and is there any article or tutorial that could help me because I searched a lot in the documentation and the internet but I couldn’t find I way to solve my problems.

Note: I didn’t change the schema file I let it as default.

Asked By: hasan bilal

||

Answers:

I’ve solved the problems by using the tokenizers and filters in indexing and querying.
You can use them by the Client API that Solr provide.
Here is an example of JSON data to add tokenizers and filters to a field type:

{
    "replace-field-type": {
        "name": "field_name",
        "class": "solr.TextField",
        "multiValued": True,
        "indexAnalyzer": {
            "tokenizer": {
                "class": "solr.LowerCaseTokenizerFactory"
            },
            "filters": [
                {
                    "class": "solr.LowerCaseFilterFactory"
                }
            ]
        },
        "queryAnalyzer": {
            "tokenizer": {
                "class": "solr.WhitespaceTokenizerFactory",
                "rule": "java"
            },
            "filters": [
                {
                    "class": "solr.LowerCaseFilterFactory"
                }
            ]
        }
    }
}
Answered By: hasan bilal
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.