curl to python requests conversion for solr query

Question:

I have a bit of a bizzare problem. I have a solr index, which I query using curl like so:

curl 'http://localhost:8984/solr/my_index/select?indent=on&q="galvin%20life%20sciences"~0&wt=json&sort=_docid_%20desc&rows=5'

and I get (note the q string and the tilde operator which I use for proximity search):

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":""galvin life sciences"~0",
      "indent":"on",
      "sort":"_docid_ desc",
      "rows":"5",
      "wt":"json"}},
  "response":{"numFound":61,"start":0,"numFoundExact":true,"docs":[

Now, I am trying to replicate the same thing in python using:

resp=requests.get('http://localhost:8984/solr/my_index/select?q=' + "galvin%20life%20sciences"+"~0" + '&wt=json&rows=5&start=0&fl=id,org*,score')

and I get this:

[
    {
        "responseHeader": {
            "status": 0,
            "QTime": 0,
            "params": {
                "q": "galvin life sciences~0",
                "fl": "id,org*,score",
                "start": "0",
                "rows": "5",
                "wt": "json"
            }
        },
        "response": {
            "numFound": 3505398,
            "start": 0,
            "maxScore": 9.792607,
            "numFoundExact": true,
            "docs": [

YOu can see that the queries are somehow different:

curl: "q":""galvin life sciences"~0",
requests: "q": "galvin life sciences~0",

so I am getting wrong results when using requests.

I am not sure what I should do in requests to make the queries match.

I have tried the solution of @Mats:

requests.get('http://localhost:8984/solr/my_index/select', params={
  'q': '"galvin life sciences"~0',
  'wt': 'json',
  'rows': 5,
  'start': 0,
  'fl': 'id,org*,score',
})

but now I am not able to pass the variable to it (how annoying). So I have:

q_solr="Galvin life sciences"
requests.get('http://localhost:8984/solr/my_index/select', params={
  'q': q_solr+'~0',
  'wt': 'json',
  'rows': 5,
  'start': 0,
  'fl': 'id,org*,score',
})

but this gives me no result.. WTAF!!!!

Asked By: AJW

||

Answers:

You can either use requests built-in support for creating URL parameters for you (which is what I’d recommend, as it lets you properly separate the parameters and requests handles escaping for you):

requests.get('http://localhost:8984/solr/my_index/select', params={
  'q': '"galvin life sciences"~0',
  'wt': 'json',
  'rows': 5,
  'start': 0,
  'fl': 'id,org*,score',
})

Otherwise you can build the URL yourself as you’ve done, but since you’ve concatenated the strings instead of having " inside the previous string, you’ve just merged q= with galvin .. instead of "galvin. There’s no need to end the previous string if the next one is included anyways. You can also use a backslash to escape any quotes inside a string if necessary.

resp=requests.get('http://localhost:8984/solr/my_index/select?q="galvin%20life%20sciences"~0&wt=json&rows=5&start=0&fl=id,org*,score')

But use the first form unless you’re getting a preformatted URL from a different source.

Answered By: MatsLindh
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.