How can I use custom filters in the Stack Exchange API?

Question:

I am trying to get the questions and answers from StackApi to train a deep learning model. I have the problem that I don’t understand how to use the custom filters so that I only get the body of the question.

This is my code:

from stackapi import StackAPI
import torch
import torch.nn as nn

SITE = StackAPI('stackoverflow')
SITE.max_pages=1
SITE.page_size=1
data = SITE.fetch('questions', tagged='python',filter = '!*SU8CGYZitCB.D*(BDVIficKj7nFMLLDij64nVID)N9aK3GmR9kT4IzT*5iO_1y3iZ)6W.G*', sort = 'votes')
for quest in data['items']:
    question = quest['title']
    print(question)
    question_id = quest['question_id']
    print (question_id)
    dataAnswer = SITE.fetch('questions/{ids}/answers', ids=[question_id], filter='withbody')
    print(dataAnswer)

My results for dataAnswer:

{'backoff': 0, 'has_more': True, 'page': 1, 'quota_max': 300, 'quota_remaining': 300, 'total': 0, 'items': [{'owner': {'reputation': 404, 'user_id': 11182732, 'user_type': 'registered', 'profile_image': 'https://lh6.googleusercontent.com/-F2a9OP4yGHc/AAAAAAAAAAI/AAAAAAAADVo/Mn4oVgim-m8/photo.jpg?sz=128', 'display_name': 'Aditya patil', 'link': 'https://stackoverflow.com/users/11182732/aditya-patil'}, 'is_accepted': False, 'score': 8, 'last_activity_date': 1609856797, 'last_edit_date': 1609856797, 'creation_date': 1587307868, 'answer_id': 61306333, 'question_id': 231767, 'content_license': 'CC BY-SA 4.0', 'body': '<p><strong>The yield keyword is going to 
replace return in a function definition to create a generator.</strong></p>n<pre><code>def create_generator():n   for i in range(100):n   yield inmyGenerator = create_generator()nprint(myGenerator)n# &lt;generator object create_generator at 0x102dd2480&gt;nfor i in myGenerator:n   print(i) # prints 0-99n</code></pre>n<p>When the returned generator is first used—not in the assignment but the for loop—the function definition will execute until it reaches the yield statement. There, it will pause (see why it’s called yield) until used again. Then, it will pick up where it left off. Upon the final iteration of the generator, any code after the yield command will execute.</p>n<pre><code>def create_generator():n   print(&quot;Beginning of generator&quot;)n   for i in range(4):n      yield in   print(&quot;After yield&quot;)nprint(&quot;Before assignment&quot;)nnmyGenerator = create_generator()nnprint(&quot;After assignment&quot;)nfor i in myGenerator :n   print(i) # prints 0-3n&quot;&quot;&quot;nBefore assignmentnAfter assignmentnBeginning of generatorn0n1n2nAfter yieldn</code></pre>n<p>The <strong>yield</strong> keyword modifies a function’s behavior to produce a generator that’s paused at each yield command during iteration. The function isn’t executed except upon iteration, 
which leads to improved resource management, and subsequently, a better overall performance. Use generators (and yielded functions) for creating large data sets meant for single-use iteration.</p>n'}]}

Now I want to get just the body of the result. Can I replace the withbody filter with a custom one and if so which one?

Asked By: Neminem

||

Answers:

  1. Select your method from the API docs. In this case, it’s the /questions/{ids}/answers one.
  2. Click [edit] next to default filter, edit the fields you want, then click save.
  3. Copy the filter that appears and paste it in your code.

"filter" dropdown from the SE API docs playground

Creating a filter programmatically is complicated due to the lack of (proper) documentation for the /filters/create method. Since you want the body of an answer, you’ll need to include answer.body in the filter, as well as the default .wrapper fields. For example:

from stackapi import StackAPI

defaultWrapper = '.backoff;.error_id;.error_message;.error_name;.has_more;.items;.quota_max;.quota_remaining;'
includes = 'answer.body'

SITE = StackAPI('stackoverflow')
# See https://stackapi.readthedocs.io/en/latest/user/advanced.html#end-points-that-don-t-accept-site-parameter
SITE._api_key = None
data = SITE.fetch('filters/create', base = 'none', include = defaultWrapper + includes)
print(data['items'][0]['filter'])

where you change includes accordingly.

References:

Answered By: double-beep
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.