Unable to stop Streaming in tweepy after one minute

Question:

I am trying to stream twitter data for a period of time of say 5 minutes, using the Stream.filter() method. I am storing the retrieved tweets in a JSON file. The problem is I am unable to stop the filter() method from within the program. I need to stop the execution manually. I tried stopping the data based on system time using the time package. I was able to stop writing tweets to the JSON file but the stream method is still going on, but It was not able to continue to the next line of code.
I am using IPython notebook to write and execute the code.
Here’s the code:

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

from tweepy import Stream
from tweepy.streaming import StreamListener

class MyListener(StreamListener):

    def __init__(self, start_time, time_limit=60):
        self.time = start_time
        self.limit = time_limit

    def on_data(self, data):
        while (time.time() - self.time) < self.limit:
            try:
                saveFile = open('abcd.json', 'a')
                saveFile.write(data)
                saveFile.write('n')
                saveFile.close()
                return True
            except BaseException as e:
                print 'failed ondata,', str(e)
                time.sleep(5)
        return True

    def on_status(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false

    def on_error(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false
        else:
            print(status)
            return True

start_time = time.time()
stream_data = Stream(auth, MyListener(start_time,20))
stream_data.filter(track=['name1','name2',...list ...,'name n'])#list of the strings I want to track

These links are similar but I does not answer my question directly

Tweepy: Stream data for X minutes?

Stopping Tweepy steam after a duration parameter (# lines, seconds, #Tweets, etc)

Tweepy Streaming – Stop collecting tweets at x amount

I used this link as my reference,
http://stats.seandolinar.com/collecting-twitter-data-using-a-python-stream-listener/

Asked By: Abin

||

Answers:

  1. In order to close the stream you need to return False from on_data(), or on_status().

  2. Because tweepy.Stream() runs a while loop itself, you don’t need the while loop in on_data().

  3. When initializing MyListener, you didn’t call the parent’s class __init__ method, so it wasn’t initialized properly.

So for what you’re trying to do, the code should be something like:

class MyStreamListener(tweepy.StreamListener):
    def __init__(self, time_limit=60):
        self.start_time = time.time()
        self.limit = time_limit
        self.saveFile = open('abcd.json', 'a')
        super(MyStreamListener, self).__init__()

    def on_data(self, data):
        if (time.time() - self.start_time) < self.limit:
            self.saveFile.write(data)
            self.saveFile.write('n')
            return True
        else:
            self.saveFile.close()
            return False

myStream = tweepy.Stream(auth=api.auth, listener=MyStreamListener(time_limit=20))
myStream.filter(track=['test'])
Answered By: yprez

Access the variable myListener.running but instead of passing MyListener directly to Stream create a variable as follows:

myListener = MyListener()
timeout code here... suchas time.sleep(20)
myListener.running = False 
Answered By: John

So, I was having this issue as well. Fortunately Tweepy is open source so it’s easy so dig into the problem.

Basically the important part is this here:

def _data(self, data):
    if self.listener.on_data(data) is False:
        self.running = False

On Stream class in streaming.py

That means, to close the connection you just have to return false on the listener’s on_data() method.

Answered By: Eduardo Rocha

For those who are trying with Twitter api V2 (StreamingClient class), here is the solution:

client.disconnect()

Answered By: no-stale-reads
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.