Why I am getting RuntimeError: generator raised StopIteration? and how to solve it?
Question:
I am making Bigrams of the tokens stored in list docToken.
print(docToken[520])
Output: [‘sleepy’, ‘account’, ‘just’, ‘man’, ‘tired’, ‘twitter’, ‘case’,
‘romney’, ‘candidate’, ‘looks’]
list(nltk.bigrams(docToken[520]))
Output: [(‘sleepy’, ‘account’), (‘account’, ‘just’), (‘just’, ‘man’),
(‘man’, ‘tired’), (‘tired’, ‘twitter’), (‘twitter’, ‘case’),
(‘case’, ‘romney’), (‘romney’, ‘candidate’), (‘candidate’, ‘looks’)]
and when i’m using nltk.bigrams(docToken[i])
in a loop i’m getting following error on the range>=1000:
bigram=[]
for i in range(5000):
ls=list(nltk.bigrams(docToken[i]))
for j in ls:
bigram.append(list(j))
it’s working just fine when the range(500) in the first loop but when the Range is 1000 or more it is giving me following error:
StopIteration Traceback (most recent call last)
~Anaconda3libsite-packagesnltkutil.py in ngrams(sequence, n, pad_left,
pad_right, left_pad_symbol, right_pad_symbol)
467 while n > 1:
--> 468 history.append(next(sequence))
469 n -= 1
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-76-8982951528bd> in <module>()
1 bigram=[]
2 for i in range(5000):
----> 3 ls=list(nltk.bigrams(docToken[i]))
4 for j in ls:
5 bigram.append(list(j))
~Anaconda3libsite-packagesnltkutil.py in bigrams(sequence, **kwargs)
489 """
490
--> 491 for item in ngrams(sequence, 2, **kwargs):
492 yield item
493
RuntimeError: generator raised StopIteration
Answers:
I was not able to resolve this error. Not sure why nltk.bigrams(docToken[i])
is generating this but I was able to create bigrams by using the following code.
bigram={}
for i in range(size):
ls=[]
for j in range(len(docToken[i])-1):
for k in range(j,len(docToken[i])-1):
ls.append([docToken[i][j],docToken[i][k+1]])
bigram[i]=ls
I fixed this by upgrading nltk from 3.3 -> 3.4
Do simple:
pip install nltk==3.4
I too faced the same error. One possible reason can be that one of the elements in docToken
is an empty list.
For example, the following code throws the same error when i=2
as the second element is an empty list.
from nltk import bigrams
docToken= [['the', 'wildlings', 'are', 'dead'], [], ['do', 'the', 'dead', 'frighten', 'you', 'ser', 'waymar']]
for i in range(3):
print (i)
print (list(nltk.bigrams(docToken[i])))
Output:
0
[('the', 'wildlings'), ('wildlings', 'are'), ('are', 'dead')]
1
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesnltkutil.py in ngrams(sequence, n, pad_left, pad_right, left_pad_symbol, right_pad_symbol)
467 while n > 1:
--> 468 history.append(next(sequence))
469 n -= 1
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-58-91f35cae32ed> in <module>
2 for i in range(3):
3 print (i)
----> 4 list(nltk.bigrams(docToken[i]))
~AppDataLocalContinuumanaconda3libsite-packagesnltkutil.py in bigrams(sequence, **kwargs)
489 """
490
--> 491 for item in ngrams(sequence, 2, **kwargs):
492 yield item
493
RuntimeError: generator raised StopIteration
You can filter out the empty lists from docToken
and then create bigrams:
docToken= [['the', 'wildlings', 'are', 'dead'], [], ['do', 'the', 'dead', 'frighten', 'you', 'ser', 'waymar']]
docToken = [x for x in docToken if x]
bigram = []
for i in range(len(docToken)):
bigram.append(["_".join(w) for w in bigrams(docToken[i])])
bigram
Output:
[['the_wildlings', 'wildlings_are', 'are_dead'],
['do_the',
'the_dead',
'dead_frighten',
'frighten_you',
'you_ser',
'ser_waymar']]
Another possible reason can be that you’re using nltk
3.3 in python 3.7.
Please use nltk 3.4, it’s the first version with Python 3.7 support, your issue should be resolved in this version.
Please see here.
First uninstall the current version of NLTK
pip uninstall nltk==3.2.5
Then install the latest version of NLTK
pip install nltk==3.6.2
And then check NLTK version, it should be 3.6.2
import nltk
print('The nltk version is {}.'.format(nltk.__version__))
This will fix the problem.
To complete @Leonard answer, I’ve solved it by uninstalling and reinstalling simply with:
pip uninstall nltk
pip install nltk
Don’t give version numbers, by default it uninstalls the one you have and reinstall the latest one.
I am making Bigrams of the tokens stored in list docToken.
print(docToken[520])
Output: [‘sleepy’, ‘account’, ‘just’, ‘man’, ‘tired’, ‘twitter’, ‘case’,
‘romney’, ‘candidate’, ‘looks’]
list(nltk.bigrams(docToken[520]))
Output: [(‘sleepy’, ‘account’), (‘account’, ‘just’), (‘just’, ‘man’),
(‘man’, ‘tired’), (‘tired’, ‘twitter’), (‘twitter’, ‘case’),
(‘case’, ‘romney’), (‘romney’, ‘candidate’), (‘candidate’, ‘looks’)]
and when i’m using nltk.bigrams(docToken[i])
in a loop i’m getting following error on the range>=1000:
bigram=[]
for i in range(5000):
ls=list(nltk.bigrams(docToken[i]))
for j in ls:
bigram.append(list(j))
it’s working just fine when the range(500) in the first loop but when the Range is 1000 or more it is giving me following error:
StopIteration Traceback (most recent call last)
~Anaconda3libsite-packagesnltkutil.py in ngrams(sequence, n, pad_left,
pad_right, left_pad_symbol, right_pad_symbol)
467 while n > 1:
--> 468 history.append(next(sequence))
469 n -= 1
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-76-8982951528bd> in <module>()
1 bigram=[]
2 for i in range(5000):
----> 3 ls=list(nltk.bigrams(docToken[i]))
4 for j in ls:
5 bigram.append(list(j))
~Anaconda3libsite-packagesnltkutil.py in bigrams(sequence, **kwargs)
489 """
490
--> 491 for item in ngrams(sequence, 2, **kwargs):
492 yield item
493
RuntimeError: generator raised StopIteration
I was not able to resolve this error. Not sure why nltk.bigrams(docToken[i])
is generating this but I was able to create bigrams by using the following code.
bigram={}
for i in range(size):
ls=[]
for j in range(len(docToken[i])-1):
for k in range(j,len(docToken[i])-1):
ls.append([docToken[i][j],docToken[i][k+1]])
bigram[i]=ls
I fixed this by upgrading nltk from 3.3 -> 3.4
Do simple:
pip install nltk==3.4
I too faced the same error. One possible reason can be that one of the elements in docToken
is an empty list.
For example, the following code throws the same error when i=2
as the second element is an empty list.
from nltk import bigrams
docToken= [['the', 'wildlings', 'are', 'dead'], [], ['do', 'the', 'dead', 'frighten', 'you', 'ser', 'waymar']]
for i in range(3):
print (i)
print (list(nltk.bigrams(docToken[i])))
Output:
0
[('the', 'wildlings'), ('wildlings', 'are'), ('are', 'dead')]
1
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesnltkutil.py in ngrams(sequence, n, pad_left, pad_right, left_pad_symbol, right_pad_symbol)
467 while n > 1:
--> 468 history.append(next(sequence))
469 n -= 1
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-58-91f35cae32ed> in <module>
2 for i in range(3):
3 print (i)
----> 4 list(nltk.bigrams(docToken[i]))
~AppDataLocalContinuumanaconda3libsite-packagesnltkutil.py in bigrams(sequence, **kwargs)
489 """
490
--> 491 for item in ngrams(sequence, 2, **kwargs):
492 yield item
493
RuntimeError: generator raised StopIteration
You can filter out the empty lists from docToken
and then create bigrams:
docToken= [['the', 'wildlings', 'are', 'dead'], [], ['do', 'the', 'dead', 'frighten', 'you', 'ser', 'waymar']]
docToken = [x for x in docToken if x]
bigram = []
for i in range(len(docToken)):
bigram.append(["_".join(w) for w in bigrams(docToken[i])])
bigram
Output:
[['the_wildlings', 'wildlings_are', 'are_dead'],
['do_the',
'the_dead',
'dead_frighten',
'frighten_you',
'you_ser',
'ser_waymar']]
Another possible reason can be that you’re using nltk
3.3 in python 3.7.
Please use nltk 3.4, it’s the first version with Python 3.7 support, your issue should be resolved in this version.
Please see here.
First uninstall the current version of NLTK
pip uninstall nltk==3.2.5
Then install the latest version of NLTK
pip install nltk==3.6.2
And then check NLTK version, it should be 3.6.2
import nltk
print('The nltk version is {}.'.format(nltk.__version__))
This will fix the problem.
To complete @Leonard answer, I’ve solved it by uninstalling and reinstalling simply with:
pip uninstall nltk
pip install nltk
Don’t give version numbers, by default it uninstalls the one you have and reinstall the latest one.