AttributeError: 'float' object has no attribute 'lower'
Question:
I’m facing this attribute error and I’m stuck at how to handle float values if they appear in a tweet.The streaming tweet has to be lower cased and tokenized so i have used split function.
Can somebody please help me to deal with it, any workaround or solution ..?
Here’s the error which m gettin….
AttributeError Traceback (most recent call last)
<ipython-input-28-fa278f6c3171> in <module>()
1 stop_words = []
----> 2 negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]
3 posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]
4
5 trainfeats = negfeats+ posfeats
AttributeError: 'float' object has no attribute 'lower'
Here is my code
p_test = pd.read_csv('TrainSA.csv')
stop_words = [ ]
def word_feats(words):
return dict([(word, True) for word in words])
l = [ ]
for f in range(len(p_test)):
if p_test.Sentiment[f] == 0:
l.append(f)
p = [ ]
for f in range(len(p_test)):
if p_test.Sentiment[f] == 1:
p.append(f)
negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]
posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]
trainfeats = negfeats+ posfeats
print len(trainfeats)
import random
random.shuffle(trainfeats)
print(len(trainfeats))
p_train = pd.read_csv('TrainSA.csv')
l_t = []
for f in range(len(p_train)):
if p_train.Sentiment[f] == 0:
l_t.append(f)
p_t = []
for f in range(len(p_train)):
if p_train.Sentiment[f] == 1:
p_t.append(f)
print len(l_t)
print len(p_t)
I tried many ways but still not able to get them to use lower and split function.
Answers:
I get the feeling that your problems has its root in the pd.read_csv(‘TrainSA.csv’) function. Althought you did not post this routine I assume it is Pandas read_csv. This routine intelligently converts input to python datatypes. However this means that in your case some values could be translated to a float. You can prevent this intelligent (?) behaviour by specifying which datatypes you expect for each column.
Thank you @Dick Kniep. Yes,it is Pandas CSV reader. Your suggestion worked.
Following is the python code which worked for me by specifying the field datatype,
(in this case, its string)
p_test = pd.read_csv('TrainSA.csv')
p_test.SentimentText=p_test.SentimentText.astype(str)
I got similar error with my dataset. Setup dtype
parameter didn’t help me. I have to prepare my dataset. The problem was with NaN
column value. Dataset part:
Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,
So my solution: before read_csv
I added dummy text instead of an empty row:
Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,"NaN"
Now my app works fine.
If you are using data frame, drop the NA using:
df = df.dropna()
df=pd.read_excel("locationfile.xlsx")
df.characters=df.characters.astype(str)
I tried this and I got my answer.
You can ensure if the DataFrame series is not null or non-missing values.
You can do the below step before performing any operations.
df = df[df[‘ColumnName’].notna()]
I’m facing this attribute error and I’m stuck at how to handle float values if they appear in a tweet.The streaming tweet has to be lower cased and tokenized so i have used split function.
Can somebody please help me to deal with it, any workaround or solution ..?
Here’s the error which m gettin….
AttributeError Traceback (most recent call last)
<ipython-input-28-fa278f6c3171> in <module>()
1 stop_words = []
----> 2 negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]
3 posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]
4
5 trainfeats = negfeats+ posfeats
AttributeError: 'float' object has no attribute 'lower'
Here is my code
p_test = pd.read_csv('TrainSA.csv')
stop_words = [ ]
def word_feats(words):
return dict([(word, True) for word in words])
l = [ ]
for f in range(len(p_test)):
if p_test.Sentiment[f] == 0:
l.append(f)
p = [ ]
for f in range(len(p_test)):
if p_test.Sentiment[f] == 1:
p.append(f)
negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]
posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]
trainfeats = negfeats+ posfeats
print len(trainfeats)
import random
random.shuffle(trainfeats)
print(len(trainfeats))
p_train = pd.read_csv('TrainSA.csv')
l_t = []
for f in range(len(p_train)):
if p_train.Sentiment[f] == 0:
l_t.append(f)
p_t = []
for f in range(len(p_train)):
if p_train.Sentiment[f] == 1:
p_t.append(f)
print len(l_t)
print len(p_t)
I tried many ways but still not able to get them to use lower and split function.
I get the feeling that your problems has its root in the pd.read_csv(‘TrainSA.csv’) function. Althought you did not post this routine I assume it is Pandas read_csv. This routine intelligently converts input to python datatypes. However this means that in your case some values could be translated to a float. You can prevent this intelligent (?) behaviour by specifying which datatypes you expect for each column.
Thank you @Dick Kniep. Yes,it is Pandas CSV reader. Your suggestion worked.
Following is the python code which worked for me by specifying the field datatype,
(in this case, its string)
p_test = pd.read_csv('TrainSA.csv')
p_test.SentimentText=p_test.SentimentText.astype(str)
I got similar error with my dataset. Setup dtype
parameter didn’t help me. I have to prepare my dataset. The problem was with NaN
column value. Dataset part:
Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,
So my solution: before read_csv
I added dummy text instead of an empty row:
Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,"NaN"
Now my app works fine.
If you are using data frame, drop the NA using:
df = df.dropna()
df=pd.read_excel("locationfile.xlsx")
df.characters=df.characters.astype(str)
I tried this and I got my answer.
You can ensure if the DataFrame series is not null or non-missing values.
You can do the below step before performing any operations.
df = df[df[‘ColumnName’].notna()]