gensim

gensim word2vec: Find number of words in vocabulary

gensim word2vec: Find number of words in vocabulary Question: After training a word2vec model using python gensim, how do you find the number of words in the model’s vocabulary? Asked By: hlin117 || Source Answers: In recent versions, the model.wv property holds the words-and-vectors, and can itself can report a length – the number of …

Total answers: 4

How to check if a key exists in a word2vec trained model or not

How to check if a key exists in a word2vec trained model or not Question: I have trained a word2vec model using a corpus of documents with Gensim. Once the model is training, I am writing the following piece of code to get the raw feature vector of a word say “view”. myModel[“view”] However, I …

Total answers: 8

Convert word2vec bin file to text

Convert word2vec bin file to text Question: From the word2vec site I can download GoogleNews-vectors-negative300.bin.gz. The .bin file (about 3.4GB) is a binary format not useful to me. Tomas Mikolov assures us that “It should be fairly straightforward to convert the binary format to text format (though that will take more disk space). Check the …

Total answers: 10

Python Gensim: how to calculate document similarity using the LDA model?

Python Gensim: how to calculate document similarity using the LDA model? Question: I’ve got a trained LDA model and I want to calculate the similarity score between two documents from the corpus I trained my model on. After studying all the Gensim tutorials and functions, I still can’t get my head around it. Can somebody …

Total answers: 3

How to calculate the sentence similarity using word2vec model of gensim with python

How to calculate the sentence similarity using word2vec model of gensim with python Question: According to the Gensim Word2Vec, I can use the word2vec model in gensim package to calculate the similarity between 2 words. e.g. trained_model.similarity(‘woman’, ‘man’) 0.73723527 However, the word2vec model fails to predict the sentence similarity. I find out the LSI model …

Total answers: 14

How to load sentences into Python gensim?

How to load sentences into Python gensim? Question: I am trying to use the word2vec module from gensim natural language processing library in Python. The docs say to initialize the model: from gensim.models import word2vec model = Word2Vec(sentences, size=100, window=5, min_count=5, workers=4) What format does gensim expect for the input sentences? I have raw text …

Total answers: 2

Understanding LDA implementation using gensim

Understanding LDA implementation using gensim Question: I am trying to understand how gensim package in Python implements Latent Dirichlet Allocation. I am doing the following: Define the dataset documents = [“Apple is releasing a new product”, “Amazon sells many things”, “Microsoft announces Nokia acquisition”] After removing stopwords, I create the dictionary and the corpus: texts …

Total answers: 5