LDA Mallet Gensim CalledProcessError

Question:

Seems like many people are having issues with Mallet.

import os
from gensim.models.wrappers import LdaMallet

os.environ.update({'MALLET_HOME':r'C:/Users/myusername/Desktop/Topic_Modelling/mallet-2.0.8'})

mallet_path = r'C:/Users/myusername/Desktop/Topic_Modelling/mallet-2.0.8/bin/mallet' 

model = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus,num_topics=num_topics, id2word=id2word)

Getting the following errors:

/bin/sh: C:/Users/myusername/Desktop/Topic_Modelling/mallet-2.0.8/bin/mallet.bat: No such file or directory

CalledProcessError: Command 'C:/Users/myusername/Desktop/Topic_Modelling/mallet-2.0.8/bin/mallet.bat import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "S+" --input /var/folders/ml/lxzrtxwn02vbvq65c80g1b640000gn/T/c52cdc_corpus.txt --output /var/folders/ml/lxzrtxwn02vbvq65c80g1b640000gn/T/c52cdc_corpus.mallet' returned non-zero exit status 127.

I downloaded mallet from http://mallet.cs.umass.edu/dist/mallet-2.0.8.zip and unzipped it in my directory. I’ve tried running the command in the error in the terminal and I’m getting the same ‘no such file found’ error, but it’s there in my directory?

I’ve also followed this: https://ps.au.dk/fileadmin/ingen_mappe_valgt/installing_mallet.pdf

When I go to the directory via command line and type ./bin/mallet I get a whole bunch of commands, which according to the instructions, is what I’m looking for to know that it’s been installed ok.

I’m running the following on MacOS

  • Python==3.9.6
  • gensim==3.8.3

Anyone have any ideas?

Asked By: user47467

||

Answers:

As silly as this sounds, I resolved this by changing the path to:

os.environ.update({'MALLET_HOME':r'mallet-2.0.8'})

mallet_path = r'mallet-2.0.8/bin/mallet' 

So if you have the mallet directory in the same one as where your code is, this will work!

Answered By: user47467

This, error arises if jdk is not installed in the system, lda mallet uses jdk to run . if your are using colab follow these steps

1.!pip install –upgrade gensim==3.8(wrapper classes only supported in the previous versions)

2.install jdk in colab

import os
def install_java():
!apt-get install -y openjdk-8-jdk-headless -qq > /dev/null #install openjdk
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64" #set environment variable
!java -version #check java version
install_java()

3.install the mallet
!wget http://mallet.cs.umass.edu/dist/mallet-2.0.8.zip
!unzip mallet-2.0.8.zip

4.set the path and run the lda mallet
os.environ[‘MALLET_HOME’] = ‘/content/mallet-2.0.8’
mallet_path = ‘/content/mallet-2.0.8/bin/mallet’ # you should NOT need to change this

Hope this helps.

Answered By: N.U.Akash Reddy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.