API Reference¶
Modules:
interfaces– Core gensim interfacesutils– Various utility functionsmatutils– Math utils_matutils– Compiled extension for math utilsdownloader– Downloader API for gensimcorpora.bleicorpus– Corpus in Blei’s LDA-C formatcorpora.csvcorpus– Corpus in CSV formatcorpora.dictionary– Construct word<->id mappingscorpora.hashdictionary– Construct word<->id mappingscorpora.indexedcorpus– Random access to corpus documentscorpora.lowcorpus– Corpus in GibbsLda++ formatcorpora.malletcorpus– Corpus in Mallet formatcorpora.mmcorpus– Corpus in Matrix Market formatcorpora._mmreader– Read corpus in the Matrix Market formatcorpora.sharded_corpus– Corpus stored in separate filescorpora.svmlightcorpus– Corpus in SVMlight formatcorpora.textcorpus– Tools for building corpora with dictionariescorpora.ucicorpus– Corpus in UCI formatcorpora.wikicorpus– Corpus from a Wikipedia dumpmodels.ldamodel– Latent Dirichlet Allocationmodels.ldamulticore– parallelized Latent Dirichlet Allocationmodels.nmf– Non-Negative Matrix factorizationmodels.lsimodel– Latent Semantic Indexingmodels.ldaseqmodel– Dynamic Topic Modeling in Pythonmodels.tfidfmodel– TF-IDF modelmodels.rpmodel– Random Projectionsmodels.hdpmodel– Hierarchical Dirichlet Processmodels.logentropy_model– LogEntropy modelmodels.normmodel– Normalization modelmodels.translation_matrix– Translation Matrix modelmodels.lsi_dispatcher– Dispatcher for distributed LSImodels.lsi_worker– Worker for distributed LSImodels.lda_dispatcher– Dispatcher for distributed LDAmodels.lda_worker– Worker for distributed LDAmodels.atmodel– Author-topic modelsmodels.word2vec– Word2vec embeddingsmodels.keyedvectors– Store and query word vectorsmodels.doc2vec– Doc2vec paragraph embeddingsmodels.fasttext– FastText modelmodels._fasttext_bin– Facebook’s fastText I/Omodels.phrases– Phrase (collocation) detectionmodels.poincare– Train and use Poincare embeddingsviz.poincare– Visualize Poincare embeddingsmodels.coherencemodel– Topic coherence pipelinemodels.basemodel– Core TM interfacemodels.callbacks– Callbacks for track and viz LDA train processmodels.word2vec_inner– Cython routines for training Word2Vec modelsmodels.doc2vec_inner– Cython routines for training Doc2Vec modelsmodels.fasttext_inner– Cython routines for training FastText modelsmodels.wrappers.ldamallet– Latent Dirichlet Allocation via Malletmodels.wrappers.dtmmodel– Dynamic Topic Models (DTM) and Dynamic Influence Models (DIM)models.wrappers.ldavowpalwabbit– Latent Dirichlet Allocation via Vowpal Wabbitmodels.wrappers.wordrank– Word Embeddings from WordRankmodels.wrappers.varembed– VarEmbed Word Embeddingssimilarities.docsim– Document similarity queriessimilarities.termsim– Term similarity queriessimilarities.annoy– Approximate Vector Search using Annoysimilarities.nmslib– Approximate Vector Search using NMSLIBsklearn_api.atmodel– Scikit learn wrapper for Author-topic modelsklearn_api.d2vmodel– Scikit learn wrapper for paragraph2vec modelsklearn_api.hdp– Scikit learn wrapper for Hierarchical Dirichlet Process modelsklearn_api.ldamodel– Scikit learn wrapper for Latent Dirichlet Allocationsklearn_api.ldaseqmodel– Scikit learn wrapper for LdaSeq modelsklearn_api.lsimodel– Scikit learn wrapper for Latent Semantic Indexingsklearn_api.phrases– Scikit learn wrapper for phrase (collocation) detectionsklearn_api.rpmodel– Scikit learn wrapper for Random Projection modelsklearn_api.text2bow– Scikit learn wrapper word<->id mappingsklearn_api.tfidf– Scikit learn wrapper for TF-IDF modelsklearn_api.w2vmodel– Scikit learn wrapper for word2vec modeltest.utils– Internal testing functionstopic_coherence.aggregation– Aggregation moduletopic_coherence.direct_confirmation_measure– Direct confirmation measure moduletopic_coherence.indirect_confirmation_measure– Indirect confirmation measure moduletopic_coherence.probability_estimation– Probability estimation moduletopic_coherence.segmentation– Segmentation moduletopic_coherence.text_analysis– Analyzing the texts of a corpus to accumulate statistical information about word occurrencesscripts.package_info– Information about gensim packagescripts.glove2word2vec– Convert glove format to word2vecscripts.make_wikicorpus– Convert articles from a Wikipedia dump to vectors.scripts.word2vec_standalone– Train word2vec on text file CORPUSscripts.make_wiki_online– Convert articles from a Wikipedia dumpscripts.make_wiki_online_lemma– Convert articles from a Wikipedia dumpscripts.make_wiki_online_nodebug– Convert articles from a Wikipedia dumpscripts.word2vec2tensor– Convert the word2vec format to Tensorflow 2D tensorscripts.segment_wiki– Convert wikipedia dump to json-line formatparsing.porter– Porter Stemming Algorithmparsing.preprocessing– Functions to preprocess raw textsummarization.bm25– BM25 ranking functionsummarization.commons– Graph functions used in TextRank summarizationsummarization.graph– Graph used in TextRank summarizationsummarization.keywords– Keywords for TextRank summarization algorithmsummarization.mz_entropy– Keywords for the Montemurro and Zanette entropy algorithmsummarization.pagerank_weighted– Weighted PageRank algorithmsummarization.summarizer– TextRank Summarizersummarization.syntactic_unit– Syntactic Unit classsummarization.textcleaner– Preprocessing for TextRank summarization
