API Reference¶
Modules:
interfaces
– Core gensim interfacesutils
– Various utility functionsmatutils
– Math utils_matutils
– Compiled extension for math utilsdownloader
– Downloader API for gensimcorpora.bleicorpus
– Corpus in Blei’s LDA-C formatcorpora.csvcorpus
– Corpus in CSV formatcorpora.dictionary
– Construct word<->id mappingscorpora.hashdictionary
– Construct word<->id mappingscorpora.indexedcorpus
– Random access to corpus documentscorpora.lowcorpus
– Corpus in GibbsLda++ formatcorpora.malletcorpus
– Corpus in Mallet formatcorpora.mmcorpus
– Corpus in Matrix Market formatcorpora._mmreader
– Read corpus in the Matrix Market formatcorpora.sharded_corpus
– Corpus stored in separate filescorpora.svmlightcorpus
– Corpus in SVMlight formatcorpora.textcorpus
– Tools for building corpora with dictionariescorpora.ucicorpus
– Corpus in UCI formatcorpora.wikicorpus
– Corpus from a Wikipedia dumpmodels.ldamodel
– Latent Dirichlet Allocationmodels.ldamulticore
– parallelized Latent Dirichlet Allocationmodels.nmf
– Non-Negative Matrix factorizationmodels.lsimodel
– Latent Semantic Indexingmodels.ldaseqmodel
– Dynamic Topic Modeling in Pythonmodels.tfidfmodel
– TF-IDF modelmodels.rpmodel
– Random Projectionsmodels.hdpmodel
– Hierarchical Dirichlet Processmodels.logentropy_model
– LogEntropy modelmodels.normmodel
– Normalization modelmodels.translation_matrix
– Translation Matrix modelmodels.lsi_dispatcher
– Dispatcher for distributed LSImodels.lsi_worker
– Worker for distributed LSImodels.lda_dispatcher
– Dispatcher for distributed LDAmodels.lda_worker
– Worker for distributed LDAmodels.atmodel
– Author-topic modelsmodels.word2vec
– Word2vec embeddingsmodels.keyedvectors
– Store and query word vectorsmodels.doc2vec
– Doc2vec paragraph embeddingsmodels.fasttext
– FastText modelmodels._fasttext_bin
– Facebook’s fastText I/Omodels.phrases
– Phrase (collocation) detectionmodels.poincare
– Train and use Poincare embeddingsviz.poincare
– Visualize Poincare embeddingsmodels.coherencemodel
– Topic coherence pipelinemodels.basemodel
– Core TM interfacemodels.callbacks
– Callbacks for track and viz LDA train processmodels.word2vec_inner
– Cython routines for training Word2Vec modelsmodels.doc2vec_inner
– Cython routines for training Doc2Vec modelsmodels.fasttext_inner
– Cython routines for training FastText modelsmodels.wrappers.ldamallet
– Latent Dirichlet Allocation via Malletmodels.wrappers.dtmmodel
– Dynamic Topic Models (DTM) and Dynamic Influence Models (DIM)models.wrappers.ldavowpalwabbit
– Latent Dirichlet Allocation via Vowpal Wabbitmodels.wrappers.wordrank
– Word Embeddings from WordRankmodels.wrappers.varembed
– VarEmbed Word Embeddingssimilarities.docsim
– Document similarity queriessimilarities.termsim
– Term similarity queriessimilarities.annoy
– Approximate Vector Search using Annoysimilarities.nmslib
– Approximate Vector Search using NMSLIBsklearn_api.atmodel
– Scikit learn wrapper for Author-topic modelsklearn_api.d2vmodel
– Scikit learn wrapper for paragraph2vec modelsklearn_api.hdp
– Scikit learn wrapper for Hierarchical Dirichlet Process modelsklearn_api.ldamodel
– Scikit learn wrapper for Latent Dirichlet Allocationsklearn_api.ldaseqmodel
– Scikit learn wrapper for LdaSeq modelsklearn_api.lsimodel
– Scikit learn wrapper for Latent Semantic Indexingsklearn_api.phrases
– Scikit learn wrapper for phrase (collocation) detectionsklearn_api.rpmodel
– Scikit learn wrapper for Random Projection modelsklearn_api.text2bow
– Scikit learn wrapper word<->id mappingsklearn_api.tfidf
– Scikit learn wrapper for TF-IDF modelsklearn_api.w2vmodel
– Scikit learn wrapper for word2vec modeltest.utils
– Internal testing functionstopic_coherence.aggregation
– Aggregation moduletopic_coherence.direct_confirmation_measure
– Direct confirmation measure moduletopic_coherence.indirect_confirmation_measure
– Indirect confirmation measure moduletopic_coherence.probability_estimation
– Probability estimation moduletopic_coherence.segmentation
– Segmentation moduletopic_coherence.text_analysis
– Analyzing the texts of a corpus to accumulate statistical information about word occurrencesscripts.package_info
– Information about gensim packagescripts.glove2word2vec
– Convert glove format to word2vecscripts.make_wikicorpus
– Convert articles from a Wikipedia dump to vectors.scripts.word2vec_standalone
– Train word2vec on text file CORPUSscripts.make_wiki_online
– Convert articles from a Wikipedia dumpscripts.make_wiki_online_lemma
– Convert articles from a Wikipedia dumpscripts.make_wiki_online_nodebug
– Convert articles from a Wikipedia dumpscripts.word2vec2tensor
– Convert the word2vec format to Tensorflow 2D tensorscripts.segment_wiki
– Convert wikipedia dump to json-line formatparsing.porter
– Porter Stemming Algorithmparsing.preprocessing
– Functions to preprocess raw textsummarization.bm25
– BM25 ranking functionsummarization.commons
– Graph functions used in TextRank summarizationsummarization.graph
– Graph used in TextRank summarizationsummarization.keywords
– Keywords for TextRank summarization algorithmsummarization.mz_entropy
– Keywords for the Montemurro and Zanette entropy algorithmsummarization.pagerank_weighted
– Weighted PageRank algorithmsummarization.summarizer
– TextRank Summarizersummarization.syntactic_unit
– Syntactic Unit classsummarization.textcleaner
– Preprocessing for TextRank summarization