sklearn_api.rpmodel
– Scikit learn wrapper for Random Projection model¶
Scikit learn interface for RpModel
.
Follows scikit-learn API conventions to facilitate using gensim along with scikit-learn.
Examples
>>> from gensim.sklearn_api.rpmodel import RpTransformer
>>> from gensim.test.utils import common_dictionary, common_corpus
>>>
>>> # Initialize and fit the model.
>>> model = RpTransformer(id2word=common_dictionary).fit(common_corpus)
>>>
>>> # Use the trained model to transform a document.
>>> result = model.transform(common_corpus[3])
-
class
gensim.sklearn_api.rpmodel.
RpTransformer
(id2word=None, num_topics=300)¶ Bases:
sklearn.base.TransformerMixin
,sklearn.base.BaseEstimator
Base Word2Vec module, wraps
RpModel
.For more information please have a look to Random projection.
- Parameters
id2word (
Dictionary
, optional) – Mapping token_id -> token, will be determined from corpus if id2word == None.num_topics (int, optional) – Number of dimensions.
-
fit
(X, y=None)¶ Fit the model according to the given training data.
- Parameters
X (iterable of list of (int, number)) – Input corpus in BOW format.
- Returns
The trained model.
- Return type
-
fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –
y (ndarray of shape (n_samples,), default=None) – Target values.
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
object
-
transform
(docs)¶ Find the Random Projection factors for docs.
- Parameters
docs ({iterable of iterable of (int, int), list of (int, number)}) – Document or documents to be transformed in BOW format.
- Returns
RP representation for each input document.
- Return type
numpy.ndarray of shape [len(docs), num_topics]