TransactionEncoder

TransactionEncoder()

Encoder class for transaction data in Python lists

Parameters

None

Attributes

columns_: list List of unique names in the X input list of lists

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/preprocessing/TransactionEncoder/

Methods


fit(X)

Learn unique column names from transaction DataFrame

Parameters

  • X : list of lists

    A python list of lists, where the outer list stores the n transactions and the inner list stores the items in each transaction.

    For example, [['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]


fit_transform(X, sparse=False)

Fit a TransactionEncoder encoder and transform a dataset.


get_feature_names_out()

Used to get the column names of pandas output.

This method combined with the `TransformerMixin` exposes the
set_output API to the `TransactionEncoder`. This allows the user
to set the transformed output to a `pandas.DataFrame` by default.

See  https://scikit-learn.org/stable/developers/develop.html#developer-api-set-output
for more details.

get_params(deep=True)

Get parameters for this estimator.

Parameters

  • deep : bool, default=True

    If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

  • params : dict

    Parameter names mapped to their values.


inverse_transform(array)

Transforms an encoded NumPy array back into transactions.

Parameters

  • array : NumPy array [n_transactions, n_unique_items]

    The NumPy one-hot encoded boolean array of the input transactions, where the columns represent the unique items found in the input array in alphabetic order

    For example, array([[True , False, True , True , False, True ], [True , False, True , False, False, True ], [True , False, True , False, False, False], [True , True , False, False, False, False], [False, False, True , True , True , True ], [False, False, True , False, True , True ], [False, False, True , False, True , False], [True , True , False, False, False, False]]) The corresponding column labels are available as self.columns_, e.g., ['Apple', 'Bananas', 'Beer', 'Chicken', 'Milk', 'Rice']

Returns

  • X : list of lists

    A python list of lists, where the outer list stores the n transactions and the inner list stores the items in each transaction.

    For example, [['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]


set_output(, transform=None)*

Set output container.

See :ref:`sphx_glr_auto_examples_miscellaneous_plot_set_output.py`
for an example on how to use the API.

Parameters

  • transform : {"default", "pandas"}, default=None

    Configure output of transform and fit_transform.

    • "default": Default output format of a transformer
    • "pandas": DataFrame output
    • None: Transform configuration is unchanged

Returns

  • self : estimator instance

    Estimator instance.


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``<component>__<parameter>`` so that it's
possible to update each component of a nested object.

Parameters

  • **params : dict

    Estimator parameters.

Returns

  • self : estimator instance

    Estimator instance.


transform(X, sparse=False)

Transform transactions into a one-hot encoded NumPy array.

Parameters

  • X : list of lists

    A python list of lists, where the outer list stores the n transactions and the inner list stores the items in each transaction.

    For example, [['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]

    sparse: bool (default=False) If True, transform will return Compressed Sparse Row matrix instead of the regular one.

Returns

  • array : NumPy array [n_transactions, n_unique_items]

    if sparse=False (default). Compressed Sparse Row matrix otherwise The one-hot encoded boolean array of the input transactions, where the columns represent the unique items found in the input array in alphabetic order. Exact representation depends on the sparse argument

    For example, array([[True , False, True , True , False, True ], [True , False, True , False, False, True ], [True , False, True , False, False, False], [True , True , False, False, False, False], [False, False, True , True , True , True ], [False, False, True , False, True , True ], [False, False, True , False, True , False], [True , True , False, False, False, False]]) The corresponding column labels are available as self.columns_, e.g., ['Apple', 'Bananas', 'Beer', 'Chicken', 'Milk', 'Rice']