apriori

apriori(df, min_support=0.5, use_colnames=False, max_len=None, verbose=0, low_memory=False)

Get frequent itemsets from a one-hot DataFrame

Parameters

  • df : pandas DataFrame

    pandas DataFrame the encoded format. Also supports DataFrames with sparse data; for more info, please see (https://pandas.pydata.org/pandas-docs/stable/ user_guide/sparse.html#sparse-data-structures)

    Please note that the old pandas SparseDataFrame format is no longer supported in mlxtend >= 0.17.2.

    The allowed values are either 0/1 or True/False. For example,

    Apple Bananas Beer Chicken Milk Rice 0 True False True True False True 1 True False True False False True 2 True False True False False False 3 True True False False False False 4 False False True True True True 5 False False True False True True 6 False False True False True False 7 True True False False False False

  • min_support : float (default: 0.5)

    A float between 0 and 1 for minumum support of the itemsets returned. The support is computed as the fraction transactions_where_item(s)_occur / total_transactions.

  • use_colnames : bool (default: False)

    If True, uses the DataFrames' column names in the returned DataFrame instead of column indices.

  • max_len : int (default: None)

    Maximum length of the itemsets generated. If None (default) all possible itemsets lengths (under the apriori condition) are evaluated.

  • verbose : int (default: 0)

    Shows the number of iterations if >= 1 and low_memory is True. If

    =1 and low_memory is False, shows the number of combinations.

  • low_memory : bool (default: False)

    If True, uses an iterator to search for combinations above min_support. Note that while low_memory=True should only be used for large dataset if memory resources are limited, because this implementation is approx. 3-6x slower than the default.

Returns

pandas DataFrame with columns ['support', 'itemsets'] of all itemsets that are >= min_support and < than max_len (if max_len is not None). Each itemset in the 'itemsets' column is of type frozenset, which is a Python built-in type that behaves similarly to sets except that it is immutable (For more info, see https://docs.python.org/3.6/library/stdtypes.html#frozenset).

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/frequent_patterns/apriori/