Mlxtend.plotting

mlxtend version: 0.24.0

category_scatter

category_scatter(x, y, label_col, data, markers='sxo^v', colors=('blue', 'green', 'red', 'purple', 'gray', 'cyan'), alpha=0.7, markersize=20.0, legend_loc='best')

Scatter plot to plot categories in different colors/markerstyles.

Parameters

x : str or int

DataFrame column name of the x-axis values or integer for the numpy ndarray column index.
y : str

DataFrame column name of the y-axis values or integer for the numpy ndarray column index
data : Pandas DataFrame object or NumPy ndarray.
markers : str

Markers that are cycled through the label category.
colors : tuple

Colors that are cycled through the label category.
alpha : float (default: 0.7)

Parameter to control the transparency.
markersize : float (default` : 20.0)

Parameter to control the marker size.
legend_loc : str (default: 'best')

Location of the plot legend {best, upper left, upper right, lower left, lower right} No legend if legend_loc=False

Returns

fig : matplotlig.pyplot figure object

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/category_scatter/

checkerboard_plot

checkerboard_plot(ary, cell_colors=('white', 'black'), font_colors=('black', 'white'), fmt='%.1f', figsize=None, row_labels=None, col_labels=None, fontsize=None)

Plot a checkerboard table / heatmap via matplotlib.

Parameters

ary : array-like, shape = [n, m]

A 2D Nnumpy array.
cell_colors : tuple or list (default: ('white', 'black'))

Tuple or list containing the two colors of the checkerboard pattern.
font_colors : tuple or list (default: ('black', 'white'))

Font colors corresponding to the cell colors.
figsize : tuple (default: (2.5, 2.5))

Height and width of the figure
fmt : str (default: '%.1f')

Python string formatter for cell values. The default '%.1f' results in floats with 1 digit after the decimal point. Use '%d' to show numbers as integers.
row_labels : list (default: None)

List of the row labels. Uses the array row indices 0 to n by default.
col_labels : list (default: None)

List of the column labels. Uses the array column indices 0 to m by default.
fontsize : int (default: None)

Specifies the font size of the checkerboard table. Uses matplotlib's default if None.

Returns

fig : matplotlib Figure object.

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/checkerboard_plot/

ecdf

ecdf(x, y_label='ECDF', x_label=None, ax=None, percentile=None, ecdf_color=None, ecdf_marker='o', percentile_color='black', percentile_linestyle='--')

Plots an Empirical Cumulative Distribution Function

Parameters

x : array or list, shape=[n_samples,]

Array-like object containing the feature values
y_label : str (default='ECDF')

Text label for the y-axis
x_label : str (default=None)

Text label for the x-axis
ax : matplotlib.axes.Axes (default: None)

An existing matplotlib Axes. Creates one if ax=None
percentile : float (default=None)

Float between 0 and 1 for plotting a percentile threshold line
ecdf_color : matplotlib color (default=None)

Color for the ECDF plot; uses matplotlib defaults if None
ecdf_marker : matplotlib marker (default='o')

Marker style for the ECDF plot
percentile_color : matplotlib color (default='black')

Color for the percentile threshold if percentile is not None
percentile_linestyle : matplotlib linestyle (default='--')

Line style for the percentile threshold if percentile is not None

Returns

ax : matplotlib.axes.Axes object
percentile_threshold : float

Feature threshold at the percentile or None if percentile=None
percentile_count : Number of if percentile is not None

Number of samples that have a feature less or equal than the feature threshold at a percentile threshold or None if percentile=None

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/ecdf/

enrichment_plot

enrichment_plot(df, colors='bgrkcy', markers=' ', linestyles='-', alpha=0.5, lw=2, where='post', grid=True, count_label='Count', xlim='auto', ylim='auto', invert_axes=False, legend_loc='best', ax=None)

Plot stacked barplots

Parameters

df : pandas.DataFrame

A pandas DataFrame where columns represent the different categories. colors: str (default: 'bgrcky') The colors of the bars.
markers : str (default: ' ')

Matplotlib markerstyles, e.g, 'sov' for square,circle, and triangle markers.
linestyles : str (default: '-')

Matplotlib linestyles, e.g., '-,--' to cycle normal and dashed lines. Note that the different linestyles need to be separated by commas.
alpha : float (default: 0.5)

Transparency level from 0.0 to 1.0.
lw : int or float (default: 2)

Linewidth parameter.
where : {'post', 'pre', 'mid'} (default: 'post')

Starting location of the steps.
grid : bool (default: True)

Plots a grid if True.
count_label : str (default: 'Count')

Label for the "Count"-axis.
xlim : 'auto' or array-like [min, max] (default: 'auto')

Min and maximum position of the x-axis range.
ylim : 'auto' or array-like [min, max] (default: 'auto')

Min and maximum position of the y-axis range.
invert_axes : bool (default: False)

Plots count on the x-axis if True.
legend_loc : str (default: 'best')

Location of the plot legend {best, upper left, upper right, lower left, lower right} No legend if legend_loc=False
ax : matplotlib axis, optional (default: None)

Use this axis for plotting or make a new one otherwise

Returns

ax : matplotlib axis

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/enrichment_plot/

heatmap

heatmap(matrix, hide_spines=False, hide_ticks=False, figsize=None, cmap=None, colorbar=True, row_names=None, column_names=None, column_name_rotation=45, cell_values=True, cell_fmt='.2f', cell_font_size=None, text_color_threshold=None)

Plot a heatmap via matplotlib.

Parameters

conf_mat : array-like, shape = [n_rows, n_columns]

And arbitrary 2D array.
hide_spines : bool (default: False)

Hides axis spines if True.
hide_ticks : bool (default: False)

Hides axis ticks if True
figsize : tuple (default: (2.5, 2.5))

Height and width of the figure
cmap : matplotlib colormap (default: None)

Uses matplotlib.pyplot.cm.viridis if None
colorbar : bool (default: True)

Shows a colorbar if True
row_names : array-like, shape = [n_rows] (default: None)

List of row names to be used as y-axis tick labels.
column_names : array-like, shape = [n_columns] (default: None)

List of column names to be used as x-axis tick labels.
column_name_rotation : int (default: 45)

Number of degrees for rotating column x-tick labels.
cell_values : bool (default: True)

Plots cell values if True.
cell_fmt : string (default: '.2f')

Format specification for cell values (if cell_values=True)
cell_font_size : int (default: None)

Font size for cell values (if cell_values=True)
text_color_threshold : float (default: None)

Threshold for the black/white text threshold of the text annotation. Default (None) tried to infer a good threshold automatically using np.max(normed_matrix) / 2.

Returns

fig, ax : matplotlib.pyplot subplot objects

Figure and axis elements of the subplot.

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/heatmap/

plot_confusion_matrix

plot_confusion_matrix(conf_mat, hide_spines=False, hide_ticks=False, figsize=None, cmap=None, colorbar=False, show_absolute=True, show_normed=False, norm_colormap=None, class_names=None, figure=None, axis=None, fontcolor_threshold=0.5)

Plot a confusion matrix via matplotlib.

Parameters

conf_mat : array-like, shape = [n_classes, n_classes]

Confusion matrix from evaluate.confusion matrix.
hide_spines : bool (default: False)

Hides axis spines if True.
hide_ticks : bool (default: False)

Hides axis ticks if True
figsize : tuple (default: (2.5, 2.5))

Height and width of the figure
cmap : matplotlib colormap (default: None)

Uses matplotlib.pyplot.cm.Blues if None
colorbar : bool (default: False)

Shows a colorbar if True
show_absolute : bool (default: True)

Shows absolute confusion matrix coefficients if True. At least one of show_absolute or show_normed must be True.
show_normed : bool (default: False)

Shows normed confusion matrix coefficients if True. The normed confusion matrix coefficients give the proportion of training examples per class that are assigned the correct label. At least one of show_absolute or show_normed must be True.
norm_colormap : bool (default: False)

Matplotlib color normalization object to normalize the color scale, e.g., matplotlib.colors.LogNorm().
class_names : array-like, shape = [n_classes] (default: None)

List of class names. If not None, ticks will be set to these values.
figure : None or Matplotlib figure (default: None)

If None will create a new figure.
axis : None or Matplotlib figure axis (default: None)

If None will create a new axis.
fontcolor_threshold : Float (default: 0.5)

Sets a threshold for choosing black and white font colors for the cells. By default all values larger than 0.5 times the maximum cell value are converted to white, and everything equal or smaller than 0.5 times the maximum cell value are converted to black.

Returns

fig, ax : matplotlib.pyplot subplot objects

Figure and axis elements of the subplot.

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_confusion_matrix/

plot_decision_regions

plot_decision_regions(X, y, clf, feature_index=None, filler_feature_values=None, filler_feature_ranges=None, ax=None, X_highlight=None, zoom_factor=1.0, legend=1, hide_spines=True, markers='s^oxv<>', colors='#1f77b4,#ff7f0e,#3ca02c,#d62728,#9467bd,#8c564b,#e377c2,#7f7f7f,#bcbd22,#17becf', scatter_kwargs=None, contourf_kwargs=None, contour_kwargs=None, scatter_highlight_kwargs=None, n_jobs=None)

Plot decision regions of a classifier.

Please note that this functions assumes that class labels are
labeled consecutively, e.g,. 0, 1, 2, 3, 4, and 5. If you have class
labels with integer labels > 4, you may want to provide additional colors
and/or markers as `colors` and `markers` arguments.
See https://matplotlib.org/examples/color/named_colors.html for more
information.

Parameters

X : array-like, shape = [n_samples, n_features]

Feature Matrix.
y : array-like, shape = [n_samples]

True class labels.
clf : Classifier object.

Must have a .predict method.
feature_index : array-like (default: (0,) for 1D, (0, 1) otherwise)

Feature indices to use for plotting. The first index in feature_index will be on the x-axis, the second index will be on the y-axis.
filler_feature_values : dict (default: None)

Only needed for number features > 2. Dictionary of feature index-value pairs for the features not being plotted.
filler_feature_ranges : dict (default: None)

Only needed for number features > 2. Dictionary of feature index-value pairs for the features not being plotted. Will use the ranges provided to select training samples for plotting.
ax : matplotlib.axes.Axes (default: None)

An existing matplotlib Axes. Creates one if ax=None.
X_highlight : array-like, shape = [n_samples, n_features] (default: None)

An array with data points that are used to highlight samples in X.
zoom_factor : float (default: 1.0)

Controls the scale of the x- and y-axis of the decision plot.
hide_spines : bool (default: True)

Hide axis spines if True.
legend : int (default: 1)

Integer to specify the legend location. No legend if legend is 0.
markers : str (default: 's^oxv<>')

Scatterplot markers.
colors : str (default: 'red,blue,limegreen,gray,cyan')

Comma separated list of colors.
scatter_kwargs : dict (default: None)

Keyword arguments for underlying matplotlib scatter function.
contourf_kwargs : dict (default: None)

Keyword arguments for underlying matplotlib contourf function.
contour_kwargs : dict (default: None)

Keyword arguments for underlying matplotlib contour function (which draws the lines between decision regions).
scatter_highlight_kwargs : dict (default: None)

Keyword arguments for underlying matplotlib scatter function.
n_jobs : int or None, optional (default=None)

The number of CPUs to use to do the computation using Python's multiprocessing library. None means 1. -1 means using all processors. New in v0.22.0.

Returns

ax : matplotlib.axes.Axes object

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_decision_regions/

plot_learning_curves

plot_learning_curves(X_train, y_train, X_test, y_test, clf, train_marker='o', test_marker='^', scoring='misclassification error', suppress_plot=False, print_model=True, title_fontsize=12, style='default', legend_loc='best')

Plots learning curves of a classifier.

Parameters

X_train : array-like, shape = [n_samples, n_features]

Feature matrix of the training dataset.
y_train : array-like, shape = [n_samples]

True class labels of the training dataset.
X_test : array-like, shape = [n_samples, n_features]

Feature matrix of the test dataset.
y_test : array-like, shape = [n_samples]

True class labels of the test dataset.
clf : Classifier object. Must have a .predict .fit method.
train_marker : str (default: 'o')

Marker for the training set line plot.
test_marker : str (default: '^')

Marker for the test set line plot.
scoring : str (default: 'misclassification error')

If not 'misclassification error', accepts the following metrics (from scikit-learn): {'accuracy', 'average_precision', 'f1_micro', 'f1_macro', 'f1_weighted', 'f1_samples', 'log_loss', 'precision', 'recall', 'roc_auc', 'adjusted_rand_score', 'mean_absolute_error', 'mean_squared_error', 'median_absolute_error', 'r2'}
suppress_plot=False : bool (default: False)

Suppress matplotlib plots if True. Recommended for testing purposes.
print_model : bool (default: True)

Print model parameters in plot title if True.
title_fontsize : int (default: 12)

Determines the size of the plot title font.
style : str (default: 'default')

Matplotlib style. For more styles, please see https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html
legend_loc : str (default: 'best')

Where to place the plot legend: {'best', 'upper left', 'upper right', 'lower left', 'lower right'}

Returns

errors : (training_error, test_error): tuple of lists

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_learning_curves/

plot_linear_regression

plot_linear_regression(X, y, model=LinearRegression(), corr_func='pearsonr', scattercolor='blue', fit_style='k--', legend=True, xlim='auto')

Plot a linear regression line fit.

Parameters

X : numpy array, shape = [n_samples,]

Samples.
y : numpy array, shape (n_samples,)

Target values model: object (default: sklearn.linear_model.LinearRegression) Estimator object for regression. Must implement a .fit() and .predict() method. corr_func: str or function (default: 'pearsonr') Uses pearsonr from scipy.stats if corr_func='pearsonr'. to compute the regression slope. If not 'pearsonr', the corr_func, the corr_func parameter expects a function of the form func(, ) as inputs, which is expected to return a tuple (<correlation_coefficient>, <some_unused_value>). scattercolor: string (default: blue) Color of scatter plot points. fit_style: string (default: k--) Style for the line fit. legend: bool (default: True) Plots legend with corr_coeff coef., fit coef., and intercept values. xlim: array-like (x_min, x_max) or 'auto' (default: 'auto') X-axis limits for the linear line fit.

Returns

regression_fit : tuple

intercept, slope, corr_coeff (float, float, float)

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_linear_regression/

plot_pca_correlation_graph

plot_pca_correlation_graph(X, variables_names, dimensions=(1, 2), figure_axis_size=6, X_pca=None, explained_variance=None)

Compute the PCA for X and plots the Correlation graph

Parameters

X : 2d array like.

The columns represent the different variables and the rows are the samples of thos variables
variables_names : array like

Name of the columns (the variables) of X

dimensions: tuple with two elements. dimensions to be plotted (x,y)

figure_axis_size : size of the final frame. The figure created is a square with length and width equal to figure_axis_size.
X_pca : np.ndarray, shape = [n_samples, n_components].

Optional. X_pca is the matrix of the transformed components from X. If not provided, the function computes PCA automatically using mlxtend.feature_extraction.PrincipalComponentAnalysis Expected n_componentes >= max(dimensions)
explained_variance : 1 dimension np.ndarray, length = n_components

Optional. explained_variance are the eigenvalues from the diagonalized covariance matrix on the PCA transformatiopn. If not provided, the function computes PCA independently Expected n_componentes == X.shape[1]

Returns

matplotlib_figure, correlation_matrix

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_pca_correlation_graph/

plot_sequential_feature_selection

plot_sequential_feature_selection(metric_dict, figsize=None, kind='std_dev', color='blue', bcolor='steelblue', marker='o', alpha=0.2, ylabel='Performance', confidence_interval=0.95)

Plot feature selection results.

Parameters

metric_dict : mlxtend.SequentialFeatureSelector.get_metric_dict() object
figsize : tuple (default: None)

Height and width of the figure
kind : str (default: "std_dev")

The kind of error bar or confidence interval in {'std_dev', 'std_err', 'ci', None}.
color : str (default: "blue")

Color of the lineplot (accepts any matplotlib color name)
bcolor : str (default: "steelblue").

Color of the error bars / confidence intervals (accepts any matplotlib color name).
marker : str (default: "o")

Marker of the line plot (accepts any matplotlib marker name).
alpha : float in [0, 1] (default: 0.2)

Transparency of the error bars / confidence intervals.
ylabel : str (default: "Performance")

Y-axis label.
confidence_interval : float (default: 0.95)

Confidence level if kind='ci'.

Returns

fig : matplotlib.pyplot.figure() object

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/plot_sequential_feature_selection/

remove_borders

remove_borders(axes, left=False, bottom=False, right=True, top=True)

Remove chart junk from matplotlib plots.

Parameters

axes : iterable

An iterable containing plt.gca() or plt.subplot() objects, e.g. [plt.gca()].
left : bool (default: False)

Hide left axis spine if True.
bottom : bool (default: False)

Hide bottom axis spine if True.
right : bool (default: True)

Hide right axis spine if True.
top : bool (default: True)

Hide top axis spine if True.

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/remove_chartjunk/

scatter_hist

scatter_hist(x, y, xlabel=None, ylabel=None, figsize=(5, 5))

Scatter plot and individual feature histograms along axes.

Parameters

x : 1D array-like or Pandas Series

X-axis values.
y : 1D array-like or Pandas Series

Y-axis values.
xlabel : str (default: None)

Label for the X-axis values. If x is a pandas Series, and xlabel is None, the label is inferred automatically.
ylabel : str (default: None)

Label for the X-axis values. If y is a pandas Series, and ylabel is None, the label is inferred automatically.
figsize : tuple (default: (5, 5))

Matplotlib figure size.

Returns

plot : Matplotlib Figure object

scatterplotmatrix

scatterplotmatrix(X, fig_axes=None, names=None, figsize=(8, 8), alpha=1.0, kwargs)

Lower triangular of a scatterplot matrix

Parameters

X : array-like, shape={num_examples, num_features}

Design matrix containing data instances (examples) with multiple exploratory variables (features).
fix_axes : tuple (default: None)

A (fig, axes) tuple, where fig is an figure object and axes is an axes object created via matplotlib, for example, by calling the pyplot subplot function fig, axes = plt.subplots(...)
names : list (default: None)

A list of string names, which should have the same number of elements as there are features (columns) in X.
figsize : tuple (default: (8, 8))

Height and width of the subplot grid. Ignored if fig_axes is not None.
alpha : float (default: 1.0)

Transparency for both the scatter plots and the histograms along the diagonal.
**kwargs : kwargs

Keyword arguments for the scatterplots.

Returns

fix_axes : tuple

A (fig, axes) tuple, where fig is an figure object and axes is an axes object created via matplotlib, for example, by calling the pyplot subplot function fig, axes = plt.subplots(...)

Examples

For more usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/scatterplotmatrix/

stacked_barplot

stacked_barplot(df, bar_width='auto', colors='bgrcky', labels='index', rotation=90, legend_loc='best')

Function to plot stacked barplots

Parameters

df : pandas.DataFrame

A pandas DataFrame where the index denotes the x-axis labels, and the columns contain the different measurements for each row. bar_width: 'auto' or float (default: 'auto') Parameter to set the widths of the bars. if 'auto', the width is automatically determined by the number of columns in the dataset. colors: str (default: 'bgrcky') The colors of the bars. labels: 'index' or iterable (default: 'index') If 'index', the DataFrame index will be used as x-tick labels. rotation: int (default: 90) Parameter to rotate the x-axis labels.
legend_loc : str (default: 'best')

Location of the plot legend {best, upper left, upper right, lower left, lower right} No legend if legend_loc=False

Returns

fig : matplotlib.pyplot figure object

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/plotting/stacked_barplot/

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search