Release Notes

The CHANGELOG for the current development version is available at

Version 0.13.0 (07/20/2018)

New Features
  • A meaningful error message is now raised when a cross-validation generator is used with SequentialFeatureSelector. (#377)
  • The SequentialFeatureSelector now accepts custom feature names via the fit method for more interpretable feature subset reports. (#379)
  • The SequentialFeatureSelector is now also compatible with Pandas DataFrames and uses DataFrame column-names for more interpretable feature subset reports. (#379)
  • ColumnSelector now works with Pandas DataFrames columns. (#378 by Manuel Garrido)
  • The ExhaustiveFeatureSelector estimator in mlxtend.feature_selection now is safely stoppable mid-process by control+c. (#380)
  • Two new functions, vectorspace_orthonormalization and vectorspace_dimensionality were added to mlxtend.math to use the Gram-Schmidt process to convert a set of linearly independent vectors into a set of orthonormal basis vectors, and to compute the dimensionality of a vectorspace, respectively. (#382)
  • mlxtend.frequent_patterns.apriori now supports pandas SparseDataFrames to generate frequent itemsets. (#404 via Daniel Morales)
  • The plot_confusion_matrix function now has the ability to show normalized confusion matrix coefficients in addition to or instead of absolute confusion matrix coefficients with or without a colorbar. The text display method has been changed so that the full range of the colormap is used. The default size is also now set based on the number of classes.
  • Added support for merging the meta features with the original input features in StackingRegressor (via use_features_in_secondary) like it is already supported in the other Stacking classes. (#418)
  • Added a support_only to the association_rules function, which allow constructing association rules (based on the support metric only) for cropped input DataFrames that don't contain a complete set of antecedent and consequent support values. (#421)
  • Itemsets generated with apriori are now frozensets (#393 by William Laney and #394)
  • Now raises an error if a input DataFrame to apriori contains non 0, 1, True, False values. #419)
Bug Fixes
  • Allow mlxtend estimators to be cloned via scikit-learn's clone function. (#374)
  • Fixes bug to allow the correct use of refit=False in StackingRegressor and StackingCVRegressor (#384 and (#385) by selay01)
  • Allow StackingClassifier to work with sparse matrices when use_features_in_secondary=True (#408 by Floris Hoogenbook)
  • Allow StackingCVRegressor to work with sparse matrices when use_features_in_secondary=True (#416)
  • Allow StackingCVClassifier to work with sparse matrices when use_features_in_secondary=True (#417)

Version 0.12.0 (2018-21-04)

New Features
  • A new feature_importance_permuation function to compute the feature importance in classifiers and regressors via the permutation importance method (#358)
  • The fit method of the ExhaustiveFeatureSelector now optionally accepts **fit_params for the estimator that is used for the feature selection. (#354 by Zach Griffith)
  • The fit method of the SequentialFeatureSelector now optionally accepts **fit_params for the estimator that is used for the feature selection. (#350 by Zach Griffith)
  • Replaced plot_decision_regions colors by a colorblind-friendly palette and adds contour lines for decision regions. (#348)
  • All stacking estimators now raise NonFittedErrors if any method for inference is called prior to fitting the estimator. (#353)
  • Renamed the refit parameter of both the StackingClassifier and StackingCVClassifier to use_clones to be more explicit and less misleading. (#368)
Bug Fixes
  • Various changes in the documentation and documentation tools to fix formatting issues (#363)
  • Fixes a bug where the StackingCVClassifier's meta features were not stored in the original order when shuffle=True (#370)
  • Many documentation improvements, including links to the User Guides in the API docs (#371)

Version 0.11.0 (2018-03-14)

New Features
  • New function implementing the resampled paired t-test procedure (paired_ttest_resampled) to compare the performance of two models. (#323)
  • New function implementing the k-fold paired t-test procedure (paired_ttest_kfold_cv) to compare the performance of two models (also called k-hold-out paired t-test). (#324)
  • New function implementing the 5x2cv paired t-test procedure (paired_ttest_5x2cv) proposed by Dieterrich (1998) to compare the performance of two models. (#325)
  • A refit parameter was added to stacking classes (similar to the refit parameter in the EnsembleVoteClassifier), to support classifiers and regressors that follow the scikit-learn API but are not compatible with scikit-learn's clone function. (#322)
  • The ColumnSelector now has a drop_axis argument to use it in pipelines with CountVectorizers. (#333)
  • Raises an informative error message if predict or predict_meta_features is called prior to calling the fit method in StackingRegressor and StackingCVRegressor. (#315)
  • The plot_decision_regions function now automatically determines the optimal setting based on the feature dimensions and supports anti-aliasing. The old res parameter has been deprecated. (#309 by Guillaume Poirier-Morency)
  • Apriori code is faster due to optimization in onehot transformation and the amount of candidates generated by the apriori algorithm. (#327 by Jakub Smid)
  • The OnehotTransactions class (which is typically often used in combination with the apriori function for association rule mining) is now more memory efficient as it uses boolean arrays instead of integer arrays. In addition, the OnehotTransactions class can be now be provided with sparse argument to generate sparse representations of the onehot matrix to further improve memory efficiency. (#328 by Jakub Smid)
  • The OneHotTransactions has been deprecated and replaced by the TransactionEncoder. (#332
  • The plot_decision_regions function now has three new parameters, scatter_kwargs, contourf_kwargs, and scatter_highlight_kwargs, that can be used to modify the plotting style. (#342 by James Bourbeau)
Bug Fixes
  • Fixed issue when class labels were provided to the EnsembleVoteClassifier when refit was set to false. (#322)
  • Allow arrays with 16-bit and 32-bit precision in plot_decision_regions function. (#337)
  • Fixed bug that raised an indexing error if the number of items was <= 1 when computing association rules using the conviction metric. (#340)

Version 0.10.0 (2017-12-22)

New Features
  • New store_train_meta_features parameter for fit in StackingCVRegressor. if True, train meta-features are stored in self.train_meta_features_. New pred_meta_features method for StackingCVRegressor. People can get test meta-features using this method. (#294 via takashioya)
  • The new store_train_meta_features attribute and pred_meta_features method for the StackingCVRegressor were also added to the StackingRegressor, StackingClassifier, and StackingCVClassifier (#299 & #300)
  • New function (evaluate.mcnemar_tables) for creating multiple 2x2 contigency from model predictions arrays that can be used in multiple McNemar (post-hoc) tests or Cochran's Q or F tests, etc. (#307)
  • New function (evaluate.cochrans_q) for performing Cochran's Q test to compare the accuracy of multiple classifiers. (#310)
Bug Fixes
  • Improved numerical stability for p-values computed via the the exact McNemar test (#306)
  • nose is not required to use the library (#302)

Version 0.9.1 (2017-11-19)

New Features
  • Added mlxtend.evaluate.bootstrap_point632_score to evaluate the performance of estimators using the .632 bootstrap. (#283)
  • New max_len parameter for the frequent itemset generation via the apriori function to allow for early stopping. (#270)
  • All feature index tuples in SequentialFeatureSelector or now in sorted order. (#262)
  • The SequentialFeatureSelector now runs the continuation of the floating inclusion/exclusion as described in Novovicova & Kittler (1994). Note that this didn't cause any difference in performance on any of the test scenarios but could lead to better performance in certain edge cases. (#262)
  • utils.Counter now accepts a name variable to help distinguish between multiple counters, time precision can be set with the 'precision' kwarg and the new attribute end_time holds the time the last iteration completed. (#278 via Mathew Savage)
Bug Fixes
  • Fixed an deprecation error that occured with McNemar test when using SciPy 1.0. (#283)

Version 0.9.0 (2017-10-21)

New Features
  • Added evaluate.permutation_test, a permutation test for hypothesis testing (or A/B testing) to test if two samples come from the same distribution. Or in other words, a procedure to test the null hypothesis that that two groups are not significantly different (e.g., a treatment and a control group). (#250)
  • Added 'leverage' and 'conviction as evaluation metrics to the frequent_patterns.association_rules function. (#246 & #247)
  • Added a loadings_ attribute to PrincipalComponentAnalysis to compute the factor loadings of the features on the principal components. (#251)
  • Allow grid search over classifiers/regressors in ensemble and stacking estimators. (#259)
  • New make_multiplexer_dataset function that creates a dataset generated by a n-bit Boolean multiplexer for evaluating supervised learning algorithms. (#263)
  • Added a new BootstrapOutOfBag class, an implementation of the out-of-bag bootstrap to evaluate supervised learning algorithms. (#265)
  • The parameters for StackingClassifier, StackingCVClassifier, StackingRegressor, StackingCVRegressor, and EnsembleVoteClassifier can now be tuned using scikit-learn's GridSearchCV (#254 via James Bourbeau)
  • The 'support' column returned by frequent_patterns.association_rules was changed to compute the support of "antecedant union consequent", and new antecedant support' and 'consequent support' column were added to avoid ambiguity. (#245)
  • Allow the OnehotTransactions to be cloned via scikit-learn's clone function, which is required by e.g., scikit-learn's FeatureUnion or GridSearchCV (via Iaroslav Shcherbatyi). (#249)
Bug Fixes
  • Fix issues with self._init_time parameter in _IterativeModel subclasses. (#256)
  • Fix imprecision bug that occurred in plot_ecdf when run on Python 2.7. (264)
  • The vectors from SVD in PrincipalComponentAnalysis are now being scaled so that the eigenvalues via solver='eigen' and solver='svd' now store eigenvalues that have the same magnitudes. (#251)

Version 0.8.0 (2017-09-09)

New Features
  • Added a mlxtend.evaluate.bootstrap that implements the ordinary nonparametric bootstrap to bootstrap a single statistic (for example, the mean. median, R^2 of a regression fit, and so forth) #232
  • SequentialFeatureSelecor's k_features now accepts a string argument "best" or "parsimonious" for more "automated" feature selection. For instance, if "best" is provided, the feature selector will return the feature subset with the best cross-validation performance. If "parsimonious" is provided as an argument, the smallest feature subset that is within one standard error of the cross-validation performance will be selected. #238
  • SequentialFeatureSelector now uses np.nanmean over normal mean to support scorers that may return np.nan #211 (via mrkaiser)
  • The skip_if_stuck parameter was removed from SequentialFeatureSelector in favor of a more efficient implementation comparing the conditional inclusion/exclusion results (in the floating versions) to the performances of previously sampled feature sets that were cached #237
  • ExhaustiveFeatureSelector was modified to consume substantially less memory #195 (via Adam Erickson)
Bug Fixes
  • Fixed a bug where the SequentialFeatureSelector selected a feature subset larger than then specified via the k_features tuple max-value #213

Version 0.7.0 (2017-06-22)

New Features
  • The TensorFlow estimator have been removed from mlxtend, since TensorFlow has now very convenient ways to build on estimators, which render those implementations obsolete.
  • plot_decision_regions now supports plotting decision regions for more than 2 training features #189, via James Bourbeau).
  • Parallel execution in mlxtend.feature_selection.SequentialFeatureSelector and mlxtend.feature_selection.ExhaustiveFeatureSelector is now performed over different feature subsets instead of the different cross-validation folds to better utilize machines with multiple processors if the number of features is large (#193, via @whalebot-helmsman).
  • Raise meaningful error messages if pandas DataFrames or Python lists of lists are fed into the StackingCVClassifer as a fit arguments (198).
  • The n_folds parameter of the StackingCVClassifier was changed to cv and can now accept any kind of cross validation technique that is available from scikit-learn. For example, StackingCVClassifier(..., cv=StratifiedKFold(n_splits=3)) or StackingCVClassifier(..., cv=GroupKFold(n_splits=3)) (#203, via Konstantinos Paliouras).
Bug Fixes
  • SequentialFeatureSelector now correctly accepts a None argument for the scoring parameter to infer the default scoring metric from scikit-learn classifiers and regressors (#171).
  • The plot_decision_regions function now supports pre-existing axes objects generated via matplotlib's plt.subplots. (#184, see example)
  • Made math.num_combinations and math.num_permutations numerically stable for large numbers of combinations and permutations (#200).

Version 0.6.0 (2017-03-18)

New Features
  • An association_rules function is implemented that allows to generate rules based on a list of frequent itemsets (via Joshua Goerner).
  • Adds a black edgecolor to plots via plotting.plot_decision_regions to make markers more distinguishable from the background in matplotlib>=2.0.
  • The association submodule was renamed to frequent_patterns.
Bug Fixes
  • The DataFrame index of apriori results are now unique and ordered.
  • Fixed typos in autompg and wine datasets (via James Bourbeau).

Version 0.5.1 (2017-02-14)

New Features
  • The EnsembleVoteClassifier has a new refit attribute that prevents refitting classifiers if refit=False to save computational time.
  • Added a new lift_score function in evaluate to compute lift score (via Batuhan Bardak).
  • StackingClassifier and StackingRegressor support multivariate targets if the underlying models do (via kernc).
  • StackingClassifier has a new use_features_in_secondary attribute like StackingCVClassifier.
  • Changed default verbosity level in SequentialFeatureSelector to 0
  • The EnsembleVoteClassifier now raises a NotFittedError if the estimator wasn't fit before calling predict. (via Anton Loss)
  • Added new TensorFlow variable initialization syntax to guarantee compatibility with TensorFlow 1.0
Bug Fixes
  • Fixed wrong default value for k_features in SequentialFeatureSelector
  • Cast selected feature subsets in the SequentialFeautureSelector as sets to prevent the iterator from getting stuck if the k_idx are different permutations of the same combination (via Zac Wellmer).
  • Fixed an issue with learning curves that caused the performance metrics to be reversed (via ipashchenko)
  • Fixed a bug that could occur in the SequentialFeatureSelector if there are similarly-well performing subsets in the floating variants (via Zac Wellmer).

Version 0.5.0 (2016-11-09)

New Features
  • New ExhaustiveFeatureSelector estimator in mlxtend.feature_selection for evaluating all feature combinations in a specified range
  • The StackingClassifier has a new parameter average_probas that is set to True by default to maintain the current behavior. A deprecation warning was added though, and it will default to False in future releases (0.6.0); average_probas=False will result in stacking of the level-1 predicted probabilities rather than averaging these.
  • New StackingCVClassifier estimator in 'mlxtend.classifier' for implementing a stacking ensemble that uses cross-validation techniques for training the meta-estimator to avoid overfitting (Reiichiro Nakano)
  • New OnehotTransactions encoder class added to the preprocessing submodule for transforming transaction data into a one-hot encoded array
  • The SequentialFeatureSelector estimator in mlxtend.feature_selection now is safely stoppable mid-process by control+c, and deprecated print_progress in favor of a more tunable verbose parameter (Will McGinnis)
  • New apriori function in association to extract frequent itemsets from transaction data for association rule mining
  • New checkerboard_plot function in plotting to plot checkerboard tables / heat maps
  • New mcnemar_table and mcnemar functions in evaluate to compute 2x2 contingency tables and McNemar's test
  • All plotting functions have been moved to mlxtend.plotting for compatibility reasons with continuous integration services and to make the installation of matplotlib optional for users of mlxtend's core functionality
  • Added a compatibility layer for scikit-learn 0.18 using the new model_selection module while maintaining backwards compatibility to scikit-learn 0.17.
Bug Fixes
  • mlxtend.plotting.plot_decision_regions now draws decision regions correctly if more than 4 class labels are present
  • Raise AttributeError in plot_decision_regions when the X_higlight argument is a 1D array (chkoar)

Version 0.4.2 (2016-08-24)

New Features
  • Added preprocessing.CopyTransformer, a mock class that returns copies of imput arrays via transform and fit_transform
  • Added AppVeyor to CI to ensure MS Windows compatibility
  • Dataset are now saved as compressed .txt or .csv files rather than being imported as Python objects
  • feature_selection.SequentialFeatureSelector now supports the selection of k_features using a tuple to specify a "min-max" k_features range
  • Added "SVD solver" option to the PrincipalComponentAnalysis
  • Raise a AttributeError with "not fitted" message in SequentialFeatureSelector if transform or get_metric_dict are called prior to fit
  • Use small, positive bias units in TfMultiLayerPerceptron's hidden layer(s) if the activations are ReLUs in order to avoid dead neurons
  • Added an optional clone_estimator parameter to the SequentialFeatureSelector that defaults to True, avoiding the modification of the original estimator objects
  • More rigorous type and shape checks in the evaluate.plot_decision_regions function
  • DenseTransformer now doesn't raise and error if the input array is not sparse
  • API clean-up using scikit-learn's BaseEstimator as parent class for feature_selection.ColumnSelector
Bug Fixes
  • Fixed a problem when a tuple-range was provided as argument to the SequentialFeatureSelector's k_features parameter and the scoring metric was more negative than -1 (e.g., as in scikit-learn's MSE scoring function) (wahutch](
  • Fixed an AttributeError issue when verbose > 1 in StackingClassifier
  • Fixed a bug in classifier.SoftmaxRegression where the mean values of the offsets were used to update the bias units rather than their sum
  • Fixed rare bug in MLP _layer_mapping functions that caused a swap between the random number generation seed when initializing weights and biases

Version 0.4.1 (2016-05-01)

New Features
  • New TensorFlow estimator for Linear Regression (tf_regressor.TfLinearRegression)
  • New k-means clustering estimator (cluster.Kmeans)
  • New TensorFlow k-means clustering estimator (tf_cluster.Kmeans)
  • Due to refactoring of the estimator classes, the init_weights parameter of the fit methods was globally renamed to init_params
  • Overall performance improvements of estimators due to code clean-up and refactoring
  • Added several additional checks for correct array types and more meaningful exception messages
  • Added optional dropout to the tf_classifier.TfMultiLayerPerceptron classifier for regularization
  • Added an optional decay parameter to the tf_classifier.TfMultiLayerPerceptron classifier for adaptive learning via an exponential decay of the learning rate eta
  • Replaced old NeuralNetMLP by more streamlined MultiLayerPerceptron (classifier.MultiLayerPerceptron); now also with softmax in the output layer and categorical cross-entropy loss.
  • Unified init_params parameter for fit functions to continue training where the algorithm left off (if supported)

Version 0.4.0 (2016-04-09)

New Features
  • New TfSoftmaxRegression classifier using Tensorflow (tf_classifier.TfSoftmaxRegression)
  • New SoftmaxRegression classifier (classifier.SoftmaxRegression)
  • New TfMultiLayerPerceptron classifier using Tensorflow (tf_classifier.TfMultiLayerPerceptron)
  • New StackingRegressor (regressor.StackingRegressor)
  • New StackingClassifier (classifier.StackingClassifier)
  • New function for one-hot encoding of class labels (preprocessing.one_hot)
  • Added GridSearch support to the SequentialFeatureSelector (feature_selection/.SequentialFeatureSelector)
  • evaluate.plot_decision_regions improvements:
    • Function now handles class y-class labels correctly if array is of type float
    • Correct handling of input arguments markers and colors
    • Accept an existing Axes via the ax argument
  • New print_progress parameter for all generalized models and multi-layer neural networks for printing time elapsed, ETA, and the current cost of the current epoch
  • Minibatch learning for classifier.LogisticRegression, classifier.Adaline, and regressor.LinearRegression plus streamlined API
  • New Principal Component Analysis class via mlxtend.feature_extraction.PrincipalComponentAnalysis
  • New RBF Kernel Principal Component Analysis class via mlxtend.feature_extraction.RBFKernelPCA
  • New Linear Discriminant Analysis class via mlxtend.feature_extraction.LinearDiscriminantAnalysis
  • The column parameter in mlxtend.preprocessing.standardize now defaults to None to standardize all columns more conveniently

Version 0.3.0 (2016-01-31)

New Features
  • Added a progress bar tracker to classifier.NeuralNetMLP
  • Added a function to score predicted vs. target class labels evaluate.scoring
  • Added confusion matrix functions to create (evaluate.confusion_matrix) and plot (evaluate.plot_confusion_matrix) confusion matrices
  • New style parameter and improved axis scaling in mlxtend.evaluate.plot_learning_curves
  • Added loadlocal_mnist to for streaming MNIST from a local byte files into numpy arrays
  • New NeuralNetMLP parameters: random_weights, shuffle_init, shuffle_epoch
  • New SFS features such as the generation of pandas DataFrame results tables and plotting functions (with confidence intervals, standard deviation, and standard error bars)
  • Added support for regression estimators in SFS
  • Added Boston housing dataset
  • New shuffle parameter for classifier.NeuralNetMLP
  • The mlxtend.preprocessing.standardize function now optionally returns the parameters, which are estimated from the array, for re-use. A further improvement makes the standardize function smarter in order to avoid zero-division errors
  • Cosmetic improvements to the evaluate.plot_decision_regions function such as hiding plot axes
  • Renaming of classifier.EnsembleClassfier to classifier.EnsembleVoteClassifier
  • Improved random weight initialization in Perceptron, Adaline, LinearRegression, and LogisticRegression
  • Changed learning parameter of mlxtend.classifier.Adaline to solver and added "normal equation" as closed-form solution solver
  • Hide y-axis labels in mlxtend.evaluate.plot_decision_regions in 1 dimensional evaluations
  • Sequential Feature Selection algorithms were unified into a single SequentialFeatureSelector class with parameters to enable floating selection and toggle between forward and backward selection.
  • Stratified sampling of MNIST (now 500x random samples from each of the 10 digit categories)
  • Renaming mlxtend.plotting to mlxtend.general_plotting in order to distinguish general plotting function from specialized utility function such as evaluate.plot_decision_regions

Version 0.2.9 (2015-07-14)

New Features
  • Sequential Feature Selection algorithms: SFS, SFFS, SBS, and SFBS
  • Changed regularization & lambda parameters in LogisticRegression to single parameter l2_lambda

Version 0.2.8 (2015-06-27)

  • API changes:
    • mlxtend.sklearn.EnsembleClassifier -> mlxtend.classifier.EnsembleClassifier
    • mlxtend.sklearn.ColumnSelector -> mlxtend.feature_selection.ColumnSelector
    • mlxtend.sklearn.DenseTransformer -> mlxtend.preprocessing.DenseTransformer
    • mlxtend.pandas.standardizing -> mlxtend.preprocessing.standardizing
    • mlxtend.pandas.minmax_scaling -> mlxtend.preprocessing.minmax_scaling
    • mlxtend.matplotlib -> mlxtend.plotting
  • Added momentum learning parameter (alpha coefficient) to mlxtend.classifier.NeuralNetMLP.
  • Added adaptive learning rate (decrease constant) to mlxtend.classifier.NeuralNetMLP.
  • mlxtend.pandas.minmax_scaling became mlxtend.preprocessing.minmax_scaling and also supports NumPy arrays now
  • mlxtend.pandas.standardizing became mlxtend.preprocessing.standardizing and now supports both NumPy arrays and pandas DataFrames; also, now ddof parameters to set the degrees of freedom when calculating the standard deviation

Version 0.2.7 (2015-06-20)

  • Added multilayer perceptron (feedforward artificial neural network) classifier as mlxtend.classifier.NeuralNetMLP.
  • Added 5000 labeled trainingsamples from the MNIST handwritten digits dataset to

Version 0.2.6 (2015-05-08)

  • Added ordinary least square regression using different solvers (gradient and stochastic gradient descent, and the closed form solution (normal equation)
  • Added option for random weight initialization to logistic regression classifier and updated l2 regularization
  • Added wine dataset to
  • Added invert_axes parameter mlxtend.matplotlib.enrichtment_plot to optionally plot the "Count" on the x-axis
  • New verbose parameter for mlxtend.sklearn.EnsembleClassifier by Alejandro C. Bahnsen
  • Added mlxtend.pandas.standardizing to standardize columns in a Pandas DataFrame
  • Added parameters linestyles and markers to mlxtend.matplotlib.enrichment_plot
  • mlxtend.regression.lin_regplot automatically adds np.newaxis and works w. python lists
  • Added tokenizers: mlxtend.text.extract_emoticons and mlxtend.text.extract_words_and_emoticons

Version 0.2.5 (2015-04-17)

  • Added Sequential Backward Selection (mlxtend.sklearn.SBS)
  • Added X_highlight parameter to mlxtend.evaluate.plot_decision_regions for highlighting test data points.
  • Added mlxtend.regression.lin_regplot to plot the fitted line from linear regression.
  • Added mlxtend.matplotlib.stacked_barplot to conveniently produce stacked barplots using pandas DataFrames.
  • Added mlxtend.matplotlib.enrichment_plot

Version 0.2.4 (2015-03-15)

  • Added scoring to mlxtend.evaluate.learning_curves (by user pfsq)
  • Fixed bug caused by the missing README.html file
  • matplotlib.category_scatter for pandas DataFrames and Numpy arrays

Version 0.2.3 (2015-03-11)

  • Added Logistic regression
  • Gradient descent and stochastic gradient descent perceptron was changed to Adaline (Adaptive Linear Neuron)
  • Perceptron and Adaline for {0, 1} classes
  • Added mlxtend.preprocessing.shuffle_arrays_unison function to shuffle one or more NumPy arrays.
  • Added shuffle and random seed parameter to stochastic gradient descent classifier.
  • Added rstrip parameter to mlxtend.file_io.find_filegroups to allow trimming of base names.
  • Added ignore_substring parameter to mlxtend.file_io.find_filegroups and find_files.
  • Replaced .rstrip in mlxtend.file_io.find_filegroups with more robust regex.
  • Gridsearch support for mlxtend.sklearn.EnsembleClassifier

Version 0.2.2 (2015-03-01)

  • Improved robustness of EnsembleClassifier.
  • Extended plot_decision_regions() functionality for plotting 1D decision boundaries.
  • Function matplotlib.plot_decision_regions was reorganized to evaluate.plot_decision_regions .
  • evaluate.plot_learning_curves() function added.
  • Added Rosenblatt, gradient descent, and stochastic gradient descent perceptrons.

Version 0.2.1 (2015-01-20)

  • Added mlxtend.pandas.minmax_scaling - a function to rescale pandas DataFrame columns.
  • Slight update to the EnsembleClassifier interface (additional voting parameter)
  • Fixed EnsembleClassifier to return correct class labels if class labels are not integers from 0 to n.
  • Added new matplotlib function to plot decision regions of classifiers.

Version 0.2.0 (2015-01-13)

  • Improved mlxtend.text.generalize_duplcheck to remove duplicates and prevent endless looping issue.
  • Added recursive search parameter to mlxtend.file_io.find_files.
  • Added check_ext parameter mlxtend.file_io.find_files to search based on file extensions.
  • Default parameter to ignore invisible files for mlxtend.file_io.find.
  • Added transform and fit_transform to the EnsembleClassifier.
  • Added mlxtend.file_io.find_filegroups function.

Version 0.1.9 (2015-01-10)

  • Implemented scikit-learn EnsembleClassifier (majority voting rule) class.

Version 0.1.8 (2015-01-07)

  • Improvements to mlxtend.text.generalize_names to handle certain Dutch last name prefixes (van, van der, de, etc.).
  • Added mlxtend.text.generalize_name_duplcheck function to apply mlxtend.text.generalize_names function to a pandas DataFrame without creating duplicates.

Version 0.1.7 (2015-01-07)

  • Added text utilities with name generalization function.
  • Added and file_io utilities.

Version 0.1.6 (2015-01-04)

  • Added combinations and permutations estimators.

Version 0.1.5 (2014-12-11)

  • Added DenseTransformer for pipelines and grid search.

Version 0.1.4 (2014-08-20)

  • mean_centering function is now a Class that creates MeanCenterer objects that can be used to fit data via the fit method, and center data at the column means via the transform and fit_transform method.

Version 0.1.3 (2014-08-19)

  • Added preprocessing module and mean_centering function.

Version 0.1.2 (2014-08-19)

  • Added matplotlib utilities and remove_borders function.

Version 0.1.1 (2014-08-13)

  • Simplified code for ColumnSelector.