Release Notes
The CHANGELOG for the current development version is available at https://github.com/rasbt/mlxtend/blob/master/docs/sources/CHANGELOG.md.
Version 0.23.3 (15 Nov 2024)
Downloads
New Features and Enhancements
Files updated:
- mlxtend.evaluate.time_series.plot_splits
- Improved plot_splits for better visualization of time series splits
Changes
mlxtend/feature_selection/exhaustive_feature_selector.py- np.inf update to support for NumPy 2.0
Version 0.23.2 (5 Nov 2024)
Downloads
New Features and Enhancements
- Implement the FP-Growth and FP-Max algorithms with the possibility of missing values in the input dataset. Added a new metric Representativity for the association rules generated (#1004 via zazass8). Files updated:
- ['mlxtend.frequent_patterns.fpcommon']
- 'mlxtend.frequent_patterns.fpgrowth'
- 'mlxtend.frequent_patterns.fpmax'
- 'mlxtend/feature_selection/utilities.py'
- Modified
_calc_scorefunction to ensure compatibility with scikit-learn versions 1.4 and above by dynamically selecting betweenfit_paramsandparamsincross_val_score.
- Modified
mlxtend.feature_selection.SequentialFeatureSelector- Updated negative infinity constant to be compatible with old and new (>=2.0)
numpyversions
- Updated negative infinity constant to be compatible with old and new (>=2.0)
mlxtend.frequent_patterns.association_rules- Implemented three new metrics: Jaccard, Certainty, and Kulczynski. (#1096)
- Integrated scikit-learn's
set_outputmethod intoTransactionEncoder(#1087 via it176131)
Changes
- [
mlxtend.frequent_patterns.fpcommon] Added the null_values parameter in valid_input_check signature to check in case the input also includes null values. Changes the returns statements and function signatures for setup_fptree and generated_itemsets respectively to return the disabled array created and to include it as a parameter. Added code in [mlxtend.frequent_patterns.fpcommon] andmlxtend.frequent_patterns.association_rulesto implement the algorithms in case null values exist when null_values is True. -
mlxtend.frequent_patterns.association_rulesAdded optional parameter 'return_metrics' to only return a given list of metrics, rather than every possible metric. -
Add
n_classes_attribute to stacking classifiers for compatibility with scikit-learn 1.3 (#1091) - Use Scipy's instead of NumPy's decompositions in PCA for improved accuracy in edge cases (#1080 via [fkdosilovic])
Version 0.23.1 (5 Jan 2024)
Downloads
Changes
Version 0.23.0 (22 Sep 2023)
Downloads
Changes
- Address NumPy deprecations to make mlxtend compatible to NumPy 1.24
- Changed the signature of the
LinearRegressionmodel of sklearn in the test removing thenormalizeparameter as it is deprecated. (#1036) - Add
pyproject.tomlto support PEP 518 builds (#1065 via jmahlik) - Fixed installation from sdist failing (#1065 via jmahlik)
- Converted configuration to
pyproject.toml(#1065 via jmahlik) - Remove
mlxtend.imagesubmodule with face recognition functions due to poordlibsupport in modern environments.
New Features and Enhancements
- Document how to use
SequentialFeatureSelectorand multiclass ROC AUC.
Version 0.22.0 (4 April 2023)
Downloads
Changes
- When
ExhaustiveFeatureSelectoris run withn_jobs == 1, joblib is now disabled, which enables more immediate (live) feedback when theverbosemode is enabled. (#985 via Nima Sarajpoor) - Disabled unnecessary warning in
EnsembleVoteClassifier(#941) - Fixed various documentation issues (#849 and #951 via Lekshmanan Natarajan)
- Fixed "Edit on GitHub" button (#1024)
New Features and Enhancements
- The
mlxtend.frequent_patterns.association_rulesfunction has a new metric - Zhang's Metric, which measures both association and dissociation. (#980) - Internal
mlxtend.frequent_patterns.fpmaxcode improvement that avoids casting a sparse DataFrame into a dense NumPy array. (#1000 via Tim Kellogg) - The
plot_decision_regionsfunction now has an_jobsparameter to parallelize the computation. (In a particular use case, on a small dataset, there was a 21x speed-up (449 seconds vs 21 seconds on local HPC instance of 36 cores). (#998 via Khalid ElHaj) - Added
mlxtend.frequent_patterns.hminealgorithm and documentation for mining frequent itemsets using the H-Mine algorithm. (#1020 via Fatih Sen)
Version 0.21.0 (09/17/2022)
Downloads
New Features and Enhancements
- The
mlxtend.evaluate.feature_importance_permutationfunction has a newfeature_groupsargument to treat user-specified feature groups as single features, which is useful for one-hot encoded features. (#955) - The
mlxtend.feature_selection.ExhaustiveFeatureSelectorandSequentialFeatureSelectoralso gained support forfeature_groupswith a behavior similar to the one described above. (#957 and #965 via Nima Sarajpoor)
Changes
- The
custom_feature_namesparameter was removed from theExhaustiveFeatureSelectordue to redundancy and to simplify the code base. TheExhaustiveFeatureSelectordocumentation illustrates how the same behavior and outcome can be achieved using pandas DataFrames. (#957)
Bug Fixes
- None
Version 0.20.0 (05/26/2022)
Downloads
New Features and Enhancements
- The
mlxtend.evaluate.bootstrap_point632_scorenow supportsfit_params. (#861) - The
mlxtend/plotting/decision_regions.pyfunction now has acontourf_kwargsfor matplotlib to change the look of the decision boundaries if desired. (#881 via [pbloem]) - Add a
norm_colormapparameter tomlxtend.plotting.plot_confusion_matrix, to allow normalizing the colormap, e.g., usingmatplotlib.colors.LogNorm()(#895) - Add new
GroupTimeSeriesSplitclass for evaluation in time series tasks with support of custom groups and additional parameters in comparison with scikit-learn'sTimeSeriesSplit. (#915 via Dmitry Labazkin)
Changes
- Due to compatibility issues with newer package versions, certain functions from six.py have been removed so that mlxtend may not work anymore with Python 2.7.
- As an internal change to speed up unit testing, unit testing is now faciliated by GitHub workflows, and Travis CI and Appveyor hooks have been removed.
- Improved axis label rotation in
mlxtend.plotting.heatmapandmlxtend.plotting.plot_confusion_matrix(#872) - Fix various typos in McNemar guides.
- Raises a warning if non-bool arrays are used in the frequent pattern functions
apriori,fpmax, andfpgrowth. (#934 via NimaSarajpoor)
Bug Fixes
- Fix unreadable labels in
heatmapfor certain colormaps. (#852) - Fix an issue in
mlxtend.plotting.plot_confusion_matrixwhen string class names are passed (#894)
Version 0.19.0 (2021-09-02)
Downloads
New Features
- Adds a second "balanced accuracy" interpretation ("balanced") to
evaluate.accuracy_scorein addition to the existing "average" option to compute the scikit-learn-style balanced accuracy. (#764) - Adds new
scatter_histfunction tomlxtend.plottingfor generating a scattered histogram. (#757 via Maitreyee Mhasaka) - The
evaluate.permutation_testfunction now accepts apairedargument to specify to support paired permutation/randomization tests. (#768) - The
StackingCVRegressornow also supports multi-dimensional targets similar toStackingRegressorviaStackingCVRegressor(..., multi_output=True). (#802 via Marco Tiraboschi)
Changes
- Updates unit tests for scikit-learn 0.24.1 compatibility. (#774)
StackingRegressornow requires settingStackingRegressor(..., multi_output=True)if the target is multi-dimensional; this allows for better input validation. (#802)- Removes deprecated
resargument fromplot_decision_regions. (#803) - Adds a
title_fontsizeparameter toplot_learning_curvesfor controlling the title font size; also the plot style is now the matplotlib default. (#818) - Internal change using
'c': 'none'instead of'c': ''inmlxtend.plotting.plot_decision_regions's scatterplot highlights to stay compatible with Matplotlib 3.4 and newer. (#822) - Adds a
fontcolor_thresholdparameter to themlxtend.plotting.plot_confusion_matrixfunction as an additional option for determining the font color cut-off manually. (#827) - The
frequent_patterns.association_rulesnow raises aValueErrorif an empty frequent itemset DataFrame is passed. (#843) - The .632 and .632+ bootstrap method implemented in the
mlxtend.evaluate.bootstrap_point632_scorefunction now use the whole training set for the resubstitution weighting term instead of the internal training set that is a new bootstrap sample in each round. (#844)
Bug Fixes
- Fixes a typo in the SequentialFeatureSelector documentation (#835 via João Pedro Zanlorensi Cardoso)
Version 0.18.0 (2020-11-25)
Downloads
New Features
- The
bias_variance_decompfunction now supports optionalfit_paramsfor the estimators that are fit on bootstrap samples. (#748) - The
bias_variance_decompfunction now supports Keras estimators. (#725 via @hanzigs) - Adds new
mlxtend.classifier.OneRClassifier(One Rule Classfier) class, a simple rule-based classifier that is often used as a performance baseline or simple interpretable model. (#726 - Adds new
create_counterfactualmethod for creating counterfactuals to explain model predictions. (#740)
Changes
permutation_test(mlxtend.evaluate.permutation) ìs corrected to give the proportion of permutations whose statistic is at least as extreme as the one observed. (#721 via Florian Charlier)- Fixes the McNemar confusion matrix layout to match the convention (and documentation), swapping the upper left and lower right cells. (#744 via mmarius)
Bug Fixes
- The loss in
LogisticRegressionfor logging purposes didn't include the L2 penalty for the first weight in the weight vector (this is not the bias unit). However, since this loss function was only used for logging purposes, and the gradient remains correct, this does not have an effect on the main code. (#741) - Fixes a bug in
bias_variance_decompwhere when themseloss was used, downcasting to integers caused imprecise results for small numbers. (#749)
Version 0.17.3 (2020-07-27)
Downloads
New Features
- Add
predict_probakwarg to bootstrap methods, to allow bootstrapping of scoring functions that take in probability values. (#700 via Adam Li) - Add a
cell_valuesparameter tomlxtend.plotting.heatmap()to optionally suppress cell annotations by settingcell_values=False. (#703
Changes
- Implemented both
use_clonesandfit_base_estimators(previouslyrefitinEnsembleVoteClassifier) forEnsembleVoteClassifierandStackingClassifier. (#670 via Katrina Ni) - Switched to using raw strings for regex in
mlxtend.textto prevent deprecation warning in Python 3.8 (#688) - Slice data in sequential forward selection before sending to parallel backend, reducing memory consumption.
Bug Fixes
- Fixes axis DeprecationWarning in matplotlib v3.1.0 and newer. (#673)
- Fixes an issue with using
meshgridinno_information_ratefunction used by thebootstrap_point632_scorefunction for the .632+ estimate. (#688) - Fixes an issue in
fpmaxthat could lead to incorrect support values. (#692 via Steve Harenberg)
Version 0.17.2 (2020-02-24)
Downloads
New Features
- -
Changes
- The previously deprecated
OnehotTransactionshas been removed in favor of theTransactionEncoder. - Removed
SparseDataFramesupport in frequent pattern mining functions in favor of pandas >=1.0's new way for working sparse data. If you usedSparseDataFrameformats, please see pandas' migration guide at https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating. (#667) - The
plot_confusion_matrix.pynow also accepts a matplotlib figure and axis as input to which the confusion matrix plot can be added. (#671 via Vahid Mirjalili)
Bug Fixes
- -
Version 0.17.1 (2020-01-28)
Downloads
New Features
- The
SequentialFeatureSelectornow supports using pre-specified feature sets via thefixed_featuresparameter. (#578) - Adds a new
accuracy_scorefunction tomlxtend.evaluatefor computing basic classifcation accuracy, per-class accuracy, and average per-class accuracy. (#624 via Deepan Das) StackingClassifierandStackingCVClassifiernow have adecision_functionmethod, which serves as a preferred choice overpredict_probain calculating roc_auc and average_precision scores when the meta estimator is a linear model or support vector classifier. (#634 via Qiang Gu)
Changes
- Improve the runtime performance for the
apriorifrequent itemset generating function whenlow_memory=True. Settinglow_memory=False(default) is still faster for small itemsets, butlow_memory=Truecan be much faster for large itemsets and requires less memory. Also, input validation forapriori, ̀ fpgrowthandfpmaxtakes a significant amount of time when input pandas DataFrame is large; this is now dramatically reduced when input contains boolean values (and not zeros/ones), which is the case when usingTransactionEncoder`. (#619 via Denis Barbier) - Add support for newer sparse pandas DataFrame for frequent itemset algorithms. Also, input validation for
apriori, ̀ fpgrowthandfpmax` runs much faster on sparse DataFrame when input pandas DataFrame contains integer values. (#621 via Denis Barbier) - Let
fpgrowthandfpmaxdirectly work on sparse DataFrame, they were previously converted into dense Numpy arrays. (#622 via Denis Barbier)
Bug Fixes
- Fixes a bug in
mlxtend.plotting.plot_pca_correlation_graphthat caused the explaind variances not summing up to 1. Also, improves the runtime performance of the correlation computation and adds a missing function argument for the explained variances (eigenvalues) if users provide their own principal components. (#593 via Gabriel Azevedo Ferreira) - Behavior of
fpgrowthandaprioriconsistent for edgecases such asmin_support=0. (#573 via Steve Harenberg) fpmaxreturns an empty data frame now instead of raising an error if the frequent itemset set is empty. (#573 via Steve Harenberg)- Fixes and issue in
mlxtend.plotting.plot_confusion_matrix, where the font-color choice for medium-dark cells was not ideal and hard to read. #588 via sohrabtowfighi) - The
svdmode ofmlxtend.feature_extraction.PrincipalComponentAnalysisnow also n-1 degrees of freedom instead of n d.o.f. when computing the eigenvalues to match the behavior ofeigen. #595 - Disable input validation for
StackingCVClassifierbecause it causes issues if pipelines are used as input. #606
Version 0.17.0 (2019-07-19)
Downloads
New Features
- Added an enhancement to the existing
iris_data()such that both the UCI Repository version of the Iris dataset as well as the corrected, original version of the dataset can be loaded, which has a slight difference in two data points (consistent with Fisher's paper; this is also the same as in R). (via #539 via janismdhanbad) - Added optional
groupsparameter toSequentialFeatureSelectorandExhaustiveFeatureSelectorfit()methods for forwarding to sklearn CV (#537 via arc12) - Added a new
plot_pca_correlation_graphfunction to themlxtend.plottingsubmodule for plotting a PCA correlation graph. (#544 via Gabriel-Azevedo-Ferreira) - Added a
zoom_factorparameter to themlxten.plotting.plot_decision_regionfunction that allows users to zoom in and out of the decision region plots. (#545) - Added a function
fpgrowththat implements the FP-Growth algorithm for mining frequent itemsets as a drop-in replacement for the existingapriorialgorithm. (#550 via Steve Harenberg) - New
heatmapfunction inmlxtend.plotting. (#552) - Added a function
fpmaxthat implements the FP-Max algorithm for mining maximal itemsets as a drop-in replacement for thefpgrowthalgorithm. (#553 via Steve Harenberg) - New
figsizeparameter for theplot_decision_regionsfunction inmlxtend.plotting. (#555 via Mirza Hasanbasic) - New
low_memoryoption for theapriorifrequent itemset generating function. Settinglow_memory=False(default) uses a substantially optimized version of the algorithm that is 3-6x faster than the original implementation (low_memory=True). (#567 via jmayse) - Added numerically stable OLS methods which uses
QR decompositionandSingular Value Decomposition(SVD) methods toLinearRegressioninmlxtend.regressor.linear_regression. (#575 via PuneetGrov3r)
Changes
- Now uses the latest joblib library under the hood for multiprocessing instead of
sklearn.externals.joblib. (#547) - Changes to
StackingCVClassifierandStackingCVRegressorsuch that first-level models are allowed to generate output of non-numeric type. (#562)
Bug Fixes
- Fixed documentation of
iris_data()underiris.pyby adding a note about differences in the iris data in R and UCI machine learning repo. - Make sure that if the
'svd'mode is used in PCA, the number of eigenvalues is the same as when using'eigen'(append 0's zeros in that case) (#565)
Version 0.16.0 (2019-05-12)
Downloads
New Features
StackingCVClassifierandStackingCVRegressornow supportrandom_stateparameter, which, together withshuffle, controls the randomness in the cv splitting. (#523 via Qiang Gu)StackingCVClassifierandStackingCVRegressornow have a newdrop_last_probaparameter. It drops the last "probability" column in the feature set since ifTrue, because it is redundant: p(y_c) = 1 - p(y_1) + p(y_2) + ... + p(y_{c-1}). This can be useful for meta-classifiers that are sensitive to perfectly collinear features. (#532)- Other stacking estimators, including
StackingClassifier,StackingCVClassifierandStackingRegressor, support grid search over theregressorsand even a single base regressor. (#522 via Qiang Gu) - Adds multiprocessing support to
StackingCVClassifier. (#522 via Qiang Gu) - Adds multiprocessing support to
StackingCVRegressor. (#512 via Qiang Gu) - Now, the
StackingCVRegressoralso enables grid search over theregressorsand even a single base regressor. When there are level-mixed parameters,GridSearchCVwill try to replace hyperparameters in a top-down order (see the documentation for examples details). (#515 via Qiang Gu) - Adds a
verboseparameter toapriorito show the current iteration number as well as the itemset size currently being sampled. (#519 - Adds an optional
class_nameparameter to the confusion matrix function to display class names on the axis as tick marks. (#487 via sandpiturtle) - Adds a
pca.e_vals_normalized_attribute to PCA for storing the eigenvalues also in normalized form; this is commonly referred to as variance explained ratios. #545
Changes
- Due to new features, restructuring, and better scikit-learn support (for
GridSearchCV, etc.) theStackingCVRegressor's meta regressor is now being accessed via'meta_regressor__*in the parameter grid. E.g., if aRandomForestRegressoras meta- egressor was previously tuned via'randomforestregressor__n_estimators', this has now changed to'meta_regressor__n_estimators'. (#515 via Qiang Gu) - The same change mentioned above is now applied to other stacking estimators, including
StackingClassifier,StackingCVClassifierandStackingRegressor. (#522 via Qiang Gu) - Automatically performs mean centering for PCA solver 'SVD' such that using SVD is always equal to using the covariance matrix approach #545
Bug Fixes
- The
feature_selection.ColumnSelectornow also supports column names of typeint(in addition tostrnames) if the input is a pandas DataFrame. (#500 via tetrar124 - Fix unreadable labels in
plot_confusion_matrixfor imbalanced datasets ifshow_absolute=Trueandshow_normed=True. (#504) - Raises a more informative error if a
SparseDataFrameis passed toaprioriand the dataframe has integer column names that don't start with0due to current limitations of theSparseDataFrameimplementation in pandas. (#503) - SequentialFeatureSelector now supports DataFrame as input for all operating modes (forward/backward/floating). #506
mlxtend.evaluate.feature_importance_permutationnow correctly accepts scoring functions with proper function signature asmetricargument. #528
Version 0.15.0 (2019-01-19)
Downloads
New Features
- Adds a new transformer class to
mlxtend.image,EyepadAlign, that aligns face images based on the location of the eyes. (#466 by Vahid Mirjalili) - Adds a new function,
mlxtend.evaluate.bias_variance_decompthat decomposes the loss of a regressor or classifier into bias and variance terms. (#470) - Adds a
whiteningparameter toPrincipalComponentAnalysis, to optionally whiten the transformed data such that the features have unit variance. (#475)
Changes
- Changed the default solver in
PrincipalComponentAnalysisto'svd'instead of'eigen'to improve numerical stability. (#474) - The
mlxtend.image.extract_face_landmarksnow returnsNoneif no facial landmarks were detected instead of an array of all zeros. (#466)
Bug Fixes
- The eigenvectors maybe have not been sorted in certain edge cases if solver was
'eigen'inPrincipalComponentAnalysisandLinearDiscriminantAnalysis. (#477, #478)
Version 0.14.0 (2018-11-09)
Downloads
New Features
- Added a
scatterplotmatrixfunction to theplottingmodule. (#437) - Added
sample_weightoption toStackingRegressor,StackingClassifier,StackingCVRegressor,StackingCVClassifier,EnsembleVoteClassifier. (#438) - Added a
RandomHoldoutSplitclass to perform a random train/valid split without rotation inSequentialFeatureSelector, scikit-learnGridSearchCVetc. (#442) - Added a
PredefinedHoldoutSplitclass to perform a train/valid split, based on user-specified indices, without rotation inSequentialFeatureSelector, scikit-learnGridSearchCVetc. (#443) - Created a new
mlxtend.imagesubmodule for working on image processing-related tasks. (#457) - Added a new convenience function
extract_face_landmarksbased ondlibtomlxtend.image. (#458) - Added a
method='oob'option to themlxtend.evaluate.bootstrap_point632_scoremethod to compute the classic out-of-bag bootstrap estimate (#459) - Added a
method='.632+'option to themlxtend.evaluate.bootstrap_point632_scoremethod to compute the .632+ bootstrap estimate that addresses the optimism bias of the .632 bootstrap (#459) - Added a new
mlxtend.evaluate.ftestfunction to perform an F-test for comparing the accuracies of two or more classification models. (#460) - Added a new
mlxtend.evaluate.combined_ftest_5x2cvfunction to perform an combined 5x2cv F-Test for comparing the performance of two models. (#461) - Added a new
mlxtend.evaluate.difference_proportionstest for comparing two proportions (e.g., classifier accuracies) (#462)
Changes
- Addressed deprecations warnings in NumPy 0.15. (#425)
- Because of complications in PR (#459), Python 2.7 was now dropped; since official support for Python 2.7 by the Python Software Foundation is ending in approx. 12 months anyways, this re-focussing will hopefully free up some developer time with regard to not having to worry about backward compatibility
Bug Fixes
- Fixed an issue with a missing import in
mlxtend.plotting.plot_confusion_matrix. (#428)
Version 0.13.0 (2018-07-20)
Downloads
New Features
- A meaningful error message is now raised when a cross-validation generator is used with
SequentialFeatureSelector. (#377) - The
SequentialFeatureSelectornow accepts custom feature names via thefitmethod for more interpretable feature subset reports. (#379) - The
SequentialFeatureSelectoris now also compatible with Pandas DataFrames and uses DataFrame column-names for more interpretable feature subset reports. (#379) ColumnSelectornow works with Pandas DataFrames columns. (#378 by Manuel Garrido)- The
ExhaustiveFeatureSelectorestimator inmlxtend.feature_selectionnow is safely stoppable mid-process by control+c. (#380) - Two new functions,
vectorspace_orthonormalizationandvectorspace_dimensionalitywere added tomlxtend.mathto use the Gram-Schmidt process to convert a set of linearly independent vectors into a set of orthonormal basis vectors, and to compute the dimensionality of a vectorspace, respectively. (#382) mlxtend.frequent_patterns.apriorinow supports pandasSparseDataFrames to generate frequent itemsets. (#404 via Daniel Morales)- The
plot_confusion_matrixfunction now has the ability to show normalized confusion matrix coefficients in addition to or instead of absolute confusion matrix coefficients with or without a colorbar. The text display method has been changed so that the full range of the colormap is used. The default size is also now set based on the number of classes. - Added support for merging the meta features with the original input features in
StackingRegressor(viause_features_in_secondary) like it is already supported in the other Stacking classes. (#418) - Added a
support_onlyto theassociation_rulesfunction, which allow constructing association rules (based on the support metric only) for cropped input DataFrames that don't contain a complete set of antecedent and consequent support values. (#421)
Changes
- Itemsets generated with
aprioriare nowfrozensets (#393 by William Laney and #394) - Now raises an error if a input DataFrame to
aprioricontains non 0, 1, True, False values. #419)
Bug Fixes
- Allow mlxtend estimators to be cloned via scikit-learn's
clonefunction. (#374) - Fixes bug to allow the correct use of
refit=FalseinStackingRegressorandStackingCVRegressor(#384 and (#385) by selay01) - Allow
StackingClassifierto work with sparse matrices whenuse_features_in_secondary=True(#408 by Floris Hoogenbook) - Allow
StackingCVRegressorto work with sparse matrices whenuse_features_in_secondary=True(#416) - Allow
StackingCVClassifierto work with sparse matrices whenuse_features_in_secondary=True(#417)
Version 0.12.0 (2018-21-04)
Downloads
New Features
- A new
feature_importance_permuationfunction to compute the feature importance in classifiers and regressors via the permutation importance method (#358) - The fit method of the
ExhaustiveFeatureSelectornow optionally accepts**fit_paramsfor the estimator that is used for the feature selection. (#354 by Zach Griffith) - The fit method of the
SequentialFeatureSelectornow optionally accepts**fit_paramsfor the estimator that is used for the feature selection. (#350 by Zach Griffith)
Changes
- Replaced
plot_decision_regionscolors by a colorblind-friendly palette and adds contour lines for decision regions. (#348) - All stacking estimators now raise
NonFittedErrorsif any method for inference is called prior to fitting the estimator. (#353) - Renamed the
refitparameter of both theStackingClassifierandStackingCVClassifiertouse_clonesto be more explicit and less misleading. (#368)
Bug Fixes
- Various changes in the documentation and documentation tools to fix formatting issues (#363)
- Fixes a bug where the
StackingCVClassifier's meta features were not stored in the original order whenshuffle=True(#370) - Many documentation improvements, including links to the User Guides in the API docs (#371)
Version 0.11.0 (2018-03-14)
Downloads
New Features
- New function implementing the resampled paired t-test procedure (
paired_ttest_resampled) to compare the performance of two models. (#323) - New function implementing the k-fold paired t-test procedure (
paired_ttest_kfold_cv) to compare the performance of two models (also called k-hold-out paired t-test). (#324) - New function implementing the 5x2cv paired t-test procedure (
paired_ttest_5x2cv) proposed by Dieterrich (1998) to compare the performance of two models. (#325) - A
refitparameter was added to stacking classes (similar to therefitparameter in theEnsembleVoteClassifier), to support classifiers and regressors that follow the scikit-learn API but are not compatible with scikit-learn'sclonefunction. (#322) - The
ColumnSelectornow has adrop_axisargument to use it in pipelines withCountVectorizers. (#333)
Changes
- Raises an informative error message if
predictorpredict_meta_featuresis called prior to calling thefitmethod inStackingRegressorandStackingCVRegressor. (#315) - The
plot_decision_regionsfunction now automatically determines the optimal setting based on the feature dimensions and supports anti-aliasing. The oldresparameter has been deprecated. (#309 by Guillaume Poirier-Morency) - Apriori code is faster due to optimization in
onehot transformationand the amount of candidates generated by theapriorialgorithm. (#327 by Jakub Smid) - The
OnehotTransactionsclass (which is typically often used in combination with theapriorifunction for association rule mining) is now more memory efficient as it uses boolean arrays instead of integer arrays. In addition, theOnehotTransactionsclass can be now be provided withsparseargument to generate sparse representations of theonehotmatrix to further improve memory efficiency. (#328 by Jakub Smid) - The
OneHotTransactionshas been deprecated and replaced by theTransactionEncoder. (#332 - The
plot_decision_regionsfunction now has three new parameters,scatter_kwargs,contourf_kwargs, andscatter_highlight_kwargs, that can be used to modify the plotting style. (#342 by James Bourbeau)
Bug Fixes
- Fixed issue when class labels were provided to the
EnsembleVoteClassifierwhenrefitwas set tofalse. (#322) - Allow arrays with 16-bit and 32-bit precision in
plot_decision_regionsfunction. (#337) - Fixed bug that raised an indexing error if the number of items was <= 1 when computing association rules using the conviction metric. (#340)
Version 0.10.0 (2017-12-22)
Downloads
New Features
- New
store_train_meta_featuresparameter forfitin StackingCVRegressor. if True, train meta-features are stored inself.train_meta_features_. Newpred_meta_featuresmethod forStackingCVRegressor. People can get test meta-features using this method. (#294 via takashioya) - The new
store_train_meta_featuresattribute andpred_meta_featuresmethod for theStackingCVRegressorwere also added to theStackingRegressor,StackingClassifier, andStackingCVClassifier(#299 & #300) - New function (
evaluate.mcnemar_tables) for creating multiple 2x2 contigency from model predictions arrays that can be used in multiple McNemar (post-hoc) tests or Cochran's Q or F tests, etc. (#307) - New function (
evaluate.cochrans_q) for performing Cochran's Q test to compare the accuracy of multiple classifiers. (#310)
Changes
- Added
requirements.txttosetup.py. (#304 via Colin Carrol)
Bug Fixes
- Improved numerical stability for p-values computed via the the exact McNemar test (#306)
noseis not required to use the library (#302)
Version 0.9.1 (2017-11-19)
Downloads
New Features
- Added
mlxtend.evaluate.bootstrap_point632_scoreto evaluate the performance of estimators using the .632 bootstrap. (#283) - New
max_lenparameter for the frequent itemset generation via theapriorifunction to allow for early stopping. (#270)
Changes
- All feature index tuples in
SequentialFeatureSelectoror now in sorted order. (#262) - The
SequentialFeatureSelectornow runs the continuation of the floating inclusion/exclusion as described in Novovicova & Kittler (1994). Note that this didn't cause any difference in performance on any of the test scenarios but could lead to better performance in certain edge cases. (#262) utils.Counternow accepts a name variable to help distinguish between multiple counters, time precision can be set with the 'precision' kwarg and the new attribute end_time holds the time the last iteration completed. (#278 via Mathew Savage)
Bug Fixes
- Fixed an deprecation error that occured with McNemar test when using SciPy 1.0. (#283)
Version 0.9.0 (2017-10-21)
Downloads
New Features
- Added
evaluate.permutation_test, a permutation test for hypothesis testing (or A/B testing) to test if two samples come from the same distribution. Or in other words, a procedure to test the null hypothesis that that two groups are not significantly different (e.g., a treatment and a control group). (#250) - Added
'leverage'and'convictionas evaluation metrics to thefrequent_patterns.association_rulesfunction. (#246 & #247) - Added a
loadings_attribute toPrincipalComponentAnalysisto compute the factor loadings of the features on the principal components. (#251) - Allow grid search over classifiers/regressors in ensemble and stacking estimators. (#259)
- New
make_multiplexer_datasetfunction that creates a dataset generated by a n-bit Boolean multiplexer for evaluating supervised learning algorithms. (#263) - Added a new
BootstrapOutOfBagclass, an implementation of the out-of-bag bootstrap to evaluate supervised learning algorithms. (#265) - The parameters for
StackingClassifier,StackingCVClassifier,StackingRegressor,StackingCVRegressor, andEnsembleVoteClassifiercan now be tuned using scikit-learn'sGridSearchCV(#254 via James Bourbeau)
Changes
- The
'support'column returned byfrequent_patterns.association_ruleswas changed to compute the support of "antecedant union consequent", and newantecedant support'and'consequent support'column were added to avoid ambiguity. (#245) - Allow the
OnehotTransactionsto be cloned via scikit-learn'sclonefunction, which is required by e.g., scikit-learn'sFeatureUnionorGridSearchCV(via Iaroslav Shcherbatyi). (#249)
Bug Fixes
- Fix issues with
self._init_timeparameter in_IterativeModelsubclasses. (#256) - Fix imprecision bug that occurred in
plot_ecdfwhen run on Python 2.7. (264) - The vectors from SVD in
PrincipalComponentAnalysisare now being scaled so that the eigenvalues viasolver='eigen'andsolver='svd'now store eigenvalues that have the same magnitudes. (#251)
Version 0.8.0 (2017-09-09)
Downloads
New Features
- Added a
mlxtend.evaluate.bootstrapthat implements the ordinary nonparametric bootstrap to bootstrap a single statistic (for example, the mean. median, R^2 of a regression fit, and so forth) #232 SequentialFeatureSelecor'sk_featuresnow accepts a string argument "best" or "parsimonious" for more "automated" feature selection. For instance, if "best" is provided, the feature selector will return the feature subset with the best cross-validation performance. If "parsimonious" is provided as an argument, the smallest feature subset that is within one standard error of the cross-validation performance will be selected. #238
Changes
SequentialFeatureSelectornow usesnp.nanmeanover normal mean to support scorers that may returnnp.nan#211 (via mrkaiser)- The
skip_if_stuckparameter was removed fromSequentialFeatureSelectorin favor of a more efficient implementation comparing the conditional inclusion/exclusion results (in the floating versions) to the performances of previously sampled feature sets that were cached #237 ExhaustiveFeatureSelectorwas modified to consume substantially less memory #195 (via Adam Erickson)
Bug Fixes
- Fixed a bug where the
SequentialFeatureSelectorselected a feature subset larger than then specified via thek_featurestuple max-value #213
Version 0.7.0 (2017-06-22)
Downloads
New Features
- New mlxtend.plotting.ecdf function for plotting empirical cumulative distribution functions (#196).
- New
StackingCVRegressorfor stacking regressors with out-of-fold predictions to prevent overfitting (#201via Eike Dehling).
Changes
- The TensorFlow estimator have been removed from mlxtend, since TensorFlow has now very convenient ways to build on estimators, which render those implementations obsolete.
plot_decision_regionsnow supports plotting decision regions for more than 2 training features #189, via James Bourbeau).- Parallel execution in
mlxtend.feature_selection.SequentialFeatureSelectorandmlxtend.feature_selection.ExhaustiveFeatureSelectoris now performed over different feature subsets instead of the different cross-validation folds to better utilize machines with multiple processors if the number of features is large (#193, via @whalebot-helmsman). - Raise meaningful error messages if pandas
DataFrames or Python lists of lists are fed into theStackingCVClassiferas afitarguments (198). - The
n_foldsparameter of theStackingCVClassifierwas changed tocvand can now accept any kind of cross validation technique that is available from scikit-learn. For example,StackingCVClassifier(..., cv=StratifiedKFold(n_splits=3))orStackingCVClassifier(..., cv=GroupKFold(n_splits=3))(#203, via Konstantinos Paliouras).
Bug Fixes
SequentialFeatureSelectornow correctly accepts aNoneargument for thescoringparameter to infer the default scoring metric from scikit-learn classifiers and regressors (#171).- The
plot_decision_regionsfunction now supports pre-existing axes objects generated via matplotlib'splt.subplots. (#184, see example) - Made
math.num_combinationsandmath.num_permutationsnumerically stable for large numbers of combinations and permutations (#200).
Version 0.6.0 (2017-03-18)
Downloads
New Features
- An
association_rulesfunction is implemented that allows to generate rules based on a list of frequent itemsets (via Joshua Goerner).
Changes
- Adds a black
edgecolorto plots viaplotting.plot_decision_regionsto make markers more distinguishable from the background inmatplotlib>=2.0. - The
associationsubmodule was renamed tofrequent_patterns.
Bug Fixes
- The
DataFrameindex ofaprioriresults are now unique and ordered. - Fixed typos in autompg and wine datasets (via James Bourbeau).
Version 0.5.1 (2017-02-14)
Downloads
New Features
- The
EnsembleVoteClassifierhas a newrefitattribute that prevents refitting classifiers ifrefit=Falseto save computational time. - Added a new
lift_scorefunction inevaluateto compute lift score (via Batuhan Bardak). StackingClassifierandStackingRegressorsupport multivariate targets if the underlying models do (via kernc).StackingClassifierhas a newuse_features_in_secondaryattribute likeStackingCVClassifier.
Changes
- Changed default verbosity level in
SequentialFeatureSelectorto 0 - The
EnsembleVoteClassifiernow raises aNotFittedErrorif the estimator wasn'tfitbefore callingpredict. (via Anton Loss) - Added new TensorFlow variable initialization syntax to guarantee compatibility with TensorFlow 1.0
Bug Fixes
- Fixed wrong default value for
k_featuresinSequentialFeatureSelector - Cast selected feature subsets in the
SequentialFeautureSelectoras sets to prevent the iterator from getting stuck if thek_idxare different permutations of the same combination (via Zac Wellmer). - Fixed an issue with learning curves that caused the performance metrics to be reversed (via ipashchenko)
- Fixed a bug that could occur in the
SequentialFeatureSelectorif there are similarly-well performing subsets in the floating variants (via Zac Wellmer).
Version 0.5.0 (2016-11-09)
Downloads
New Features
- New
ExhaustiveFeatureSelectorestimator inmlxtend.feature_selectionfor evaluating all feature combinations in a specified range - The
StackingClassifierhas a new parameteraverage_probasthat is set toTrueby default to maintain the current behavior. A deprecation warning was added though, and it will default toFalsein future releases (0.6.0);average_probas=Falsewill result in stacking of the level-1 predicted probabilities rather than averaging these. - New
StackingCVClassifierestimator in 'mlxtend.classifier' for implementing a stacking ensemble that uses cross-validation techniques for training the meta-estimator to avoid overfitting (Reiichiro Nakano) - New
OnehotTransactionsencoder class added to thepreprocessingsubmodule for transforming transaction data into a one-hot encoded array - The
SequentialFeatureSelectorestimator inmlxtend.feature_selectionnow is safely stoppable mid-process by control+c, and deprecatedprint_progressin favor of a more tunableverboseparameter (Will McGinnis) - New
apriorifunction inassociationto extract frequent itemsets from transaction data for association rule mining - New
checkerboard_plotfunction inplottingto plot checkerboard tables / heat maps - New
mcnemar_tableandmcnemarfunctions inevaluateto compute 2x2 contingency tables and McNemar's test
Changes
- All plotting functions have been moved to
mlxtend.plottingfor compatibility reasons with continuous integration services and to make the installation ofmatplotliboptional for users ofmlxtend's core functionality - Added a compatibility layer for
scikit-learn 0.18using the newmodel_selectionmodule while maintaining backwards compatibility to scikit-learn 0.17.
Bug Fixes
mlxtend.plotting.plot_decision_regionsnow draws decision regions correctly if more than 4 class labels are present- Raise
AttributeErrorinplot_decision_regionswhen theX_higlightargument is a 1D array (chkoar)
Version 0.4.2 (2016-08-24)
Downloads
New Features
- Added
preprocessing.CopyTransformer, a mock class that returns copies of imput arrays viatransformandfit_transform
Changes
- Added AppVeyor to CI to ensure MS Windows compatibility
- Dataset are now saved as compressed .txt or .csv files rather than being imported as Python objects
feature_selection.SequentialFeatureSelectornow supports the selection ofk_featuresusing a tuple to specify a "min-max"k_featuresrange- Added "SVD solver" option to the
PrincipalComponentAnalysis - Raise a
AttributeErrorwith "not fitted" message inSequentialFeatureSelectoriftransformorget_metric_dictare called prior tofit - Use small, positive bias units in
TfMultiLayerPerceptron's hidden layer(s) if the activations are ReLUs in order to avoid dead neurons - Added an optional
clone_estimatorparameter to theSequentialFeatureSelectorthat defaults toTrue, avoiding the modification of the original estimator objects - More rigorous type and shape checks in the
evaluate.plot_decision_regionsfunction DenseTransformernow doesn't raise and error if the input array is not sparse- API clean-up using scikit-learn's
BaseEstimatoras parent class forfeature_selection.ColumnSelector
Bug Fixes
- Fixed a problem when a tuple-range was provided as argument to the
SequentialFeatureSelector'sk_featuresparameter and the scoring metric was more negative than -1 (e.g., as in scikit-learn's MSE scoring function) (wahutch](https://github.com/wahutch)) - Fixed an
AttributeErrorissue whenverbose> 1 inStackingClassifier - Fixed a bug in
classifier.SoftmaxRegressionwhere the mean values of the offsets were used to update the bias units rather than their sum - Fixed rare bug in MLP
_layer_mappingfunctions that caused a swap between the random number generation seed when initializing weights and biases
Version 0.4.1 (2016-05-01)
Downloads
New Features
- New TensorFlow estimator for Linear Regression (
tf_regressor.TfLinearRegression) - New k-means clustering estimator (
cluster.Kmeans) - New TensorFlow k-means clustering estimator (
tf_cluster.Kmeans)
Changes
- Due to refactoring of the estimator classes, the
init_weightsparameter of thefitmethods was globally renamed toinit_params - Overall performance improvements of estimators due to code clean-up and refactoring
- Added several additional checks for correct array types and more meaningful exception messages
- Added optional
dropoutto thetf_classifier.TfMultiLayerPerceptronclassifier for regularization - Added an optional
decayparameter to thetf_classifier.TfMultiLayerPerceptronclassifier for adaptive learning via an exponential decay of the learning rate eta - Replaced old
NeuralNetMLPby more streamlinedMultiLayerPerceptron(classifier.MultiLayerPerceptron); now also with softmax in the output layer and categorical cross-entropy loss. - Unified
init_paramsparameter for fit functions to continue training where the algorithm left off (if supported)
Version 0.4.0 (2016-04-09)
New Features
- New
TfSoftmaxRegressionclassifier using Tensorflow (tf_classifier.TfSoftmaxRegression) - New
SoftmaxRegressionclassifier (classifier.SoftmaxRegression) - New
TfMultiLayerPerceptronclassifier using Tensorflow (tf_classifier.TfMultiLayerPerceptron) - New
StackingRegressor(regressor.StackingRegressor) - New
StackingClassifier(classifier.StackingClassifier) - New function for one-hot encoding of class labels (
preprocessing.one_hot) - Added
GridSearchsupport to theSequentialFeatureSelector(feature_selection/.SequentialFeatureSelector) evaluate.plot_decision_regionsimprovements:- Function now handles class y-class labels correctly if array is of type
float - Correct handling of input arguments
markersandcolors - Accept an existing
Axesvia theaxargument
- Function now handles class y-class labels correctly if array is of type
- New
print_progressparameter for all generalized models and multi-layer neural networks for printing time elapsed, ETA, and the current cost of the current epoch - Minibatch learning for
classifier.LogisticRegression,classifier.Adaline, andregressor.LinearRegressionplus streamlined API - New Principal Component Analysis class via
mlxtend.feature_extraction.PrincipalComponentAnalysis - New RBF Kernel Principal Component Analysis class via
mlxtend.feature_extraction.RBFKernelPCA - New Linear Discriminant Analysis class via
mlxtend.feature_extraction.LinearDiscriminantAnalysis
Changes
- The
columnparameter inmlxtend.preprocessing.standardizenow defaults toNoneto standardize all columns more conveniently
Version 0.3.0 (2016-01-31)
Downloads
New Features
- Added a progress bar tracker to
classifier.NeuralNetMLP - Added a function to score predicted vs. target class labels
evaluate.scoring - Added confusion matrix functions to create (
evaluate.confusion_matrix) and plot (evaluate.plot_confusion_matrix) confusion matrices - New style parameter and improved axis scaling in
mlxtend.evaluate.plot_learning_curves - Added
loadlocal_mnisttomlxtend.datafor streaming MNIST from a local byte files into numpy arrays - New
NeuralNetMLPparameters:random_weights,shuffle_init,shuffle_epoch - New
SFSfeatures such as the generation of pandasDataFrameresults tables and plotting functions (with confidence intervals, standard deviation, and standard error bars) - Added support for regression estimators in
SFS - Added Boston
housing dataset - New
shuffleparameter forclassifier.NeuralNetMLP
Changes
- The
mlxtend.preprocessing.standardizefunction now optionally returns the parameters, which are estimated from the array, for re-use. A further improvement makes thestandardizefunction smarter in order to avoid zero-division errors - Cosmetic improvements to the
evaluate.plot_decision_regionsfunction such as hiding plot axes - Renaming of
classifier.EnsembleClassfiertoclassifier.EnsembleVoteClassifier - Improved random weight initialization in
Perceptron,Adaline,LinearRegression, andLogisticRegression - Changed
learningparameter ofmlxtend.classifier.Adalinetosolverand added "normal equation" as closed-form solution solver - Hide y-axis labels in
mlxtend.evaluate.plot_decision_regionsin 1 dimensional evaluations - Sequential Feature Selection algorithms were unified into a single
SequentialFeatureSelectorclass with parameters to enable floating selection and toggle between forward and backward selection. - Stratified sampling of MNIST (now 500x random samples from each of the 10 digit categories)
- Renaming
mlxtend.plottingtomlxtend.general_plottingin order to distinguish general plotting function from specialized utility function such asevaluate.plot_decision_regions
Version 0.2.9 (2015-07-14)
Downloads
New Features
- Sequential Feature Selection algorithms: SFS, SFFS, SBS, and SFBS
Changes
- Changed
regularization&lambdaparameters inLogisticRegressionto single parameterl2_lambda
Version 0.2.8 (2015-06-27)
- API changes:
mlxtend.sklearn.EnsembleClassifier->mlxtend.classifier.EnsembleClassifiermlxtend.sklearn.ColumnSelector->mlxtend.feature_selection.ColumnSelectormlxtend.sklearn.DenseTransformer->mlxtend.preprocessing.DenseTransformermlxtend.pandas.standardizing->mlxtend.preprocessing.standardizingmlxtend.pandas.minmax_scaling->mlxtend.preprocessing.minmax_scalingmlxtend.matplotlib->mlxtend.plotting
- Added momentum learning parameter (alpha coefficient) to
mlxtend.classifier.NeuralNetMLP. - Added adaptive learning rate (decrease constant) to
mlxtend.classifier.NeuralNetMLP. mlxtend.pandas.minmax_scalingbecamemlxtend.preprocessing.minmax_scalingand also supports NumPy arrays nowmlxtend.pandas.standardizingbecamemlxtend.preprocessing.standardizingand now supports both NumPy arrays and pandas DataFrames; also, nowddofparameters to set the degrees of freedom when calculating the standard deviation
Version 0.2.7 (2015-06-20)
- Added multilayer perceptron (feedforward artificial neural network) classifier as
mlxtend.classifier.NeuralNetMLP. - Added 5000 labeled trainingsamples from the MNIST handwritten digits dataset to
mlxtend.data
Version 0.2.6 (2015-05-08)
- Added ordinary least square regression using different solvers (gradient and stochastic gradient descent, and the closed form solution (normal equation)
- Added option for random weight initialization to logistic regression classifier and updated l2 regularization
- Added
winedataset tomlxtend.data - Added
invert_axesparametermlxtend.matplotlib.enrichtment_plotto optionally plot the "Count" on the x-axis - New
verboseparameter formlxtend.sklearn.EnsembleClassifierby Alejandro C. Bahnsen - Added
mlxtend.pandas.standardizingto standardize columns in a Pandas DataFrame - Added parameters
linestylesandmarkerstomlxtend.matplotlib.enrichment_plot mlxtend.regression.lin_regplotautomatically adds np.newaxis and works w. python lists- Added tokenizers:
mlxtend.text.extract_emoticonsandmlxtend.text.extract_words_and_emoticons
Version 0.2.5 (2015-04-17)
- Added Sequential Backward Selection (mlxtend.sklearn.SBS)
- Added
X_highlightparameter tomlxtend.evaluate.plot_decision_regionsfor highlighting test data points. - Added mlxtend.regression.lin_regplot to plot the fitted line from linear regression.
- Added mlxtend.matplotlib.stacked_barplot to conveniently produce stacked barplots using pandas
DataFrames. - Added mlxtend.matplotlib.enrichment_plot
Version 0.2.4 (2015-03-15)
- Added
scoringtomlxtend.evaluate.learning_curves(by user pfsq) - Fixed setup.py bug caused by the missing README.html file
- matplotlib.category_scatter for pandas DataFrames and Numpy arrays
Version 0.2.3 (2015-03-11)
- Added Logistic regression
- Gradient descent and stochastic gradient descent perceptron was changed to Adaline (Adaptive Linear Neuron)
- Perceptron and Adaline for {0, 1} classes
- Added
mlxtend.preprocessing.shuffle_arrays_unisonfunction to shuffle one or more NumPy arrays. - Added shuffle and random seed parameter to stochastic gradient descent classifier.
- Added
rstripparameter tomlxtend.file_io.find_filegroupsto allow trimming of base names. - Added
ignore_substringparameter tomlxtend.file_io.find_filegroupsandfind_files. - Replaced .rstrip in
mlxtend.file_io.find_filegroupswith more robust regex. - Gridsearch support for
mlxtend.sklearn.EnsembleClassifier
Version 0.2.2 (2015-03-01)
- Improved robustness of EnsembleClassifier.
- Extended plot_decision_regions() functionality for plotting 1D decision boundaries.
- Function matplotlib.plot_decision_regions was reorganized to evaluate.plot_decision_regions .
- evaluate.plot_learning_curves() function added.
- Added Rosenblatt, gradient descent, and stochastic gradient descent perceptrons.
Version 0.2.1 (2015-01-20)
- Added mlxtend.pandas.minmax_scaling - a function to rescale pandas DataFrame columns.
- Slight update to the EnsembleClassifier interface (additional
votingparameter) - Fixed EnsembleClassifier to return correct class labels if class labels are not integers from 0 to n.
- Added new matplotlib function to plot decision regions of classifiers.
Version 0.2.0 (2015-01-13)
- Improved mlxtend.text.generalize_duplcheck to remove duplicates and prevent endless looping issue.
- Added
recursivesearch parameter to mlxtend.file_io.find_files. - Added
check_extparameter mlxtend.file_io.find_files to search based on file extensions. - Default parameter to ignore invisible files for mlxtend.file_io.find.
- Added
transformandfit_transformto theEnsembleClassifier. - Added mlxtend.file_io.find_filegroups function.
Version 0.1.9 (2015-01-10)
- Implemented scikit-learn EnsembleClassifier (majority voting rule) class.
Version 0.1.8 (2015-01-07)
- Improvements to mlxtend.text.generalize_names to handle certain Dutch last name prefixes (van, van der, de, etc.).
- Added mlxtend.text.generalize_name_duplcheck function to apply mlxtend.text.generalize_names function to a pandas DataFrame without creating duplicates.
Version 0.1.7 (2015-01-07)
- Added text utilities with name generalization function.
- Added and file_io utilities.
Version 0.1.6 (2015-01-04)
- Added combinations and permutations estimators.
Version 0.1.5 (2014-12-11)
- Added
DenseTransformerfor pipelines and grid search.
Version 0.1.4 (2014-08-20)
mean_centeringfunction is now a Class that createsMeanCentererobjects that can be used to fit data via thefitmethod, and center data at the column means via thetransformandfit_transformmethod.
Version 0.1.3 (2014-08-19)
- Added
preprocessingmodule andmean_centeringfunction.
Version 0.1.2 (2014-08-19)
- Added
matplotlibutilities andremove_bordersfunction.
Version 0.1.1 (2014-08-13)
- Simplified code for ColumnSelector.