# PCA Correlation Circle

A function to provide a correlation circle for pca

from mlxtend.plotting import plot_pca_correlation_graph

In a so called correlation circle, the correlations between the original dataset features and the principal component(s) are shown via coordinates.

## Example

The following correlation circle examples visualizes the correlation between the first two principal components and the 4 original iris dataset features.

• Features with a positive correlation will be grouped together.
• Totally uncorrelated features are orthogonal to each other.
• Features with a negative correlation will be plotted on the opposing quadrants of this plot.
from mlxtend.data import iris_data
from mlxtend.plotting import plot_pca_correlation_graph
import numpy as np

X, y = iris_data()

X_norm = X / X.std(axis=0) # Normalizing the feature columns is recommended

feature_names = [
'sepal length',
'sepal width',
'petal length',
'petal width']

figure, correlation_matrix = plot_pca_correlation_graph(X_norm,
feature_names,
pc_dimensions=(1, 2),
figure_axis_size=10)



correlation_matrix

Principal Component 1 Principal Component 2
sepal length -0.891224 -0.357352
sepal width 0.449313 -0.888351
petal length -0.991684 -0.020247
petal width -0.964996 -0.062786

Further, note that the percentage values shown on the x and y axis denote how much of the variance in the original dataset is explained by each principal component axis. I.e.., if PC1 lists 72.7% and PC2 lists 23.0% as shown above, then combined, the 2 principal components explain 95.7% of the total variance.

## API

plot_pca_correlation_graph(X, variables_names, pc_dimensions=(1, 2), figure_axis_size=6, X_pca=None)

Computes PCA for X and plots the correlation plot

Parameters

• X : 2d array-like.

The columns represent the different variables and the rows are the examples of those feature variables

• variables_names : array-like

Column names (the feature variables) of X pc_dimensions: tuple with two elements (default=(1, 2)). principal component dimensions to be plotted (x, y)

• X_pca : 2d array-like (default=None)

Feature variables after performing PCA. If not provided, this function will carry out the principal component analysis automatically. figure_axis_size (default=6): size of the final figure frame. The figure created has square-shape with length and width equal to figure_axis_size.

Returns

matplotlib_figure , correlation_matrix