Perceptron: A simple binary classifier

Implementation of a Perceptron learning algorithm for classification.

from mlxtend.classifier import Perceptron


The idea behind this "thresholded" perceptron was to mimic how a single neuron in the brain works: It either "fires" or not. A perceptron receives multiple input signals, and if the sum of the input signals exceed a certain threshold it either returns a signal or remains "silent" otherwise. What made this a "machine learning" algorithm was Frank Rosenblatt's idea of the perceptron learning rule: The perceptron algorithm is about learning the weights for the input signals in order to draw linear decision boundary that allows us to discriminate between the two linearly separable classes +1 and -1.

Basic Notation

Before we dive deeper into the algorithm(s) for learning the weights of the perceptron classifier, let us take a brief look at the basic notation. In the following sections, we will label the positive and negative class in our binary classification setting as "1" and "-1", respectively. Next, we define an activation function that takes a linear combination of the input values and weights as input (), and if is greater than a defined threshold we predict 1 and -1 otherwise; in this case, this activation function is a simple "unit step function," which is sometimes also called "Heaviside step function."

$$ g(z) = $$


is the feature vector, and is an -dimensional sample from the training dataset:

In order to simplify the notation, we bring to the left side of the equation and define

so that

$$ g({z}) = $$


Perceptron Rule

Rosenblatt's initial perceptron rule is fairly simple and can be summarized by the following steps:

  1. Initialize the weights to 0 or small random numbers.
  2. For each training sample :
    1. Calculate the output value.
    2. Update the weights.

The output value is the class label predicted by the unit step function that we defined earlier (output ) and the weight update can be written more formally as .

The value for updating the weights at each increment is calculated by the learning rule

where is the learning rate (a constant between 0.0 and 1.0), "target" is the true class label, and the "output" is the predicted class label.

aIt is important to note that all weights in the weight vector are being updated simultaneously. Concretely, for a 2-dimensional dataset, we would write the update as:

Before we implement the perceptron rule in Python, let us make a simple thought experiment to illustrate how beautifully simple this learning rule really is. In the two scenarios where the perceptron predicts the class label correctly, the weights remain unchanged:

However, in case of a wrong prediction, the weights are being "pushed" towards the direction of the positive or negative target class, respectively:

It is important to note that the convergence of the perceptron is only guaranteed if the two classes are linearly separable. If the two classes can't be separated by a linear decision boundary, we can set a maximum number of passes over the training dataset ("epochs") and/or a threshold for the number of tolerated misclassifications.


Example 1 - Classification of Iris Flowers

from import iris_data
from mlxtend.plotting import plot_decision_regions
from mlxtend.classifier import Perceptron
import matplotlib.pyplot as plt

# Loading Data

X, y = iris_data()
X = X[:, [0, 3]] # sepal length and petal width
X = X[0:100] # class 0 and class 1
y = y[0:100] # class 0 and class 1

# standardize
X[:,0] = (X[:,0] - X[:,0].mean()) / X[:,0].std()
X[:,1] = (X[:,1] - X[:,1].mean()) / X[:,1].std()

# Rosenblatt Perceptron

ppn = Perceptron(epochs=5, 
                 print_progress=3), y)

plot_decision_regions(X, y, clf=ppn)
plt.title('Perceptron - Rosenblatt Perceptron Rule')

print('Bias & Weights: %s' % ppn.w_)

plt.plot(range(len(ppn.cost_)), ppn.cost_)
Iteration: 5/5 | Elapsed: 00:00:00 | ETA: 00:00:00


Bias & Weights: [[-0.04500809]
 [ 0.11048855]]



Perceptron(eta=0.1, epochs=50, random_seed=None, print_progress=0)

Perceptron classifier.

Note that this implementation of the Perceptron expects binary class labels in {0, 1}.




For usage examples, please see


fit(X, y, init_params=True)

Learn model from training data.




Predict targets from X.



score(X, y)

Compute the prediction accuracy