one_hot: One-Hot encoding function for class label arrays
A function that performs one-hot encoding for class labels.
from mlxtend.preprocessing import one_hot
Overview
Typical supervised machine learning algorithms for classifications assume that the class labels are nominal (a special case of categorical where no order is implied). A typical example of an nominal feature would be "color" since we can't say (in most applications) that "orange > blue > red".
The one_hot function provides a simple interface to convert class label integers into a so-called one-hot array, where each unique label is represented as a column in the new array.
For example, let's assume we have 5 data points from 3 different classes: 0, 1, and 2.
y = [0, # sample 1, class 0 
     1, # sample 2, class 1 
     0, # sample 3, class 0
     2, # sample 4, class 2
     2] # sample 5, class 2
After one-hot encoding, we then obtain the following array (note that the index position of the "1" in each row denotes the class label of this sample):
y = [[1,  0,  0], # sample 1, class 0 
     [0,  1,  0], # sample 2, class 1  
     [1,  0,  0], # sample 3, class 0
     [0,  0,  1], # sample 4, class 2
     [0,  0,  1]  # sample 5, class 2
     ])
Example 1 - Defaults
from mlxtend.preprocessing import one_hot
import numpy as np
y = np.array([0, 1, 2, 1, 2])
one_hot(y)
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])
Example 2 - Python Lists
from mlxtend.preprocessing import one_hot
y = [0, 1, 2, 1, 2]
one_hot(y)
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])
Example 3 - Integer Arrays
from mlxtend.preprocessing import one_hot
y = [0, 1, 2, 1, 2]
one_hot(y, dtype='int')
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 0, 1]])
Example 4 - Arbitrary Numbers of Class Labels
from mlxtend.preprocessing import one_hot
y = [0, 1, 2, 1, 2]
one_hot(y, num_labels=10)
array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])
API
one_hot(y, num_labels='auto', dtype='float')
One-hot encoding of class labels
Parameters
- 
y: array-like, shape = [n_classlabels]Python list or numpy array consisting of class labels.
 - 
num_labels: int or 'auto'Number of unique labels in the class label array. Infers the number of unique labels from the input array if set to 'auto'.
 - 
dtype: strNumPy array type (float, float32, float64) of the output array.
 
Returns
- 
ary: numpy.ndarray, shape = [n_classlabels]One-hot encoded array, where each sample is represented as a row vector in the returned array.
 
Examples
For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/preprocessing/one_hot/