avatarLuthfi Ramadhan

Summary

Radial Basis Function (RBF) networks are a type of neural network with a distinct architecture, consisting of an input layer, a single hidden layer, and an output layer, which transforms the input into a more linearly separable space using a nonlinear transformation based on Cover's theorem.

Abstract

Radial Basis Function (RBF) networks are a unique type of neural network with a simplified architecture, comprising only an input layer, a single hidden layer, and an output layer. Unlike traditional neural networks, the hidden layer of RBF networks uses a special method to compute its output. This method is based on Cover's theorem, which states that patterns that are not linearly separable can often be transformed into higher-dimensional space using nonlinear transformations, making them more linearly separable. In the hidden layer, each neuron computes the similarity between the input vector and its prototype vector, which is a vector from the training set. The output layer uses a linear activation function for both classification and regression tasks.

Opinions

  • RBF networks have a fundamentally different architecture than most neural network architectures.
  • The power of RBF networks comes from the computations that happen inside the hidden layer.
  • The input layer is not a computation layer, it just receives the input data and feeds it into the hidden layer.
  • The number of neurons in the hidden layer should be greater than the number of neurons in the input layer, based on Cover’s theorem on the separability of patterns.
  • The parameters μ bar and σ are learned in an unsupervised way, for example using some clustering algorithm.
  • The resulting prediction can be used for both classification or regression tasks, depending on the target and loss function.
  • It is possible to stack RBF networks with other networks, for example, replacing the output layer of the RBF network with a multilayer perception and training the network end-to-end.

Deep Learning

Radial Basis Function Neural Network Simplified

A short introduction to radial basis function neural network

Photo by Zoltan Tasi on Unsplash

Radial basis function (RBF) networks have a fundamentally different architecture than most neural network architectures. Most neural network architecture consists of many layers and introduces nonlinearity by repetitively applying nonlinear activation functions. RBF network on the other hand only consists of an input layer, a single hidden layer, and an output layer.

Image by author

The input layer is not a computation layer, it just receives the input data and feeds it into the special hidden layer of the RBF network. The computation that is happened inside the hidden layer is very different from most neural networks, and this is where the power of the RBF network comes from. The output layer performs the prediction task such as classification or regression.

Input Layer

The input layer simply feeds the data to the hidden layers.

Image by author

As a result, the number of neurons in the input layer should be equal to the dimensionality of the data. In the input layers, no computation is performed, as is the case with standard artificial neural networks. The input neurons are fully connected to the hidden neurons and feed their input forward.

Hidden Layer

The hidden layer takes the input in which the pattern might not be linearly separable and transform it into a new space that is more linearly separable. The hidden layer has higher dimensionality than the input layer because the pattern that is not linearly separable often needs to be transformed into higher-dimensional space to be more linearly separable. This is based on Cover’s theorem on the separability of patterns, which states that a pattern that is transformed into a higher-dimensional space with nonlinear transformation is more likely to be linearly separable, therefore the number of neurons in the hidden layer should be greater than the number of the input neuron. With that said, the number of neurons in the hidden layer should be less than or equal to the number of samples in the training set. When the number of neurons in the hidden layer is equal to the number of samples in the training set, the model can be thought roughly equivalent to kernel learners such as kernel regression and kernel support vector machines.

The computations in the hidden layers are based on comparisons with prototype vectors which is a vector from the training set.

Image by author

Each neuron in the hidden layer has a prototype vector and a bandwidth denoted by μ and σ respectively. Each neuron computes the similarity between the input vector and its prototype vector. The computation in the hidden layer can be mathematically written as follow:

Image by author

With:

  • X bar as the input vector
  • μ bar as the iᵗʰ neuron’s prototype vector
  • σ as the iᵗʰ neuron’s bandwidth
  • phi as the iᵗʰ neuron’s output

The parameters μ bar and σ are learned in an unsupervised way, for example using some clustering algorithm.

Output Layer

The output layer uses a linear activation function for both classification or regression tasks.

Image by author

The computations in the output layer are performed just like a standard artificial neural network which is a linear combination between the input vector and the weight vector. The computation in the output layer can be mathematically written as follow:

Image by author

With:

  • wᵢ as the weight connection
  • phi as the iᵗʰ neuron’s output from the hidden layer
  • y as the prediction result

The resulting prediction can be used for both classification or regression tasks, it depends on the target and loss function. The parameters w are learned in a supervised way such as gradient descent.

Although the output layer of RBF can be used as the final output, it is possible to stack RBF networks with other networks, for example, we can replace the output layer of the RBF network with a multilayer perception and train the network end-to-end.

Conclusion

The RBF network only consists of a single hidden layer that has its own way of computing the output. RBF network is based on the cover theorem, it casts the data into a higher-dimensional space by using its hidden layer, therefore the number of neurons in the hidden layer should be greater than the number of neurons in the input layer. The output layer uses a linear activation function or can be thought of without any activation function.

References

Deep Learning
Data Science
Machine Learning
Artificial Intelligence
Neural Networks
Recommended from ReadMedium