World's most popular travel blog for travel bloggers.

difference between multilayer perceptron and linear regression

, , No Comments
Problem Detail: 

What is the difference between multilayer perceptron and linear regression classifier. I am trying to learn a model with numerical attributes, and predict a numerical value.

Thanks

Asked By : user20287

Answered By : Peter

If you have a neural network (aka a multilayer perceptron) with only an input and an output layer and with no activation function, that is exactly equal to linear regression. In your case, each attribute corresponds to an input node and your network has one output node, which represents the target value you're trying to predict.

However, most neural networks are more complex. Usually, the input nodes first feed into a layer of hidden units. These hidden units take the sum of their weighted inputs, as the output node did in the simple network, but apply an activation function to them (usually a sigmoid function which compresses the domain $(-\infty, \infty)$ to $(-1, 1)$). The output nodes are each a linear combination of all the nodes in the hidden layer. These days there are even effective methods to train networks with many hidden layer each feeding into the next.

Practically, there are two important differences:

  • Linear regression (and the linear network with no hidden layers) have a closed form solution. You can compute the optimal model directly and efficiently. Once you add an activation function, and possibly hidden layers, you cannot compute an optimal model directly anymore, and you're forced to use an iterative solution: an algorithm that goes through steps, usually improving the model with each step. There are no guarantees that the process will converge, or that you'll find the best model. It's also a lot slower than the direct solution.
  • The flipside is that with a hidden layer and activation function, you have a much more powerful model. You can even show that so long as you have enough nodes in your hidden layer, you can approximate any continuous function with an abritrarily small error. Linear regression and the simple neural network can only model linear functions. You can however use a design matrix (or basis functions, in neural network terminology) to increase the power of linear regression without losing the closed form solution.

Glossary

  • Neural Network: A collection of nodes and arrows. Computation flows through the network along the arrows. The value of node $n$ is determined by the nodes which have arrows from them to $n$. Each arrow contains a weight which determines the strength of the connection.
  • Activation function: A node in a neural network works as follows: it takes all the values of all nodes which are connected to it by an arrow, multiplies all these by their respective weights, and then sums them. It then passes this sum through an activation function to ensure that its output stays in the range $(-1, 1)$. Common activations functions are the logistic function and tanh.
  • Hidden layers: The nodes and arrows in a neural network can be arranged in any way, but the most common arrangement is in layers. This is called a multilayer perceptron. It groups the nodes into layers. Each node in layer $i$ takes all nodes in layer $i-1$ as input. The nodes in the first layer are the input nodes and the nodes in the last layer are the output nodes. The rest of the layers are called hidden layers.

The following image comes from this tutorial.

Schematic of a multilayer perceptron.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/28597

0 comments:

Post a Comment

Let us know your responses and feedback