What exactly is the point of the bias input in a perceptron?
Printable View
What exactly is the point of the bias input in a perceptron?
A perceptron is essentially an equation for a hyperplane in the N-dimentional input space. E.g. for two inputs, it describes a line; for three, it describes a plane. After learning, the hyperplane tries to partition the input space into two regions.
The bias input is needed to create a zeroth-order term, a term which doesn't vary with any inputs. E.g. one form of the generic line equation could be written as Y = MX + B. B is a constant (zeroth order) term.
Without a bias input, a perceptron couldn't describe an arbitrary hyperplane, because all permutations would inevitably pass through the origin. E.g. no matter what the W vector was, if the input vector was all zero, the output would be all zero, unless you have the bias.
The output of a perceptron is determined by the dot product of the input vector with its associated weight vector plus the bias.
I*W+b > 0 -> 1
I*W+b <= 0 -> 0
So, as Cat was saying, it is essentially the constant "shift" in the plane which divides space between "on" and "off" for the perceptron.
But, according to AI Horizon, the bias has a weight itself. Doesn't that mean that the actual value of the bias is irrelevant, as long as it is non-zero, as the weight would compensate?
There are 2 ways to add a bias:
1) Addition after the dot product is taken
2) As a weight (most common method)
For example, say X and Y are variables, and your perceptron implements:
2X + 4Y + 8 > 0
You could do this as:
I = [X Y], W = [2 4], B = 8
but then you have to account for B more often, and being as B is updated just like W, it's more convenient to make it the last entry in the W-vector, like this:
I = [X Y 1], W = [2 4 8].
In this case, you add another input to the perceptron and fix that input's value at 1. Then the weight associated with this input acts as the bias. This is the more common method of implementing a perceptron; it reduces complexity by treating the bias like any other weight.
The bias doesn't have a weight, it *IS* a weight in this model.
So, if I use the second model, appending the bias to the inputs, the actual value of it is meaningless, as long as it is constant?
Edit:
Also, does it matter what the threshold value is as long as it remains constant?
1) The bias is not the last input, the bias is the last WEIGHT. Technically, you could append any nonzero constant to the input vector, but tradition says use 1.Quote:
Originally posted by XSquared
So, if I use the second model, appending the bias to the inputs, the actual value of it is meaningless, as long as it is constant?
Edit:
Also, does it matter what the threshold value is as long as it remains constant?
2) Technically, it could be anything, but 0 is the normal threshold.
I read (and originally posted the link to) the AI Horizon article, and I've read this thread, but it isn't enough for me to really understand what is going on with a Perceptron. Perhaps it's a bit over my head right now, or maybe I just haven't read enough... but has anyone here implemented a program that uses perceptrons for something? The simpler the better, obviously, when trying to learn, but anything will help. So... anyone?
Yes, but it was far from simple... I did a MLP (multilayer perceptron) program once for an advanced course in neural networks.
Dunno how much source I have left, that was years ago under a pretty bad (non-ANSI) compiler.
The biggest thing is to get a library for vector/matrix math.
www.generation5.org has a small example I believe, as well as a decent introductory article.
>>> The biggest thing is to get a library for vector/matrix math.
I'd go with Blitz++.