When selecting keywords, I would use a simple statistical analysis of the terms, words and phrases used in spam versus non-spam, All words or phrases for which the threshold of differentiation is greater than some arbitrary setting, defined by you, could be used as an input to the network. For example, words like 'natural' 'male' and 'enhancement' will have a high correlation with spam, while words like 'the' will be pure noise. If you then train the network on all word inputs, you can correleate things like 'the all nautral male enhancement' which should trigger a high output. You should also correlate them using seperate statistical analysis for the subject and the body, and one using both. The final network should not be trained until it is perfect, use early stopping. Specificalyl stop as soon as the trainign set yeilds a firm differentiation in classes. For each example, run teh network in feed forward mode adn check the output. Then see if the outputs for spam all fall above or below the outputs for non-spam. Now retrain using the examples from eaqch that overlap, then recheck. Repeat until the network properly differentiates all your examples, and keep teh threshold at which it makes this choice. choosing an arbitrary threshold like 0.5 will tend to tryt o warp teh hyperplane in ways that while mathematically feasable are difficult to attain in real woorld hardware which has finite precision. Trying to force the most optimal hyperplane can and often does cause teh trainign to disregard solutions that are less bound to the parameters fo perfection, but are non-theless correct in their differentiation. Do not forget to add new examples to the training set and to retrin as often as possible. Usually the point at which a new example is added is teh perfect time to retrain, as most people are more than happy to let their computer spend lots of time learnign to ignore the most recent peice of spam to get through.
Here is some code, which I used to build MARGO. This is just 'one' of the feedforward functons.
Code:
void CNode::FastFeedForward(double* Input , double* Output){
this->Temp[0] = 0.0;
// this is the pure C/C++ implimentation
for(DWORD x = 0;x<this->NumInputs;x++){
this->Temp[0] += (Input[x] * (this->Weights[x]));
}
*Output = sin(atan(this->Temp[0]));
return;
}