Programming my first ANN, but got a few unanswered questions
Hello, I'm getting started on writing my own Neural Network code, and although I've only written the input/output system at the moment so that the data can be fed into it etc, and haven't started on the ANN itself, I've been doing quite a lot of reading about neural networks and have a few questions.
Question 1 - Can you use a neural network to produce a non-binary output or should the output of an ANN always either be 1 or 0? If so, thats a little limited for most situation, isn't it?
Question 2 - From the reading I've been doing it seems that often only one hidden layer is implemented in most ANNs, I was wondering why more hidden layers aren't implemented in most cases?
Question 3 - I understand the point of the training sets and validation sets, but how is the validation set any different from the generalization set when it comes to learning? Surely if you input the training set to calculate the neural network weightings, and then use the validation set to test the network's accuracy, what does the generalization set do?
Question 4 - Since the performance of a neural network at solving a given problem depends quite substantially on the structure of the network chosen, could it be possible to get the program to find its own optimum network structure. I was thinking of implementing a system where I have a big population of similar neural networks (probably 2000 to 5000 ANNs) and then after each network has learned, taking an average of their accuracies and then allow random adjustment to the structure within a given framework (e.g. input and output layer must remain the same structure, but additional neurons or hidden layers could be added in the middle) by using an evolutionary-based algorithm. I.e. The neural networks with the best accuracy carry on their traits to the next generation etc. Would that be possible? Would that be a good idea, so that the neural network self-improves? I am concerned that if I did that it would carry on adding neurons and would never find an optimized structure, in which case I would have to add in additional stopping conditions such as maximum network size, an accuracy threshold etc.
Question 5 - How many outputs can an ANN give? In all the examples I've seen they only ever give out a single binary value. Ideally I want to get 4 or 5 outputs from my network, how feasible is that as a suggestion?
Question 6 - Would it be possible to do the idea suggested in question 4, but instead using it to optimize the momentum based on the data. Add in random variation and see which ones perform best?
Needless to say if I decide to include all that evolution stuff my program is gonna be a bit of a monster to program, especially since I've never done any AI programming before, but I'm pretty good at C/C++ so should be able to cope. As the project becomes more complete I'll start making source code posts.