Programming my first ANN, but got a few unanswered questions

This is a discussion on Programming my first ANN, but got a few unanswered questions within the General AI Programming forums, part of the Cprogramming.com and AIHorizon.com's Artificial Intelligence Boards category; Hello, I'm getting started on writing my own Neural Network code, and although I've only written the input/output system at ...

  1. #1
    Registered User Swarvy's Avatar
    Join Date
    Apr 2008
    Location
    United Kingdom
    Posts
    195

    Question Programming my first ANN, but got a few unanswered questions

    Hello, I'm getting started on writing my own Neural Network code, and although I've only written the input/output system at the moment so that the data can be fed into it etc, and haven't started on the ANN itself, I've been doing quite a lot of reading about neural networks and have a few questions.

    Question 1 - Can you use a neural network to produce a non-binary output or should the output of an ANN always either be 1 or 0? If so, thats a little limited for most situation, isn't it?

    Question 2 - From the reading I've been doing it seems that often only one hidden layer is implemented in most ANNs, I was wondering why more hidden layers aren't implemented in most cases?

    Question 3 - I understand the point of the training sets and validation sets, but how is the validation set any different from the generalization set when it comes to learning? Surely if you input the training set to calculate the neural network weightings, and then use the validation set to test the network's accuracy, what does the generalization set do?

    Question 4 - Since the performance of a neural network at solving a given problem depends quite substantially on the structure of the network chosen, could it be possible to get the program to find its own optimum network structure. I was thinking of implementing a system where I have a big population of similar neural networks (probably 2000 to 5000 ANNs) and then after each network has learned, taking an average of their accuracies and then allow random adjustment to the structure within a given framework (e.g. input and output layer must remain the same structure, but additional neurons or hidden layers could be added in the middle) by using an evolutionary-based algorithm. I.e. The neural networks with the best accuracy carry on their traits to the next generation etc. Would that be possible? Would that be a good idea, so that the neural network self-improves? I am concerned that if I did that it would carry on adding neurons and would never find an optimized structure, in which case I would have to add in additional stopping conditions such as maximum network size, an accuracy threshold etc.

    Question 5 - How many outputs can an ANN give? In all the examples I've seen they only ever give out a single binary value. Ideally I want to get 4 or 5 outputs from my network, how feasible is that as a suggestion?

    Question 6 - Would it be possible to do the idea suggested in question 4, but instead using it to optimize the momentum based on the data. Add in random variation and see which ones perform best?

    Needless to say if I decide to include all that evolution stuff my program is gonna be a bit of a monster to program, especially since I've never done any AI programming before, but I'm pretty good at C/C++ so should be able to cope. As the project becomes more complete I'll start making source code posts.

  2. #2
    Registered User
    Join Date
    Aug 2010
    Posts
    35
    Quote Originally Posted by Swarvy View Post
    Hello, I'm getting started on writing my own Neural Network code, and although I've only written the input/output system at the moment so that the data can be fed into it etc, and haven't started on the ANN itself, I've been doing quite a lot of reading about neural networks and have a few questions.
    I'm no expert, but since nobody else has replied, I'll see if I can provide some useful info.

    1. Yes, you can. Typically (but not always), your hidden layer function outputs will be binary as this mimics the behaviour of a neuron firing at a given threshold, but there are a number of standard activation functions and I see no reason why any function can't be used. You just have to remember that using overly-complex activation functions will undermine the point of the neural network, which is that relationship between the functions and the values of the weights does the deciding - not the functions themselves.

    2. Though the number of layers can be arbitrary, there are a few reasons why more hidden layers are not usually implemented. First is the cost. For every layer i with k neurons, adding a layer i+1 with k` neurons means training another k*k` weights. Too many layers can also promote overfitting, as it may allow you to train your ANN to be extremely accurate for your training set, which may detract from its ability to generalise as its parameters become too specific. It's also often more manageable to abstract and link together multiple neural networks than simply to have one large one.

    3. The validation set is used during training to ensure that overfitting is not occurring - i.e. when the validation error is minimised, the network is optimal. The generalization set, on the other hand, is used outside of training, for comparing two or more models (among other things).

    4. It is certainly possible to create a neural network that has the capacity to dynamically add nodes, but it is far from straight forward. I'm also not sure how useful it is, particularly given that you should already have some intelligent selection of inputs and, as I noted earlier, you should be trying to avoid having too many layers/nodes as it is in order to help avoid overfitting. I think you're much better off just designing a few different models and comparing the results after training.

    5. You can have any number of outputs. There's an example of this (and question 1) here. Generally, though, if you want to make multiple decisions, you are probably better off having multiple different models for each task (unless they are very closely related, as in the example). Don't get into the mindset that you are trying to build a brain - rather, you are trying to build a model that is designed to make a single decision.

    6, again, is probably possible, but I'm not sure why you would want to subject yourself to something so complicated for so little reward. Especially as someone with little previous experience in AI programming.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. malloc problem
    By Tool in forum C Programming
    Replies: 23
    Last Post: 03-12-2010, 10:54 AM
  2. questions....so many questions about random numbers....
    By face_master in forum C++ Programming
    Replies: 2
    Last Post: 07-30-2009, 09:47 AM
  3. A very long list of questions... maybe to long...
    By Ravens'sWrath in forum C Programming
    Replies: 16
    Last Post: 05-16-2007, 06:36 AM
  4. Several Questions, main one is about protected memory
    By Tron 9000 in forum C Programming
    Replies: 3
    Last Post: 06-02-2005, 08:42 AM
  5. Trivial questions - what to do?
    By Aerie in forum A Brief History of Cprogramming.com
    Replies: 23
    Last Post: 12-26-2004, 09:44 AM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21