Thread: why doesn't my XOR network work?

  1. #16
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    I guess I did something wrong...? And yes, I included <math.h>
    If you are compiling with gcc on linux you have to add the -lm switch to include math.h. I don't know why, but for for some reason you do.

    After that you still have a lot of things that will need changing before this will work. Perhaps the first step would be to use randomly generated weights between -1.0 and +1.0.

  2. #17
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    So, instead gcc name.c -o name is gcc name.c -lm name?

    why do i need randomly generated weights? OK, i will add them, but tell me those other things i need...pretty please?
    Arduino rocks!

  3. #18
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by yann View Post
    So, instead gcc name.c -o name is gcc name.c -lm name?
    No, linker flags at the end:

    gcc name.c -o name -lm

    -lm just means "link math". Some header files also have pre-compiled library objects necessary to them.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #19
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    So, instead gcc name.c -o name is gcc name.c -lm name?
    use: gcc name.c -lm -o name
    why do i need randomly generated weights?
    Its so you don't get bias to certain outcomes. It also means that each trained network behaves slightly differently.
    OK, i will add them, but tell me those other things i need...pretty please?
    First off learn how to use a loop. Then you could try producing a diagram of where your data is flowing through your network, with arrows and stuff. That would probably give you a better understanding of what you are doing. Then write some pseudo code listing the steps the program will take for your feed forward and feedback phases. Thats what I would do anyway.

  5. #20
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    First off learn how to use a loop. Then you could try producing a diagram of where your data is flowing through your network, with arrows and stuff. That would probably give you a better understanding of what you are doing.
    OK thank you...
    Arduino rocks!

  6. #21
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    OK, this is my code, I still need to "account for the derivative of the activation when calculating error"
    and i have no idea of how to do that, so be free to change my code anyway you want.

    Code:
    #include <stdbool.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <math.h>
    
    float error[100];
    int i;
    float weight[100];
    bool percept[100];
    int input[100];
    float tres[100];
    double sum[100];
    int successive_right = 0;
    int total_go_round = 1;
    const float learning_rate = 0.1f;
    const float learning_rate2 = 0.25f;
    const float learning_rate3 = 0.1f;
    
    
    double f(double x){
       return sin(atan(x));
       }
    
    /*double f(double x){
       if(x > 0.0) return 1.0;
       return 0.0;
       }*/
    
    /*double f(double x){
       return 1.0 / (1.0 + pow(2.71828182846 , 0.0 - x));
       }*/
    
    
    
    bool target(int y, int z) {
    	if ((y && !z) || (z && !y)) return 1;
    	return 0;
    }
    
    
    int educate() {
      
    	input[0] = 1; //bias, uvijek 1
            input[3] = 1;
            input[4] = 1;
            input[5] = 1;
            input[6] = 1;
            input[1] = rand() % 2;
            input[2] = rand() % 2;
            bool goal = target(input[1], input[2]);
       	sum[0] =  (weight[0]*input[0]+weight[1]*input[1]+weight[2]*input[2]);
    	sum[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2]);
     	sum[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1]);
       	sum[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1]);
        	sum[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3]);
            percept[0] = f(sum[0]);
            percept[1] = f(sum[1]);
            percept[2] = f(sum[2]);
            percept[3] = f(sum[3]);
            percept[4] = f(sum[4]);
            
            if (percept[4] == goal) {
            	successive_right++;
            }
            else {
                	successive_right = 0;
                	error[0] = goal - percept[4];
                	error[1] = (error[0]*weight[14]);
           	     	error[2] = (error[0]*weight[13]);
                    error[3] = (error[1]*weight[1])+(error[2]*weight[2]);
                	error[4] = (error[1]*weight[4])+(error[2]*weight[5]);
                              
                	weight[0]  += learning_rate*error[4]*input[0];//input
                	weight[1]  += learning_rate*error[4]*input[1];//input
                	weight[2]  += learning_rate*error[4]*input[2];//input
                	weight[3]  += learning_rate*error[4]*input[3];//input
                	weight[4]  += learning_rate*error[3]*percept[1];//input
                	weight[5]  += learning_rate*error[3]*percept[0];//input
                	weight[6]  += learning_rate2*error[3]*input[4];//hiden
                	weight[7]  += learning_rate2*error[2]*percept[0];//hiden
                	weight[8]  += learning_rate2*error[2]*percept[1];//hiden
                	weight[9]  += learning_rate2*error[2]*input[5];//hiden
                	weight[10] += learning_rate2*error[1]*percept[0];//hiden
                	weight[11] += learning_rate2*error[1]*percept[1];//hiden
                	weight[12] += learning_rate3*error[1]*input[6];//output
                	weight[13] += learning_rate3*error[0]*percept[4];//output
                	weight[14] += learning_rate3*error[0]*percept[3];//output
    
    }
    	return;
    }
    
    
    int main(){
       	tres[0] = 0;
       	tres[1] = 0;
       	tres[2] = 0;
       	tres[3] = 0;
       	tres[4] = 0;
       	srand(time(NULL));
       	for(i=0;i<14;i++){
       		weight[i]= 0.5;
       	}    
       	total_go_round=1;
       	while(total_go_round <= 5000){
            	educate();
          		total_go_round++;
    	}
            printf("inputs:\n");
            scanf("%d", &input[1]);
            scanf("%d", &input[2]);
       	percept[0] =  (weight[0]*input[0]+ weight[1]*input[1]+weight[2]*input[2] > tres[0]);
    	percept[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2] > tres[1]);
     	percept[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1] > tres[2]);
       	percept[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1] > tres[3]);
        	percept[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3] > tres[4]);
            printf("%d\n", percept[4]);   	
    	return 0;
    }
    Arduino rocks!

  7. #22
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by yann View Post
    Huh, the first thing you posted is just like the linear threshold function, it still gets things wrong, in the second thing sin(atan(x)), I get these error messages:And yes, I included <math.h>
    hmm, thats what I was gonna say is to include math.h, which should have sin and atan defined in it, if it doesn't then your math.h is broken, you may also try <cmath>

    you can also try just using tanh, which depending on the implementation might be faster, but less accurate.

    sin(atan()) doesn't produce 'better' results, it just executes faster, because sin and atan are both supported in hardware.

    The derivative of a function is teh function that decribed the change in that function, or put another way a function that outputs teh instantaeous slope of that function at any given point. That also makes the sin(atan()) easier, the derivative of the arc tangeant is 1 / (1 + x^2) the derivative of sin is cos, so the derivative of sin(atan(x)) would be
    Code:
    cos( 1.0 / (1.0 + x * x))
    or something, my calculus is a bit rusty.

    The suggestion to draw out the neural network is a good one, beign abel to visualize what is happening helps a lot.
    Last edited by abachler; 09-28-2009 at 01:23 PM.

  8. #23
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    sooo...this:

    weight[14] += learning_rate3*error[0]*cos( 1.0 / (1.0 + x * x)) *percept[3];
    Right?
    Everything is fine with math now...
    Arduino rocks!

  9. #24
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by yann View Post
    sooo...this:
    Right?
    Everything is fine with math now...
    ummm, who gave you that feedback equation. Most perceptrons just use
    Code:
    Weight += Error * Input * Alpha;
    where alpha is the learning rate, typically 0.1, this implements Hebbian learning. You want to reduce alpha for larger numbers of inputs to stabilize the learning.
    Last edited by abachler; 09-28-2009 at 01:57 PM.

  10. #25
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    Uhhh... this is complicated...i used to have input there, i don't know where i lost it...now...what is the right "Weight += Error * Input * Alpha;"
    where do i put cos( 1.0 / (1.0 + x * x)) thing?
    (please write the whole equation, i mean please, i had a very busy day at school and i hurt my head(nothing sirious...) and my head hurts and my dad is yelling at me because of something and i am totally confused...sorry...)

    this is my actual code:
    weight[0] += learning_rate*error[4]*input[0];, instead of input perceptron is like the input, they are connected...
    Arduino rocks!

  11. #26
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by yann View Post
    Uhhh... this is complicated...i used to have input there, i don't know where i lost it...now...what is the right "Weight += Error * Input * Alpha;"
    where do i put cos( 1.0 / (1.0 + x * x)) thing?
    (please write the whole equation, i mean please, i had a very busy day at school and i hurt my head(nothing sirious...) and my head hurts and my dad is yelling at me because of something and i am totally confused...sorry...)

    this is my actual code:
    weight[0] += learning_rate*error[4]*input[0];, instead of input perceptron is like the input, they are connected...
    you dont put the cos anywhere, its not part of the learning rule. What I gave you is the entire basic learning rule, its all you need.

    Code:
    void CNeuron::ApplyError(double* pInputs , double* Feedback , double Error){
    	double Temp;
    
    	this->Bias += Error * this->Bias * this->Alpha;
    	for(unsigned long x = 0;x<this->InputCount;x++){
    		Temp = pInputs[x] * Error * this->Alpha;
    		this->UWeights[x] += Temp + this->UAdjustments[x];
    		this->UAdjustments[x] = 0.0;
    		Feedback[x] += Temp;
    		}
    	return;
    	}
    Last edited by abachler; 09-28-2009 at 02:07 PM.

  12. #27
    Registered User
    Join Date
    Sep 2009
    Posts
    63
    Quote Originally Posted by yann View Post
    OK, this is my code, I still need to "account for the derivative of the activation when calculating error"
    and i have no idea of how to do that, so be free to change my code anyway you want.
    Well, to account for the derivative, you'd have to know calculus. Such a thing is independent of whether or not you can code, so we'll probably be willing to help you if you don't know.

    I'd have to guess that this is a definite case where you'd use partial derivatives. I think your activation function is a function of your various weights and input values. I'm trying to remember propagation of uncertainty, to see if that would apply here. Let's see here.....

    I think it would be |partial x_0| * error for x_0 + |partial x_1| * error x_1 + ...... partial |x_n| * error x_n.

    Hopefully someone who knows the math better than I do will step up

  13. #28
    Registered User
    Join Date
    Sep 2009
    Posts
    63
    Quote Originally Posted by abachler View Post
    That also makes the sin(atan()) easier, the derivative of the arc tangeant is 1 / (1 + x^2) the derivative of sin is cos, so the derivative of sin(atan(x)) would be
    Code:
    cos( 1.0 / (1.0 + x * x))
    or something, my calculus is a bit rusty.
    You're right about arctan(x), and right about sin(x), but wrong about sin(atan(x)).

    -(x^2/(1 + x^2)^(3/2)) + 1/sqrt(1 + x^2) is the derivative. You have to use the product rule and the chain rule

  14. #29
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    void CNeuron::ApplyError(double* pInputs , double* Feedback , double Error){
    double Temp;

    this->Bias += Error * this->Bias * this->Alpha;
    for(unsigned long x = 0;x<this->InputCount;x++){
    Temp = pInputs[x] * Error * this->Alpha;
    this->UWeights[x] += Temp + this->UAdjustments[x];
    this->UAdjustments[x] = 0.0;
    Feedback[x] += Temp;
    }
    return;
    }


    whuh, i i am so confused right now, i don't know the names of these variables, anything....
    Last edited by yann; 09-28-2009 at 02:15 PM.
    Arduino rocks!

  15. #30
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by mike_g View Post
    Its so you don't get bias to certain outcomes. It also means that each trained network behaves slightly differently.
    Hmm.

    In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.
    "What are you doing?", asked Minsky.
    "I am training a randomly wired neural net to play Tic-tac-toe", Sussman replied.
    "Why is the net wired randomly?", asked Minsky.
    "I do not want it to have any preconceptions of how to play", Sussman said.
    Minsky then shut his eyes.
    "Why do you close your eyes?" Sussman asked his teacher.
    "So that the room will be empty."
    At that moment, Sussman was enlightened.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. strcmp returning 1...
    By Axel in forum C Programming
    Replies: 12
    Last Post: 09-08-2006, 07:48 PM
  2. getline() don't want to work anymore...
    By mikahell in forum C++ Programming
    Replies: 7
    Last Post: 07-31-2006, 10:50 AM
  3. Why don't the tutorials on this site work on my computer?
    By jsrig88 in forum C++ Programming
    Replies: 3
    Last Post: 05-15-2006, 10:39 PM
  4. fopen();
    By GanglyLamb in forum C Programming
    Replies: 8
    Last Post: 11-03-2002, 12:39 PM
  5. DLL __cdecl doesnt seem to work?
    By Xei in forum C++ Programming
    Replies: 6
    Last Post: 08-21-2002, 04:36 PM