Thread: why doesn't my XOR network work?

  1. #1
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186

    Unhappy why doesn't my XOR network work?

    Hi, i made a XOR network that should I teach XOR function, i have 3 layers, input, output and one hidden, 5 neurons...code...

    Code:
    #include <stdbool.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    int i;
    float weight[100];
    bool percept[100];
    int input[100];
    float tres[100];
    int successive_right = 0;
    int total_go_round = 1;
    const float learning_rate = 0.1f;
    
    int educate() {
      
            input[0] = 1; //bias, uvijek 1
            input[3] = 1;
            input[4] = 1;
            input[5] = 1;
            input[6] = 1;
            printf("inputs:\n");
            scanf("%d", &input[1]);
            scanf("%d", &input[2]);
            int goal;
            printf("goal:\n");
            scanf("%d", &goal);
       	percept[0] =  (weight[0]*input[0]+ weight[1]*input[1]+weight[2]*input[2] > tres[0]);
    	percept[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2] > tres[1]);
     	percept[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1] > tres[2]);
       	percept[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1] > tres[3]);
        	percept[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3] > tres[4]);
            if (percept[4] == goal) {
                successive_right++;
            } else {
                successive_right = 0;
                int sign = goal ? 1 : -1; //sign of (y-f(x))
                weight[0] += learning_rate*sign*input[0];
                weight[1] += learning_rate*sign*input[1];
                weight[2] += learning_rate*sign*input[2];
                weight[3] += learning_rate*sign*input[3];
                weight[4] += learning_rate*sign*input[1];
                weight[5] += learning_rate*sign*input[2];
                weight[6] += learning_rate*sign*input[4];
                weight[7] += learning_rate*sign*percept[0];
                weight[8] += learning_rate*sign*percept[1];
                weight[9] += learning_rate*sign*input[5];
                weight[10] += learning_rate*sign*percept[0];
                weight[11] += learning_rate*sign*percept[1];
                weight[12] += learning_rate*sign*input[6];
                weight[13] += learning_rate*sign*percept[2];
                weight[14] += learning_rate*sign*percept[3];
               for(i=0;i<14;i++)
               {
               printf("%f", weight[i]);
               }     
          }
        printf("\n");
        printf("output:\n");
        printf("%d\n", percept[4]);
        return;
    }
    
    
    int main(){
       tres[0] = 0.5;
       tres[1] = 0.5;
       tres[2] = 0.5;
       tres[3] = 0.5;
       tres[4] = 0.5;
       srand(time(NULL));
       total_go_round=1;
       while(total_go_round <= 1500){
          educate();
          total_go_round++;
    }
    return 0;
    }
    So, I don't know why doesn't it work, it has no errors but it can't learn XOR...maybe i don't teach it well....Help? Please?
    Arduino rocks!

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Well yann, part of the reason I tidied up that last piece of code was because I was hoping to communicate something to you: you will only be hampered in your efforts if you cannot use the C language well. I think you are overly excited about solving your problem, and so you fail to give enough attention to learning how to code. Unfortunately, you will have to learn to code before you learn to implement complex algorithms.

    This seems to be a pretty "classic" illustration of of the consequence, because it looks to me like you are trying to write a multi-layer perceptron network that can learn XOR when in fact you have not bothered to learn how to do a XOR yourself!

    Code:
            printf("inputs:\n");
            scanf("%d", &input[1]);
            scanf("%d", &input[2]);
            int goal;
            printf("goal:\n");
            scanf("%d", &goal);
    Now, I presume you are piping in a file of values here and not sitting there entering numbers 4500 times! But why do you even need to do that? Here is a version of the target() function that performs a XOR: *

    Code:
    bool target(int y, int z) {
    	if ((y && !z) || (z && !y)) return 1;
    	return 0;
    }
    Does that make sense? If you had understood the use of logical operators, a basic element of the language, this would have been easy for you to do. Instead, presumably, you have wasted a lot of time on some kind of work around. Please correct me if I am wrong about my assumption.

    So here's another illustration of how to exploit the "power" of basic C syntax. Which it is basic and will not be hard for you to learn if you take the time to try:
    Code:
    int x, *ptr;
    for (i=0; i<15; i++;) {
    	if ((i == 1) || (i == 8) || (i == 11)) x = 0;
    	else if ((i == 1) || (i == 8) || (i == 11)) x = 1;
    	else if ((i == 2) || (i == 5) || (i == 13)) x = 2;
    	else if ((i == 3) || (i == 14)) x = 3;
    	else if (i == 6) x = 4;
    	else if (i == 9) x = 5;
    	else if (i == 12) x = 6;
    	if ((i == 9) || (i==12) || (i<7)) ptr = input;
    	else ptr = percept;
    	weight[i] += learning_rate*sign*ptr[x];
    	printf("%f", weight[i]);
    }
    [corrected]

    This replaces the series of assignments beginning on line 38. Now, there is not a big difference, but notice
    1. this is less than half the number of lines
    2. it is comperable efficiency wise
    3. it is more logically organized


    The more you expand this list, the more organizing it this way will help.

    Enough "not a big difference" type things will add up to a very big difference as the code you are working on gets longer and more complex. Readability and logic are important. Right now, you are relying on your own memory: you understand the logic of the code because you wrote it. But once you get up to a few hundred or thousand lines of code, you will not be able to do that as easily. So by writing in a concise, well organized way, you will save your self time and headaches.

    * ps I did try that with the single perceptron, it truly cannot learn XOR.
    Last edited by MK27; 09-27-2009 at 12:13 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Your network will never learn XOR. It is mathematically impossible. We've been through this already.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  4. #4
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    Quote Originally Posted by brewbuck View Post
    Your network will never learn XOR. It is mathematically impossible. We've been through this already.
    yes i know, i built a new one that can, but sometimes gets it wrong...it uses backpropagation...

    Code:
    #include <stdbool.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    float error[100];
    int i;
    float weight[100];
    bool percept[100];
    int input[100];
    float tres[100];
    int successive_right = 0;
    int total_go_round = 1;
    const float learning_rate = 0.1f;
    const float learning_rate2 = 0.25f;
    const float learning_rate3 = 0.1f;
    
    bool target(int y, int z) {
    	if ((y && !z) || (z && !y)) return 1;
    	return 0;
    }
    
    
    int educate() {
      
    	input[0] = 1; //bias, uvijek 1
            input[3] = 1;
            input[4] = 1;
            input[5] = 1;
            input[6] = 1;
            input[1] = rand() % 2;
            input[2] = rand() % 2;
            bool goal = target(input[1], input[2]);
       	percept[0] =  (weight[0]*input[0]+ weight[1]*input[1]+weight[2]*input[2] > tres[0]);
    	percept[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2] > tres[1]);
     	percept[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1] > tres[2]);
       	percept[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1] > tres[3]);
        	percept[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3] > tres[4]);
            if (percept[4] == goal) {
            	successive_right++;
            }
            else {
                	successive_right = 0;
                	error[0] = goal - percept[4];
                	error[1] = (error[0]*weight[14]);
           	     	error[2] = (error[0]*weight[13]);
                    error[3] = (error[1]*weight[1])+(error[2]*weight[2]);
                	error[4] = (error[1]*weight[4])+(error[2]*weight[5]);
                              
                	weight[0]  += learning_rate*error[4]*input[0];//input
                	weight[1]  += learning_rate*error[4]*input[1];//input
                	weight[2]  += learning_rate*error[4]*input[2];//input
                	weight[3]  += learning_rate*error[4]*input[3];//input
                	weight[4]  += learning_rate*error[3]*percept[1];//input
                	weight[5]  += learning_rate*error[3]*percept[0];//input
                	weight[6]  += learning_rate2*error[3]*input[4];//hiden
                	weight[7]  += learning_rate2*error[2]*percept[0];//hiden
                	weight[8]  += learning_rate2*error[2]*percept[1];//hiden
                	weight[9]  += learning_rate2*error[2]*input[5];//hiden
                	weight[10] += learning_rate2*error[1]*percept[0];//hiden
                	weight[11] += learning_rate2*error[1]*percept[1];//hiden
                	weight[12] += learning_rate3*error[1]*input[6];//output
                	weight[13] += learning_rate3*error[0]*percept[4];//output
                	weight[14] += learning_rate3*error[0]*percept[3];//output
    
    }
    	return;
    }
    
    
    int main(){
       	tres[0] = 0;
       	tres[1] = 0;
       	tres[2] = 0;
       	tres[3] = 0;
       	tres[4] = 0;
       	srand(time(NULL));
       	for(i=0;i<14;i++){
       		weight[i]= 0.5;
       	}    
       	total_go_round=1;
       	while(total_go_round <= 5000){
            	educate();
          		total_go_round++;
    	}
            printf("inputs:\n");
            scanf("%d", &input[1]);
            scanf("%d", &input[2]);
       	percept[0] =  (weight[0]*input[0]+ weight[1]*input[1]+weight[2]*input[2] > tres[0]);
    	percept[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2] > tres[1]);
     	percept[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1] > tres[2]);
       	percept[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1] > tres[3]);
        	percept[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3] > tres[4]);
            printf("%d\n", percept[4]);   	
    	return 0;
    }
    now, this network will say 0 to 1 and one, and will say 1 to 0 and 1, but it will some times get it wrong do you know why?

    MK27, before i will start following you're advices, i think i should just finish what i started.
    Arduino rocks!

  5. #5
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by yann View Post
    MK27, before i will start following you're advices, i think i should just finish what i started.
    Sure, if you can. And it will probably take you longer that way. Honestly. I have to deal with this issue. Everyone does. You know what happens? You write some code and realize two months later how "stupid" it was, ie, that it would have been much easier if you had a better grasp of the essentials first. Meaning the code you wrote is now next to useless.

    But no one wants to just sit and do "hello world" exercises out of a book all day. Try and strike a balance -- take some interest in the language. Most learning comes while you are trying to accomplish a goal. So slow down, and remember to learn as much as you can while you are at it, rather than being blinded by your need to "get it done". In the end, it will probably

    1. take less time
    2. teach you more
    3. be done better


    What you are doing now is like if someone gave you the worlds most incredible calculator, but you don't understand most of the buttons. So to calculate 4^5, you go:

    4x4x4x4x4

    Yes, it will work, but there is a pow() button. If you are doing a long, complicated exercise with a lot of exponents, you are going to waste a huge amount of time. How 'bout 123^12314? I guess it will be a huge accomplishment just to get that finished! Here's a simple rule: if you think there is a function or a method that might help, write a short, separate program to explore that method and save it. Even if it turns out to be not useful to you now, it probably will be later. The same is true if you see a method used that you do not understand or have not used before. Experiment. If you want to program, you might as well try and enjoy programming...

    Also, you will get more respect from people (who could give you help) if you demonstrate some level of proficiency with the syntax, which right now you are not. You're a smart person yann, try and behave like one
    Last edited by MK27; 09-27-2009 at 02:10 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by yann View Post
    yes i know, i built a new one that can, but sometimes gets it wrong...it uses backpropagation...
    It "sometimes gets it wrong" because you are trying to do the impossible. Look, try it on paper to convince yourself.

    Use a piece of graph paper. At coordinates (0,0) and (1,1), draw two green dots. At coordinates (0,1) and (1,0) draw two red dots. Now, try to draw a straight line such that the green dots are both on one side of the line, and the red dots are both on the other side of the line.

    It's not possible. That's why a linear machine can never learn the XOR function. In order to learn this function, the machine must learn a curve, not a straight line.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    You have to apply a sigmoid to the output before it is apssed on to the next neuron, or it will never learn a non-linear function like XOR, it is, as we have all tried to explain to you, impossible, not 'we don't know how to do it' impossible, mathematically impossible. It has been proven time and time again, it is the proof that nearly destroyed the entire field of neural network research.

    This is why noone uses classical perceptrons any more. We use newer neuron models that have non-linear activation functions, this lets the neuron map a curved hyperplane, not just a linear one. Using multiple layers increases the number of folds the manifold can achieve. A simple problem like XOR needs at least 2 layers. It is the classical problem used to teach the fact that simple perceptrons cannot ever learn that function. No matter how many layers you have, a 5 billion layer perceptron will NEVER learn a problem that is not linearly separable. It has been proven that a perceptron of any number of layers can be reduced to a single layer network that yields the exact same outputs.

  8. #8
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    But this 3 layers, which is enough for XOR, and it can do XOR, but only gets it it right in 1 of 3 times, I did it with backpropagation but i think I connected something/s wrong...

    What I did wrong I think is that I don't have that "nonlinear activation", how to achive that?

    Could you post an example like this:
    if weight1*input1+weight2*input2+weightb*bias > threshold then out=1...?


    And thank you people for you're effort...
    Arduino rocks!

  9. #9
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    3 layers or 20, doesnt make a difference if you don't have a non-linear activation function.

    there are a few, and its fairly simple to apply them. After you perform the sum of products on the inputs and weights, you simply take the result and run it through f(x). f(x) could be any of the following-

    Code:
    double f(double x){
       if(x > 0.0) return 1.0;
       return 0.0;
       }
    Code:
    double f(double x){
       return sin(atan(x));
       }
    Code:
    double f(double x){
       return 1.0 / (1.0 + pow(2.71828182846 , 0.0 - x));
       }
    Personally I use the sin(atan()) method because its faster than the logarithmic method, but the lgo method converges faster in my experience. So it basically comes down to when do you want the network to be faster, during development, or in the field, since my stuff actually goes out the door, I choose the field, and just spend extra time in development.
    Last edited by abachler; 09-28-2009 at 09:37 AM.

  10. #10
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    OK, thank you, I will apply it to my code!
    Arduino rocks!

  11. #11
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    The last learning function abachler posted is the "delta rule" which was what I learnt to use. Converging faster, means that you need shorter training times, so that may be better for initial testing. I might have to have a play around with the return sin(atan(x)) one myself if it produces better results when trained.

  12. #12
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    Huh, the first thing you posted is just like the linear threshold function, it still gets things wrong, in the second thing sin(atan(x)), I get these error messages:

    tmp/ccKzDcmq.o: In function `f':
    mozak.c:(.text+0x19): undefined reference to `atan'
    mozak.c:(.text+0x21): undefined reference to `sin'
    collect2: ld returned 1 exit status
    this is how I did it...
    Code:
       	
    
    double f(double x){
       return sin(atan(x));
       }
    ...
    sum[0] =  (weight[0]*input[0]+weight[1]*input[1]+weight[2]*input[2]);
    sum[1] =  (weight[3]*input[3]+weight[4]*input[1]+weight[5]*input[2]);
    sum[2] =  (weight[6]*input[4]+weight[7]*percept[0]+weight[8]*percept[1]);
    sum[3] =  (weight[9]*input[5]+weight[10]*percept[0]+weight[11]*percept[1]);
    sum[4] =  (weight[12]*input[6]+weight[13]*percept[2]+weight[14]*percept[3]);
    percept[0] = f(sum[0]);
    percept[1] = f(sum[1]);
    percept[2] = f(sum[2]);
    percept[3] = f(sum[3]);
    percept[4] = f(sum[4]);
    I guess I did something wrong...? And yes, I included <math.h>
    Arduino rocks!

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Just adding a nonlinear activation won't help anything. You need to adjust your learning/backprop function to account for the derivative of the activation when calculating error.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  14. #14
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    Oh...and what should I actually do...?...(code examples are welcome...)...(: thanks.
    Arduino rocks!

  15. #15
    Registered User yann's Avatar
    Join Date
    Sep 2009
    Location
    Zagreb, Croatia
    Posts
    186
    And how to eliminate those errors?
    Arduino rocks!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. strcmp returning 1...
    By Axel in forum C Programming
    Replies: 12
    Last Post: 09-08-2006, 07:48 PM
  2. getline() don't want to work anymore...
    By mikahell in forum C++ Programming
    Replies: 7
    Last Post: 07-31-2006, 10:50 AM
  3. Why don't the tutorials on this site work on my computer?
    By jsrig88 in forum C++ Programming
    Replies: 3
    Last Post: 05-15-2006, 10:39 PM
  4. fopen();
    By GanglyLamb in forum C Programming
    Replies: 8
    Last Post: 11-03-2002, 12:39 PM
  5. DLL __cdecl doesnt seem to work?
    By Xei in forum C++ Programming
    Replies: 6
    Last Post: 08-21-2002, 04:36 PM