C Board  

Go Back   C Board > Cprogramming.com and AIHorizon.com's Artificial Intelligence Boards > General AI Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 01-05-2007, 09:45 AM   #1
Registered User
 
Join Date: Oct 2005
Posts: 27
Best method for agent learning?

Ive made an artificial life simulator where agents begin with simple, randomly generated neural networks but develop strategies and complex behaviors through natural selection. However, they remain with their behaviors until they die and cannot change them. I was hoping to somehow modify the neural networks, use reinforcement algorithms, or do something so that they could adapt as they lived. The inputs right now are whether an herbivore/carnivore/plant is to their left/right/front/proximity, and their health status. Energy is gained by eating the type of food theyre supposed to. There is no way for herbivores to judge whether or not carnivores are dangerous right now since they will just be eaten, but i guess I could change it so that an herbivore has a certain probability of surviving. What would be the best way to implement adaptive learning for this kind of simulator?
Crazy Glue is offline   Reply With Quote
Old 01-06-2007, 08:26 AM   #2
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
What exactly is it that you would like to learn ?

Because as of now there doesn't seem to be that much of a need for AI. Are you trying to create some sort of uber animal ?
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction
Happy_Reaper is offline   Reply With Quote
Old 01-06-2007, 11:39 PM   #3
Registered User
 
Join Date: Oct 2005
Posts: 27
lol this is just my senior project that im working on for highschool. I just want to add some method for agents to learn how to act on their own as they live.
Crazy Glue is offline   Reply With Quote
Old 01-07-2007, 12:46 AM   #4
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
How strong do you want it to be ?

And how much time are you willing to allocate to this thing ?

For something like a school project you could do a half decent agent just by making it prefer moves away from its predators and towards its food. And this can be done very quickly.
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction
Happy_Reaper is offline   Reply With Quote
Old 01-07-2007, 11:31 AM   #5
Registered User
 
Join Date: Oct 2005
Posts: 27
Nah, this needs to be actual AI learning, not just probability actions. By the way, im sending this to MIT too, and i care more about them being impressed than my school. I have until the end of january to finish this thing for MIT. Since I now have study hall every day at school, I should have enough time to add some sort of AI learning. The basic simulator itself I finished in two weeks over the summer.
Crazy Glue is offline   Reply With Quote
Old 01-07-2007, 12:56 PM   #6
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
In that case, real learning usually starts with some sort of utility functions for actions, based on a certain number of factors. The learning, then, is tweaking the weights as you go along.
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction
Happy_Reaper is offline   Reply With Quote
Old 01-07-2007, 04:59 PM   #7
Registered User
 
Join Date: Oct 2005
Posts: 27
I dont see how that would work though for any other agent action besides eating. How would an herbivore for example know that turning when a carnivore is infront of it would be a good thing? From what ive been reading, maybe reinforcement learning might be a better idea than neural networks. What do you think?
Crazy Glue is offline   Reply With Quote
Old 01-07-2007, 09:29 PM   #8
Crazy Fool
 
Perspective's Avatar
 
Join Date: Jan 2003
Location: Canada
Posts: 2,588
>Nah, this needs to be actual AI learning, not just probability actions

boy are you gonna be disapointed when you take your first AI class. Have a look at Bayesian Nets.. its just a bunch of probabilistic decisions where the "learning" updates the probabilities. Or Neural Nets, where training data defines the probabilities of activities of the nodes (neurons).
Perspective is offline   Reply With Quote
Old 01-08-2007, 07:16 AM   #9
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
Quote:
How would an herbivore for example know that turning when a carnivore is infront of it would be a good thing?
You could just put an extra factor which represents the "distance to closest predator", and the greater that would be, the better.

And even if you do neural nets, as Perspective said, you're going to be doing essentially the same thing.

Quote:
boy are you gonna be disapointed when you take your first AI class
I'd agree with this one. I was quite dissapointed when I discovered that modern AI is so ridiculously simplistic.
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction

Last edited by Happy_Reaper; 01-08-2007 at 07:22 AM.
Happy_Reaper is offline   Reply With Quote
Old 01-13-2007, 07:03 PM   #10
Registered User
 
Join Date: Oct 2005
Posts: 27
Ok, i looked for a while, and I tried using temporal difference learning with neural networks, kinda like in TD-Gammon. Heres the algorithm I used

1. The agent acts based on which output cell has the highest value
2. Store the agent's inputs when it acted
3. Set the reward equal to the difference in the agen'ts health from before it acted to its current health.
4. Repercieve the new state of the agent.
5. Store the new output cell with the highest value
6. Error = reward + learningRate * (new value of highest output cell) - (value of output cell from the agent's previous action)
7. Find the weights tied from any non-zero inputs of the agent when it acted to the output cell of its action, and add the error to each of those weights.

Does this sound right? The agents that do this dont really seem to be doing any better than the ones that would just evolve, maybe a bit worse even.
Crazy Glue is offline   Reply With Quote
Old 01-13-2007, 09:17 PM   #11
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
That sounds about correct to me. TD-Learning doesn't necessarily gurantee good results. Also at the outset, TD-Learning won't beat your evolution thing either. Think of how TD-Gammon got so good. It started terrible, but by playing for a long time it got very good.

It could be that you're not carrying over the data from previous experiments to subsequent ones.
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction
Happy_Reaper is offline   Reply With Quote
Old 01-13-2007, 10:19 PM   #12
Registered User
 
Join Date: Oct 2005
Posts: 27
I think im screwing up with how the weights are adjusted. I looked at some of the agent weights in the simulator and the numbers were pretty huge. I could see how this would happen, since if the error keeps getting added to the weights, the state values increase as well and so everything just keeps increasing or decreasing like crazy. Is there a better way to apportion them so that the weights will stay within a reasonable range?
Crazy Glue is offline   Reply With Quote
Old 01-14-2007, 08:04 AM   #13
Fear the Reaper...
 
Join Date: Aug 2005
Location: Toronto, Ontario, Canada
Posts: 625
Well shouldn't you also have negative rewards ?
__________________
Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction
Happy_Reaper is offline   Reply With Quote
Old 01-14-2007, 11:18 AM   #14
Crazy Fool
 
Perspective's Avatar
 
Join Date: Jan 2003
Location: Canada
Posts: 2,588
How are you training the neural net? Are you just plugging in default values at the begging of each simulation?
Perspective is offline   Reply With Quote
Old 01-14-2007, 11:28 AM   #15
Registered User
 
Join Date: Oct 2005
Posts: 27
The neural nets arent trained, they just begin with random values. And yes, there are negative rewards. The weights turn out to be really high or really low
Crazy Glue is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Machine Learning with Lego Mindstorms DavidP General Discussions 14 01-30-2009 02:34 PM
Default Method SolarEnergy C++ Programming 3 11-21-2008 07:20 AM
Best communication method to thousand childs? Ironic C Programming 8 11-08-2008 12:30 AM
Static templated method problem mikahell C++ Programming 6 11-19-2006 09:19 AM
Returning an object from a method - Problem when creating my own string class pecymanski C++ Programming 3 12-03-2001 01:45 PM


All times are GMT -6. The time now is 01:36 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22