Best method for agent learning?

This is a discussion on Best method for agent learning? within the General AI Programming forums, part of the Cprogramming.com and AIHorizon.com's Artificial Intelligence Boards category; The general idea of neural nets is that you train them with (a lot of) data, then you apply the ...

  1. #16
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    The general idea of neural nets is that you train them with (a lot of) data, then you apply the trained system to the problem. You seem to be measuring your results based on the training phase. An untrained neural net performs (as you might expect) randomly.

  2. #17
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    (I can't edit my post ).

    I don't think your simulation runs long enough to both train the net and see meaningful behaviour of it in the same run.

  3. #18
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    Also, what is the topology of your net and how did you choose it? Are using linear or non-linear neurons? What are the input and output nodes representing?

    There is a lot of theory/design issues to tackle here to use neural nets productively.

  4. #19
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    If i want to, I can train the agents as long as I need to. All id have to do is kill off the carnivores or keep them from sucesfuly eating, then set the energy loss for each turn to 0.
    As for the neural net, the inputs are whether a plant/carnivore/herbivore is to the left/right/front/proximity of an agent, with each combination getting an input cell. The outputs are turn left, turn right, move forward, or eat. Im not sure what the diff between linear or non-linear neurons is, but I just have the inputs with weights connecting to each of the output cells, with no hidden cells.

  5. #20
    Fear the Reaper...
    Join Date
    Aug 2005
    Location
    Toronto, Ontario, Canada
    Posts
    625
    Ok, but the data you accumulate from one experiment needs to carry on to the next one.
    Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction

  6. #21
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    Yeah, I can do that. My simulator can save the agents' neural nets and then reload them if needed. But that doesnt solve the prob of the really big/low neural net weights

  7. #22
    Fear the Reaper...
    Join Date
    Aug 2005
    Location
    Toronto, Ontario, Canada
    Posts
    625
    Ok, then decrease the amount by which you change your weights, but increase the number of experiments.
    Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction

  8. #23
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    >>>
    The outputs are turn left, turn right, move forward, or eat. Im not sure what the diff between linear or non-linear neurons is, but I just have the inputs with weights connecting to each of the output cells, with no hidden cells.
    <<<


    Ok, so this is a single layer linear net. The best way to train is with a supervised learning procedure (ie. test data and the "solutions" to the test cases) ex. herbathing left, whatchamacolit right, food over there => some output. Correct answer is <some other output>. Difference in the two forms the error derivative which you propogate.

    What your describing now sounds like unsupervised learning, neural nets have never been particularly well suited to that. Too many paramaters to tune which the whole system is sensative too.

  9. #24
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    I dont want to use any training data whatsoever though. I want agents to be able to learn on their own whats good and bad for them without any prior knowledge of the environment. If theres a better way of doing this, please let me know.

  10. #25
    Fear the Reaper...
    Join Date
    Aug 2005
    Location
    Toronto, Ontario, Canada
    Posts
    625
    To be honest, I see this as being a Search-Tree problem. There would be no learning involved, but your situation just screams it, almost.

    I'd say right now, your learning is difficult cause all you agents are trying to learn at once which will lead them to random behviour. You need at least one that will be somewhat deterministic, or else your agents will never learn anything.
    Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction

  11. #26
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    NNNnnnnooo! anything but decision trees! Id rather go with trained neural nets if it comes to that. Well im gonna try to figure this out on my own i guess. I wanna stick with reinforcement learning. Maybe if i proportion the error based on the values of the input cells thatll reduce the effect. Im disapointed though. This doesnt seem to be that complex a simulator and i thought thered be some AI algorithm thatd do exactly what i wanted.

  12. #27
    Fear the Reaper...
    Join Date
    Aug 2005
    Location
    Toronto, Ontario, Canada
    Posts
    625
    Although learning is an interesting field, from my experience Decision trees more correctly simulate most situations. That's largely due, I think, to the fact that most AI problems are very specific (like finding a path between two points, not running into obstacles, etc...) and that therefore learning often teaches them things that are not necessary to know. This leads them to usually be much less strong then straight decision tree makers.
    Teacher: "You connect with Internet Explorer, but what is your browser? You know, Yahoo, Webcrawler...?" It's great to see the educational system moving in the right direction

  13. #28
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    You can also "learn" using decision tress. You train the tree to identify the most relevent branching conditions.

  14. #29
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    Yeah, but im not really concerned about how well an agent does what it should so much as the way it learns. I want agents to have the least possible preprogrammed knowledge possible. If I just decided to add some new creature to the environment, I want them to be able to learn by experience how to interact with it with little, if any modifcations to the agents. I want them specifically to learn by trial and error at runtime, not with any prior training, and not by just going through a decision tree and doing the best action it finds.

  15. #30
    Registered User
    Join Date
    Oct 2005
    Posts
    27
    I just started comparing the data from the TD agents to the plain evolving agents. It actually looks like whatever i did works. When I plot the herbivores' max age vs. time, it appears that the TD agents learn to stay alive much longer. At the end of 2000 turns, the TD herbivores' max age was 1230 and the evolving herbivores' was 709

Page 2 of 3 FirstFirst 123 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Machine Learning with Lego Mindstorms
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 14
    Last Post: 01-30-2009, 02:34 PM
  2. Default Method
    By SolarEnergy in forum C++ Programming
    Replies: 3
    Last Post: 11-21-2008, 07:20 AM
  3. Best communication method to thousand childs?
    By Ironic in forum C Programming
    Replies: 8
    Last Post: 11-08-2008, 12:30 AM
  4. Static templated method problem
    By mikahell in forum C++ Programming
    Replies: 6
    Last Post: 11-19-2006, 09:19 AM
  5. Replies: 3
    Last Post: 12-03-2001, 01:45 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21