Best method for agent learning?

**Perspective** · 01-14-2007

The general idea of neural nets is that you train them with (a lot of) data, then you apply the trained system to the problem. You seem to be measuring your results based on the training phase. An untrained neural net performs (as you might expect) randomly.

**Perspective** · 01-14-2007

(I can't edit my post

).

I don't think your simulation runs long enough to both train the net and see meaningful behaviour of it in the same run.

**Perspective** · 01-14-2007

Also, what is the topology of your net and how did you choose it? Are using linear or non-linear neurons? What are the input and output nodes representing?

There is a lot of theory/design issues to tackle here to use neural nets productively.

**Crazy Glue** · 01-14-2007

If i want to, I can train the agents as long as I need to. All id have to do is kill off the carnivores or keep them from sucesfuly eating, then set the energy loss for each turn to 0.
As for the neural net, the inputs are whether a plant/carnivore/herbivore is to the left/right/front/proximity of an agent, with each combination getting an input cell. The outputs are turn left, turn right, move forward, or eat. Im not sure what the diff between linear or non-linear neurons is, but I just have the inputs with weights connecting to each of the output cells, with no hidden cells.

**Happy_Reaper** · 01-15-2007

Ok, but the data you accumulate from one experiment needs to carry on to the next one.

**Crazy Glue** · 01-15-2007

Yeah, I can do that. My simulator can save the agents' neural nets and then reload them if needed. But that doesnt solve the prob of the really big/low neural net weights

**Happy_Reaper** · 01-15-2007

Ok, then decrease the amount by which you change your weights, but increase the number of experiments.

**Perspective** · 01-15-2007

>>>
The outputs are turn left, turn right, move forward, or eat. Im not sure what the diff between linear or non-linear neurons is, but I just have the inputs with weights connecting to each of the output cells, with no hidden cells.
<<<

Ok, so this is a single layer linear net. The best way to train is with a supervised learning procedure (ie. test data and the "solutions" to the test cases) ex. herbathing left, whatchamacolit right, food over there => some output. Correct answer is <some other output>. Difference in the two forms the error derivative which you propogate.

What your describing now sounds like unsupervised learning, neural nets have never been particularly well suited to that. Too many paramaters to tune which the whole system is sensative too.

**Crazy Glue** · 01-15-2007

I dont want to use any training data whatsoever though. I want agents to be able to learn on their own whats good and bad for them without any prior knowledge of the environment. If theres a better way of doing this, please let me know.

**Happy_Reaper** · 01-15-2007

To be honest, I see this as being a Search-Tree problem. There would be no learning involved, but your situation just screams it, almost.

I'd say right now, your learning is difficult cause all you agents are trying to learn at once which will lead them to random behviour. You need at least one that will be somewhat deterministic, or else your agents will never learn anything.

**Crazy Glue** · 01-15-2007

NNNnnnnooo! anything but decision trees! Id rather go with trained neural nets if it comes to that. Well im gonna try to figure this out on my own i guess. I wanna stick with reinforcement learning. Maybe if i proportion the error based on the values of the input cells thatll reduce the effect. Im disapointed though. This doesnt seem to be that complex a simulator and i thought thered be some AI algorithm thatd do exactly what i wanted.

**Happy_Reaper** · 01-15-2007

Although learning is an interesting field, from my experience Decision trees more correctly simulate most situations. That's largely due, I think, to the fact that most AI problems are very specific (like finding a path between two points, not running into obstacles, etc...) and that therefore learning often teaches them things that are not necessary to know. This leads them to usually be much less strong then straight decision tree makers.

**Perspective** · 01-16-2007

You can also "learn" using decision tress. You train the tree to identify the most relevent branching conditions.

**Crazy Glue** · 01-16-2007

Yeah, but im not really concerned about how well an agent does what it should so much as the way it learns. I want agents to have the least possible preprogrammed knowledge possible. If I just decided to add some new creature to the environment, I want them to be able to learn by experience how to interact with it with little, if any modifcations to the agents. I want them specifically to learn by trial and error at runtime, not with any prior training, and not by just going through a decision tree and doing the best action it finds.

**Crazy Glue** · 01-16-2007

I just started comparing the data from the TD agents to the plain evolving agents. It actually looks like whatever i did works. When I plot the herbivores' max age vs. time, it appears that the TD agents learn to stay alive much longer. At the end of 2000 turns, the TD herbivores' max age was 1230 and the evolving herbivores' was 709