Some embedded OS have diagnostic tools to tell you how much stack a task has needed "so far".
So allocating "enough", then running for a while, then picking a tuned size is a reasonable way to go.

Go with as much as you can spare during development, since stack overruns are pretty weird things to debug. If stacks are consecutive in memory, and T1 overruns into T2, then it's usually T2 which dies for no reason which makes any kind of sense