I can understand that because this is so intermittent, it is a very difficult issue to diagnose. Is there any way that I can help you figure out what is going on? Is there some way we can get a diagnostic dump of the router when it gets into this state or have some detailed diagnostic traces to figure out what triggers this? This is really a very significant problem that is forcing those of us using these routers in business applications to look at alternative solutions.
Otherwise, I love your firmware. Just need to figure out a way to fix this problem.
These types of issues are often better diagnosed by having a serial TTL adapter hooked to the internal serial console, so that's not something most end users would be able to do.
Another method would be to implement some active monitoring on the router, to get regular data on memory usage, and see if there's a tendency that appears. That data has to be sent somewhere so not to get lost in a reboot.
Using a remote syslog server may also help accessing the last log entries before an actual crash, provided anything shows up in the system log - some low level errors are only visible at the kernel logging level, by running dmesg for example.