[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds
Ben West
me at benwest.name
Wed Nov 6 17:40:06 UTC 2013
Hi Dan,
Thanks for offering more detail, especially that you see the nodes
spontaneously reboot rather than simple have services crash.
I would again point out that the Picostations will have a finite limit for
simultaneous clients. 15 to 20 clients is quite a few, each client
requiring a portion of available of RAM. It may be a single Picostation is
not going to be able to sustain all of them.
On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples <
danstaples at opentechinstitute.org> wrote:
> Regarding logging, I'm not sure that will work well since the nodes are
> spontaneously rebooting themselves (due to OOM conditions), not the user
> rebooting them. What we're going to try to do is attach a serial console
> (thanks Will!) and try to slam the router with simultaneous users and
> traffic.
>
> Also, I don't think Dan is hosting local apps on the router itself
> (correct me if I'm wrong), but just advertising them using the Commotion
> apps portal. And that's just takes a little space for the Avahi service
> file...so hopefully that's not a problem.
>
> We'll certainly report what we find with our stress testing.
>
> Dan
>
> On 11/06/2013 10:37 AM, Ben West wrote:
> > I am also seeing sporadic memory consumption issues operating mesh nodes
> > running AA r38347 in WasabiNet on Nanostation Loco M2.
> >
> > That is, using the same ath9k wifi driver and same underlying OS, but
> > without the Commotion-specific tools like commotiond and servald. I
> > will see nodes boot up with ~26Mbytes memory usage and then gradually
> > increase over the next few days until sporadic nodes start crashing with
> > page allocation failures (aka memory exhausted). This all is happening
> > despite having 3Mbytes of compressed swap space allocated. When I am
> > able to log into crashed nodes to inspect, I will occasionally find the
> > current memory usage to be /less/ than the average observed on bootup,
> > along with ~500Kbytes sitting in swap.
> >
> > This seems to suggest something is very sporadically allocating itself a
> > large chunk (multiple MBytes), but not residing in memory as such, and
> > causing other processes to crash in consequence. I do use the
> > coovachilli captive portal in WasabiNet, which could be a culprit and
> > thus unrelated to Commotion, but there could also be an underlying
> > memory leak in the kernel or wifi driver.
> >
> > What are thoughts for having crashed nodes try to collect a debug report
> > about themselves when a crash condition is detected (e.g. no Internet
> > access, "page allocation failure" detected in syslog), and then write
> > that report to flash somewhere before the node get rebooted by its
> > frustrated user?
> >
> > Besides that, do note that nodes with only 32MBytes of RAM, like UBNT
> > Picostations, are going to have difficulties hosting local apps for many
> > users. If Dan Hasting would be able to use an alternate device with
> > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
> > router (all of which should be able to run Commotion-OpenWRT), that may
> > be a viable workaround in cause chasing down memory leaks becomes too
> > ornery.
> >
> >
> >
> > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> > <danstaples at opentechinstitute.org
> > <mailto:danstaples at opentechinstitute.org>> wrote:
> >
> > +commotion-dev
> >
> > If your nodes are crashing w/ 15-20 clients, while both serval and
> > commotion-splash are disabled, that is very worrisome!
> >
> > I propose to the Commotion dev team that we urgently need to come up
> > with a way to simulate network load, so we can identify and fix the
> > causes of these types of crashes. Does anyone have ideas or
> experiences
> > with this? Perhaps we can take the technical discussion over to the
> > commotion-dev list only.
> >
> > And just an update for you Dan, earlier this week I found and fixed a
> > significant memory leak in Serval...not sure how much that will
> affect
> > the instability we've seen, but we'll soon know with some testing.
> The
> > fix will make its way into the nightly builds probably by the end of
> the
> > week.
> >
> > As long as the rest of your network is DR1 or newer, the nightly
> builds
> > should be compatible.
> >
> > Dan
> >
> > On 11/06/2013 04:07 AM, Dan Hastings wrote:
> > > I was just checking to see if their had been any progress made on
> the
> > > nightly builds with fixing the memory overload causing the nodes to
> > > crash. To try and prevent my node from crashing I disabled serval
> and
> > > the splash page. However, whenever I have 15 to 20 students login
> to a
> > > local app at the start of class my node crashes instantly. I'm
> > wondering
> > > if upgrading to the latest nightly build might fix this issue.
> Lastly,
> > > if I upgrade to the latest nightly build will it still work with
> the
> > > other nodes that do not have the latest build or do I have to or
> is it
> > > recommend that I upgrade all of the other nodes to latest build as
> > > well? Thanks for all the hard work. Commotion is otherwise
> working
> > > wonders over here in the horn.
> > >
> > > Dan
> > >
> > > _______________________________________________
> > > Commotion-discuss mailing list
> > > Commotion-discuss at lists.chambana.net
> > <mailto:Commotion-discuss at lists.chambana.net>
> > > https://lists.chambana.net/mailman/listinfo/commotion-discuss
> > >
> >
> > --
> > Dan Staples
> >
> > Open Technology Institute
> > https://commotionwireless.net
> > OpenPGP key: http://disman.tl/pgp.asc
> > Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> > _______________________________________________
> > Commotion-dev mailing list
> > Commotion-dev at lists.chambana.net
> > <mailto:Commotion-dev at lists.chambana.net>
> > https://lists.chambana.net/mailman/listinfo/commotion-dev
> >
> >
> >
> >
> > --
> > Ben West
> > http://gowasabi.net
> > ben at gowasabi.net <mailto:ben at gowasabi.net>
> > 314-246-9434
>
> --
> Dan Staples
>
> Open Technology Institute
> https://commotionwireless.net
> OpenPGP key: http://disman.tl/pgp.asc
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> _______________________________________________
> Commotion-dev mailing list
> Commotion-dev at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-dev
>
>
--
Ben West
me at benwest.name
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-discuss/attachments/20131106/116a3f34/attachment-0001.html>
More information about the Commotion-discuss
mailing list