[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds

Ben West ben at gowasabi.net
Wed Nov 6 15:37:46 UTC 2013


I am also seeing sporadic memory consumption issues operating mesh nodes
running AA r38347 in WasabiNet on Nanostation Loco M2.

That is, using the same ath9k wifi driver and same underlying OS, but
without the Commotion-specific tools like commotiond and servald.  I will
see nodes boot up with ~26Mbytes memory usage and then gradually increase
over the next few days until sporadic nodes start crashing with page
allocation failures (aka memory exhausted).  This all is happening despite
having 3Mbytes of compressed swap space allocated.    When I am able to log
into crashed nodes to inspect, I will occasionally find the current memory
usage to be *less* than the average observed on bootup, along with
~500Kbytes sitting in swap.

This seems to suggest something is very sporadically allocating itself a
large chunk (multiple MBytes), but not residing in memory as such, and
causing other processes to crash in consequence.  I do use the coovachilli
captive portal in WasabiNet, which could be a culprit and thus unrelated to
Commotion, but there could also be an underlying memory leak in the kernel
or wifi driver.

What are thoughts for having crashed nodes try to collect a debug report
about themselves when a crash condition is detected (e.g. no Internet
access, "page allocation failure" detected in syslog), and then write that
report to flash somewhere before the node get rebooted by its frustrated
user?

Besides that, do note that nodes with only 32MBytes of RAM, like UBNT
Picostations, are going to have difficulties hosting local apps for many
users.  If Dan Hasting would be able to use an alternate device with
64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link router
(all of which should be able to run Commotion-OpenWRT), that may be a
viable workaround in cause chasing down memory leaks becomes too ornery.



On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples <
danstaples at opentechinstitute.org> wrote:

> +commotion-dev
>
> If your nodes are crashing w/ 15-20 clients, while both serval and
> commotion-splash are disabled, that is very worrisome!
>
> I propose to the Commotion dev team that we urgently need to come up
> with a way to simulate network load, so we can identify and fix the
> causes of these types of crashes. Does anyone have ideas or experiences
> with this? Perhaps we can take the technical discussion over to the
> commotion-dev list only.
>
> And just an update for you Dan, earlier this week I found and fixed a
> significant memory leak in Serval...not sure how much that will affect
> the instability we've seen, but we'll soon know with some testing. The
> fix will make its way into the nightly builds probably by the end of the
> week.
>
> As long as the rest of your network is DR1 or newer, the nightly builds
> should be compatible.
>
> Dan
>
> On 11/06/2013 04:07 AM, Dan Hastings wrote:
> > I was just checking to see if their had been any progress made on the
> > nightly builds with fixing the memory overload causing the nodes to
> > crash. To try and prevent my node from crashing I disabled serval and
> > the splash page. However, whenever I have 15 to 20 students login to a
> > local app at the start of class my node crashes instantly. I'm wondering
> > if upgrading to the latest nightly build might fix this issue. Lastly,
> > if I upgrade to the latest nightly build will it still work with the
> > other nodes that do not have the latest build or do I have to or is it
> > recommend that I upgrade all of the other nodes to latest build as
> > well?  Thanks for all the hard work.  Commotion is otherwise working
> > wonders over here in the horn.
> >
> > Dan
> >
> > _______________________________________________
> > Commotion-discuss mailing list
> > Commotion-discuss at lists.chambana.net
> > https://lists.chambana.net/mailman/listinfo/commotion-discuss
> >
>
> --
> Dan Staples
>
> Open Technology Institute
> https://commotionwireless.net
> OpenPGP key: http://disman.tl/pgp.asc
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> _______________________________________________
> Commotion-dev mailing list
> Commotion-dev at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-dev
>
>


-- 
Ben West
http://gowasabi.net
ben at gowasabi.net
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-discuss/attachments/20131106/831574ed/attachment.html>


More information about the Commotion-discuss mailing list