[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds

Dan Staples danstaples at opentechinstitute.org
Wed Nov 6 16:58:01 UTC 2013


Regarding logging, I'm not sure that will work well since the nodes are
spontaneously rebooting themselves (due to OOM conditions), not the user
rebooting them. What we're going to try to do is attach a serial console
(thanks Will!) and try to slam the router with simultaneous users and
traffic.

Also, I don't think Dan is hosting local apps on the router itself
(correct me if I'm wrong), but just advertising them using the Commotion
apps portal. And that's just takes a little space for the Avahi service
file...so hopefully that's not a problem.

We'll certainly report what we find with our stress testing.

Dan

On 11/06/2013 10:37 AM, Ben West wrote:
> I am also seeing sporadic memory consumption issues operating mesh nodes
> running AA r38347 in WasabiNet on Nanostation Loco M2.
> 
> That is, using the same ath9k wifi driver and same underlying OS, but
> without the Commotion-specific tools like commotiond and servald.  I
> will see nodes boot up with ~26Mbytes memory usage and then gradually
> increase over the next few days until sporadic nodes start crashing with
> page allocation failures (aka memory exhausted).  This all is happening
> despite having 3Mbytes of compressed swap space allocated.    When I am
> able to log into crashed nodes to inspect, I will occasionally find the
> current memory usage to be /less/ than the average observed on bootup,
> along with ~500Kbytes sitting in swap.
> 
> This seems to suggest something is very sporadically allocating itself a
> large chunk (multiple MBytes), but not residing in memory as such, and
> causing other processes to crash in consequence.  I do use the
> coovachilli captive portal in WasabiNet, which could be a culprit and
> thus unrelated to Commotion, but there could also be an underlying
> memory leak in the kernel or wifi driver.
> 
> What are thoughts for having crashed nodes try to collect a debug report
> about themselves when a crash condition is detected (e.g. no Internet
> access, "page allocation failure" detected in syslog), and then write
> that report to flash somewhere before the node get rebooted by its
> frustrated user?
> 
> Besides that, do note that nodes with only 32MBytes of RAM, like UBNT
> Picostations, are going to have difficulties hosting local apps for many
> users.  If Dan Hasting would be able to use an alternate device with
> 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
> router (all of which should be able to run Commotion-OpenWRT), that may
> be a viable workaround in cause chasing down memory leaks becomes too
> ornery.
> 
> 
> 
> On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> <danstaples at opentechinstitute.org
> <mailto:danstaples at opentechinstitute.org>> wrote:
> 
>     +commotion-dev
> 
>     If your nodes are crashing w/ 15-20 clients, while both serval and
>     commotion-splash are disabled, that is very worrisome!
> 
>     I propose to the Commotion dev team that we urgently need to come up
>     with a way to simulate network load, so we can identify and fix the
>     causes of these types of crashes. Does anyone have ideas or experiences
>     with this? Perhaps we can take the technical discussion over to the
>     commotion-dev list only.
> 
>     And just an update for you Dan, earlier this week I found and fixed a
>     significant memory leak in Serval...not sure how much that will affect
>     the instability we've seen, but we'll soon know with some testing. The
>     fix will make its way into the nightly builds probably by the end of the
>     week.
> 
>     As long as the rest of your network is DR1 or newer, the nightly builds
>     should be compatible.
> 
>     Dan
> 
>     On 11/06/2013 04:07 AM, Dan Hastings wrote:
>     > I was just checking to see if their had been any progress made on the
>     > nightly builds with fixing the memory overload causing the nodes to
>     > crash. To try and prevent my node from crashing I disabled serval and
>     > the splash page. However, whenever I have 15 to 20 students login to a
>     > local app at the start of class my node crashes instantly. I'm
>     wondering
>     > if upgrading to the latest nightly build might fix this issue. Lastly,
>     > if I upgrade to the latest nightly build will it still work with the
>     > other nodes that do not have the latest build or do I have to or is it
>     > recommend that I upgrade all of the other nodes to latest build as
>     > well?  Thanks for all the hard work.  Commotion is otherwise working
>     > wonders over here in the horn.
>     >
>     > Dan
>     >
>     > _______________________________________________
>     > Commotion-discuss mailing list
>     > Commotion-discuss at lists.chambana.net
>     <mailto:Commotion-discuss at lists.chambana.net>
>     > https://lists.chambana.net/mailman/listinfo/commotion-discuss
>     >
> 
>     --
>     Dan Staples
> 
>     Open Technology Institute
>     https://commotionwireless.net
>     OpenPGP key: http://disman.tl/pgp.asc
>     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>     _______________________________________________
>     Commotion-dev mailing list
>     Commotion-dev at lists.chambana.net
>     <mailto:Commotion-dev at lists.chambana.net>
>     https://lists.chambana.net/mailman/listinfo/commotion-dev
> 
> 
> 
> 
> -- 
> Ben West
> http://gowasabi.net
> ben at gowasabi.net <mailto:ben at gowasabi.net>
> 314-246-9434

-- 
Dan Staples

Open Technology Institute
https://commotionwireless.net
OpenPGP key: http://disman.tl/pgp.asc
Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9


More information about the Commotion-discuss mailing list