[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds

Ryan Gerety gerety at opentechinstitute.org
Wed Nov 6 18:00:58 UTC 2013


I was under the impression that PicoStations could support roughly 30 users?


On 11/6/2013 12:54 PM, Dan Staples wrote:
> Since Dan Hastings has seen this happen with a lot of simultaneous
> clients and with high-memory components disabled, it sounds like that is
> likely the cause. Do you know exactly where that RAM is used for each
> connecting client?
> 
> Dan, can you provide any more detailed info on exactly what was
> happening when you see the node crashing? How many simultaneous users,
> and what were they doing (viewing a webpage on the internet, or viewing
> the node's administrative web interface, etc)?
> 
> On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
>> Hi Dan,
>>
>> Thanks for offering more detail, especially that you see the nodes
>> spontaneously reboot rather than simple have services crash.
>>
>> I would again point out that the Picostations will have a finite limit
>> for simultaneous clients.  15 to 20 clients is quite a few, each
>> client requiring a portion of available of RAM.  It may be a single
>> Picostation is not going to be able to sustain all of them.
>>
>>
>>
>> On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
>> <danstaples at opentechinstitute.org
>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>
>>     Regarding logging, I'm not sure that will work well since the
>>     nodes are
>>     spontaneously rebooting themselves (due to OOM conditions), not
>>     the user
>>     rebooting them. What we're going to try to do is attach a serial
>>     console
>>     (thanks Will!) and try to slam the router with simultaneous users and
>>     traffic.
>>
>>     Also, I don't think Dan is hosting local apps on the router itself
>>     (correct me if I'm wrong), but just advertising them using the
>>     Commotion
>>     apps portal. And that's just takes a little space for the Avahi
>>     service
>>     file...so hopefully that's not a problem.
>>
>>     We'll certainly report what we find with our stress testing.
>>
>>     Dan
>>
>>     On 11/06/2013 10:37 AM, Ben West wrote:
>>     > I am also seeing sporadic memory consumption issues operating
>>     mesh nodes
>>     > running AA r38347 in WasabiNet on Nanostation Loco M2.
>>     >
>>     > That is, using the same ath9k wifi driver and same underlying
>>     OS, but
>>     > without the Commotion-specific tools like commotiond and servald.  I
>>     > will see nodes boot up with ~26Mbytes memory usage and then
>>     gradually
>>     > increase over the next few days until sporadic nodes start
>>     crashing with
>>     > page allocation failures (aka memory exhausted).  This all is
>>     happening
>>     > despite having 3Mbytes of compressed swap space allocated.  
>>      When I am
>>     > able to log into crashed nodes to inspect, I will occasionally
>>     find the
>>     > current memory usage to be /less/ than the average observed on
>>     bootup,
>>     > along with ~500Kbytes sitting in swap.
>>     >
>>     > This seems to suggest something is very sporadically allocating
>>     itself a
>>     > large chunk (multiple MBytes), but not residing in memory as
>>     such, and
>>     > causing other processes to crash in consequence.  I do use the
>>     > coovachilli captive portal in WasabiNet, which could be a
>>     culprit and
>>     > thus unrelated to Commotion, but there could also be an underlying
>>     > memory leak in the kernel or wifi driver.
>>     >
>>     > What are thoughts for having crashed nodes try to collect a
>>     debug report
>>     > about themselves when a crash condition is detected (e.g. no
>>     Internet
>>     > access, "page allocation failure" detected in syslog), and then
>>     write
>>     > that report to flash somewhere before the node get rebooted by its
>>     > frustrated user?
>>     >
>>     > Besides that, do note that nodes with only 32MBytes of RAM, like
>>     UBNT
>>     > Picostations, are going to have difficulties hosting local apps
>>     for many
>>     > users.  If Dan Hasting would be able to use an alternate device with
>>     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
>>     > router (all of which should be able to run Commotion-OpenWRT),
>>     that may
>>     > be a viable workaround in cause chasing down memory leaks
>>     becomes too
>>     > ornery.
>>     >
>>     >
>>     >
>>     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
>>     > <danstaples at opentechinstitute.org
>>     <mailto:danstaples at opentechinstitute.org>
>>     > <mailto:danstaples at opentechinstitute.org
>>     <mailto:danstaples at opentechinstitute.org>>> wrote:
>>     >
>>     >     +commotion-dev
>>     >
>>     >     If your nodes are crashing w/ 15-20 clients, while both
>>     serval and
>>     >     commotion-splash are disabled, that is very worrisome!
>>     >
>>     >     I propose to the Commotion dev team that we urgently need to
>>     come up
>>     >     with a way to simulate network load, so we can identify and
>>     fix the
>>     >     causes of these types of crashes. Does anyone have ideas or
>>     experiences
>>     >     with this? Perhaps we can take the technical discussion over
>>     to the
>>     >     commotion-dev list only.
>>     >
>>     >     And just an update for you Dan, earlier this week I found
>>     and fixed a
>>     >     significant memory leak in Serval...not sure how much that
>>     will affect
>>     >     the instability we've seen, but we'll soon know with some
>>     testing. The
>>     >     fix will make its way into the nightly builds probably by
>>     the end of the
>>     >     week.
>>     >
>>     >     As long as the rest of your network is DR1 or newer, the
>>     nightly builds
>>     >     should be compatible.
>>     >
>>     >     Dan
>>     >
>>     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
>>     >     > I was just checking to see if their had been any progress
>>     made on the
>>     >     > nightly builds with fixing the memory overload causing the
>>     nodes to
>>     >     > crash. To try and prevent my node from crashing I disabled
>>     serval and
>>     >     > the splash page. However, whenever I have 15 to 20
>>     students login to a
>>     >     > local app at the start of class my node crashes instantly. I'm
>>     >     wondering
>>     >     > if upgrading to the latest nightly build might fix this
>>     issue. Lastly,
>>     >     > if I upgrade to the latest nightly build will it still
>>     work with the
>>     >     > other nodes that do not have the latest build or do I have
>>     to or is it
>>     >     > recommend that I upgrade all of the other nodes to latest
>>     build as
>>     >     > well?  Thanks for all the hard work.  Commotion is
>>     otherwise working
>>     >     > wonders over here in the horn.
>>     >     >
>>     >     > Dan
>>     >     >
>>     >     > _______________________________________________
>>     >     > Commotion-discuss mailing list
>>     >     > Commotion-discuss at lists.chambana.net
>>     <mailto:Commotion-discuss at lists.chambana.net>
>>     >     <mailto:Commotion-discuss at lists.chambana.net
>>     <mailto:Commotion-discuss at lists.chambana.net>>
>>     >     > https://lists.chambana.net/mailman/listinfo/commotion-discuss
>>     >     >
>>     >
>>     >     --
>>     >     Dan Staples
>>     >
>>     >     Open Technology Institute
>>     >     https://commotionwireless.net
>>     >     OpenPGP key: http://disman.tl/pgp.asc
>>     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>>     >     _______________________________________________
>>     >     Commotion-dev mailing list
>>     >     Commotion-dev at lists.chambana.net
>>     <mailto:Commotion-dev at lists.chambana.net>
>>     >     <mailto:Commotion-dev at lists.chambana.net
>>     <mailto:Commotion-dev at lists.chambana.net>>
>>     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
>>     >
>>     >
>>     >
>>     >
>>     > --
>>     > Ben West
>>     > http://gowasabi.net
>>     > ben at gowasabi.net <mailto:ben at gowasabi.net>
>>     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>>     > 314-246-9434 <tel:314-246-9434>
>>
>>     --
>>     Dan Staples
>>
>>     Open Technology Institute
>>     https://commotionwireless.net
>>     OpenPGP key: http://disman.tl/pgp.asc
>>     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>>     _______________________________________________
>>     Commotion-dev mailing list
>>     Commotion-dev at lists.chambana.net
>>     <mailto:Commotion-dev at lists.chambana.net>
>>     https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
>>
>>
>>
>> -- 
>> Ben West
>> me at benwest.name <mailto:me at benwest.name>



More information about the Commotion-discuss mailing list