[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds
Ryan Gerety
gerety at opentechinstitute.org
Wed Nov 6 18:00:58 UTC 2013
I was under the impression that PicoStations could support roughly 30 users?
On 11/6/2013 12:54 PM, Dan Staples wrote:
> Since Dan Hastings has seen this happen with a lot of simultaneous
> clients and with high-memory components disabled, it sounds like that is
> likely the cause. Do you know exactly where that RAM is used for each
> connecting client?
>
> Dan, can you provide any more detailed info on exactly what was
> happening when you see the node crashing? How many simultaneous users,
> and what were they doing (viewing a webpage on the internet, or viewing
> the node's administrative web interface, etc)?
>
> On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
>> Hi Dan,
>>
>> Thanks for offering more detail, especially that you see the nodes
>> spontaneously reboot rather than simple have services crash.
>>
>> I would again point out that the Picostations will have a finite limit
>> for simultaneous clients. 15 to 20 clients is quite a few, each
>> client requiring a portion of available of RAM. It may be a single
>> Picostation is not going to be able to sustain all of them.
>>
>>
>>
>> On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
>> <danstaples at opentechinstitute.org
>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>
>> Regarding logging, I'm not sure that will work well since the
>> nodes are
>> spontaneously rebooting themselves (due to OOM conditions), not
>> the user
>> rebooting them. What we're going to try to do is attach a serial
>> console
>> (thanks Will!) and try to slam the router with simultaneous users and
>> traffic.
>>
>> Also, I don't think Dan is hosting local apps on the router itself
>> (correct me if I'm wrong), but just advertising them using the
>> Commotion
>> apps portal. And that's just takes a little space for the Avahi
>> service
>> file...so hopefully that's not a problem.
>>
>> We'll certainly report what we find with our stress testing.
>>
>> Dan
>>
>> On 11/06/2013 10:37 AM, Ben West wrote:
>> > I am also seeing sporadic memory consumption issues operating
>> mesh nodes
>> > running AA r38347 in WasabiNet on Nanostation Loco M2.
>> >
>> > That is, using the same ath9k wifi driver and same underlying
>> OS, but
>> > without the Commotion-specific tools like commotiond and servald. I
>> > will see nodes boot up with ~26Mbytes memory usage and then
>> gradually
>> > increase over the next few days until sporadic nodes start
>> crashing with
>> > page allocation failures (aka memory exhausted). This all is
>> happening
>> > despite having 3Mbytes of compressed swap space allocated.
>> When I am
>> > able to log into crashed nodes to inspect, I will occasionally
>> find the
>> > current memory usage to be /less/ than the average observed on
>> bootup,
>> > along with ~500Kbytes sitting in swap.
>> >
>> > This seems to suggest something is very sporadically allocating
>> itself a
>> > large chunk (multiple MBytes), but not residing in memory as
>> such, and
>> > causing other processes to crash in consequence. I do use the
>> > coovachilli captive portal in WasabiNet, which could be a
>> culprit and
>> > thus unrelated to Commotion, but there could also be an underlying
>> > memory leak in the kernel or wifi driver.
>> >
>> > What are thoughts for having crashed nodes try to collect a
>> debug report
>> > about themselves when a crash condition is detected (e.g. no
>> Internet
>> > access, "page allocation failure" detected in syslog), and then
>> write
>> > that report to flash somewhere before the node get rebooted by its
>> > frustrated user?
>> >
>> > Besides that, do note that nodes with only 32MBytes of RAM, like
>> UBNT
>> > Picostations, are going to have difficulties hosting local apps
>> for many
>> > users. If Dan Hasting would be able to use an alternate device with
>> > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
>> > router (all of which should be able to run Commotion-OpenWRT),
>> that may
>> > be a viable workaround in cause chasing down memory leaks
>> becomes too
>> > ornery.
>> >
>> >
>> >
>> > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
>> > <danstaples at opentechinstitute.org
>> <mailto:danstaples at opentechinstitute.org>
>> > <mailto:danstaples at opentechinstitute.org
>> <mailto:danstaples at opentechinstitute.org>>> wrote:
>> >
>> > +commotion-dev
>> >
>> > If your nodes are crashing w/ 15-20 clients, while both
>> serval and
>> > commotion-splash are disabled, that is very worrisome!
>> >
>> > I propose to the Commotion dev team that we urgently need to
>> come up
>> > with a way to simulate network load, so we can identify and
>> fix the
>> > causes of these types of crashes. Does anyone have ideas or
>> experiences
>> > with this? Perhaps we can take the technical discussion over
>> to the
>> > commotion-dev list only.
>> >
>> > And just an update for you Dan, earlier this week I found
>> and fixed a
>> > significant memory leak in Serval...not sure how much that
>> will affect
>> > the instability we've seen, but we'll soon know with some
>> testing. The
>> > fix will make its way into the nightly builds probably by
>> the end of the
>> > week.
>> >
>> > As long as the rest of your network is DR1 or newer, the
>> nightly builds
>> > should be compatible.
>> >
>> > Dan
>> >
>> > On 11/06/2013 04:07 AM, Dan Hastings wrote:
>> > > I was just checking to see if their had been any progress
>> made on the
>> > > nightly builds with fixing the memory overload causing the
>> nodes to
>> > > crash. To try and prevent my node from crashing I disabled
>> serval and
>> > > the splash page. However, whenever I have 15 to 20
>> students login to a
>> > > local app at the start of class my node crashes instantly. I'm
>> > wondering
>> > > if upgrading to the latest nightly build might fix this
>> issue. Lastly,
>> > > if I upgrade to the latest nightly build will it still
>> work with the
>> > > other nodes that do not have the latest build or do I have
>> to or is it
>> > > recommend that I upgrade all of the other nodes to latest
>> build as
>> > > well? Thanks for all the hard work. Commotion is
>> otherwise working
>> > > wonders over here in the horn.
>> > >
>> > > Dan
>> > >
>> > > _______________________________________________
>> > > Commotion-discuss mailing list
>> > > Commotion-discuss at lists.chambana.net
>> <mailto:Commotion-discuss at lists.chambana.net>
>> > <mailto:Commotion-discuss at lists.chambana.net
>> <mailto:Commotion-discuss at lists.chambana.net>>
>> > > https://lists.chambana.net/mailman/listinfo/commotion-discuss
>> > >
>> >
>> > --
>> > Dan Staples
>> >
>> > Open Technology Institute
>> > https://commotionwireless.net
>> > OpenPGP key: http://disman.tl/pgp.asc
>> > Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>> > _______________________________________________
>> > Commotion-dev mailing list
>> > Commotion-dev at lists.chambana.net
>> <mailto:Commotion-dev at lists.chambana.net>
>> > <mailto:Commotion-dev at lists.chambana.net
>> <mailto:Commotion-dev at lists.chambana.net>>
>> > https://lists.chambana.net/mailman/listinfo/commotion-dev
>> >
>> >
>> >
>> >
>> > --
>> > Ben West
>> > http://gowasabi.net
>> > ben at gowasabi.net <mailto:ben at gowasabi.net>
>> <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>> > 314-246-9434 <tel:314-246-9434>
>>
>> --
>> Dan Staples
>>
>> Open Technology Institute
>> https://commotionwireless.net
>> OpenPGP key: http://disman.tl/pgp.asc
>> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>> _______________________________________________
>> Commotion-dev mailing list
>> Commotion-dev at lists.chambana.net
>> <mailto:Commotion-dev at lists.chambana.net>
>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
>>
>>
>>
>> --
>> Ben West
>> me at benwest.name <mailto:me at benwest.name>
More information about the Commotion-discuss
mailing list