[Commotion-dev] [Commotion-discuss] Memory Issues and Nightly Builds

Ben West ben at gowasabi.net
Wed Nov 6 18:09:48 UTC 2013


It's relevant to point out that devices like Picostation and Nanostation
are normally intended for use as thin hotspots in high-usage environments,
i.e. DHCP and NAT routing not done on the device itself.  So,
Commotion-OpenWRT issuing DHCP leases and performing NAT one or 2 local
LANs onboard does consume memory that otherwise would go to serving 802.11n
clients.  This is an inherent limitation of the chosen architecture.

Besides that, I would assume at least these processes need to devote a
portion of available RAM to each client on the public AP in
Commotion-OpenWRT:

   - /proc/net/nf_conntrack entries
   - nodogsplash (although possibly only on initial portal page viewing)
   - uhttpd (again, only on portal page viewing)
   - the ath9k driver itself




On Wed, Nov 6, 2013 at 11:54 AM, Dan Staples <
danstaples at opentechinstitute.org> wrote:

> Since Dan Hastings has seen this happen with a lot of simultaneous
> clients and with high-memory components disabled, it sounds like that is
> likely the cause. Do you know exactly where that RAM is used for each
> connecting client?
>
> Dan, can you provide any more detailed info on exactly what was
> happening when you see the node crashing? How many simultaneous users,
> and what were they doing (viewing a webpage on the internet, or viewing
> the node's administrative web interface, etc)?
>
> On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
> > Hi Dan,
> >
> > Thanks for offering more detail, especially that you see the nodes
> > spontaneously reboot rather than simple have services crash.
> >
> > I would again point out that the Picostations will have a finite limit
> > for simultaneous clients.  15 to 20 clients is quite a few, each
> > client requiring a portion of available of RAM.  It may be a single
> > Picostation is not going to be able to sustain all of them.
> >
> >
> >
> > On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
> > <danstaples at opentechinstitute.org
> > <mailto:danstaples at opentechinstitute.org>> wrote:
> >
> >     Regarding logging, I'm not sure that will work well since the
> >     nodes are
> >     spontaneously rebooting themselves (due to OOM conditions), not
> >     the user
> >     rebooting them. What we're going to try to do is attach a serial
> >     console
> >     (thanks Will!) and try to slam the router with simultaneous users and
> >     traffic.
> >
> >     Also, I don't think Dan is hosting local apps on the router itself
> >     (correct me if I'm wrong), but just advertising them using the
> >     Commotion
> >     apps portal. And that's just takes a little space for the Avahi
> >     service
> >     file...so hopefully that's not a problem.
> >
> >     We'll certainly report what we find with our stress testing.
> >
> >     Dan
> >
> >     On 11/06/2013 10:37 AM, Ben West wrote:
> >     > I am also seeing sporadic memory consumption issues operating
> >     mesh nodes
> >     > running AA r38347 in WasabiNet on Nanostation Loco M2.
> >     >
> >     > That is, using the same ath9k wifi driver and same underlying
> >     OS, but
> >     > without the Commotion-specific tools like commotiond and servald.
>  I
> >     > will see nodes boot up with ~26Mbytes memory usage and then
> >     gradually
> >     > increase over the next few days until sporadic nodes start
> >     crashing with
> >     > page allocation failures (aka memory exhausted).  This all is
> >     happening
> >     > despite having 3Mbytes of compressed swap space allocated.
> >      When I am
> >     > able to log into crashed nodes to inspect, I will occasionally
> >     find the
> >     > current memory usage to be /less/ than the average observed on
> >     bootup,
> >     > along with ~500Kbytes sitting in swap.
> >     >
> >     > This seems to suggest something is very sporadically allocating
> >     itself a
> >     > large chunk (multiple MBytes), but not residing in memory as
> >     such, and
> >     > causing other processes to crash in consequence.  I do use the
> >     > coovachilli captive portal in WasabiNet, which could be a
> >     culprit and
> >     > thus unrelated to Commotion, but there could also be an underlying
> >     > memory leak in the kernel or wifi driver.
> >     >
> >     > What are thoughts for having crashed nodes try to collect a
> >     debug report
> >     > about themselves when a crash condition is detected (e.g. no
> >     Internet
> >     > access, "page allocation failure" detected in syslog), and then
> >     write
> >     > that report to flash somewhere before the node get rebooted by its
> >     > frustrated user?
> >     >
> >     > Besides that, do note that nodes with only 32MBytes of RAM, like
> >     UBNT
> >     > Picostations, are going to have difficulties hosting local apps
> >     for many
> >     > users.  If Dan Hasting would be able to use an alternate device
> with
> >     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
> >     > router (all of which should be able to run Commotion-OpenWRT),
> >     that may
> >     > be a viable workaround in cause chasing down memory leaks
> >     becomes too
> >     > ornery.
> >     >
> >     >
> >     >
> >     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> >     > <danstaples at opentechinstitute.org
> >     <mailto:danstaples at opentechinstitute.org>
> >     > <mailto:danstaples at opentechinstitute.org
> >     <mailto:danstaples at opentechinstitute.org>>> wrote:
> >     >
> >     >     +commotion-dev
> >     >
> >     >     If your nodes are crashing w/ 15-20 clients, while both
> >     serval and
> >     >     commotion-splash are disabled, that is very worrisome!
> >     >
> >     >     I propose to the Commotion dev team that we urgently need to
> >     come up
> >     >     with a way to simulate network load, so we can identify and
> >     fix the
> >     >     causes of these types of crashes. Does anyone have ideas or
> >     experiences
> >     >     with this? Perhaps we can take the technical discussion over
> >     to the
> >     >     commotion-dev list only.
> >     >
> >     >     And just an update for you Dan, earlier this week I found
> >     and fixed a
> >     >     significant memory leak in Serval...not sure how much that
> >     will affect
> >     >     the instability we've seen, but we'll soon know with some
> >     testing. The
> >     >     fix will make its way into the nightly builds probably by
> >     the end of the
> >     >     week.
> >     >
> >     >     As long as the rest of your network is DR1 or newer, the
> >     nightly builds
> >     >     should be compatible.
> >     >
> >     >     Dan
> >     >
> >     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
> >     >     > I was just checking to see if their had been any progress
> >     made on the
> >     >     > nightly builds with fixing the memory overload causing the
> >     nodes to
> >     >     > crash. To try and prevent my node from crashing I disabled
> >     serval and
> >     >     > the splash page. However, whenever I have 15 to 20
> >     students login to a
> >     >     > local app at the start of class my node crashes instantly.
> I'm
> >     >     wondering
> >     >     > if upgrading to the latest nightly build might fix this
> >     issue. Lastly,
> >     >     > if I upgrade to the latest nightly build will it still
> >     work with the
> >     >     > other nodes that do not have the latest build or do I have
> >     to or is it
> >     >     > recommend that I upgrade all of the other nodes to latest
> >     build as
> >     >     > well?  Thanks for all the hard work.  Commotion is
> >     otherwise working
> >     >     > wonders over here in the horn.
> >     >     >
> >     >     > Dan
> >     >     >
> >     >     > _______________________________________________
> >     >     > Commotion-discuss mailing list
> >     >     > Commotion-discuss at lists.chambana.net
> >     <mailto:Commotion-discuss at lists.chambana.net>
> >     >     <mailto:Commotion-discuss at lists.chambana.net
> >     <mailto:Commotion-discuss at lists.chambana.net>>
> >     >     >
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
> >     >     >
> >     >
> >     >     --
> >     >     Dan Staples
> >     >
> >     >     Open Technology Institute
> >     >     https://commotionwireless.net
> >     >     OpenPGP key: http://disman.tl/pgp.asc
> >     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >     >     _______________________________________________
> >     >     Commotion-dev mailing list
> >     >     Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>
> >     >     <mailto:Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>>
> >     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > Ben West
> >     > http://gowasabi.net
> >     > ben at gowasabi.net <mailto:ben at gowasabi.net>
> >     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
> >     > 314-246-9434 <tel:314-246-9434>
> >
> >     --
> >     Dan Staples
> >
> >     Open Technology Institute
> >     https://commotionwireless.net
> >     OpenPGP key: http://disman.tl/pgp.asc
> >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >     _______________________________________________
> >     Commotion-dev mailing list
> >     Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>
> >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >
> >
> >
> >
> > --
> > Ben West
> > me at benwest.name <mailto:me at benwest.name>
> --
> Dan Staples
>
> Open Technology Institute
> https://commotionwireless.net
> OpenPGP key: http://disman.tl/pgp.asc
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>



-- 
Ben West
http://gowasabi.net
ben at gowasabi.net
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-dev/attachments/20131106/f18dd5a7/attachment-0001.html>


More information about the Commotion-dev mailing list