[Commotion-discuss] [Commotion-dev] Memory Issues and Nightly Builds

Ben West ben at gowasabi.net
Wed Nov 6 18:41:01 UTC 2013


30 simultaneous users would be what I'd expect when using a Picostation M2
just as a stand-alone hotspot, i.e. without having it issue DHCP leases or
do NAT routing.  Because Commotion (and by extension most mesh node
firmware projects) are running additional services directly on the
hotspots, e.g. serval, olsrd, uhttpd, commotiond, wpa_supplicant, along
with the additional virtual wireless interface(s), this will reduce the the
number of simultaneous AP clients per node that can be reliably supported.

10 simultaneous users per node may be a better limit, albeit on the
assumption that additional memory leak fixes or optimizations are done to
shave a couple Mbytes or so off the firmware's current memory footprint.

Again, do remember the underlying nature of mesh node networks assumes each
node to be rather lightweight.  The "meshy" approach to supporting many
users is to distribute them across multiple nodes.



On Wed, Nov 6, 2013 at 12:00 PM, Ryan Gerety
<gerety at opentechinstitute.org>wrote:

> I was under the impression that PicoStations could support roughly 30
> users?
>
>
> On 11/6/2013 12:54 PM, Dan Staples wrote:
> > Since Dan Hastings has seen this happen with a lot of simultaneous
> > clients and with high-memory components disabled, it sounds like that is
> > likely the cause. Do you know exactly where that RAM is used for each
> > connecting client?
> >
> > Dan, can you provide any more detailed info on exactly what was
> > happening when you see the node crashing? How many simultaneous users,
> > and what were they doing (viewing a webpage on the internet, or viewing
> > the node's administrative web interface, etc)?
> >
> > On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
> >> Hi Dan,
> >>
> >> Thanks for offering more detail, especially that you see the nodes
> >> spontaneously reboot rather than simple have services crash.
> >>
> >> I would again point out that the Picostations will have a finite limit
> >> for simultaneous clients.  15 to 20 clients is quite a few, each
> >> client requiring a portion of available of RAM.  It may be a single
> >> Picostation is not going to be able to sustain all of them.
> >>
> >>
> >>
> >> On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
> >> <danstaples at opentechinstitute.org
> >> <mailto:danstaples at opentechinstitute.org>> wrote:
> >>
> >>     Regarding logging, I'm not sure that will work well since the
> >>     nodes are
> >>     spontaneously rebooting themselves (due to OOM conditions), not
> >>     the user
> >>     rebooting them. What we're going to try to do is attach a serial
> >>     console
> >>     (thanks Will!) and try to slam the router with simultaneous users
> and
> >>     traffic.
> >>
> >>     Also, I don't think Dan is hosting local apps on the router itself
> >>     (correct me if I'm wrong), but just advertising them using the
> >>     Commotion
> >>     apps portal. And that's just takes a little space for the Avahi
> >>     service
> >>     file...so hopefully that's not a problem.
> >>
> >>     We'll certainly report what we find with our stress testing.
> >>
> >>     Dan
> >>
> >>     On 11/06/2013 10:37 AM, Ben West wrote:
> >>     > I am also seeing sporadic memory consumption issues operating
> >>     mesh nodes
> >>     > running AA r38347 in WasabiNet on Nanostation Loco M2.
> >>     >
> >>     > That is, using the same ath9k wifi driver and same underlying
> >>     OS, but
> >>     > without the Commotion-specific tools like commotiond and servald.
>  I
> >>     > will see nodes boot up with ~26Mbytes memory usage and then
> >>     gradually
> >>     > increase over the next few days until sporadic nodes start
> >>     crashing with
> >>     > page allocation failures (aka memory exhausted).  This all is
> >>     happening
> >>     > despite having 3Mbytes of compressed swap space allocated.
> >>      When I am
> >>     > able to log into crashed nodes to inspect, I will occasionally
> >>     find the
> >>     > current memory usage to be /less/ than the average observed on
> >>     bootup,
> >>     > along with ~500Kbytes sitting in swap.
> >>     >
> >>     > This seems to suggest something is very sporadically allocating
> >>     itself a
> >>     > large chunk (multiple MBytes), but not residing in memory as
> >>     such, and
> >>     > causing other processes to crash in consequence.  I do use the
> >>     > coovachilli captive portal in WasabiNet, which could be a
> >>     culprit and
> >>     > thus unrelated to Commotion, but there could also be an underlying
> >>     > memory leak in the kernel or wifi driver.
> >>     >
> >>     > What are thoughts for having crashed nodes try to collect a
> >>     debug report
> >>     > about themselves when a crash condition is detected (e.g. no
> >>     Internet
> >>     > access, "page allocation failure" detected in syslog), and then
> >>     write
> >>     > that report to flash somewhere before the node get rebooted by its
> >>     > frustrated user?
> >>     >
> >>     > Besides that, do note that nodes with only 32MBytes of RAM, like
> >>     UBNT
> >>     > Picostations, are going to have difficulties hosting local apps
> >>     for many
> >>     > users.  If Dan Hasting would be able to use an alternate device
> with
> >>     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor
> TP-Link
> >>     > router (all of which should be able to run Commotion-OpenWRT),
> >>     that may
> >>     > be a viable workaround in cause chasing down memory leaks
> >>     becomes too
> >>     > ornery.
> >>     >
> >>     >
> >>     >
> >>     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> >>     > <danstaples at opentechinstitute.org
> >>     <mailto:danstaples at opentechinstitute.org>
> >>     > <mailto:danstaples at opentechinstitute.org
> >>     <mailto:danstaples at opentechinstitute.org>>> wrote:
> >>     >
> >>     >     +commotion-dev
> >>     >
> >>     >     If your nodes are crashing w/ 15-20 clients, while both
> >>     serval and
> >>     >     commotion-splash are disabled, that is very worrisome!
> >>     >
> >>     >     I propose to the Commotion dev team that we urgently need to
> >>     come up
> >>     >     with a way to simulate network load, so we can identify and
> >>     fix the
> >>     >     causes of these types of crashes. Does anyone have ideas or
> >>     experiences
> >>     >     with this? Perhaps we can take the technical discussion over
> >>     to the
> >>     >     commotion-dev list only.
> >>     >
> >>     >     And just an update for you Dan, earlier this week I found
> >>     and fixed a
> >>     >     significant memory leak in Serval...not sure how much that
> >>     will affect
> >>     >     the instability we've seen, but we'll soon know with some
> >>     testing. The
> >>     >     fix will make its way into the nightly builds probably by
> >>     the end of the
> >>     >     week.
> >>     >
> >>     >     As long as the rest of your network is DR1 or newer, the
> >>     nightly builds
> >>     >     should be compatible.
> >>     >
> >>     >     Dan
> >>     >
> >>     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
> >>     >     > I was just checking to see if their had been any progress
> >>     made on the
> >>     >     > nightly builds with fixing the memory overload causing the
> >>     nodes to
> >>     >     > crash. To try and prevent my node from crashing I disabled
> >>     serval and
> >>     >     > the splash page. However, whenever I have 15 to 20
> >>     students login to a
> >>     >     > local app at the start of class my node crashes instantly.
> I'm
> >>     >     wondering
> >>     >     > if upgrading to the latest nightly build might fix this
> >>     issue. Lastly,
> >>     >     > if I upgrade to the latest nightly build will it still
> >>     work with the
> >>     >     > other nodes that do not have the latest build or do I have
> >>     to or is it
> >>     >     > recommend that I upgrade all of the other nodes to latest
> >>     build as
> >>     >     > well?  Thanks for all the hard work.  Commotion is
> >>     otherwise working
> >>     >     > wonders over here in the horn.
> >>     >     >
> >>     >     > Dan
> >>     >     >
> >>     >     > _______________________________________________
> >>     >     > Commotion-discuss mailing list
> >>     >     > Commotion-discuss at lists.chambana.net
> >>     <mailto:Commotion-discuss at lists.chambana.net>
> >>     >     <mailto:Commotion-discuss at lists.chambana.net
> >>     <mailto:Commotion-discuss at lists.chambana.net>>
> >>     >     >
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
> >>     >     >
> >>     >
> >>     >     --
> >>     >     Dan Staples
> >>     >
> >>     >     Open Technology Institute
> >>     >     https://commotionwireless.net
> >>     >     OpenPGP key: http://disman.tl/pgp.asc
> >>     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >>     >     _______________________________________________
> >>     >     Commotion-dev mailing list
> >>     >     Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>
> >>     >     <mailto:Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>>
> >>     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > --
> >>     > Ben West
> >>     > http://gowasabi.net
> >>     > ben at gowasabi.net <mailto:ben at gowasabi.net>
> >>     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
> >>     > 314-246-9434 <tel:314-246-9434>
> >>
> >>     --
> >>     Dan Staples
> >>
> >>     Open Technology Institute
> >>     https://commotionwireless.net
> >>     OpenPGP key: http://disman.tl/pgp.asc
> >>     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >>     _______________________________________________
> >>     Commotion-dev mailing list
> >>     Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>
> >>     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >>
> >>
> >>
> >>
> >> --
> >> Ben West
> >> me at benwest.name <mailto:me at benwest.name>
>
> _______________________________________________
> Commotion-discuss mailing list
> Commotion-discuss at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
>
>


-- 
Ben West
http://gowasabi.net
ben at gowasabi.net
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-discuss/attachments/20131106/a6676571/attachment.html>


More information about the Commotion-discuss mailing list