[Commotion-discuss] Commotion-discuss Digest, Vol 14, Issue 5

Daniel Hastings dhastings at abaarsotech.org
Wed Nov 6 18:57:38 UTC 2013


I had exactly 18 students all simultaneously trying to connect to OwnCloud
a local application hosted on our server not the routers at one time.  I've
had this happen a few random times in the past week or so but only once
throughout the day almost always with the first class I have in the
morning.  The second class to follow had more students and we began the
class repeating the same behavior of connecting to OwnCloud but I did not
have any issues.

I'll ssh into the node tomorrow morning during my first class and see what
commands are running if the node happens to crash.  Also, I know I disabled
the commotion-splash by checking "immediately authenticate" under Captive
Portal but is there another way in command line to completely kill that
process from running all the time?  I just ran top on one of my nodes and I
still see that no-dog splash is running.


On Wed, Nov 6, 2013 at 9:16 PM, <
commotion-discuss-request at lists.chambana.net> wrote:

> Send Commotion-discuss mailing list submissions to
>         commotion-discuss at lists.chambana.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.chambana.net/mailman/listinfo/commotion-discuss
> or, via email, send a message with subject or body 'help' to
>         commotion-discuss-request at lists.chambana.net
>
> You can reach the person managing the list at
>         commotion-discuss-owner at lists.chambana.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Commotion-discuss digest..."
>
>
> Today's Topics:
>
>    1. Re: [Commotion-dev] Memory Issues and Nightly Builds (Dan Staples)
>    2. Re: [Commotion-dev] Memory Issues and Nightly Builds (Ryan Gerety)
>    3. Re: [Commotion-dev] Memory Issues and Nightly     Builds (Ben West)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 06 Nov 2013 12:54:10 -0500
> From: Dan Staples <danstaples at opentechinstitute.org>
> To: me at benwest.name, commotion-discuss
>         <commotion-discuss at lists.chambana.net>, Commotion Development List
>         <commotion-dev at lists.chambana.net>
> Subject: Re: [Commotion-discuss] [Commotion-dev] Memory Issues and
>         Nightly Builds
> Message-ID: <527A8242.4040107 at opentechinstitute.org>
> Content-Type: text/plain; charset=UTF-8
>
> Since Dan Hastings has seen this happen with a lot of simultaneous
> clients and with high-memory components disabled, it sounds like that is
> likely the cause. Do you know exactly where that RAM is used for each
> connecting client?
>
> Dan, can you provide any more detailed info on exactly what was
> happening when you see the node crashing? How many simultaneous users,
> and what were they doing (viewing a webpage on the internet, or viewing
> the node's administrative web interface, etc)?
>
> On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
> > Hi Dan,
> >
> > Thanks for offering more detail, especially that you see the nodes
> > spontaneously reboot rather than simple have services crash.
> >
> > I would again point out that the Picostations will have a finite limit
> > for simultaneous clients.  15 to 20 clients is quite a few, each
> > client requiring a portion of available of RAM.  It may be a single
> > Picostation is not going to be able to sustain all of them.
> >
> >
> >
> > On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
> > <danstaples at opentechinstitute.org
> > <mailto:danstaples at opentechinstitute.org>> wrote:
> >
> >     Regarding logging, I'm not sure that will work well since the
> >     nodes are
> >     spontaneously rebooting themselves (due to OOM conditions), not
> >     the user
> >     rebooting them. What we're going to try to do is attach a serial
> >     console
> >     (thanks Will!) and try to slam the router with simultaneous users and
> >     traffic.
> >
> >     Also, I don't think Dan is hosting local apps on the router itself
> >     (correct me if I'm wrong), but just advertising them using the
> >     Commotion
> >     apps portal. And that's just takes a little space for the Avahi
> >     service
> >     file...so hopefully that's not a problem.
> >
> >     We'll certainly report what we find with our stress testing.
> >
> >     Dan
> >
> >     On 11/06/2013 10:37 AM, Ben West wrote:
> >     > I am also seeing sporadic memory consumption issues operating
> >     mesh nodes
> >     > running AA r38347 in WasabiNet on Nanostation Loco M2.
> >     >
> >     > That is, using the same ath9k wifi driver and same underlying
> >     OS, but
> >     > without the Commotion-specific tools like commotiond and servald.
>  I
> >     > will see nodes boot up with ~26Mbytes memory usage and then
> >     gradually
> >     > increase over the next few days until sporadic nodes start
> >     crashing with
> >     > page allocation failures (aka memory exhausted).  This all is
> >     happening
> >     > despite having 3Mbytes of compressed swap space allocated.
> >      When I am
> >     > able to log into crashed nodes to inspect, I will occasionally
> >     find the
> >     > current memory usage to be /less/ than the average observed on
> >     bootup,
> >     > along with ~500Kbytes sitting in swap.
> >     >
> >     > This seems to suggest something is very sporadically allocating
> >     itself a
> >     > large chunk (multiple MBytes), but not residing in memory as
> >     such, and
> >     > causing other processes to crash in consequence.  I do use the
> >     > coovachilli captive portal in WasabiNet, which could be a
> >     culprit and
> >     > thus unrelated to Commotion, but there could also be an underlying
> >     > memory leak in the kernel or wifi driver.
> >     >
> >     > What are thoughts for having crashed nodes try to collect a
> >     debug report
> >     > about themselves when a crash condition is detected (e.g. no
> >     Internet
> >     > access, "page allocation failure" detected in syslog), and then
> >     write
> >     > that report to flash somewhere before the node get rebooted by its
> >     > frustrated user?
> >     >
> >     > Besides that, do note that nodes with only 32MBytes of RAM, like
> >     UBNT
> >     > Picostations, are going to have difficulties hosting local apps
> >     for many
> >     > users.  If Dan Hasting would be able to use an alternate device
> with
> >     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link
> >     > router (all of which should be able to run Commotion-OpenWRT),
> >     that may
> >     > be a viable workaround in cause chasing down memory leaks
> >     becomes too
> >     > ornery.
> >     >
> >     >
> >     >
> >     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> >     > <danstaples at opentechinstitute.org
> >     <mailto:danstaples at opentechinstitute.org>
> >     > <mailto:danstaples at opentechinstitute.org
> >     <mailto:danstaples at opentechinstitute.org>>> wrote:
> >     >
> >     >     +commotion-dev
> >     >
> >     >     If your nodes are crashing w/ 15-20 clients, while both
> >     serval and
> >     >     commotion-splash are disabled, that is very worrisome!
> >     >
> >     >     I propose to the Commotion dev team that we urgently need to
> >     come up
> >     >     with a way to simulate network load, so we can identify and
> >     fix the
> >     >     causes of these types of crashes. Does anyone have ideas or
> >     experiences
> >     >     with this? Perhaps we can take the technical discussion over
> >     to the
> >     >     commotion-dev list only.
> >     >
> >     >     And just an update for you Dan, earlier this week I found
> >     and fixed a
> >     >     significant memory leak in Serval...not sure how much that
> >     will affect
> >     >     the instability we've seen, but we'll soon know with some
> >     testing. The
> >     >     fix will make its way into the nightly builds probably by
> >     the end of the
> >     >     week.
> >     >
> >     >     As long as the rest of your network is DR1 or newer, the
> >     nightly builds
> >     >     should be compatible.
> >     >
> >     >     Dan
> >     >
> >     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
> >     >     > I was just checking to see if their had been any progress
> >     made on the
> >     >     > nightly builds with fixing the memory overload causing the
> >     nodes to
> >     >     > crash. To try and prevent my node from crashing I disabled
> >     serval and
> >     >     > the splash page. However, whenever I have 15 to 20
> >     students login to a
> >     >     > local app at the start of class my node crashes instantly.
> I'm
> >     >     wondering
> >     >     > if upgrading to the latest nightly build might fix this
> >     issue. Lastly,
> >     >     > if I upgrade to the latest nightly build will it still
> >     work with the
> >     >     > other nodes that do not have the latest build or do I have
> >     to or is it
> >     >     > recommend that I upgrade all of the other nodes to latest
> >     build as
> >     >     > well?  Thanks for all the hard work.  Commotion is
> >     otherwise working
> >     >     > wonders over here in the horn.
> >     >     >
> >     >     > Dan
> >     >     >
> >     >     > _______________________________________________
> >     >     > Commotion-discuss mailing list
> >     >     > Commotion-discuss at lists.chambana.net
> >     <mailto:Commotion-discuss at lists.chambana.net>
> >     >     <mailto:Commotion-discuss at lists.chambana.net
> >     <mailto:Commotion-discuss at lists.chambana.net>>
> >     >     >
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
> >     >     >
> >     >
> >     >     --
> >     >     Dan Staples
> >     >
> >     >     Open Technology Institute
> >     >     https://commotionwireless.net
> >     >     OpenPGP key: http://disman.tl/pgp.asc
> >     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >     >     _______________________________________________
> >     >     Commotion-dev mailing list
> >     >     Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>
> >     >     <mailto:Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>>
> >     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > Ben West
> >     > http://gowasabi.net
> >     > ben at gowasabi.net <mailto:ben at gowasabi.net>
> >     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
> >     > 314-246-9434 <tel:314-246-9434>
> >
> >     --
> >     Dan Staples
> >
> >     Open Technology Institute
> >     https://commotionwireless.net
> >     OpenPGP key: http://disman.tl/pgp.asc
> >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >     _______________________________________________
> >     Commotion-dev mailing list
> >     Commotion-dev at lists.chambana.net
> >     <mailto:Commotion-dev at lists.chambana.net>
> >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >
> >
> >
> >
> > --
> > Ben West
> > me at benwest.name <mailto:me at benwest.name>
> --
> Dan Staples
>
> Open Technology Institute
> https://commotionwireless.net
> OpenPGP key: http://disman.tl/pgp.asc
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 06 Nov 2013 13:00:58 -0500
> From: Ryan Gerety <gerety at opentechinstitute.org>
> To: commotion-discuss at lists.chambana.net
> Subject: Re: [Commotion-discuss] [Commotion-dev] Memory Issues and
>         Nightly Builds
> Message-ID: <527A83DA.6050507 at opentechinstitute.org>
> Content-Type: text/plain; charset=ISO-8859-1
>
> I was under the impression that PicoStations could support roughly 30
> users?
>
>
> On 11/6/2013 12:54 PM, Dan Staples wrote:
> > Since Dan Hastings has seen this happen with a lot of simultaneous
> > clients and with high-memory components disabled, it sounds like that is
> > likely the cause. Do you know exactly where that RAM is used for each
> > connecting client?
> >
> > Dan, can you provide any more detailed info on exactly what was
> > happening when you see the node crashing? How many simultaneous users,
> > and what were they doing (viewing a webpage on the internet, or viewing
> > the node's administrative web interface, etc)?
> >
> > On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
> >> Hi Dan,
> >>
> >> Thanks for offering more detail, especially that you see the nodes
> >> spontaneously reboot rather than simple have services crash.
> >>
> >> I would again point out that the Picostations will have a finite limit
> >> for simultaneous clients.  15 to 20 clients is quite a few, each
> >> client requiring a portion of available of RAM.  It may be a single
> >> Picostation is not going to be able to sustain all of them.
> >>
> >>
> >>
> >> On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
> >> <danstaples at opentechinstitute.org
> >> <mailto:danstaples at opentechinstitute.org>> wrote:
> >>
> >>     Regarding logging, I'm not sure that will work well since the
> >>     nodes are
> >>     spontaneously rebooting themselves (due to OOM conditions), not
> >>     the user
> >>     rebooting them. What we're going to try to do is attach a serial
> >>     console
> >>     (thanks Will!) and try to slam the router with simultaneous users
> and
> >>     traffic.
> >>
> >>     Also, I don't think Dan is hosting local apps on the router itself
> >>     (correct me if I'm wrong), but just advertising them using the
> >>     Commotion
> >>     apps portal. And that's just takes a little space for the Avahi
> >>     service
> >>     file...so hopefully that's not a problem.
> >>
> >>     We'll certainly report what we find with our stress testing.
> >>
> >>     Dan
> >>
> >>     On 11/06/2013 10:37 AM, Ben West wrote:
> >>     > I am also seeing sporadic memory consumption issues operating
> >>     mesh nodes
> >>     > running AA r38347 in WasabiNet on Nanostation Loco M2.
> >>     >
> >>     > That is, using the same ath9k wifi driver and same underlying
> >>     OS, but
> >>     > without the Commotion-specific tools like commotiond and servald.
>  I
> >>     > will see nodes boot up with ~26Mbytes memory usage and then
> >>     gradually
> >>     > increase over the next few days until sporadic nodes start
> >>     crashing with
> >>     > page allocation failures (aka memory exhausted).  This all is
> >>     happening
> >>     > despite having 3Mbytes of compressed swap space allocated.
> >>      When I am
> >>     > able to log into crashed nodes to inspect, I will occasionally
> >>     find the
> >>     > current memory usage to be /less/ than the average observed on
> >>     bootup,
> >>     > along with ~500Kbytes sitting in swap.
> >>     >
> >>     > This seems to suggest something is very sporadically allocating
> >>     itself a
> >>     > large chunk (multiple MBytes), but not residing in memory as
> >>     such, and
> >>     > causing other processes to crash in consequence.  I do use the
> >>     > coovachilli captive portal in WasabiNet, which could be a
> >>     culprit and
> >>     > thus unrelated to Commotion, but there could also be an underlying
> >>     > memory leak in the kernel or wifi driver.
> >>     >
> >>     > What are thoughts for having crashed nodes try to collect a
> >>     debug report
> >>     > about themselves when a crash condition is detected (e.g. no
> >>     Internet
> >>     > access, "page allocation failure" detected in syslog), and then
> >>     write
> >>     > that report to flash somewhere before the node get rebooted by its
> >>     > frustrated user?
> >>     >
> >>     > Besides that, do note that nodes with only 32MBytes of RAM, like
> >>     UBNT
> >>     > Picostations, are going to have difficulties hosting local apps
> >>     for many
> >>     > users.  If Dan Hasting would be able to use an alternate device
> with
> >>     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor
> TP-Link
> >>     > router (all of which should be able to run Commotion-OpenWRT),
> >>     that may
> >>     > be a viable workaround in cause chasing down memory leaks
> >>     becomes too
> >>     > ornery.
> >>     >
> >>     >
> >>     >
> >>     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> >>     > <danstaples at opentechinstitute.org
> >>     <mailto:danstaples at opentechinstitute.org>
> >>     > <mailto:danstaples at opentechinstitute.org
> >>     <mailto:danstaples at opentechinstitute.org>>> wrote:
> >>     >
> >>     >     +commotion-dev
> >>     >
> >>     >     If your nodes are crashing w/ 15-20 clients, while both
> >>     serval and
> >>     >     commotion-splash are disabled, that is very worrisome!
> >>     >
> >>     >     I propose to the Commotion dev team that we urgently need to
> >>     come up
> >>     >     with a way to simulate network load, so we can identify and
> >>     fix the
> >>     >     causes of these types of crashes. Does anyone have ideas or
> >>     experiences
> >>     >     with this? Perhaps we can take the technical discussion over
> >>     to the
> >>     >     commotion-dev list only.
> >>     >
> >>     >     And just an update for you Dan, earlier this week I found
> >>     and fixed a
> >>     >     significant memory leak in Serval...not sure how much that
> >>     will affect
> >>     >     the instability we've seen, but we'll soon know with some
> >>     testing. The
> >>     >     fix will make its way into the nightly builds probably by
> >>     the end of the
> >>     >     week.
> >>     >
> >>     >     As long as the rest of your network is DR1 or newer, the
> >>     nightly builds
> >>     >     should be compatible.
> >>     >
> >>     >     Dan
> >>     >
> >>     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
> >>     >     > I was just checking to see if their had been any progress
> >>     made on the
> >>     >     > nightly builds with fixing the memory overload causing the
> >>     nodes to
> >>     >     > crash. To try and prevent my node from crashing I disabled
> >>     serval and
> >>     >     > the splash page. However, whenever I have 15 to 20
> >>     students login to a
> >>     >     > local app at the start of class my node crashes instantly.
> I'm
> >>     >     wondering
> >>     >     > if upgrading to the latest nightly build might fix this
> >>     issue. Lastly,
> >>     >     > if I upgrade to the latest nightly build will it still
> >>     work with the
> >>     >     > other nodes that do not have the latest build or do I have
> >>     to or is it
> >>     >     > recommend that I upgrade all of the other nodes to latest
> >>     build as
> >>     >     > well?  Thanks for all the hard work.  Commotion is
> >>     otherwise working
> >>     >     > wonders over here in the horn.
> >>     >     >
> >>     >     > Dan
> >>     >     >
> >>     >     > _______________________________________________
> >>     >     > Commotion-discuss mailing list
> >>     >     > Commotion-discuss at lists.chambana.net
> >>     <mailto:Commotion-discuss at lists.chambana.net>
> >>     >     <mailto:Commotion-discuss at lists.chambana.net
> >>     <mailto:Commotion-discuss at lists.chambana.net>>
> >>     >     >
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
> >>     >     >
> >>     >
> >>     >     --
> >>     >     Dan Staples
> >>     >
> >>     >     Open Technology Institute
> >>     >     https://commotionwireless.net
> >>     >     OpenPGP key: http://disman.tl/pgp.asc
> >>     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >>     >     _______________________________________________
> >>     >     Commotion-dev mailing list
> >>     >     Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>
> >>     >     <mailto:Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>>
> >>     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > --
> >>     > Ben West
> >>     > http://gowasabi.net
> >>     > ben at gowasabi.net <mailto:ben at gowasabi.net>
> >>     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
> >>     > 314-246-9434 <tel:314-246-9434>
> >>
> >>     --
> >>     Dan Staples
> >>
> >>     Open Technology Institute
> >>     https://commotionwireless.net
> >>     OpenPGP key: http://disman.tl/pgp.asc
> >>     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >>     _______________________________________________
> >>     Commotion-dev mailing list
> >>     Commotion-dev at lists.chambana.net
> >>     <mailto:Commotion-dev at lists.chambana.net>
> >>     https://lists.chambana.net/mailman/listinfo/commotion-dev
> >>
> >>
> >>
> >>
> >> --
> >> Ben West
> >> me at benwest.name <mailto:me at benwest.name>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 6 Nov 2013 12:09:48 -0600
> From: Ben West <ben at gowasabi.net>
> To: Dan Staples <danstaples at opentechinstitute.org>
> Cc: commotion-discuss <commotion-discuss at lists.chambana.net>,
>         Commotion Development List <commotion-dev at lists.chambana.net>
> Subject: Re: [Commotion-discuss] [Commotion-dev] Memory Issues and
>         Nightly Builds
> Message-ID:
>         <
> CADSh-SPWYwsXsXHkX-AiB4Mt+3beB8Ly62vm7EscfQZZb89THg at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> It's relevant to point out that devices like Picostation and Nanostation
> are normally intended for use as thin hotspots in high-usage environments,
> i.e. DHCP and NAT routing not done on the device itself.  So,
> Commotion-OpenWRT issuing DHCP leases and performing NAT one or 2 local
> LANs onboard does consume memory that otherwise would go to serving 802.11n
> clients.  This is an inherent limitation of the chosen architecture.
>
> Besides that, I would assume at least these processes need to devote a
> portion of available RAM to each client on the public AP in
> Commotion-OpenWRT:
>
>    - /proc/net/nf_conntrack entries
>    - nodogsplash (although possibly only on initial portal page viewing)
>    - uhttpd (again, only on portal page viewing)
>    - the ath9k driver itself
>
>
>
>
> On Wed, Nov 6, 2013 at 11:54 AM, Dan Staples <
> danstaples at opentechinstitute.org> wrote:
>
> > Since Dan Hastings has seen this happen with a lot of simultaneous
> > clients and with high-memory components disabled, it sounds like that is
> > likely the cause. Do you know exactly where that RAM is used for each
> > connecting client?
> >
> > Dan, can you provide any more detailed info on exactly what was
> > happening when you see the node crashing? How many simultaneous users,
> > and what were they doing (viewing a webpage on the internet, or viewing
> > the node's administrative web interface, etc)?
> >
> > On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:
> > > Hi Dan,
> > >
> > > Thanks for offering more detail, especially that you see the nodes
> > > spontaneously reboot rather than simple have services crash.
> > >
> > > I would again point out that the Picostations will have a finite limit
> > > for simultaneous clients.  15 to 20 clients is quite a few, each
> > > client requiring a portion of available of RAM.  It may be a single
> > > Picostation is not going to be able to sustain all of them.
> > >
> > >
> > >
> > > On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples
> > > <danstaples at opentechinstitute.org
> > > <mailto:danstaples at opentechinstitute.org>> wrote:
> > >
> > >     Regarding logging, I'm not sure that will work well since the
> > >     nodes are
> > >     spontaneously rebooting themselves (due to OOM conditions), not
> > >     the user
> > >     rebooting them. What we're going to try to do is attach a serial
> > >     console
> > >     (thanks Will!) and try to slam the router with simultaneous users
> and
> > >     traffic.
> > >
> > >     Also, I don't think Dan is hosting local apps on the router itself
> > >     (correct me if I'm wrong), but just advertising them using the
> > >     Commotion
> > >     apps portal. And that's just takes a little space for the Avahi
> > >     service
> > >     file...so hopefully that's not a problem.
> > >
> > >     We'll certainly report what we find with our stress testing.
> > >
> > >     Dan
> > >
> > >     On 11/06/2013 10:37 AM, Ben West wrote:
> > >     > I am also seeing sporadic memory consumption issues operating
> > >     mesh nodes
> > >     > running AA r38347 in WasabiNet on Nanostation Loco M2.
> > >     >
> > >     > That is, using the same ath9k wifi driver and same underlying
> > >     OS, but
> > >     > without the Commotion-specific tools like commotiond and servald.
> >  I
> > >     > will see nodes boot up with ~26Mbytes memory usage and then
> > >     gradually
> > >     > increase over the next few days until sporadic nodes start
> > >     crashing with
> > >     > page allocation failures (aka memory exhausted).  This all is
> > >     happening
> > >     > despite having 3Mbytes of compressed swap space allocated.
> > >      When I am
> > >     > able to log into crashed nodes to inspect, I will occasionally
> > >     find the
> > >     > current memory usage to be /less/ than the average observed on
> > >     bootup,
> > >     > along with ~500Kbytes sitting in swap.
> > >     >
> > >     > This seems to suggest something is very sporadically allocating
> > >     itself a
> > >     > large chunk (multiple MBytes), but not residing in memory as
> > >     such, and
> > >     > causing other processes to crash in consequence.  I do use the
> > >     > coovachilli captive portal in WasabiNet, which could be a
> > >     culprit and
> > >     > thus unrelated to Commotion, but there could also be an
> underlying
> > >     > memory leak in the kernel or wifi driver.
> > >     >
> > >     > What are thoughts for having crashed nodes try to collect a
> > >     debug report
> > >     > about themselves when a crash condition is detected (e.g. no
> > >     Internet
> > >     > access, "page allocation failure" detected in syslog), and then
> > >     write
> > >     > that report to flash somewhere before the node get rebooted by
> its
> > >     > frustrated user?
> > >     >
> > >     > Besides that, do note that nodes with only 32MBytes of RAM, like
> > >     UBNT
> > >     > Picostations, are going to have difficulties hosting local apps
> > >     for many
> > >     > users.  If Dan Hasting would be able to use an alternate device
> > with
> > >     > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor
> TP-Link
> > >     > router (all of which should be able to run Commotion-OpenWRT),
> > >     that may
> > >     > be a viable workaround in cause chasing down memory leaks
> > >     becomes too
> > >     > ornery.
> > >     >
> > >     >
> > >     >
> > >     > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples
> > >     > <danstaples at opentechinstitute.org
> > >     <mailto:danstaples at opentechinstitute.org>
> > >     > <mailto:danstaples at opentechinstitute.org
> > >     <mailto:danstaples at opentechinstitute.org>>> wrote:
> > >     >
> > >     >     +commotion-dev
> > >     >
> > >     >     If your nodes are crashing w/ 15-20 clients, while both
> > >     serval and
> > >     >     commotion-splash are disabled, that is very worrisome!
> > >     >
> > >     >     I propose to the Commotion dev team that we urgently need to
> > >     come up
> > >     >     with a way to simulate network load, so we can identify and
> > >     fix the
> > >     >     causes of these types of crashes. Does anyone have ideas or
> > >     experiences
> > >     >     with this? Perhaps we can take the technical discussion over
> > >     to the
> > >     >     commotion-dev list only.
> > >     >
> > >     >     And just an update for you Dan, earlier this week I found
> > >     and fixed a
> > >     >     significant memory leak in Serval...not sure how much that
> > >     will affect
> > >     >     the instability we've seen, but we'll soon know with some
> > >     testing. The
> > >     >     fix will make its way into the nightly builds probably by
> > >     the end of the
> > >     >     week.
> > >     >
> > >     >     As long as the rest of your network is DR1 or newer, the
> > >     nightly builds
> > >     >     should be compatible.
> > >     >
> > >     >     Dan
> > >     >
> > >     >     On 11/06/2013 04:07 AM, Dan Hastings wrote:
> > >     >     > I was just checking to see if their had been any progress
> > >     made on the
> > >     >     > nightly builds with fixing the memory overload causing the
> > >     nodes to
> > >     >     > crash. To try and prevent my node from crashing I disabled
> > >     serval and
> > >     >     > the splash page. However, whenever I have 15 to 20
> > >     students login to a
> > >     >     > local app at the start of class my node crashes instantly.
> > I'm
> > >     >     wondering
> > >     >     > if upgrading to the latest nightly build might fix this
> > >     issue. Lastly,
> > >     >     > if I upgrade to the latest nightly build will it still
> > >     work with the
> > >     >     > other nodes that do not have the latest build or do I have
> > >     to or is it
> > >     >     > recommend that I upgrade all of the other nodes to latest
> > >     build as
> > >     >     > well?  Thanks for all the hard work.  Commotion is
> > >     otherwise working
> > >     >     > wonders over here in the horn.
> > >     >     >
> > >     >     > Dan
> > >     >     >
> > >     >     > _______________________________________________
> > >     >     > Commotion-discuss mailing list
> > >     >     > Commotion-discuss at lists.chambana.net
> > >     <mailto:Commotion-discuss at lists.chambana.net>
> > >     >     <mailto:Commotion-discuss at lists.chambana.net
> > >     <mailto:Commotion-discuss at lists.chambana.net>>
> > >     >     >
> > https://lists.chambana.net/mailman/listinfo/commotion-discuss
> > >     >     >
> > >     >
> > >     >     --
> > >     >     Dan Staples
> > >     >
> > >     >     Open Technology Institute
> > >     >     https://commotionwireless.net
> > >     >     OpenPGP key: http://disman.tl/pgp.asc
> > >     >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86
> 43A9
> > >     >     _______________________________________________
> > >     >     Commotion-dev mailing list
> > >     >     Commotion-dev at lists.chambana.net
> > >     <mailto:Commotion-dev at lists.chambana.net>
> > >     >     <mailto:Commotion-dev at lists.chambana.net
> > >     <mailto:Commotion-dev at lists.chambana.net>>
> > >     >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> > >     >
> > >     >
> > >     >
> > >     >
> > >     > --
> > >     > Ben West
> > >     > http://gowasabi.net
> > >     > ben at gowasabi.net <mailto:ben at gowasabi.net>
> > >     <mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
> > >     > 314-246-9434 <tel:314-246-9434>
> > >
> > >     --
> > >     Dan Staples
> > >
> > >     Open Technology Institute
> > >     https://commotionwireless.net
> > >     OpenPGP key: http://disman.tl/pgp.asc
> > >     Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> > >     _______________________________________________
> > >     Commotion-dev mailing list
> > >     Commotion-dev at lists.chambana.net
> > >     <mailto:Commotion-dev at lists.chambana.net>
> > >     https://lists.chambana.net/mailman/listinfo/commotion-dev
> > >
> > >
> > >
> > >
> > > --
> > > Ben West
> > > me at benwest.name <mailto:me at benwest.name>
> > --
> > Dan Staples
> >
> > Open Technology Institute
> > https://commotionwireless.net
> > OpenPGP key: http://disman.tl/pgp.asc
> > Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> >
>
>
>
> --
> Ben West
> http://gowasabi.net
> ben at gowasabi.net
> 314-246-9434
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.chambana.net/pipermail/commotion-discuss/attachments/20131106/f18dd5a7/attachment.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Commotion-discuss mailing list
> Commotion-discuss at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-discuss
>
>
> ------------------------------
>
> End of Commotion-discuss Digest, Vol 14, Issue 5
> ************************************************
>
>


-- 
*Dan Hastings*
*Abaarso School Computer Science Department*
dhastings at abaarsotech.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-discuss/attachments/20131106/43a280ee/attachment-0001.html>


More information about the Commotion-discuss mailing list