<div dir="ltr"><div>It's relevant to point out that devices like Picostation and Nanostation are normally intended for use as thin hotspots in high-usage environments, i.e. DHCP and NAT routing not done on the device itself. So, Commotion-OpenWRT issuing DHCP leases and performing NAT one or 2 local LANs onboard does consume memory that otherwise would go to serving 802.11n clients. This is an inherent limitation of the chosen architecture.<br>
<br></div>Besides that, I would assume at least these processes need to devote a portion of available RAM to each client on the public AP in Commotion-OpenWRT:<br><ul><li>/proc/net/nf_conntrack entries <br></li><li>nodogsplash (although possibly only on initial portal page viewing)</li>
<li>uhttpd (again, only on portal page viewing)<br></li><li>the ath9k driver itself<br></li></ul><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Nov 6, 2013 at 11:54 AM, Dan Staples <span dir="ltr"><<a href="mailto:danstaples@opentechinstitute.org" target="_blank">danstaples@opentechinstitute.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Since Dan Hastings has seen this happen with a lot of simultaneous<br>
clients and with high-memory components disabled, it sounds like that is<br>
likely the cause. Do you know exactly where that RAM is used for each<br>
connecting client?<br>
<br>
Dan, can you provide any more detailed info on exactly what was<br>
happening when you see the node crashing? How many simultaneous users,<br>
and what were they doing (viewing a webpage on the internet, or viewing<br>
the node's administrative web interface, etc)?<br>
<div class="im"><br>
On Wed 06 Nov 2013 12:40:06 PM EST, Ben West wrote:<br>
> Hi Dan,<br>
><br>
> Thanks for offering more detail, especially that you see the nodes<br>
> spontaneously reboot rather than simple have services crash.<br>
><br>
> I would again point out that the Picostations will have a finite limit<br>
> for simultaneous clients. 15 to 20 clients is quite a few, each<br>
> client requiring a portion of available of RAM. It may be a single<br>
> Picostation is not going to be able to sustain all of them.<br>
><br>
><br>
><br>
> On Wed, Nov 6, 2013 at 10:58 AM, Dan Staples<br>
> <<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a><br>
</div><div><div class="h5">> <mailto:<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a>>> wrote:<br>
><br>
> Regarding logging, I'm not sure that will work well since the<br>
> nodes are<br>
> spontaneously rebooting themselves (due to OOM conditions), not<br>
> the user<br>
> rebooting them. What we're going to try to do is attach a serial<br>
> console<br>
> (thanks Will!) and try to slam the router with simultaneous users and<br>
> traffic.<br>
><br>
> Also, I don't think Dan is hosting local apps on the router itself<br>
> (correct me if I'm wrong), but just advertising them using the<br>
> Commotion<br>
> apps portal. And that's just takes a little space for the Avahi<br>
> service<br>
> file...so hopefully that's not a problem.<br>
><br>
> We'll certainly report what we find with our stress testing.<br>
><br>
> Dan<br>
><br>
> On 11/06/2013 10:37 AM, Ben West wrote:<br>
> > I am also seeing sporadic memory consumption issues operating<br>
> mesh nodes<br>
> > running AA r38347 in WasabiNet on Nanostation Loco M2.<br>
> ><br>
> > That is, using the same ath9k wifi driver and same underlying<br>
> OS, but<br>
> > without the Commotion-specific tools like commotiond and servald. I<br>
> > will see nodes boot up with ~26Mbytes memory usage and then<br>
> gradually<br>
> > increase over the next few days until sporadic nodes start<br>
> crashing with<br>
> > page allocation failures (aka memory exhausted). This all is<br>
> happening<br>
> > despite having 3Mbytes of compressed swap space allocated.<br>
> When I am<br>
> > able to log into crashed nodes to inspect, I will occasionally<br>
> find the<br>
> > current memory usage to be /less/ than the average observed on<br>
> bootup,<br>
> > along with ~500Kbytes sitting in swap.<br>
> ><br>
> > This seems to suggest something is very sporadically allocating<br>
> itself a<br>
> > large chunk (multiple MBytes), but not residing in memory as<br>
> such, and<br>
> > causing other processes to crash in consequence. I do use the<br>
> > coovachilli captive portal in WasabiNet, which could be a<br>
> culprit and<br>
> > thus unrelated to Commotion, but there could also be an underlying<br>
> > memory leak in the kernel or wifi driver.<br>
> ><br>
> > What are thoughts for having crashed nodes try to collect a<br>
> debug report<br>
> > about themselves when a crash condition is detected (e.g. no<br>
> Internet<br>
> > access, "page allocation failure" detected in syslog), and then<br>
> write<br>
> > that report to flash somewhere before the node get rebooted by its<br>
> > frustrated user?<br>
> ><br>
> > Besides that, do note that nodes with only 32MBytes of RAM, like<br>
> UBNT<br>
> > Picostations, are going to have difficulties hosting local apps<br>
> for many<br>
> > users. If Dan Hasting would be able to use an alternate device with<br>
> > 64Mbytes+ RAM, e.g. a UBNT Rocket, Unifi, or even an indoor TP-Link<br>
> > router (all of which should be able to run Commotion-OpenWRT),<br>
> that may<br>
> > be a viable workaround in cause chasing down memory leaks<br>
> becomes too<br>
> > ornery.<br>
> ><br>
> ><br>
> ><br>
> > On Wed, Nov 6, 2013 at 8:54 AM, Dan Staples<br>
> > <<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a><br>
> <mailto:<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a>><br>
</div></div>> > <mailto:<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a><br>
<div><div class="h5">> <mailto:<a href="mailto:danstaples@opentechinstitute.org">danstaples@opentechinstitute.org</a>>>> wrote:<br>
> ><br>
> > +commotion-dev<br>
> ><br>
> > If your nodes are crashing w/ 15-20 clients, while both<br>
> serval and<br>
> > commotion-splash are disabled, that is very worrisome!<br>
> ><br>
> > I propose to the Commotion dev team that we urgently need to<br>
> come up<br>
> > with a way to simulate network load, so we can identify and<br>
> fix the<br>
> > causes of these types of crashes. Does anyone have ideas or<br>
> experiences<br>
> > with this? Perhaps we can take the technical discussion over<br>
> to the<br>
> > commotion-dev list only.<br>
> ><br>
> > And just an update for you Dan, earlier this week I found<br>
> and fixed a<br>
> > significant memory leak in Serval...not sure how much that<br>
> will affect<br>
> > the instability we've seen, but we'll soon know with some<br>
> testing. The<br>
> > fix will make its way into the nightly builds probably by<br>
> the end of the<br>
> > week.<br>
> ><br>
> > As long as the rest of your network is DR1 or newer, the<br>
> nightly builds<br>
> > should be compatible.<br>
> ><br>
> > Dan<br>
> ><br>
> > On 11/06/2013 04:07 AM, Dan Hastings wrote:<br>
> > > I was just checking to see if their had been any progress<br>
> made on the<br>
> > > nightly builds with fixing the memory overload causing the<br>
> nodes to<br>
> > > crash. To try and prevent my node from crashing I disabled<br>
> serval and<br>
> > > the splash page. However, whenever I have 15 to 20<br>
> students login to a<br>
> > > local app at the start of class my node crashes instantly. I'm<br>
> > wondering<br>
> > > if upgrading to the latest nightly build might fix this<br>
> issue. Lastly,<br>
> > > if I upgrade to the latest nightly build will it still<br>
> work with the<br>
> > > other nodes that do not have the latest build or do I have<br>
> to or is it<br>
> > > recommend that I upgrade all of the other nodes to latest<br>
> build as<br>
> > > well? Thanks for all the hard work. Commotion is<br>
> otherwise working<br>
> > > wonders over here in the horn.<br>
> > ><br>
> > > Dan<br>
> > ><br>
> > > _______________________________________________<br>
> > > Commotion-discuss mailing list<br>
> > > <a href="mailto:Commotion-discuss@lists.chambana.net">Commotion-discuss@lists.chambana.net</a><br>
> <mailto:<a href="mailto:Commotion-discuss@lists.chambana.net">Commotion-discuss@lists.chambana.net</a>><br>
</div></div>> > <mailto:<a href="mailto:Commotion-discuss@lists.chambana.net">Commotion-discuss@lists.chambana.net</a><br>
<div class="im">> <mailto:<a href="mailto:Commotion-discuss@lists.chambana.net">Commotion-discuss@lists.chambana.net</a>>><br>
> > > <a href="https://lists.chambana.net/mailman/listinfo/commotion-discuss" target="_blank">https://lists.chambana.net/mailman/listinfo/commotion-discuss</a><br>
> > ><br>
> ><br>
> > --<br>
> > Dan Staples<br>
> ><br>
> > Open Technology Institute<br>
> > <a href="https://commotionwireless.net" target="_blank">https://commotionwireless.net</a><br>
> > OpenPGP key: <a href="http://disman.tl/pgp.asc" target="_blank">http://disman.tl/pgp.asc</a><br>
> > Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9<br>
> > _______________________________________________<br>
> > Commotion-dev mailing list<br>
> > <a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a><br>
> <mailto:<a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a>><br>
</div>> > <mailto:<a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a><br>
<div class="im">> <mailto:<a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a>>><br>
> > <a href="https://lists.chambana.net/mailman/listinfo/commotion-dev" target="_blank">https://lists.chambana.net/mailman/listinfo/commotion-dev</a><br>
> ><br>
> ><br>
> ><br>
> ><br>
> > --<br>
> > Ben West<br>
> > <a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a><br>
> > <a href="mailto:ben@gowasabi.net">ben@gowasabi.net</a> <mailto:<a href="mailto:ben@gowasabi.net">ben@gowasabi.net</a>><br>
</div>> <mailto:<a href="mailto:ben@gowasabi.net">ben@gowasabi.net</a> <mailto:<a href="mailto:ben@gowasabi.net">ben@gowasabi.net</a>>><br>
> > <a href="tel:314-246-9434" value="+13142469434">314-246-9434</a> <tel:<a href="tel:314-246-9434" value="+13142469434">314-246-9434</a>><br>
<div class="im">><br>
> --<br>
> Dan Staples<br>
><br>
> Open Technology Institute<br>
> <a href="https://commotionwireless.net" target="_blank">https://commotionwireless.net</a><br>
> OpenPGP key: <a href="http://disman.tl/pgp.asc" target="_blank">http://disman.tl/pgp.asc</a><br>
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9<br>
> _______________________________________________<br>
> Commotion-dev mailing list<br>
> <a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a><br>
</div><div class="im">> <mailto:<a href="mailto:Commotion-dev@lists.chambana.net">Commotion-dev@lists.chambana.net</a>><br>
> <a href="https://lists.chambana.net/mailman/listinfo/commotion-dev" target="_blank">https://lists.chambana.net/mailman/listinfo/commotion-dev</a><br>
><br>
><br>
><br>
><br>
> --<br>
> Ben West<br>
</div>> <a href="mailto:me@benwest.name">me@benwest.name</a> <mailto:<a href="mailto:me@benwest.name">me@benwest.name</a>><br>
<div class="HOEnZb"><div class="h5">--<br>
Dan Staples<br>
<br>
Open Technology Institute<br>
<a href="https://commotionwireless.net" target="_blank">https://commotionwireless.net</a><br>
OpenPGP key: <a href="http://disman.tl/pgp.asc" target="_blank">http://disman.tl/pgp.asc</a><br>
Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Ben West<div><a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a><br><a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a><br>
314-246-9434<br></div>
</div>