[Commotion-dev] Setting masq & MTU values correctly with OLSRd SmartGateway

Ben West ben at gowasabi.net
Sun Aug 4 18:08:20 UTC 2013


To distill the important points about my suspicions of mismatched MTUs:

1. The OLSRd readme for SmartGateway specifically recommends clamping
MTU=1480 for all packets leaving the mesh on a gateway node, to align with
the MTU used by tunnels b/w nodes that SmartGateway creates.

2. Before adding the iptables rule for #1, I did see certain repeater nodes
lose Internet connectivity (aka pings to the outside fail) *immediately
after* their local OLSRd instance added the tnl_* tunnel to the gateway
node.

3. Likewise, trail and error with the 'mtu_fix' option on the firewalls of
gateway and repeater nodes did appear effective in restoring repeater
nodes' route to the outside, but not in any discernibly consistent
fashion.  mtu_fix = "enable MSS clamping for *outgoing* zone traffic."  So,
mtu_fix seems to imperfectly or incompletely perform the same function as
the iptables rule in #1.  I've since disabled mtu_fix on all mesh nodes
affected, to avoid throughput loss from any unnecessary MTU clamping.

4. Finally, the 'masq' option appears necessary for the mesh and WAN
firewall zones on the gateway node, and likewise for the AP/LAN zones on
repeater nodes.  It should not be enabled on the AP/LAN zones on the
gateway node, nor on the mesh zone for repeater nodes.  (This last item
could be in error.  Theoretically, one would expect masq to be required for
all zones.)

All of these details do suggest that a node needing to switch from gateway
to repeater role on-the-fly, i.e. when a wired uplink fails, will likely
need to reboot.



On Sun, Aug 4, 2013 at 12:27 PM, Seamus Tuohy <s2e at opentechinstitute.org>wrote:

> Yea, I think with the info you have its plus the debugging info on the
> networks we have seen (if those who have seen this could send that) we can
> create a close approximation and monitor it closely.
>
>
> Dan Staples <danstaples at opentechinstitute.org> wrote:
>>
>> On Sun 04 Aug 2013 12:25:35 PM EDT, Ben West wrote:
>>
>>>
>>> It unfortunately wouldn't be possible for me to revert these node back
>>> to known bag configuration and do packet capture, since the nodes are
>>> under active use by folks other than me (their patience can be finite
>>>
>>> ;).  Usually, my encounters with this issue spring from complaints
>>> about nodes losing their Internet route and needing recovery.  So, the
>>> testing results I would be able to share for this instance are going
>>> to be limited to largely empirical results.
>>>
>>>
>>> On Sun, Aug 4, 2013 at 11:02 AM, Dan Staples
>>> <danstaples at opentechinstitute.org
>>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>>
>>> On Sat 03 Aug 2013 10:08:56 PM EDT, Will Hawkins wrote:
>>>
>>>> Ben,
>>>>
>>>> Thank you for sending this out to the list. Keep us updated on your
>>>>
>>>> progress. We will work through your recommendations on our end
>>>>
>>> and see
>>>
>>>> what comes of it. Thanks again!
>>>>
>>>> Will
>>>>
>>>> On 08/03/2013 08:38 PM, Ben West wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I'm emailing here in follow-up to a recent thread on
>>>>>
>>>> commotion-discuss
>>>
>>>>
>>>>> about repeater nodes not reliably having their
>>>>> internet-bound
>>>>>
>>>> traffic
>>>
>>>>
>>>>> routed.
>>>>>
>>>>> In particular, I was seeing my own repeater nodes, both 1-hop
>>>>>
>>>> way and
>>>
>>>> especially 2-hops away, apparently losing their connection to the
>>>>> Internet once the local OLSRd instance had inserted its tnl_*
>>>>>
>>>> interface
>>>
>>>>
>>>>> (i.e. a few minutes after power up).  I'm highlighting the
>>>>>
>>>> instance of a
>>>
>>>> node 2-hops away, since in practice that can be difficult to
>>>>>
>>>> replicate.
>>>
>>> Below is the readme about the SmartGateway feature.  Do please
>>>>>
>>>> note in
>>>
>>>> particular the recommendation to add an iptables rules to the
>>>>>
>>>> gateway
>>>
>>>> node, clamping all packets leaving the mesh to the same MTU as
>>>>>
>>>> what is
>>>
>>>> used by the OLSRd SmartGateway tunnels.
>>>>
>>>>
>>>>
>>>> http://svn.dd-wrt.com/browser/src/router/olsrd/README-Olsr-Extensions
>>>
>>>
>>>>> This seemed to resonate with a sporadic problem I'd been having
>>>>>
>>>> with
>>>
>>>>
>>>>> repeater nodes occasionally not seeing their Internet-bound traffic
>>>>> correctly routed, despite all routing tables appearing valid.
>>>>>
>>>> These
>>>
>>>>
>>>>> repeater nodes restored their Internet connection when I
>>>>>
>>>> enabled the
>>>
>>>>
>>>>> 'mtu_fix' option on their local firewall.  But not consistently so,
>>>>> making the problem very challenging to resolve.
>>>>>
>>>>> So, following the advice from the readme, I added this to
>>>>> /etc/firewall.user on my gateway node (eth0 is its wired uplink):
>>>>>
>>>>> iptables -A FORWARD -o eth0 -p tcp --tcp-flags SYN,RST SYN -j
>>>>>
>>>> TCPMSS
>>>
>>>>
>>>>> --set-mss 1480
>>>>>
>>>>> On the gateway node with wired WAN and LAN ports, in addition
>>>>>
>>>> to the
>>>
>>>> mesh interface, I set these firewall zones in /etc/config/firewall:
>>>>>
>>>>> config zone
>>>>> option name 'mesh'
>>>>>
>>>>> option network 'mesh'
>>>>> option input 'ACCEPT'
>>>>> option output 'ACCEPT'
>>>>> option forward 'ACCEPT'
>>>>> option 'masq' '1'
>>>>>
>>>>> config zone
>>>>> option name 'wan'
>>>>> option output 'ACCEPT'
>>>>>
>>>>> option masq '1'
>>>>> option input 'DROP'
>>>>> option forward 'ACCEPT'
>>>>>
>>>>> config zone
>>>>> option input 'ACCEPT'
>>>>> option output 'ACCEPT'
>>>>> option forward
>>>>> 'ACCEPT'
>>>>> option name 'lan'
>>>>> option network 'lan'
>>>>>
>>>>> config forwarding
>>>>> option src 'mesh'
>>>>> option dest 'wan'
>>>>>
>>>>> config forwarding
>>>>> option src 'lan'
>>>>> option dest 'wan'
>>>>>
>>>>> config 'forwarding'
>>>>> option 'src' 'mesh'
>>>>> option 'dest' 'mesh'
>>>>>
>>>>> Next, on repeater nodes with only one LAN port (counter-intuitively
>>>>> labeled 'wan'), these are the firewall zones:
>>>>>
>>>>> config 'zone'
>>>>> option 'name' 'mesh'
>>>>> option 'input' 'ACCEPT'
>>>>> option 'output' 'ACCEPT'
>>>>> option 'forward' 'ACCEPT'
>>>>>
>>>>> config zone
>>>>> option name        wan
>>>>>
>>>>> option input    ACCEPT
>>>>> option output    ACCEPT
>>>>> option forward    ACCEPT
>>>>> option masq        1
>>>>>
>>>>> config 'forwarding'
>>>>> option 'src' 'wan'
>>>>> option 'dest' 'mesh'
>>>>>
>>>>>
>>>>> config 'forwarding'
>>>>> option 'src' 'mesh'
>>>>> option 'dest' 'mesh'
>>>>>
>>>>> In particular, note how the 'masq' option is enabled on the
>>>>>
>>>> z
>>>  one
>>> 'mesh'
>>>
>>>>
>>>>> only for the gateway node, but not on the repeater nodes.
>>>>>
>>>> Likewise,
>>>
>>>>
>>>>> 'masq' is enabled on the zone corresponding to the local LAN
>>>>>
>>>> port of the
>>>
>>>> repeater nodes, but not on the LAN port of the gateway no
>>>>>  de.
>>>>> For
>>>>> Commotion-OpenWRT, the firewall zones of the LAN ports would be
>>>>> equivalent to those of the APs of each node.
>>>>>
>>>>> This configuration described above appears to be what works for
>>>>>
>>>> using
>>>
>>>>
>>>>> the SmartGateway feature with OLSRd v0.6.5.4-commotion-0.1-1.
>>>>>
>>>>> I'm trying to review the firewall rules that commotiond is
>>>>>
>>>> generating
>>>
>>>> for gateway and repeater nodes, to see if they follow.
>>>>>
>>>> However, I'm<
>>>  br
>>> />
>>>
>>>>
>>>>> posting to the listserv now in case someone else happens to see an
>>>>> oversight in how Commotion-OpenWRT is deploying OLSRd and
>>>>>
>>>> firewall config.
>>>
>>> On Thu, Aug 1, 2013 at 1:38 PM, Ben West <ben at gowasabi.net
>>>>>
>>>>> <
>>>>
>>>> mailto:ben at gowasabi.net>
>>> <
>>>>
>>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I can confirm having encountered similar issues specifically
>>>>> repeater nodes not correctly masq'ing AP traffic to the
>>>>>
>>>> Internet,
>>>
>>>> while gateway nodes do.  My own problems were encountered
>>>>>
>>>> running
>>>
>>>> both private APs and coovachilli APs (as opposed to
>>>>>
>>>> nodogsplash) on
>>> **
>>>>
>>>>
>>>> nodes running the WasabiNet firmware, not
>>>>
>>> Commotion-OpenWRT. However
>>
>>> the firewall config is quite similar.
>>>>
>>>> So, on repeater nodes I have the 'masq' option enabled for
>>>>
>>> zones
>>
>>>
>>>> mesh, ap1, and ap2 (i.e. public and pri
>>>>  vate
>>>> APs).  On
>>>>
>>> gateway nodes,
>>
>>>
>>>> it seems that I need to have 'masq' disabled for zones ap1
>>>>
>>> and ap2.
>>
>> On Wed, Jul 31, 2013 at 5:18 PM, Ryan Gerety
>>>> <gerety at opentechinstitute.org
>>>>
>>>> <
>>>
>>> mailto:gerety at opentechinstitute.org>
>> <mailto:gerety at opentechinstitute.org
>>
>> <mailto:gerety at opentechinstitute.org>>>
>>
>>> wrote:
>>>>
>>>> After a further chat with Preston, this seems like it
>>>>
>>> *might* be
>>
>>>
>>>> the same problem I encountered at the Hackerspace in Tunis.
>>>> When using the AP of the gateway node the client can
>>>>
>>> access the
>>
>>> internet and when on another mesh node (say via ssh)
>>>>
>>> you
>>   can
>>
>>>
>>>> access the internet, however, when you are on the AP of
>>>>
>>> another
>>
>>>
>>>> node you cannot access the internet.
>>>>
>>>> I had sent this to the tech list about two weeks ago,
>>>>
>>> and in the
>>
>>> office Seamus and Griffin thoug
>>>>  ht it
>>>> might be a zone issue.
>>>> Seamus and Griffin did you discover what the problem
>>>>
>>> actually
>>
>>> was?  Were you able to replicate the problem?
>>>>
>>>> Best,
>>>> Ryan
>>>>
>>>>
>>>> <snip>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ben West
>>>> http://gowasabi.net
>>>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>>>>
>>>> <
>>>
>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>>
>>> 314-246-9434 <tel:314-246-9
>>>>  434>
>>>> <tel:314-246-9434
>>>> <
>>>
>>> tel:314-246-9434>>
>>
>>
>>
>>
>>
>>>> --
>>>> Ben West
>>>> http://gowasabi.net
>>>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>>>>
>>>> <
>>>
>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>>
>>> 314-246-9434 <tel:314-246-9434> <tel:314-246-9434
>>>>
>>>> <
>>>
>>> tel:314-246-9434>>
>>
>>
>> ------------------------------
>>>>
>>>> Commotion-dev mailing list
>>>> Commotion-dev at lists.chambana.net
>>>> <
>>>
>>> mailto:Commotion-dev at lists.chambana.net>
>>
>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>
>>>
>>> ------------------------------
>>>
>>> Commotion-dev mailing list
>>> Commotion-dev at lists.chambana.net
>>>
>>> <
>>
>> mailto:Commotion-dev at lists.chambana.net>
>>
>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
>>
>>
>> Would we be able to verify the MTU problem by doing packet captures
>> along the route from a repeater to a gateway, and checking the MTU of
>> packets that get through versus those that get dropped?
>>
>> --
>>
>> Dan Staples
>>
>> Open Technology Institute
>> https://commotionwireless.net
>> OpenPGP key: http://disman.tl/pgp.asc
>>
>> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>> ------------------------------
>>
>> Commotion-dev mailing list
>> Commotion-dev at lists.chambana.net
>>
>> <mailto:Commotion-dev at lists.chambana.net>
>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
>>
>>
>>
>> --
>> Ben West
>> http://gowasabi.net
>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>> 314-246-9434
>>
>>
> We could try it on our test networks to diagnose the problem, if you
> think it would lead to useful data. I'm just trying to think of ways we
> could test your hypothesis about the MTU issue...
>
> --
> Dan Staples
>
> Open Technology Institute
> https://commotionwireless.net
> OpenPGP key: http://disman.tl/pgp.asc
> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
> ------------------------------
>
> Commotion-dev mailing list
> Commotion-dev at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-dev
>
>
>
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>



-- 
Ben West
http://gowasabi.net
ben at gowasabi.net
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-dev/attachments/20130804/d61fd268/attachment-0001.html>


More information about the Commotion-dev mailing list