[Commotion-dev] Setting masq & MTU values correctly with OLSRd SmartGateway

Ben West ben at gowasabi.net
Mon Aug 5 16:24:24 UTC 2013


Sorry for error.  The iptables rule mentioned above to clamp MTU=1480 for
traffic leaving the mesh zone on the gateway node is incorrect.

OpenWRT deploys a rather convoluted firewall by default, and the iptables
chain "FORWARD" suggested in OLSRd README is not the right chain.  I think
a correct chain is zone_wan (although maybe zone_mesh_forward)?  Certainly,
anyone with better familiarity with OpenWRT iptables convention do please
chime in.

# Clamp all traffic leaving to MTU of OLSRd tunnel MTU
iptables -A zone_wan -o eth0 -p tcp --tcp-flags SYN,RST SYN -j TCPMSS
--set-mss 1480



On Sun, Aug 4, 2013 at 1:08 PM, Ben West <ben at gowasabi.net> wrote:

>
> To distill the important points about my suspicions of mismatched MTUs:
>
> 1. The OLSRd readme for SmartGateway specifically recommends clamping
> MTU=1480 for all packets leaving the mesh on a gateway node, to align with
> the MTU used by tunnels b/w nodes that SmartGateway creates.
>
> 2. Before adding the iptables rule for #1, I did see certain repeater
> nodes lose Internet connectivity (aka pings to the outside fail) *immediately
> after* their local OLSRd instance added the tnl_* tunnel to the gateway
> node.
>
> 3. Likewise, trail and error with the 'mtu_fix' option on the firewalls of
> gateway and repeater nodes did appear effective in restoring repeater
> nodes' route to the outside, but not in any discernibly consistent
> fashion.  mtu_fix = "enable MSS clamping for *outgoing* zone traffic."
> So, mtu_fix seems to imperfectly or incompletely perform the same function
> as the iptables rule in #1.  I've since disabled mtu_fix on all mesh nodes
> affected, to avoid throughput loss from any unnecessary MTU clamping.
>
> 4. Finally, the 'masq' option appears necessary for the mesh and WAN
> firewall zones on the gateway node, and likewise for the AP/LAN zones on
> repeater nodes.  It should not be enabled on the AP/LAN zones on the
> gateway node, nor on the mesh zone for repeater nodes.  (This last item
> could be in error.  Theoretically, one would expect masq to be required for
> all zones.)
>
> All of these details do suggest that a node needing to switch from gateway
> to repeater role on-the-fly, i.e. when a wired uplink fails, will likely
> need to reboot.
>
>
>
> On Sun, Aug 4, 2013 at 12:27 PM, Seamus Tuohy <s2e at opentechinstitute.org>wrote:
>
>> Yea, I think with the info you have its plus the debugging info on the
>> networks we have seen (if those who have seen this could send that) we can
>> create a close approximation and monitor it closely.
>>
>>
>> Dan Staples <danstaples at opentechinstitute.org> wrote:
>>>
>>> On Sun 04 Aug 2013 12:25:35 PM EDT, Ben West wrote:
>>>
>>>>
>>>> It unfortunately wouldn't be possible for me to revert these node back
>>>> to known bag configuration and do packet capture, since the nodes are
>>>> under active use by folks other than me (their patience can be finite
>>>>
>>>>
>>>> ;).  Usually, my encounters with this issue spring from complaints
>>>> about nodes losing their Internet route and needing recovery.  So, the
>>>> testing results I would be able to share for this instance are going
>>>> to be limited to largely empirical results.
>>>>
>>>>
>>>>
>>>> On Sun, Aug 4, 2013 at 11:02 AM, Dan Staples
>>>> <danstaples at opentechinstitute.org
>>>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>>>
>>>>
>>>> On Sat 03 Aug 2013 10:08:56 PM EDT, Will Hawkins wrote:
>>>>
>>>>> Ben,
>>>>>
>>>>> Thank you for sending this out to the list. Keep us updated on your
>>>>>
>>>>>
>>>>> progress. We will work through your recommendations on our end
>>>>>
>>>> and see
>>>>
>>>>> what comes of it. Thanks again!
>>>>>
>>>>>
>>>>> Will
>>>>>
>>>>> On 08/03/2013 08:38 PM, Ben West wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I'm emailing here in follow-up to a recent thread on
>>>>>>
>>>>>> commotion-discuss
>>>>
>>>>>
>>>>>>
>>>>>> about repeater nodes not reliably having their
>>>>>> internet-bound
>>>>>>
>>>>> traffic
>>>>
>>>>>
>>>>>>
>>>>>> routed.
>>>>>>
>>>>>> In particular, I was seeing my own repeater nodes, both 1-hop
>>>>>>
>>>>> way and
>>>>
>>>>>
>>>>> especially 2-hops away, apparently losing their connection to the
>>>>>> Internet once the local OLSRd instance had inserted its tnl_*
>>>>>>
>>>>>> interface
>>>>
>>>>>
>>>>>>
>>>>>> (i.e. a few minutes after power up).  I'm highlighting the
>>>>>>
>>>>> instance of a
>>>>
>>>>>
>>>>> node 2-hops away, since in practice that can be difficult to
>>>>>>
>>>>> replicate.
>>>>
>>>> Below is the readme about the SmartGateway feature.  Do please
>>>>>>
>>>>> note in
>>>>
>>>>>
>>>>> particular the recommendation to add an iptables rules to the
>>>>>>
>>>>> gateway
>>>>
>>>>>
>>>>> node, clamping all packets leaving the mesh to the same MTU as
>>>>>>
>>>>> what is
>>>>
>>>>>
>>>>> used by the OLSRd SmartGateway tunnels.
>>>>>
>>>>>
>>>>>
>>>>> http://svn.dd-wrt.com/browser/src/router/olsrd/README-Olsr-Extensions
>>>>
>>>>
>>>>
>>>>>>
>>>>>> This seemed to resonate with a sporadic problem I'd been having
>>>>>>
>>>>> with
>>>>
>>>>>
>>>>>>
>>>>>> repeater nodes occasionally not seeing their Internet-bound traffic
>>>>>> correctly routed, despite all routing tables appearing valid.
>>>>>>
>>>>> These
>>>>
>>>>>
>>>>>>
>>>>>> repeater nodes restored their Internet connection when I
>>>>>>
>>>>> enabled the
>>>>
>>>>>
>>>>>>
>>>>>> 'mtu_fix' option on their local firewall.  But not consistently so,
>>>>>> making the problem very challenging to resolve.
>>>>>>
>>>>>> So, following the advice from the readme, I added this to
>>>>>> /etc/firewall.user on my gateway node (eth0 is its wired uplink):
>>>>>>
>>>>>>
>>>>>> iptables -A FORWARD -o eth0 -p tcp --tcp-flags SYN,RST SYN -j
>>>>>>
>>>>> TCPMSS
>>>>
>>>>>
>>>>>>
>>>>>> --set-mss 1480
>>>>>>
>>>>>> On the gateway node with wired WAN and LAN ports, in addition
>>>>>>
>>>>> to the
>>>>
>>>>>
>>>>> mesh interface, I set these firewall zones in /etc/config/firewall:
>>>>>>
>>>>>> config zone
>>>>>> option name 'mesh'
>>>>>>
>>>>>>
>>>>>> option network 'mesh'
>>>>>> option input 'ACCEPT'
>>>>>> option output 'ACCEPT'
>>>>>> option forward 'ACCEPT'
>>>>>> option 'masq' '1'
>>>>>>
>>>>>> config zone
>>>>>> option name 'wan'
>>>>>>
>>>>>> option output 'ACCEPT'
>>>>>>
>>>>>> option masq '1'
>>>>>> option input 'DROP'
>>>>>> option forward 'ACCEPT'
>>>>>>
>>>>>> config zone
>>>>>> option input 'ACCEPT'
>>>>>> option output 'ACCEPT'
>>>>>> option forward
>>>>>> 'ACCEPT'
>>>>>> option name 'lan'
>>>>>> option network 'lan'
>>>>>>
>>>>>> config forwarding
>>>>>> option src 'mesh'
>>>>>> option dest 'wan'
>>>>>>
>>>>>> config forwarding
>>>>>> option src 'lan'
>>>>>> option dest 'wan'
>>>>>>
>>>>>>
>>>>>> config 'forwarding'
>>>>>> option 'src' 'mesh'
>>>>>> option 'dest' 'mesh'
>>>>>>
>>>>>> Next, on repeater nodes with only one LAN port (counter-intuitively
>>>>>> labeled 'wan'), these are the firewall zones:
>>>>>>
>>>>>>
>>>>>> config 'zone'
>>>>>> option 'name' 'mesh'
>>>>>> option 'input' 'ACCEPT'
>>>>>> option 'output' 'ACCEPT'
>>>>>> option 'forward' 'ACCEPT'
>>>>>>
>>>>>> config zone
>>>>>>
>>>>>> option name        wan
>>>>>>
>>>>>> option input    ACCEPT
>>>>>> option output    ACCEPT
>>>>>> option forward    ACCEPT
>>>>>> option masq        1
>>>>>>
>>>>>> config 'forwarding'
>>>>>> option 'src' 'wan'
>>>>>> option 'dest' 'mesh'
>>>>>>
>>>>>>
>>>>>>
>>>>>> config 'forwarding'
>>>>>> option 'src' 'mesh'
>>>>>> option 'dest' 'mesh'
>>>>>>
>>>>>> In particular, note how the 'masq' option is enabled on the
>>>>>>
>>>>> z
>>>>  one
>>>> 'mesh'
>>>>
>>>>>
>>>>>>
>>>>>> only for the gateway node, but not on the repeater nodes.
>>>>>>
>>>>> Likewise,
>>>>
>>>>>
>>>>>>
>>>>>> 'masq' is enabled on the zone corresponding to the local LAN
>>>>>>
>>>>> port of the
>>>>
>>>>>
>>>>> repeater nodes, but not on the LAN port of the gateway no
>>>>>>  de.
>>>>>> For
>>>>>> Commotion-OpenWRT, the firewall zones of the LAN ports would be
>>>>>> equivalent to those of the APs of each node.
>>>>>>
>>>>>> This configuration described above appears to be what works for
>>>>>>
>>>>> using
>>>>
>>>>
>>>>>>
>>>>>> the SmartGateway feature with OLSRd v0.6.5.4-commotion-0.1-1.
>>>>>>
>>>>>> I'm trying to review the firewall rules that commotiond is
>>>>>>
>>>>> generating
>>>>
>>>>>
>>>>> for gateway and repeater nodes, to see if they follow.
>>>>>>
>>>>> However, I'm<
>>>>  br
>>>> />
>>>>
>>>>>
>>>>>>
>>>>>> posting to the listserv now in case someone else happens to see an
>>>>>> oversight in how Commotion-OpenWRT is deploying OLSRd and
>>>>>>
>>>>> firewall config.
>>>>
>>>>
>>>>> On Thu, Aug 1, 2013 at 1:38 PM, Ben West <ben at gowasabi.net
>>>>>>
>>>>>>
>>>>>> <
>>>>>
>>>>> mailto:ben at gowasabi.net>
>>>> <
>>>>>
>>>>>
>>>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>> wrote:
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>>
>>>>>> I can confirm having encountered similar issues specifically
>>>>>> repeater nodes not correctly masq'ing AP traffic to the
>>>>>>
>>>>> Internet,
>>>>
>>>>>
>>>>> while gateway nodes do.  My own problems were encountered
>>>>>>
>>>>> running
>>>>
>>>>>
>>>>> both private APs and coovachilli APs (as opposed to
>>>>>>
>>>>> nodogsplash) on
>>>> **
>>>>>
>>>>>
>>>>>
>>>>> nodes running the WasabiNet firmware, not
>>>>>
>>>> Commotion-OpenWRT. However
>>>
>>>>
>>>> the firewall config is quite similar.
>>>>>
>>>>> So, on repeater nodes I have the 'masq' option enabled for
>>>>>
>>>>> zones
>>>
>>>>
>>>>>
>>>>> mesh, ap1, and ap2 (i.e. public and pri
>>>>>  vate
>>>>> APs).  On
>>>>>
>>>> gateway nodes,
>>>
>>>>
>>>>>
>>>>> it seems that I need to have 'masq' disabled for zones ap1
>>>>>
>>>> and ap2.
>>>
>>>
>>>> On Wed, Jul 31, 2013 at 5:18 PM, Ryan Gerety
>>>>> <gerety at opentechinstitute.org
>>>>>
>>>>>
>>>>> <
>>>>
>>>> mailto:gerety at opentechinstitute.org>
>>> <mailto:gerety at opentechinstitute.org
>>>
>>>
>>> <mailto:gerety at opentechinstitute.org>>>
>>>
>>>>
>>>> wrote:
>>>>>
>>>>> After a further chat with Preston, this seems like it
>>>>>
>>>> *might* be
>>>
>>>
>>>>>
>>>>> the same problem I encountered at the Hackerspace in Tunis.
>>>>> When using the AP of the gateway node the client can
>>>>>
>>>> access the
>>>
>>>>
>>>> internet and when on another mesh node (say via ssh)
>>>>>
>>>> you
>>>   can
>>>
>>>>
>>>>>
>>>>> access the internet, however, when you are on the AP of
>>>>>
>>>> another
>>>
>>>>
>>>>>
>>>>> node you cannot access the internet.
>>>>>
>>>>> I had sent this to the tech list about two weeks ago,
>>>>>
>>>> and in the
>>>
>>>>
>>>> office Seamus and Griffin thoug
>>>>>  ht it
>>>>> might be a zone issue.
>>>>> Seamus and Griffin did you discover what the problem
>>>>>
>>>> actually
>>>
>>>>
>>>> was?  Were you able to replicate the problem?
>>>>>
>>>>> Best,
>>>>> Ryan
>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ben West
>>>>> http://gowasabi.net
>>>>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>>>>>
>>>>>
>>>>> <
>>>>
>>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>>>
>>>>
>>>> 314-246-9434 <tel:314-246-9
>>>>>  434>
>>>>> <tel:314-246-9434
>>>>> <
>>>>
>>>> tel:314-246-9434>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>>
>>>>> --
>>>>> Ben West
>>>>> http://gowasabi.net
>>>>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>>>>>
>>>>>
>>>>> <
>>>>
>>>> mailto:ben at gowasabi.net <mailto:ben at gowasabi.net>>
>>>
>>>>
>>>> 314-246-9434 <tel:314-246-9434> <tel:314-246-9434
>>>>>
>>>>>
>>>>> <
>>>>
>>>> tel:314-246-9434>>
>>>
>>>
>>>
>>>>> ------------------------------
>>>>>
>>>>> Commotion-dev mailing list
>>>>> Commotion-dev at lists.chambana.net
>>>>> <
>>>>
>>>> mailto:Commotion-dev at lists.chambana.net>
>>>
>>>
>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Commotion-dev mailing list
>>>> Commotion-dev at lists.chambana.net
>>>>
>>>>
>>>> <
>>>
>>> mailto:Commotion-dev at lists.chambana.net>
>>>
>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>
>>>
>>>
>>>
>>> Would we be able to verify the MTU problem by doing packet captures
>>> along the route from a repeater to a gateway, and checking the MTU of
>>> packets that get through versus those that get dropped?
>>>
>>> --
>>>
>>>
>>> Dan Staples
>>>
>>> Open Technology Institute
>>> https://commotionwireless.net
>>> OpenPGP key: http://disman.tl/pgp.asc
>>>
>>>
>>> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>>> ------------------------------
>>>
>>> Commotion-dev mailing list
>>> Commotion-dev at lists.chambana.net
>>>
>>>
>>> <mailto:Commotion-dev at lists.chambana.net>
>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Ben West
>>> http://gowasabi.net
>>> ben at gowasabi.net <mailto:ben at gowasabi.net>
>>>
>>> 314-246-9434
>>>
>>>
>> We could try it on our test networks to diagnose the problem, if you
>> think it would lead to useful data. I'm just trying to think of ways we
>> could test your hypothesis about the MTU issue...
>>
>> --
>> Dan Staples
>>
>> Open Technology Institute
>> https://commotionwireless.net
>> OpenPGP key: http://disman.tl/pgp.asc
>> Fingerprint: 2480 095D 4B16 436F 35AB 7305 F670 74ED BD86 43A9
>> ------------------------------
>>
>> Commotion-dev mailing list
>> Commotion-dev at lists.chambana.net
>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
>>
>>
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>
>
>
>
> --
> Ben West
> http://gowasabi.net
> ben at gowasabi.net
> 314-246-9434
>



-- 
Ben West
http://gowasabi.net
ben at gowasabi.net
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-dev/attachments/20130805/e131ae11/attachment-0001.html>


More information about the Commotion-dev mailing list