[Commotion-dev] meshing over ethernet

Will Hawkins hawkinsw at opentechinstitute.org
Tue Jul 8 22:36:40 EDT 2014


A fix for the control-message-as-the-first-message-in-an-olsr-packet has
been conceived and tested. There is now only one thing left to fix:

When a node sends out a challenge control message (a), if it receives a
challenge message from (a) before it receives a challenge response
control message, the whole things goes to pot. This is incredibly common
in the scenario when nodes are meshing over multiple interfaces. Once
this problem is resolved, I think we will have a complete solution.

Will

On 07/07/2014 06:25 PM, Will Hawkins wrote:
> Okay!
> 
> So, please (apparently) disregard my previous messages. The root of the
> problem is conceptually more simple (although I haven't yet started
> thinking about the fix):
> 
> It appears that MDP expects
> challenge/challenge-response/response-response messages (i.e. MDP
> control packets) to be the very first message in any OLSR packet. That
> means that if any of those messages is not first, then they will get
> missed. This is obviously a problem.
> 
> The pico stations never seem to append any packets before an MDP control
> packet. The Buffalo router does. I think it's more of a timing/packet
> size issue, but the Buffalo router is a good test case because it
> exercises this little "gem".
> 
> Now, on to the fix. I am hoping to get something going tonight. I will
> keep everyone posted!
> 
> Will
> 
> On 07/03/2014 06:18 PM, Will Hawkins wrote:
>> Further debugging seems to indicate that if one of the two nodes is
>> meshing over a single interface, the other node may be set to mesh over
>> two interfaces. In other words, the problem seems to exist only when
>> both nodes are meshing over multiple interfaces.
>>
>> Go figure?
>>
>> I'm getting more and more flummoxed by what is going on, but we are
>> working hard at fixing the problem.
>>
>> Will
>>
>> On 07/02/2014 08:49 PM, Will Hawkins wrote:
>>> Miles,
>>>
>>> We have uncovered the root of the problem and wanted to share the findings.
>>>
>>> First of all, thank you for your patience with us as we debugged this
>>> issue. Without your input, we would never have realized that this was a
>>> problem.
>>>
>>> In cases like yours, olsrd is meshing over two different interfaces.
>>> There is a primary interface address that labels the node throughout the
>>> network and there are other, secondary, addresses that label the
>>> individual interfaces.
>>>
>>> In the Serval route signing plugin, we use those labels to index a table
>>> of timeouts/timestamps. The values from this table are used to locate
>>> the proper key, the proper timestamp skew, etc.
>>>
>>> When a node has multiple interfaces, the plugin gets confused about
>>> which label to use to index that table. As a result, the skews never
>>> converge and the routes cannot be signed.
>>>
>>> We are going to start looking at possible solutions for this problem as
>>> soon as possible. We cannot promise a fix before the start of Toorcamp
>>> next week, but we are going to do our best. We will keep you posted on
>>> our progress and send you any fixes.
>>>
>>> In the meantime, the only way to work around the problem is to mesh on a
>>> single interface per node.
>>>
>>> I hope this information helps. As I said, we will keep you posted!
>>>
>>> Thanks again for all the input you've given us!
>>> Will
>>>
>>> On 07/02/2014 05:29 PM, Dan Staples wrote:
>>>> Hey Miles,
>>>>
>>>> That sounds like a good plan B to me, if we can't fix this issue. But we
>>>> (and by that I mean folks at the office other than me) did some testing
>>>> today to see if we could figure out the problem you're seeing. Here's
>>>> what they found:
>>>>
>>>> Serval route signing between Buffalo and Ubiquiti routers causes
>>>> commotiond and olsrd to seg fault (but works fine in Ubiq-only meshes).
>>>> Debugging it indicates that it's a memory-related architecture-specific
>>>> problem in commotiond. The hardware we used to replicate the issue were
>>>> Ubiquiti Picostation and Buffalo WZR-HP-G300NH.
>>>>
>>>> We already have one open memory-related fix for commotiond that may or
>>>> may not solve the problem:
>>>> https://github.com/opentechinstitute/commotiond/pull/103. We'll do some
>>>> more testing today and tomorrow and let you know anything else we find.
>>>>
>>>> Thanks for your patience with this and hopefully we'll be able to
>>>> resolve the problem.
>>>>
>>>> Dan
>>>>
>>>> On 07/02/2014 11:44 AM, Myles wrote:
>>>>> So plan b for meshing in production is to use WPA on the mesh interface and firewall OLSR to be unreachable from non mesh interfaces. Right?
>>>>>
>>>>> Sent from my mobile
>>>>>
>>>>>> On Jul 2, 2014, at 7:25 AM, Chris Ritzo <critzo at opentechinstitute.org> wrote:
>>>>>>
>>>>>> Miles,
>>>>>> I was discussing this thread with some other team members this morning,
>>>>>> and we think you've confirmed a bug that we found in our 1.1rc2
>>>>>> connectivity tests.
>>>>>>
>>>>>> Those tests confirm that two nodes meshed via ethernet will work when
>>>>>> not signed and fail when signed. Your report that turning off Serval
>>>>>> signing makes the center Buffalo node work properly.
>>>>>>
>>>>>> Our team is still debugging this and will be pushing feedback to Serval
>>>>>> about it, however in the interim, turning off route signing via Serval
>>>>>> should solve this for you.
>>>>>>
>>>>>> I'm sure Josh and Will can weigh in on more specifics related to the bug.
>>>>>>
>>>>>> -Chris
>>>>>>
>>>>>>> On Wed 02 Jul 2014 07:06:52 AM EDT, Dan Staples wrote:
>>>>>>> The current master branch is now using an upgraded version of olsrd,
>>>>>>> version 0.6.6, but doing a diff b/w the versions doesn't show anything
>>>>>>> that would affect the route signing. So it should be fully compatible.
>>>>>>>
>>>>>>> Is your setup something like this?
>>>>>>>
>>>>>>> [ubiquiti]---wifi---[ubiquiti]---ethernet---[buffalo
>>>>>>> center]---wifi---[buffalo]
>>>>>>>
>>>>>>> I can try to recreate a similar setup and test it tomorrow when I have
>>>>>>> access to a test network. I'm not sure if we've extensively tested mixed
>>>>>>> wifi/ethernet meshing and route signing together.
>>>>>>>
>>>>>>> Did you see any log output from the center or ubiquiti devices when
>>>>>>> route signing was turned on that could indicated what the problem was?
>>>>>>>
>>>>>>> Also CCing a couple other folks that might have some good
>>>>>>> troubleshooting ideas.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>> On 07/02/2014 03:24 AM, miles wrote:
>>>>>>>> This is giving me no end of trouble. I've now tested, and with all
>>>>>>>> firewalls turned off
>>>>>>>>
>>>>>>>> 3 ubiquiti nodes will mesh using serval over wifi. As you said, it takes
>>>>>>>> a few minutes,(but not more than 5) to settle. 
>>>>>>>> 2 Buffalo nodes will mesh over wifi. 
>>>>>>>> 1 buffalo node "Center" is connected to one ubiquiti over ethernet.
>>>>>>>> Turning off serval signing makes everything work as expected through
>>>>>>>> node Center.
>>>>>>>>
>>>>>>>> Turn on serval, and center sees buffalos, but will not communicate with
>>>>>>>> the ubiquiti device.  
>>>>>>>>
>>>>>>>> Thoughts for what to test/debug next? 
>>>>>>>>
>>>>>>>> The buffalos were build using master last week. Ubiquitis are 1.1rc2.
>>>>>>>> Does master play nicely with 1.1 right now?  The next thing I can think
>>>>>>>> of to try is to rebuild with commotion feed as 1.1 and see if getting
>>>>>>>> the same olsrd version will magically fix things. 
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jul 1, 2014, at 7:48 AM, Dan Staples
>>>>>>>> <danstaples at opentechinstitute.org
>>>>>>>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>>>>>>>
>>>>>>>>> Serval signed routes will work without a gateway/NTP. However, it will
>>>>>>>>> definitely take up to 5 minutes for the timestamps to converge. They
>>>>>>>>> *will* converge though, even if the starting clocks on the nodes are
>>>>>>>>> days or months apart. Give it a few minutes and see if it starts working
>>>>>>>>> again.
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Commotion-dev mailing list
>>>>>>>> Commotion-dev at lists.chambana.net
>>>>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>>>
>>>>>> _______________________________________________
>>>>>> Commotion-dev mailing list
>>>>>> Commotion-dev at lists.chambana.net
>>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>
>>> _______________________________________________
>>> Commotion-dev mailing list
>>> Commotion-dev at lists.chambana.net
>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>
>> _______________________________________________
>> Commotion-dev mailing list
>> Commotion-dev at lists.chambana.net
>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>
> _______________________________________________
> Commotion-dev mailing list
> Commotion-dev at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-dev
> 


More information about the Commotion-dev mailing list