[Commotion-dev] meshing over ethernet

Will Hawkins hawkinsw at opentechinstitute.org
Wed Jul 9 19:42:10 EDT 2014


On that note, please see
https://github.com/opentechinstitute/olsrd/pull/25 as a potential fix
for this issue.

I've got this code running in (basically) Miles' setup and see good
routes. We are going to test this extensively over the next few days and
will see what happens. We'll keep everyone posted.

Will

On 07/09/2014 07:39 PM, Will Hawkins wrote:
> 
> 
> On 07/09/2014 07:35 PM, Dan Staples wrote:
>> Is this something that would affect the olsrd-secure plugin as well?
> 
> Yes. All of the "fixes" that I put in my pull request affect the
> olsrd-secure plugin too. That's a big ol' :-(
> 
> Will
> 
>>
>> Dan
>>
>> On 07/08/2014 10:36 PM, Will Hawkins wrote:
>>> A fix for the control-message-as-the-first-message-in-an-olsr-packet has
>>> been conceived and tested. There is now only one thing left to fix:
>>>
>>> When a node sends out a challenge control message (a), if it receives a
>>> challenge message from (a) before it receives a challenge response
>>> control message, the whole things goes to pot. This is incredibly common
>>> in the scenario when nodes are meshing over multiple interfaces. Once
>>> this problem is resolved, I think we will have a complete solution.
>>>
>>> Will
>>>
>>> On 07/07/2014 06:25 PM, Will Hawkins wrote:
>>>> Okay!
>>>>
>>>> So, please (apparently) disregard my previous messages. The root of the
>>>> problem is conceptually more simple (although I haven't yet started
>>>> thinking about the fix):
>>>>
>>>> It appears that MDP expects
>>>> challenge/challenge-response/response-response messages (i.e. MDP
>>>> control packets) to be the very first message in any OLSR packet. That
>>>> means that if any of those messages is not first, then they will get
>>>> missed. This is obviously a problem.
>>>>
>>>> The pico stations never seem to append any packets before an MDP control
>>>> packet. The Buffalo router does. I think it's more of a timing/packet
>>>> size issue, but the Buffalo router is a good test case because it
>>>> exercises this little "gem".
>>>>
>>>> Now, on to the fix. I am hoping to get something going tonight. I will
>>>> keep everyone posted!
>>>>
>>>> Will
>>>>
>>>> On 07/03/2014 06:18 PM, Will Hawkins wrote:
>>>>> Further debugging seems to indicate that if one of the two nodes is
>>>>> meshing over a single interface, the other node may be set to mesh over
>>>>> two interfaces. In other words, the problem seems to exist only when
>>>>> both nodes are meshing over multiple interfaces.
>>>>>
>>>>> Go figure?
>>>>>
>>>>> I'm getting more and more flummoxed by what is going on, but we are
>>>>> working hard at fixing the problem.
>>>>>
>>>>> Will
>>>>>
>>>>> On 07/02/2014 08:49 PM, Will Hawkins wrote:
>>>>>> Miles,
>>>>>>
>>>>>> We have uncovered the root of the problem and wanted to share the findings.
>>>>>>
>>>>>> First of all, thank you for your patience with us as we debugged this
>>>>>> issue. Without your input, we would never have realized that this was a
>>>>>> problem.
>>>>>>
>>>>>> In cases like yours, olsrd is meshing over two different interfaces.
>>>>>> There is a primary interface address that labels the node throughout the
>>>>>> network and there are other, secondary, addresses that label the
>>>>>> individual interfaces.
>>>>>>
>>>>>> In the Serval route signing plugin, we use those labels to index a table
>>>>>> of timeouts/timestamps. The values from this table are used to locate
>>>>>> the proper key, the proper timestamp skew, etc.
>>>>>>
>>>>>> When a node has multiple interfaces, the plugin gets confused about
>>>>>> which label to use to index that table. As a result, the skews never
>>>>>> converge and the routes cannot be signed.
>>>>>>
>>>>>> We are going to start looking at possible solutions for this problem as
>>>>>> soon as possible. We cannot promise a fix before the start of Toorcamp
>>>>>> next week, but we are going to do our best. We will keep you posted on
>>>>>> our progress and send you any fixes.
>>>>>>
>>>>>> In the meantime, the only way to work around the problem is to mesh on a
>>>>>> single interface per node.
>>>>>>
>>>>>> I hope this information helps. As I said, we will keep you posted!
>>>>>>
>>>>>> Thanks again for all the input you've given us!
>>>>>> Will
>>>>>>
>>>>>> On 07/02/2014 05:29 PM, Dan Staples wrote:
>>>>>>> Hey Miles,
>>>>>>>
>>>>>>> That sounds like a good plan B to me, if we can't fix this issue. But we
>>>>>>> (and by that I mean folks at the office other than me) did some testing
>>>>>>> today to see if we could figure out the problem you're seeing. Here's
>>>>>>> what they found:
>>>>>>>
>>>>>>> Serval route signing between Buffalo and Ubiquiti routers causes
>>>>>>> commotiond and olsrd to seg fault (but works fine in Ubiq-only meshes).
>>>>>>> Debugging it indicates that it's a memory-related architecture-specific
>>>>>>> problem in commotiond. The hardware we used to replicate the issue were
>>>>>>> Ubiquiti Picostation and Buffalo WZR-HP-G300NH.
>>>>>>>
>>>>>>> We already have one open memory-related fix for commotiond that may or
>>>>>>> may not solve the problem:
>>>>>>> https://github.com/opentechinstitute/commotiond/pull/103. We'll do some
>>>>>>> more testing today and tomorrow and let you know anything else we find.
>>>>>>>
>>>>>>> Thanks for your patience with this and hopefully we'll be able to
>>>>>>> resolve the problem.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>> On 07/02/2014 11:44 AM, Myles wrote:
>>>>>>>> So plan b for meshing in production is to use WPA on the mesh interface and firewall OLSR to be unreachable from non mesh interfaces. Right?
>>>>>>>>
>>>>>>>> Sent from my mobile
>>>>>>>>
>>>>>>>>> On Jul 2, 2014, at 7:25 AM, Chris Ritzo <critzo at opentechinstitute.org> wrote:
>>>>>>>>>
>>>>>>>>> Miles,
>>>>>>>>> I was discussing this thread with some other team members this morning,
>>>>>>>>> and we think you've confirmed a bug that we found in our 1.1rc2
>>>>>>>>> connectivity tests.
>>>>>>>>>
>>>>>>>>> Those tests confirm that two nodes meshed via ethernet will work when
>>>>>>>>> not signed and fail when signed. Your report that turning off Serval
>>>>>>>>> signing makes the center Buffalo node work properly.
>>>>>>>>>
>>>>>>>>> Our team is still debugging this and will be pushing feedback to Serval
>>>>>>>>> about it, however in the interim, turning off route signing via Serval
>>>>>>>>> should solve this for you.
>>>>>>>>>
>>>>>>>>> I'm sure Josh and Will can weigh in on more specifics related to the bug.
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>>> On Wed 02 Jul 2014 07:06:52 AM EDT, Dan Staples wrote:
>>>>>>>>>> The current master branch is now using an upgraded version of olsrd,
>>>>>>>>>> version 0.6.6, but doing a diff b/w the versions doesn't show anything
>>>>>>>>>> that would affect the route signing. So it should be fully compatible.
>>>>>>>>>>
>>>>>>>>>> Is your setup something like this?
>>>>>>>>>>
>>>>>>>>>> [ubiquiti]---wifi---[ubiquiti]---ethernet---[buffalo
>>>>>>>>>> center]---wifi---[buffalo]
>>>>>>>>>>
>>>>>>>>>> I can try to recreate a similar setup and test it tomorrow when I have
>>>>>>>>>> access to a test network. I'm not sure if we've extensively tested mixed
>>>>>>>>>> wifi/ethernet meshing and route signing together.
>>>>>>>>>>
>>>>>>>>>> Did you see any log output from the center or ubiquiti devices when
>>>>>>>>>> route signing was turned on that could indicated what the problem was?
>>>>>>>>>>
>>>>>>>>>> Also CCing a couple other folks that might have some good
>>>>>>>>>> troubleshooting ideas.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>> On 07/02/2014 03:24 AM, miles wrote:
>>>>>>>>>>> This is giving me no end of trouble. I've now tested, and with all
>>>>>>>>>>> firewalls turned off
>>>>>>>>>>>
>>>>>>>>>>> 3 ubiquiti nodes will mesh using serval over wifi. As you said, it takes
>>>>>>>>>>> a few minutes,(but not more than 5) to settle. 
>>>>>>>>>>> 2 Buffalo nodes will mesh over wifi. 
>>>>>>>>>>> 1 buffalo node "Center" is connected to one ubiquiti over ethernet.
>>>>>>>>>>> Turning off serval signing makes everything work as expected through
>>>>>>>>>>> node Center.
>>>>>>>>>>>
>>>>>>>>>>> Turn on serval, and center sees buffalos, but will not communicate with
>>>>>>>>>>> the ubiquiti device.  
>>>>>>>>>>>
>>>>>>>>>>> Thoughts for what to test/debug next? 
>>>>>>>>>>>
>>>>>>>>>>> The buffalos were build using master last week. Ubiquitis are 1.1rc2.
>>>>>>>>>>> Does master play nicely with 1.1 right now?  The next thing I can think
>>>>>>>>>>> of to try is to rebuild with commotion feed as 1.1 and see if getting
>>>>>>>>>>> the same olsrd version will magically fix things. 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jul 1, 2014, at 7:48 AM, Dan Staples
>>>>>>>>>>> <danstaples at opentechinstitute.org
>>>>>>>>>>> <mailto:danstaples at opentechinstitute.org>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Serval signed routes will work without a gateway/NTP. However, it will
>>>>>>>>>>>> definitely take up to 5 minutes for the timestamps to converge. They
>>>>>>>>>>>> *will* converge though, even if the starting clocks on the nodes are
>>>>>>>>>>>> days or months apart. Give it a few minutes and see if it starts working
>>>>>>>>>>>> again.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Commotion-dev mailing list
>>>>>>>>>>> Commotion-dev at lists.chambana.net
>>>>>>>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Commotion-dev mailing list
>>>>>>>>> Commotion-dev at lists.chambana.net
>>>>>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Commotion-dev mailing list
>>>>>> Commotion-dev at lists.chambana.net
>>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>>>
>>>>> _______________________________________________
>>>>> Commotion-dev mailing list
>>>>> Commotion-dev at lists.chambana.net
>>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>>
>>>> _______________________________________________
>>>> Commotion-dev mailing list
>>>> Commotion-dev at lists.chambana.net
>>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>>
>>> _______________________________________________
>>> Commotion-dev mailing list
>>> Commotion-dev at lists.chambana.net
>>> https://lists.chambana.net/mailman/listinfo/commotion-dev
>>>
>>
> _______________________________________________
> Commotion-dev mailing list
> Commotion-dev at lists.chambana.net
> https://lists.chambana.net/mailman/listinfo/commotion-dev
> 


More information about the Commotion-dev mailing list