[Commotion-admin] [commotion-router] Mesh interfaces occationally break on boot (#94)

Seamus Tuohy notifications at github.com
Fri Jan 17 20:07:02 UTC 2014


This intermittent bug does not cause any processes to fail but will stop the node from sending any mesh packets over the network. I believe that this is associated with many other bugs we have been seeing. This is most easily recreated by re-naming the mesh ssid in the basic user interface but does not seem to be associated with the name given or length. Though, with that said, and the rule of large numbers being what it is I have seen the first instance using a string that is greater than nine characters with a number as the last character and a dash in the middle. It is not required, nor does it always work... but, it works more consistently than others.

cc: @jheretic as this is most likely a issue with hotplug/netifd and the kernel as shown in logs further down.

This is most easily identified as broken by looking at httpinfo and checking the Destination Gateway. It will have all 0's on the netmask.

$ netcat localhost 2006
> hit return again to have it return the data

Here are some examples from various logs I have taken.

fakeUser at fakeTerm:~/temp$ grep -E -A 2 "Table\:\sHNA" *

8charwork001:Table: HNA
8charwork001-Destination	Gateway
8charwork001-10.138.149.0/24	100.120.138.149
--
8xbroke001:Table: HNA
8xbroke001-Destination	Gateway
8xbroke001-10.138.149.0/24	0.0.0.0
--
9charbroke001:Table: HNA
9charbroke001-Destination	Gateway
9charbroke001-10.138.149.0/24	0.0.0.0
--
9xbroke001:Table: HNA
9xbroke001-Destination	Gateway
9xbroke001-10.138.149.0/24	0.0.0.0
--
broken_sw001:Table: HNA
broken_sw001-Destination	Gateway
broken_sw001-10.138.149.0/24	0.0.0.0
--
debug_dnsmasq:Table: HNA
debug_dnsmasq-Destination	Gateway
debug_dnsmasq-10.138.149.0/24	0.0.0.0
--
debug_NC_B002:Table: HNA
debug_NC_B002-Destination	Gateway
debug_NC_B002-10.138.149.0/24	0.0.0.0
--
debug_NC_W001:Table: HNA
debug_NC_W001-Destination	Gateway
debug_NC_W001-10.138.149.0/24	100.120.138.149
--
debugW001:Table: HNA
debugW001-Destination	Gateway
debugW001-10.138.149.0/24	100.120.138.149


Best I can tell the kernel seems to be failing and not triggering a new scan to find a IBSS to join on instances where this occurs. See the "kern.info" messages in the "good" logfiles below. The bad logfiles are just the same area in the logfile. The first "BAD" logfile shows the last time that there are kernel messages in the bad logfiles. I assume this is where the kernel error actually exists. The final "GOOD" log section shows what I assume to be the set of commands that the kernel is missing in bas restarts. 

BAD
===============
Jan 16 14:55:48 commotion daemon.info avahi-daemon[1566]: Withdrawing address record for fe80::a2f3:c1ff:fef8:8a95 on wlan0.
Jan 16 14:55:48 commotion kern.info kernel: [ 1371.490000] ADDRCONF(NETDEV_UP): wlan0-1: link is not ready
Jan 16 14:55:49 commotion kern.info kernel: [ 1372.710000] br-lan: port 2(wlan0) entered disabled state
Jan 16 14:55:50 commotion daemon.notice netifd: meshtest (10924): commotion.proto: Running protocol handler.
================

GOOD
================
Jan 16 14:53:22 commotion daemon.info avahi-daemon[1566]: Withdrawing address record for fe80::a2f3:c1ff:fef8:8a95 on wlan0.
Jan 16 14:53:22 commotion kern.info kernel: [ 1225.920000] ADDRCONF(NETDEV_UP): wlan0-1: link is not ready
Jan 16 14:53:23 commotion kern.info kernel: [ 1227.160000] br-lan: port 2(wlan0) entered disabled state
Jan 16 14:53:24 commotion kern.info kernel: [ 1227.270000] wlan0-1: Trigger new scan to find an IBSS to join
Jan 16 14:53:24 commotion daemon.notice netifd: meshtest (10006): commotion.proto: Running protocol handler.
================

BAD
=================
Jan 16 14:55:50 commotion daemon.notice netifd: meshtest (10924): commotion.proto: proto_add_dns_search: 
Jan 16 14:55:50 commotion user.notice commotion.proto: proto_add_dns_search: 
Jan 16 14:55:51 commotion daemon.notice netifd: meshtest (10924): meshtest(): Interface type not supported
Jan 16 14:55:51 commotion user.notice dnsmasq: DNS rebinding protection is active, will discard upstream RFC1918 responses!
Jan 16 14:55:51 commotion user.notice dnsmasq: Allowing 127.0.0.0/8 responses
Jan 16 14:55:51 commotion daemon.notice netifd: meshtest (10924): meshtest(): Interface type not supported
Jan 16 14:55:51 commotion daemon.notice netifd: meshtest (10924): commotion.proto: Sending update for meshtest
=================


GOOD
===============
Jan 16 14:53:25 commotion user.notice commotion.proto: proto_add_dns_search: 
Jan 16 14:53:26 commotion kern.info kernel: [ 1229.320000] wlan0-1: Trigger new scan to find an IBSS to join
Jan 16 14:53:26 commotion user.notice dnsmasq: DNS rebinding protection is active, will discard upstream RFC1918 responses!
Jan 16 14:53:26 commotion user.notice dnsmasq: Allowing 127.0.0.0/8 responses
Jan 16 14:53:26 commotion daemon.notice netifd: meshtest (10006): meshtest(): Interface type not supported
Jan 16 14:53:26 commotion daemon.notice netifd: meshtest (10006): meshtest(): Interface type not supported
Jan 16 14:53:26 commotion daemon.notice netifd: meshtest (10006): commotion.proto: Sending update for meshtest
==============

bad
=============
Jan 16 14:55:53 commotion user.notice commotion.hotplug.olsrd: meshed: 1
Jan 16 14:55:53 commotion user.notice commotion.hotplug.olsrd: announced: 0
Jan 16 14:55:55 commotion daemon.info dnsmasq[11222]: started, version 2.66 cachesize 150
=============

good
===============
Jan 16 14:53:27 commotion user.notice commotion.hotplug.olsrd: meshed: 1
Jan 16 14:53:27 commotion user.notice commotion.hotplug.olsrd: announced: 0
Jan 16 14:53:28 commotion kern.info kernel: [ 1231.370000] wlan0-1: Trigger new scan to find an IBSS to join
Jan 16 14:53:29 commotion daemon.info dnsmasq[10304]: started, version 2.66 cachesize 150
=================

BAD
Jan 16 14:55:55 commotion daemon.info dnsmasq-dhcp[11222]: read /etc/ethers - 0 addresses
Jan 16 14:55:56 commotion daemon.info olsrd[11240]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward


GOOD
Jan 16 14:53:29 commotion daemon.info dnsmasq-dhcp[10304]: read /etc/ethers - 0 addresses
Jan 16 14:53:30 commotion kern.info kernel: [ 1233.420000] wlan0-1: Trigger new scan to find an IBSS to join
Jan 16 14:53:30 commotion daemon.info olsrd[10318]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward

FINAL BAD
============
Jan 16 14:55:56 commotion daemon.info olsrd[11240]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jan 16 14:56:01 commotion daemon.info olsrd[11240]: olsr.org -  0.6.5.4-git_4c19cba-hash_3667acb4ad7e32204039db1f6b9bc660  - successfully started
Jan 16 14:56:01 commotion cron.info crond[729]: crond: USER root pid 11315 cmd /usr/bin/commotion-bigboard-send
Jan 16 14:56:01 commotion cron.info crond[729]: crond: USER root pid 11317 cmd /usr/sbin/ff_olsr_test_gw.sh
===========


FINAL GOOD !!!!!!
======================
Jan 16 14:53:30 commotion daemon.info olsrd[10318]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jan 16 14:53:31 commotion kern.info kernel: [ 1234.280000] wlan0-1: Creating new IBSS network, BSSID b2:d5:ac:6c:01:01
Jan 16 14:53:31 commotion kern.info kernel: [ 1234.290000] ADDRCONF(NETDEV_CHANGE): wlan0-1: link becomes ready
Jan 16 14:53:31 commotion kern.info kernel: [ 1234.500000] br-lan: port 2(wlan0) entered forwarding state
Jan 16 14:53:31 commotion kern.info kernel: [ 1234.510000] br-lan: port 2(wlan0) entered forwarding state
Jan 16 14:53:31 commotion kern.info kernel: [ 1234.520000] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Jan 16 14:53:32 commotion daemon.info avahi-daemon[1566]: Registering new address record for fe80::a0f3:c1ff:fef8:8a95 on wlan0-1.*.
Jan 16 14:53:33 commotion kern.info kernel: [ 1236.510000] br-lan: port 2(wlan0) entered forwarding state
Jan 16 14:53:35 commotion daemon.info olsrd[10318]: olsr.org -  0.6.5.4-git_4c19cba-hash_3667acb4ad7e32204039db1f6b9bc660  - successfully started
Jan 16 14:53:35 commotion daemon.info olsrd[10318]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/wlan0-1/send_redirects
Jan 16 14:53:35 commotion daemon.info olsrd[10318]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/wlan0-1/rp_filter
Jan 16 14:53:35 commotion daemon.info olsrd[10318]: Adding interface wlan0-1
Jan 16 14:53:35 commotion daemon.info olsrd[10318]: New main address: 100.120.138.149
Jan 16 14:54:01 commotion cron.info crond[729]: crond: USER root pid 10398 cmd /usr/bin/commotion-bigboard-send
Jan 16 14:54:01 commotion cron.info crond[729]: crond: USER root pid 10399 cmd /usr/sbin/ff_olsr_test_gw.sh
============================




---
Reply to this email directly or view it on GitHub:
https://github.com/opentechinstitute/commotion-router/issues/94
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chambana.net/pipermail/commotion-admin/attachments/20140117/39531203/attachment-0001.html>


More information about the Commotion-admin mailing list