- You have basic knowledge of fabric server, cluster and ensemble and its functions
- You can navigate through Linux servers and know the basic commands
- You are authorised to make changes on servers, please test locally first
- All fuse fabric processes are shut down
How do you know the Zookeeper server cluster is not running/broken?
org.fusesource.fabric.api.FabricException: java.lang.IllegalStateException: Error waiting for ZooKeeper connection
In the FuseESB/FuseMQ root container logs you will see the following error
05:47:11,059 | WARN | 0.0/0.0.0.0:2182 | NIOServerCnxn | 51 – org.fusesource.fabric.fabric-linkedin-zookeeper – 7.1.0.fuse-047 | Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
Impact
Disaster Recovery
fuse@VBoxSvr1:$netstat -nltp | grep :2182
This will give you the port number, the process id and the process name
tcp 0 0 :::2182 :::* LISTEN 28767/java
You can also find out which of the two root containers is the server by searching for the PID (28767/java)
fuse@VBoxSvr1:~$ps -eaf | grep 28767
fuse 892 864 0 18:28 pts/0 00:00:00 grep –color=auto 28767
fuse 28767 28765 0 05:28 ? 00:04:07 java -Dkaraf.home=/opt/FuseESBEnterprise-7.1.0 -Dkaraf.base=/opt/FuseESBEnterprise-7.1.0 -Dkaraf.data=/opt/FuseESBEnterprise-7.1.0/data -Dcom.sun.management.jmxremote -Dkaraf.stafuseesbLocalConsole=false -Dkaraf.stafuseesbRemoteShell=true -Djava.endorsed.dirs=/usr/java/jdk1.7.0_17/jre/lib/endorsed:/usr/java/jdk1.7.0_17/lib/endorsed:/opt/FuseESBEnterprise-7.1.0/lib/endorsed -Djava.ext.dirs=/usr/java/jdk1.7.0_17/jre/lib/ext:/usr/java/jdk1.7.0_17/lib/ext:/opt/FuseESBEnterprise-7.1.0/lib/ext -Xmx512m -Djava.library.path=/opt/FuseESBEnterprise-7.1.0/lib/ -classpath /opt/FuseESBEnterprise-7.1.0/lib/karaf-wrapper.jar:/opt/FuseESBEnterprise-7.1.0/lib/karaf.jar:/opt/FuseESBEnterprise-7.1.0/lib/karaf-jaas-boot.jar:/opt/FuseESBEnterprise-7.1.0/lib/karaf-wrapper-main.jar -Dwrapper.key=aUKIyldpL0xq60Xs -Dwrapper.pofuseesb=32001 -Dwrapper.jvm.pofuseesb.min=31000 -Dwrapper.jvm.pofuseesb.max=31999 -Dwrapper.pid=28765 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.apache.karaf.shell.wrapper.Main
- org_apache_felix_cm_impl_DynamicBindings.config
- zookeeper.config
- factory.config
- {uniqueservername}.config
/opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org_apache_felix_cm_impl_DynamicBindings.config
/opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper.config
/opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/factory.config
/opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/af327e34-501b-1576-9abf-7ad6eb7eb582.config
View the server configuration file for the running server
cat /opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/af327e34-501b-1576-9abf-7ad6eb7eb582.config
Note the server.id and the service.pid. They are unique to each fabric server.
server.3=”VBoxSvr1:2889:3889″
server.2=”VBoxSvr1:2888:3888″
server.1=”VBoxSvr0:2888:3888″
server.id=”3″
clientPofuseesb=”2182″
service.factoryPid=”org.fusesource.fabric.zookeeper.server”
tickTime=”2000″
fabric.zookeeper.pid=”org.fusesource.fabric.zookeeper.server-0001″
initLimit=”10″
syncLimit=”20″
service.pid=”org.fusesource.fabric.zookeeper.server.af327e34-501b-1576-9abf-7ad6eb7eb582″
The zookeeper config can be found in the following directory
fuse@VBoxSvr0:$cat /opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper.config
The contained information should look like this
service.pid=”org.fusesource.fabric.zookeeper”
zookeeper.password=”7Aiqgl25jIMr9rc3″
zookeeper.url=”VBoxSvr0.localdomain:2182,VBoxSvr1.localdomain:2181,VBoxSvr1.localdomain:2182″ fabric.zookeeper.pid=”org.fusesource.fabric.zookeeper”
The factory configuration can be found further down inside the zookeeper folder
fuse@VBoxSvr1:$cat /opt/FuseESBEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/factory.config
This contained information links the config file to the unique server id.
factory.pid=”org.fusesource.fabric.zookeeper.server”
factory.pidList=[“org.fusesource.fabric.zookeeper.server.af327e34-501b-1576-9abf-7ad6eb7eb582”]
What to look for?
- How many Zookeeper config files are missing?
- How many server config files are missing?
- How many factory. config files are missing or contains incorrect information?
- How many DynamicBindings.config files are missing the server information?
{uniqueservername}.config
the above mentioned files. First up is the unique server name string. To find this
unique string for the fabric servers which are not running, search the
log files of the Fuse ESB and MQ root containers. The logs will contain the
information. Example below, searching for FuseMQ fabric server string.
2013-06-30 19:54:48,068 | INFO | okeeper.server]) | ZKServerFactoryBean | ternal.BaseManagedServiceFactory 69 | 50 – org.fusesource.fabric.fabric-zookeeper – 7.1.0.fuse-047 | Configuration org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d updated: {server.3=VBoxSvr1:2889:3889,
server.2=VBoxSvr1:2888:3888, server.1=VBoxSvr0:2888:3888, server.id=2, clientpofuseesb=2181, service.factorypid=org.fusesource.fabric.zookeeper.server,
ticktime=2000, fabric.zookeeper.pid=org.fusesource.fabric.zookeeper.server-0001, synclimit=5, initlimit=10,
service.pid=org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d, datadir=data/zookeeper/0001}
2013-06-30 19:54:48,111 | INFO | pool-10-thread-1 | ZKServerFactoryBean | per.internal.ZKServerFactoryBean 44 | 50 – org.fusesource.fabric.fabric-zookeeper – 7.1.0.fuse-047 | Creating zookeeper server with propefuseesbies: {server.3=VBoxSvr1:2889:3889, server.2=VBoxSvr1:2888:3888, server.1=VBoxSvr0:2888:3888, server.id=2, clientpofuseesb=2181, service.factorypid=org.fusesource.fabric.zookeeper.server, ticktime=2000,
fabric.zookeeper.pid=org.fusesource.fabric.zookeeper.server-0001, synclimit=5, initlimit=10, service.pid=org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d, datadir=data/zookeeper/0001}
2013-06-30 19:54:49,010 | INFO | use-047-thread-2 | ZooKeeperConfigAdminBridge | admin.ZooKeeperConfigAdminBridge 361 | 149 – org.fusesource.fabric.fabric-configadmin – 7.1.0.fuse-047 | Deleting configuration org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d
2013-06-30 19:54:49,012 | INFO | 5a-0a511818883d) | ZKServerFactoryBean | ternal.BaseManagedServiceFactory 88 | 50 – org.fusesource.fabric.fabric-zookeeper – 7.1.0.fuse-047 | Configuration org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d delete
If you do not find any information in the logs, search the full folder using grep
grep -irl “org.fusesource.fabric.zookeeper.server.*” /opt/FusefusemqEnterprise-7.1.0/*
vim /opt/FusefusemqEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/56084def-f94c-4af6-bc5a-0a818848883d.config
server.3=”VBoxSvr1:2889:3889″
server.2=”VBoxSvr1:2888:3888″
server.1=”VBoxSvr0:2888:3888″
server.id=”2″
clientPofuseesb=”2182″
service.factoryPid=”org.fusesource.fabric.zookeeper.server”
tickTime=”2000″
fabric.zookeeper.pid=”org.fusesource.fabric.zookeeper.server-0001″
initLimit=”10″
syncLimit=”20″
service.pid=”org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d”
zookeeper.config
org_apache_felix_cm_impl_DynamicBindings.config
and add the line given below. Then save and close.
org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d=”mvn:org.fusesource.fabric/fabric-zookeeper/7.1.0.fuse-047″
factory.config
vim /opt/FusefusemqEnterprise-7.1.0/data/cache/bundle5/data/config/org/fusesource/fabric/zookeeper/server/factory.config
Here the factory.pidList value should be the same service.pid in the server config. Both these files should be in the same location.
factory.pid=”org.fusesource.fabric.zookeeper.server”
factory.pidList=[“org.fusesource.fabric.zookeeper.server.56084def-f94c-4af6-bc5a-0a818848883d”]
service fuseesb-service start
service fusemq-service start
tail -f /opt/FuseMQEnterprise-7.1.0/data/log/fusemq.log
Try to login to the console using client connection
ssh admin@VBoxSvr0 -p 8101
Then give container-list to view all the containers.
FuseMQ:admin@fusemq-root-node02> container-list | grep true
FuseManagementConsole 1.3 true fmc
fusemq-root-node01 1.3 true fabric, fabric-ensemble-0001-1 success
fusemq-root-node02* 1.3 true fabric, fabric-ensemble-0001-2 success
fuseesb-root-node01 1.3 true fabric, fuse-esb-full success
fuseesb-child-node01 1.3 true fabric, default success
fuseesb-child2-node01 1.3 true fabric, default success
fuseesb-root-node02 1.3 true fabric, fabric-ensemble-0001-3 success
FuseMQ:admin@fusemq-root-node02> ensemble-list
[id]
fusemq-root-node01
fusemq-root-node02
fuseesb-root-node02
Leave a Reply