[Pacemaker] problem starting new instance of pacemaker (via corosync)
John White
jwhite at lbl.gov
Fri Sep 7 19:33:45 UTC 2012
I actually just chased my tail down this path to no avail (mounted /var/run from a local disk). I'll give the tmpfs a try. Here is my /proc/mounts:
rootfs / rootfs rw,relatime 0 0
/proc /proc proc rw,relatime 0 0
/sys /sys sysfs rw,relatime 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
none /dev devtmpfs rw,relatime,size=32878032k,nr_inodes=8219508,mode=755 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
bluearc-fc:/perceus/ /var/lib/perceus nfs4 ro,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.200.14,minorversion=0,local_lock=none,addr=10.0.0.51 0 0
----------------
John White
HPC Systems Engineer
(510) 486-7307
One Cyclotron Rd, MS: 50C-3209C
Lawrence Berkeley National Lab
Berkeley, CA 94720
On Sep 7, 2012, at 11:58 AM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 07.09.2012 18:28, John White wrote:
>> An odd update to this. We run in a stateless environment (nodes are
>> pxe booted and have NFS roots, etc). Trying the same install on a VM
>> works just fine. I wonder if anyone has experience with pacemaker and
>> stateless nodes.
>
> I run it with iso image loaded from PXE server to RAM.
> State data and cluster-wide configuration is on CIFS. Volatile RW data
> is on tmpfs.
>
> Probably you have some trouble with communication paths used for
> interconnection. Try to mount /var/run to tmpfs. Or where is that socket
> on linux?
>
> memset (&address, 0, sizeof (struct sockaddr_un));
> address.sun_family = AF_UNIX;
> #if defined(COROSYNC_LINUX)
> sprintf (address.sun_path + 1, "%s", socket_name);
> #else
> sprintf (address.sun_path, "%s/%s", SOCKETDIR, socket_name);
> #endif
>
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info:
> init_ais_connection_classic: Connection to our AIS plugin (10) failed:
> Library error (2)
> It is ENOENT (2) /* No such file or directory */
> Could you provide content of /proc/mounts?
>
> Vladislav
>
>
>> ----------------
>> John White
>> HPC Systems Engineer
>> (510) 486-7307
>> One Cyclotron Rd, MS: 50C-3209C
>> Lawrence Berkeley National Lab
>> Berkeley, CA 94720
>>
>> On Sep 6, 2012, at 2:49 PM, John White <jwhite at lbl.gov> wrote:
>>
>>> Hello Folks,
>>> I'm having a very hard time getting a basic pacemaker setup going. I've gotten corosync up and running just fine from what i can tell, but once I start with pacemaker commands, I get CIB errors everywhere:
>>>
>>> -bash-4.1# crm configure
>>> Signon to CIB failed: connection failed
>>> Init failed, could not perform requested operations
>>> ERROR: cannot parse xml: no element found: line 1, column 0
>>> crm(live)configure#
>>>
>>> Digging deeper, I see both attrd and cib failing to connect to the AIS plugin:
>>>
>>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
>>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: HA Signon failed
>>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: Aborting startup
>>> -snip-
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: get_cluster_type: Cluster type is: 'openais'
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: init_ais_connection_classic: Connection to our AIS plugin (10) failed: Library error (2)
>>> Sep 06 14:42:52 n0014.lustre cib: [13223]: CRIT: cib_init: Cannot sign in to the cluster… terminating
>>>
>>>
>>> I'm really at a loss here after 3 days, any ideas or hints as to where I might find a solution? More logging available upon request.
>>>
>>>
>>>
>>> ----------------
>>> John White
>>> HPC Systems Engineer
>>> (510) 486-7307
>>> One Cyclotron Rd, MS: 50C-3209C
>>> Lawrence Berkeley National Lab
>>> Berkeley, CA 94720
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list