[Pacemaker] CTS outputs Single search timed out.
nozawat
nozawat at gmail.com
Thu Jan 27 07:17:31 UTC 2011
Hi
I was able to complete CTS.
A bad point is the following points.
1)Python after 2.5 is necessary.
However, I used 2.4 of RHEL5.5.
Therefore I carried out CTS with Python 2.6.5 of RHEL6.0.
2)The following environment variables are necessary.
* cluster_log=/share/ha/logs/ha-log-local7
* cluster_hosts="cts0101 cts0102"
3)It started to add stonith-enabled to cib-bootstrap-options to let you
read cib.xml.
The following errors occur unless they do so it.
-----
Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR:
unpack_resources: Resource start-up disabled since no STONITH resources have
been defined
Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR:
unpack_resources: Either configure some or disable STONITH with the
stonith-enabled option
Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR:
unpack_resources: NOTE: Clusters with shared data need STONITH to ensure
data integrity
----
The log is as follows when I carried out CTS.
---
[buildbot at bbs02 /usr/share/pacemaker/tests/cts]$ python CTSlab.py --nodes
"cts0201 cts0202" --at-boot 1 --stack corosync --stonith no --logfile
/share/ha/logs/ha-log-local7 --syslog-facility local7 --cib-filename
/share/ha/cib.xml 10
Jan 27 15:48:41 Random seed is: 1296110921
Jan 27 15:48:41 >>>>>>>>>>>>>>>> BEGINNING 10 TESTS
Jan 27 15:48:41 Stack: corosync (flatiron)
Jan 27 15:48:41 Schema: pacemaker-1.0
Jan 27 15:48:41 Scenario: Random Test Execution
Jan 27 15:48:41 Random Seed: 1296110921
Jan 27 15:48:41 System log files: /share/ha/logs/ha-log-local7
Jan 27 15:48:41 Cluster nodes:
Jan 27 15:48:41 * cts0201
Jan 27 15:48:41 * cts0202
Jan 27 15:48:53 Testing for syslog logs
Jan 27 15:48:53 Testing for remote logs
Jan 27 15:49:31 Continuing with remote-based log reader
Jan 27 15:49:42 Stopping Cluster Manager on all nodes
Jan 27 15:49:42 Starting Cluster Manager on all nodes.
Jan 27 15:49:42 Starting crm-flatiron on node cts0201
Jan 27 15:51:07 Starting crm-flatiron on node cts0202
Jan 27 15:52:40 Running test SimulStop (cts0202) [ 1]
Jan 27 15:53:39 Running test NearQuorumPoint (cts0202) [ 2]
Jan 27 15:55:34 Running test ComponentFail (cts0201) [ 3]
Jan 27 15:57:40 Running test Reattach (cts0202) [ 4]
Jan 27 16:02:35 Running test SimulStop (cts0201) [ 5]
Jan 27 16:03:26 Running test SpecialTest1 (cts0201) [ 6]
Jan 27 16:06:19 Running test ComponentFail (cts0201) [ 7]
Jan 27 16:07:17 Running test SpecialTest1 (cts0201) [ 8]
Jan 27 16:10:47 Running test ComponentFail (cts0201) [ 9]
Jan 27 16:11:42 BadNews: Jan 27 16:11:03 cts0201 crmd: [23399]: ERROR:
stonithd_op_result_ready: not signed on
Jan 27 16:11:45 Running test ResourceRecover (cts0202) [ 10]
Jan 27 16:11:46 No active resources on cts0202
Jan 27 16:12:03 Stopping Cluster Manager on all nodes
Jan 27 16:12:03 Stopping crm-flatiron on node cts0201
Jan 27 16:12:26 Stopping crm-flatiron on node cts0202
Jan 27 16:13:09 ****************
Jan 27 16:13:09 Overall Results:{'failure': 0, 'skipped': 0, 'success': 10,
'BadNews': 1}
Jan 27 16:13:09 ****************
Jan 27 16:13:09 Test Summary
Jan 27 16:13:09 Test Flip: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test Restart: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test StartOnebyOne: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test SimulStart: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test SimulStop: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 2}
Jan 27 16:13:09 Test StopOnebyOne: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test RestartOnebyOne: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test PartialStart: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test Standby: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 0}
Jan 27 16:13:09 Test ResourceRecover: {'auditfail': 0, 'failure': 0,
'skipped': 1, 'calls': 1}
Jan 27 16:13:09 Test ComponentFail: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 3}
Jan 27 16:13:09 Test Reattach: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 1}
Jan 27 16:13:09 Test SpecialTest1: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 2}
Jan 27 16:13:09 Test NearQuorumPoint: {'auditfail': 0, 'failure': 0,
'skipped': 0, 'calls': 1}
Jan 27 16:13:09 <<<<<<<<<<<<<<<< TESTS COMPLETED
-----
The URL to take into account is as follows.
http://www.clusterlabs.org/wiki/Release_Testing
Regards,
Tomo
2011年1月26日20:01 nozawat <nozawat at gmail.com>:
> Hi Andrew
>
> Where is filename of cts_log_watcher.py set?
> cts_log_watcher is made by /tmp, but filename of this inside seems not to
> be changed by /var/log/messages.
> Or filename seems not to be handed by CTSlab.py.
>
> Regards,
> Tomo
>
> 2011年1月22日11:15 nozawat <nozawat at gmail.com>:
>
> Hi
>>
>> Thank you for your reply.
>> I stopped a script in "CTL+C" after "Audit LogAudit FAILED" was output.
>> * bbs01-console.log -> central server console log
>> * ha-log-local7-bbs01 -> central server
>> * ha-log-local7-cts0101 -> cts server 1
>> * ha-log-local7-cts0102 -> cts server 2
>>
>> The real file name of the server log is ha-log-local7.
>> I renamed a file name to send it by an email.
>> They are made with all servers by /share/ha/logs subordinates.
>>
>> BTW, a file is made in /tmp.
>> Don't you have any problem in authority below?
>> ---
>> [11:14:23][root at bbs01 ~]$ ll /tmp
>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py
>> [11:12:39][root at cts0101 ~]$ ll /tmp
>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py
>> [11:13:36][root at cts0102 ~]$ ll /tmp
>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py
>> ---
>>
>> Regards,
>> Tomo
>>
>> 2011/1/22 Andrew Beekhof <andrew at beekhof.net>
>>
>> On Fri, Jan 21, 2011 at 4:38 PM, nozawat <nozawat at gmail.com> wrote:
>>> > Hi
>>> >
>>> > Thank you for your reply.
>>> > I logging with central server and both running CTS server.
>>> > A test message is output by both central server and running CTS
>>> server.
>>>
>>> Can we see it please?
>>>
>>> >
>>> > Regards,
>>> > Tomo
>>> >
>>> > 2011/1/21 Andrew Beekhof <andrew at beekhof.net>
>>> >>
>>> >> On Fri, Jan 21, 2011 at 6:03 AM, nozawat <nozawat at gmail.com> wrote:
>>> >> > Hi
>>> >> >
>>> >> > I ran CTS in the following environment.
>>> >> > * OS:RHEL5.5-x86_64
>>> >> > * pacemaker-1.0.9.1-1.15.el5
>>> >> > * TDN(bbs01)
>>> >> > * TNNs(cts0101 cts0102)
>>> >> >
>>> >> > Probably it is a phenomenon like the following.
>>> >> > http://www.gossamer-threads.com/lists/linuxha/pacemaker/69322
>>> >> >
>>> >> > SSH login without password -> OK.
>>> >> > Syslog Message transfer by syslog-ng -> OK.
>>> >>
>>> >> You're logging to a central server? The same server you're running
>>> CTS
>>> >> on?
>>> >> If so, what is the contents of /share/ha/logs/ha-log-local7 on that
>>> >> machine? Because that is where CTS is looking.
>>> >>
>>> >> >
>>> >> > -------
>>> >> > $ python /usr/share/pacemaker/tests/cts/CTSlab.py --nodes "cts0101
>>> >> > cts0102"
>>> >> > --at-boot 1 --stack heartbeat --stonith no --logfile
>>> >> > /share/ha/logs/ha-log-local7 --syslog-facility local7 1
>>> >> > Jan 21 13:23:08 Random seed is: 1295583788
>>> >> > Jan 21 13:23:08 >>>>>>>>>>>>>>>> BEGINNING 1 TESTS
>>> >> > Jan 21 13:23:08 Stack: heartbeat
>>> >> > Jan 21 13:23:08 Schema: pacemaker-1.0
>>> >> > Jan 21 13:23:08 Scenario: Random Test Execution
>>> >> > Jan 21 13:23:08 Random Seed: 1295583788
>>> >> > Jan 21 13:23:08 System log files: /share/ha/logs/ha-log-local7
>>> >> > Jan 21 13:23:08 Cluster nodes:
>>> >> > Jan 21 13:23:08 * cts0101
>>> >> > Jan 21 13:23:08 * cts0102
>>> >> > Jan 21 13:23:12 Testing for syslog logs
>>> >> > Jan 21 13:23:12 Testing for remote logs
>>> >> > Jan 21 13:24:16 Restarting logging on: ['cts0101', 'cts0102']
>>> >> > Jan 21 13:25:49 Restarting logging on: ['cts0101', 'cts0102']
>>> >> > Jan 21 13:28:21 Restarting logging on: ['cts0101', 'cts0102']
>>> >> > Jan 21 13:31:54 Restarting logging on: ['cts0101', 'cts0102']
>>> >> > Jan 21 13:35:54 ERROR: Cluster logging unrecoverable.
>>> >> > Jan 21 13:35:54 Audit LogAudit FAILED.
>>> >> > -----
>>> >> >
>>> >> > I run it in heartbeat, but a similar error occurs in corosync.
>>> >> > I become the error in "Single search timed out" in the log and seem
>>> to
>>> >> > retry.
>>> >> > -----
>>> >> > Jan 21 13:23:11 bbs01 CTS: debug: Audit DiskspaceAudit passed.
>>> >> > Jan 21 13:23:12 bbs01 CTS: Testing for syslog logs
>>> >> > Jan 21 13:23:12 bbs01 CTS: Testing for remote logs
>>> >> > Jan 21 13:23:12 bbs01 CTS: debug: lw:
>>> >> > cts0101:/share/ha/logs/ha-log-local7:
>>> >> > Installing /tmp/cts_log_watcher.py on cts0101
>>> >> > Jan 21 13:23:12 bbs01 CTS: debug: lw:
>>> >> > cts0102:/share/ha/logs/ha-log-local7:
>>> >> > Installing /tmp/cts_log_watcher.py on cts0102
>>> >> > Jan 21 13:23:13 cts0102 logger: Test message from cts0102
>>> >> > Jan 21 13:23:13 cts0101 logger: Test message from cts0101
>>> >> > Jan 21 13:23:44 bbs01 CTS: debug: lw: LogAudit: Single search timed
>>> out:
>>> >> > timeout=30, start=1295583793, limit=1295583824, now=1295583824
>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: lw: LogAudit: Single search timed
>>> out:
>>> >> > timeout=30, start=1295583824, limit=1295583855, now=1295583856
>>> >> > Jan 21 13:24:16 bbs01 CTS: Restarting logging on: ['cts0101',
>>> 'cts0102']
>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: cmd: async: target=cts0101,
>>> rc=22203:
>>> >> > /etc/init.d/syslog-ng restart 2>&1 > /dev/null
>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: cmd: async: target=cts0102,
>>> rc=22204:
>>> >> > /etc/init.d/syslog-ng restart 2>&1 > /dev/null
>>> >> > Jan 21 13:25:17 cts0102 logger: Test message from cts0102
>>> >> > Jan 21 13:25:17 cts0101 logger: Test message from cts0101
>>> >> > -----
>>> >> >
>>> >> > The test case seems to be carried out after this error.
>>> >> > However, the script is finished by an error. It is because "Audit
>>> >> > LogAudit
>>> >> > FAILED" occurs.
>>> >> > Is it right that how becomes the result of the CTS?
>>> >> >
>>> >> > Regards,
>>> >> > Tomo
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> >> >
>>> >> > Project Home: http://www.clusterlabs.org
>>> >> > Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> >> > Bugs:
>>> >> >
>>> >> >
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>> >> >
>>> >> >
>>> >>
>>> >> _______________________________________________
>>> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> >>
>>> >> Project Home: http://www.clusterlabs.org
>>> >> Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> >> Bugs:
>>> >>
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>> >
>>> >
>>> > _______________________________________________
>>> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> >
>>> > Project Home: http://www.clusterlabs.org
>>> > Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> > Bugs:
>>> >
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>> >
>>> >
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs:
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110127/cced9734/attachment-0002.htm>
More information about the Pacemaker
mailing list