Difference between revisions of "Training-high-availability"

From ThorstensHome
Jump to: navigation, search
(Concepts)
m (Training moved to Training-high-availability: need training as topic)
 
(17 intermediate revisions by one user not shown)
Line 1: Line 1:
 
= Concepts =
 
= Concepts =
Cluster Resource Agents are told by the Cluster Manager "start up" or "shut down".
+
Cluster Resource Agents are told by the Cluster Manager "start up" or "shut down". Virtual IP addresses are managed by the cluster.
  
 
= Host-based mirroring =
 
= Host-based mirroring =
Host-based mirroring is done with mdadm.
+
Host-based mirroring is done with mdadm. Besides, there is
 +
* log shipping
 +
* synchronous mirroring
  
 
= Autoyast =
 
= Autoyast =
Line 15: Line 17:
 
* http://www.suse.de/~fabian
 
* http://www.suse.de/~fabian
 
* http://www.swdist.org
 
* http://www.swdist.org
 +
* http://www.easymarketplace.de
  
 
= IPs =
 
= IPs =
Line 33: Line 36:
  
 
= Progress =
 
= Progress =
 +
 +
== Start multipath ==
 +
/etc/init.d/boot.multipath start
 +
/etc/init.d/multipathd start
 +
multipath -ll
 +
 +
== fdisk ==
 +
Make a little partition for the quorum (SFEX) and a large partition for the rest.
  
 
== mdadm ==
 
== mdadm ==
Line 41: Line 52:
  
 
  mdadm --assemble --config /clusterconf/NA1/mdadm.conf /dev/md0
 
  mdadm --assemble --config /clusterconf/NA1/mdadm.conf /dev/md0
 +
 +
== Create lvm ==
 +
pvcreate /dev/md0
 +
vgcreate sapvg /dev/md0
 +
lvcreate -L 145G sapdb
  
 
== Start lvm ==
 
== Start lvm ==
Line 76: Line 92:
 
  startsap sapna1as
 
  startsap sapna1as
 
  startsap sapna1ci
 
  startsap sapna1ci
 +
 +
== Cluster ==
 +
* Test the quorum
 +
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
/usr/lib64/heartbeat/sfex_lock  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
/usr/lib64/heartbeat/sfex_unlock  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
 +
;Example
 +
rx3009:~ # /usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
control data:
 +
  magic: 0x01, 0x1f, 0x71, 0x7f
 +
  version: 1
 +
  revision: 3
 +
  blocksize: 512
 +
  numlocks: 10
 +
lock data #1:
 +
  '''status: lock'''
 +
  count: 16
 +
  nodename: rx3009
 +
'''status is LOCKED.'''
 +
 +
rx3010:~ # /usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
 +
control data:
 +
  magic: 0x01, 0x1f, 0x71, 0x7f
 +
  version: 1
 +
  revision: 3
 +
  blocksize: 512
 +
  numlocks: 10
 +
lock data #1:
 +
  '''status: lock'''
 +
  count: 28
 +
  nodename: rx3009
 +
'''status is UNLOCKED.'''
 +
 +
* establish passwordless login between the nodes
 +
 +
* write the heartbeat configuration
 +
vi /etc/ha.d/ha.cf
 +
 +
cat /etc/ha.d/authkeys
 +
 +
* propagate it
 +
ha_propagate
 +
 +
Start the cluster configuration
 +
hb_gui
 +
 +
* make sure you have one of the virtual IPs
 +
 +
* start the cluster
 +
/etc/init.d/heartbeat start
 +
 +
* look at the status
 +
crm_mon
 +
 +
* find out what the cluster did recently
 +
cluster_actions
 +
 +
* monitor the cluster
 +
crmmon
  
 
= Pitfalls =
 
= Pitfalls =
 
* forgot to have luns before installing
 
* forgot to have luns before installing
 
* forgot to use SAPINST_USE_HOSTNAME => had to uninstall ASCS
 
* forgot to use SAPINST_USE_HOSTNAME => had to uninstall ASCS

Latest revision as of 11:13, 15 February 2010

Contents

Concepts

Cluster Resource Agents are told by the Cluster Manager "start up" or "shut down". Virtual IP addresses are managed by the cluster.

Host-based mirroring

Host-based mirroring is done with mdadm. Besides, there is

  • log shipping
  • synchronous mirroring

Autoyast

install=http://... autoyast=http://...

Migration

A migration in the sense of SAP is a change of OS or DB, but not an upgrade or update.

URLs

IPs

10.31.19.101 sapna1ci
10.31.19.102 sapna1db

Definitions

  • SID is NA1
  • user na1adm id 2000, group 2000
  • user sdb 2001, group 2001
  • user sqdna1 id 2002
  • ascs gets instance number 00
  • ci gets instance number 02
  • we use IBM Java 1.4.2

Decisions

  • We use a simple stack, that means, if a server goes down, the complete stack switches over to the other node, accepting downtime.

Progress

Start multipath

/etc/init.d/boot.multipath start
/etc/init.d/multipathd start
multipath -ll

fdisk

Make a little partition for the quorum (SFEX) and a large partition for the rest.

mdadm

mdadm.conf:

DEVICE /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part2
ARRAY /dev/md0 level=raid0 num-devices=1 UUID=357d721b:86058e8c:8d746cf9:a620ae3c
mdadm --assemble --config /clusterconf/NA1/mdadm.conf /dev/md0

Create lvm

pvcreate /dev/md0
vgcreate sapvg /dev/md0
lvcreate -L 145G sapdb

Start lvm

vgscan 
vgdisplay
lvdisplay
vgchange -a y sapvg

Mount the vgs

mount /dev/sapvg/usrsap /usr/sap
mount /dev/sapvg/sapdb /sapdb
mount /dev/sapvg/sapmnt /sapmnt

Java

Install java

yast -i java-1.4.2-ibm

SAP

Install

ASCS

./sapinst SAPINST_USE_HOSTNAME sapna1as

Choose Netweaver -> Application Server ABAP -> MaxDB -> High Availability -> ASCS

DB

./sapinst SAPINST_USE_HOSTNAME sapna1db

Choose Netweaver -> Application Server ABAP -> MaxDB -> High Availability -> db instance

CI

./sapinst SAPINST_USE_HOSTNAME sapna1ci

Start

startsap sapna1as
startsap sapna1ci

Cluster

  • Test the quorum
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
/usr/lib64/heartbeat/sfex_lock  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
/usr/lib64/heartbeat/sfex_unlock  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
/usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
Example
rx3009:~ # /usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
control data:
  magic: 0x01, 0x1f, 0x71, 0x7f
  version: 1
  revision: 3
  blocksize: 512
  numlocks: 10
lock data #1:
  status: lock
  count: 16
  nodename: rx3009
status is LOCKED.
rx3010:~ # /usr/lib64/heartbeat/sfex_stat  /dev/disk/by-id/scsi-360a98000486e5337524a4f6a4e43542d-part1
control data:
  magic: 0x01, 0x1f, 0x71, 0x7f
  version: 1
  revision: 3
  blocksize: 512
  numlocks: 10
lock data #1:
  status: lock
  count: 28
  nodename: rx3009
status is UNLOCKED.
  • establish passwordless login between the nodes
  • write the heartbeat configuration
vi /etc/ha.d/ha.cf
cat /etc/ha.d/authkeys
  • propagate it
ha_propagate

Start the cluster configuration

hb_gui
  • make sure you have one of the virtual IPs
  • start the cluster
/etc/init.d/heartbeat start
  • look at the status
crm_mon
  • find out what the cluster did recently
cluster_actions
  • monitor the cluster
crmmon

Pitfalls

  • forgot to have luns before installing
  • forgot to use SAPINST_USE_HOSTNAME => had to uninstall ASCS