您现在的位置： Linux教程網 >> UnixLinux > >> Linux基礎 >> 關於Linux

centos6.5環境基於corosync+cman+rgmanager實現RHCS及iscsi+gfs2+clvm的文件系統集群

一、環境准備

服務器列表：

ansible server : 192.168.8.40  
node2.chinasoft.com: 192.168.8.39  
node3.chinasoft.com: 192.168.8.41  
node4.chinasoft.com: 192.168.8.42  
iscsi設備：192.168.8.43

1、在各節點上配置/etc/hosts文件，不使用DNS解析域名

192.168.8.39 node2.chinasoft.com node2 
192.168.8.41 node3.chinasoft.com node3 
192.168.8.42 node4.chinasoft.com node4

2、時間同步、在ansible服務器配置免密碼訪問node2、node3、node4

在ansible服務器中添加rhcs組
# vim /etc/ansible/hosts
加入如下內容：

[rhcs]
node2.chinasoft.com
node3.chinasoft.com
node4.chinasoft.com

時間同步

# ansible rhcs -m shell -a "ntpdate -u 192.168.8.102"
node2.chinasoft.com | success | rc=0 >>
 3 May 09:13:11 ntpdate[1791]: step time server 192.168.8.102 offset 393253.345476 sec


node4.chinasoft.com | success | rc=0 >>
 3 May 09:13:11 ntpdate[1775]: step time server 192.168.8.102 offset 393211.983109 sec


node3.chinasoft.com | success | rc=0 >>
 3 May 09:13:12 ntpdate[1803]: step time server 192.168.8.102 offset 339279.739826 sec

4、關閉幾台服務器上的防火牆和selinux(避免干擾)

二、安裝corosync、cman、rgmanager並配置集群

# ansible rhcs -m yum -a"name=corosync state=present"


# ansible rhcs -m yum -a "name=cman state=present"


# ansible rhcs -m yum -a "name=rgmanager state=present"

說明：
不支持在集群節點中使用 NetworkManager 。如果已經在集群節點中安裝了 NetworkManager, 您應該刪除或者禁用該程序。

RHCS的配置文件/etc/cluster/cluster.conf，其在每個節點上都必須有一份，且內容均相同，其默認不存在，因此需要事先創建，ccs_tool命令可以完成此任務。另外，每個集群通過集群ID來標識自身，因此，在創建集群配置文件時需要為其選定一個集群名稱，這裡假設其為mycluster。此命令需要在集群中的某個節點上執行。
在其中的一個節點上創建集群mycluster
# ccs_tool create mycluster
# cd /etc/cluster
# ls
cluster.conf cman-notify.d
# vim cluster.conf
可以看到集群的配置文件中沒有集群的信息

<!--?xml version="1.0"?-->
<cluster config_version="1" name="mycluster">


  <clusternodes>
  </clusternodes>


  <fencedevices>
  </fencedevices>


  <rm>
    <failoverdomains>
    <resources>
  </resources></failoverdomains></rm>
</cluster>

通過# ccs_tool -h可以查看ccs_tool的命令幫助

# ccs_tool addnode
Usage: ccs_tool addnode [options]  [=]...
 -n --nodeid        Nodeid (required)
 -v --votes         Number of votes for this node (default 1)
 -a --altname       Alternative name/interface for multihomed hosts
 -f --fence_type    Name reference of fencing to use
 -c --configfile    Name of configuration file (/etc/cluster/cluster.conf)
 -o --outputfile    Name of output file (defaults to same as --configfile)
 -h --help          Display this help text


Examples:


Add a new node to default configuration file:
  ccs_tool addnode newnode1 -n 1 -f wti7 port=1


Add a new node and dump config file to stdout rather than save it
  ccs_tool addnode -o- newnode2 -n 2 -f apc port=1

添加集群節點node2、node3和node4
票數為1，節點的id分別為1、2、3

# ccs_tool addnode node2.chinasoft.com -n 1 -v 1
# ccs_tool addnode node3.chinasoft.com -n 2 -v 1
# ccs_tool addnode node4.chinasoft.com -n 3 -v 1

可以看到cluster.conf文件中已經有了集群節點的信息

<cluster config_version="4" name="mycluster">


  <clusternodes>
  <clusternode name="node2.chinasoft.com" nodeid="1" votes="1"><clusternode name="node3.chinasoft.com" nodeid="2" votes="1"><clusternode name="node4.chinasoft.com" nodeid="3" votes="1"></clusternode></clusternode></clusternode></clusternodes>


  <fencedevices>
  </fencedevices>


  <rm>
    <failoverdomains>
    <resources>
  </resources></failoverdomains></rm>
</cluster>

將node2上的配置文件拷貝到node3和Node4上
# scp cluster.conf node3.chinasoft.com:/etc/cluster
# scp cluster.conf node4.chinasoft.com:/etc/cluster

在節點上安裝ricci工具啟動服務並設置開機自啟動
# ansible rhcs -m yum -a "name=ricci state=present"
# ansible rhcs -m service -a "name=ricci state=started enabled=yes"

此時分別在node2、node3、node4上執行啟動cman服務

# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman...                                        [  OK  ]
   Waiting for quorum...                                   [  OK  ]
   Starting fenced...                                      [  OK  ]
   Starting dlm_controld...                                [  OK  ]
   Tuning DLM kernel config...                             [  OK  ]
   Starting gfs_controld...                                [  OK  ]
   Unfencing self...                                       [  OK  ]
   Joining fence domain...                                 [  OK  ]

執行ccs_tool lsnode命令可以看到節點已經啟動

# ccs_tool lsnode


Cluster name: mycluster, config_version: 4


Nodename                        Votes Nodeid Fencetype
node2.chinasoft.com                1    1    
node3.chinasoft.com                1    2    
node4.chinasoft.com                1    3

三、配置iscsi服務端和客戶端

在192.168.8.43節點上安裝scsi服務端
# yum install -y scsi-target-utils

在target上添加一塊硬盤sdb用作客戶端的存儲，在sdb上添加兩個主分區大小都為20G

# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x21c2120e.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.


Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)


WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').


Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-15665, default 1): 
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-15665, default 15665): +20G


Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (2613-15665, default 2613): 
Using default value 2613
Last cylinder, +cylinders or +size{K,M,G} (2613-15665, default 15665): +20G


Command (m for help): w
The partition table has been altered!


Calling ioctl() to re-read partition table.
Syncing disks.

使分區生效：

# partx -a /dev/sdb
BLKPG: Device or resource busy
error adding partition 1
BLKPG: Device or resource busy
error adding partition 2


# cat /proc/partitions 
major minor  #blocks  name


   8       16  125829120 sdb
   8       17   20980858 sdb1
   8       18   20980890 sdb2
   8        0   83886080 sda
   8        1     512000 sda1
   8        2   83373056 sda2
 253        0   20480000 dm-0
 253        1    4096000 dm-1
 253        2   15360000 dm-2
 253        3   20480000 dm-3
 253        4   10240000 dm-4

配置target服務端
# vim /etc/tgt/targets.conf

backing-store /dev/sdb1
backing-store /dev/sdb2
initiator-address 192.168.8.0/24

啟動tgtd服務
# service tgtd start
Starting SCSI target daemon: [ OK ]
可以看到tgtd服務的3260端口已開啟，說明服務正常啟動
# ss -tnlp | grep tgtd
LISTEN 0 128 :::3260 :::* users:(("tgtd",1700,5),("tgtd",1701,5))
LISTEN 0 128 *:3260 *:* users:(("tgtd",1700,4),("tgtd",1701,4))

查看配置

# tgtadm -L iscsi -o show -m target
Target 1: iqn.2016-05.com.chinasoft.san:1
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags: 
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 21484 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/sdb1
            Backing store flags: 
        LUN: 2
            Type: disk
            SCSI ID: IET     00010002
            SCSI SN: beaf12
            Size: 21484 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/sdb2
            Backing store flags: 
    Account information:
    ACL information:
        192.168.8.0/24

在各節點安裝iscsi客戶端

# ansible rhcs -m yum -a "name=iscsi-initiator-utils state=present"


# ansible rhcs -m shell -a 'echo "InitiatorName=`iscsi-iname -p iqn.2016-05.com.chinasoft` > /etc/iscsi/initiatorname.iscsi"'
node2.chinasoft.com | success | rc=0 >>
InitiatorName=iqn.2016-05.com.chinasoft:705b9bd97fc7 > /etc/iscsi/initiatorname.iscsi


node3.chinasoft.com | success | rc=0 >>
InitiatorName=iqn.2016-05.com.chinasoft:27b437d5e50 > /etc/iscsi/initiatorname.iscsi


node4.chinasoft.com | success | rc=0 >>
InitiatorName=iqn.2016-05.com.chinasoft:b68414f44a7f > /etc/iscsi/initiatorname.iscsi

啟動各節點iscsi和iscsid服務，並設置開機自動啟動
# ansible rhcs -m service -a "name=iscsi state=started enabled=yes"

# ansible rhcs -m service -a "name=iscsid state=started enabled=yes"

讓iscsi客戶端發現服務端設備
# ansible rhcs -m shell -a "iscsiadm -m discovery -t sendtargets -p 192.168.8.43"
# ansible rhcs -m shell -a "iscsiadm -m node -T iqn.2016-05.com.chinasoft.san:1 -p 192.168.8.43 -l"

驗證是否找到設備
# ansible rhcs -m shell -a "fdisk -l /dev/sd[a-z]"

四、配置使用gfs2文件系統

在集群節點上安裝gfs2-utils
# ansible rhcs -m yum -a "name=gfs2-utils state=present"

mkfs.gfs2為gfs2文件系統創建工具，其一般常用的選項有：

-b BlockSize：指定文件系統塊大小，最小為512，默認為4096；
-J MegaBytes：指定gfs2日志區域大小，默認為128MB，最小值為8MB；
-j Number：指定創建gfs2文件系統時所創建的日志區域個數，一般需要為每個掛載的客戶端指定一個日志區域；
-p LockProtoName：所使用的鎖協議名稱，通常為lock_dlm或lock_nolock之一；
-t LockTableName：鎖表名稱，一般來說一個集群文件系統需一個鎖表名以便讓集群節點在施加文件鎖時得悉其所關聯到的集群文件系統，鎖表名稱為clustername:fsname，其中的clustername必須跟集群配置文件中的集群名稱保持一致，因此，也僅有此集群內的節點可訪問此集群文件系統；此外，同一個集群內，每個文件系統的名稱必須惟一；

在其中的節點node2上添加一個10G的主分區

# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x0a8dee2d.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.


Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)


WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').


Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-20489, default 1): 
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-20489, default 20489): +10G


Command (m for help): w
The partition table has been altered!


Calling ioctl() to re-read partition table.
Syncing disks.


查看分區是否生效
# cat /proc/partitions 
major minor  #blocks  name


   8        0  125829120 sda
   8        1     512000 sda1
   8        2  125316096 sda2
 253        0   30720000 dm-0
 253        1    4096000 dm-1
 253        2   25600000 dm-2
 253        3   30720000 dm-3
 253        4   10240000 dm-4
   8       16   20980858 sdb
   8       17   10486768 sdb1
   8       32   20980890 sdc


# cman_tool status
Version: 6.2.0
Config Version: 4
Cluster Name: mycluster
Cluster Id: 65461
Cluster Member: Yes
Cluster Generation: 16
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2  
Active subsystems: 7
Flags: 
Ports Bound: 0  
Node name: node2.chinasoft.com
Node ID: 1
Multicast addresses: 239.192.255.181 
Node addresses: 192.168.8.39

創建sdb1為lock_dlm 分布式文件系統的格式
# mkfs.gfs2 -j 3 -t mycluster:webstore -p lock_dlm /dev/sdb1
This will destroy any data on /dev/sdb1.
It appears to contain: data

Are you sure you want to proceed? [y/n] y

Device: /dev/sdb1
Blocksize: 4096
Device Size 10.00 GB (2621692 blocks)
Filesystem Size: 10.00 GB (2621689 blocks)
Journals: 3
Resource Groups: 41
Locking Protocol: "lock_dlm"
Lock Table: "mycluster:webstore"
UUID: a5a68ae5-4a70-2f52-bd0b-cea491a46475

將/dev/sdb1掛載到mnt上

# mount /dev/sdb1 /mnt
# cd /mnt
# ls
# mount
/dev/mapper/vg_node2-root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg_node2-data on /data type ext4 (rw)
/dev/mapper/vg_node2-usr on /usr type ext4 (rw)
/dev/mapper/vg_node2-web on /web type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
none on /sys/kernel/config type configfs (rw)
/dev/sdb1 on /mnt type gfs2 (rw,relatime,hostdata=jid=0)

將fstab拷貝到當前目錄
# cp /etc/fstab ./
並編輯fstab加入新內容

在node3和Node4上分別掛載到/mnt下
# partx -a /dev/sdb
# mount -t gfs2 /dev/sdb1 /mnt
可以看到各節點的fstab文件已經OK
# cat /mnt/fstab

在其中一個節點觀察fstab文件

# tail -f /mnt/fstab 
/dev/mapper/vg_node2-data /data                   ext4    defaults        1 2
/dev/mapper/vg_node2-usr /usr                    ext4    defaults        1 2
/dev/mapper/vg_node2-web /web                    ext4    defaults        1 2
/dev/mapper/vg_node2-swap swap                    swap    defaults        0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
new line
node2.chinasoft.com
hello world
hello world2

在另外的節點寫入內容，其他節點能同步
# echo "hello world" >> /mnt/fstab
# echo "hello world2" >> /mnt/fstab

添加日志
# gfs2_jadd -j 1 /dev/sdb1
Filesystem: /mnt
Old Journals 3
New Journals 4

將文件系統鎖住
# gfs2_tool freeze /mnt
文件無法寫入內容
# echo "hllo world 3" >> /mnt/fstab
解凍後可以看到順利寫入數據
# gfs2_tool unfreeze /mnt
獲取信息
# gfs2_tool gettune /mnt
incore_log_blocks = 8192
log_flush_secs = 60
quota_warn_period = 10
quota_quantum = 60
max_readahead = 262144
complain_secs = 10
statfs_slow = 0
quota_simul_sync = 64
statfs_quantum = 30
quota_scale = 1.0000 (1, 1)
new_files_jdata = 0

將日志的寫入時間調整為120秒
# gfs2_tool settune /mnt log_flush_secs 120

獲取日志文件的信息
# gfs2_tool journals /mnt
journal2 - 128MB
journal3 - 128MB
journal1 - 128MB
journal0 - 128MB
4 journal(s) found.

在各節點上將文件系統添加到掛載配置中
# vim /etc/fstab

/dev/sdb1 /mnt gfs2 defaults 0 0

添加一個10G的分區，並調整為LVM格式

# fdisk /dev/sdb


WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').


Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (10242-20489, default 10242): 
Using default value 10242
Last cylinder, +cylinders or +size{K,M,G} (10242-20489, default 20489): +10G


Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 8e
Changed system type of partition 2 to 8e (Linux LVM)


Command (m for help): w
The partition table has been altered!


Calling ioctl() to re-read partition table.


WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.


# partx -a /dev/sdb
BLKPG: Device or resource busy
error adding partition 1
# partx -a /dev/sdb
BLKPG: Device or resource busy
error adding partition 1
BLKPG: Device or resource busy
error adding partition 2

執行兩次 partx -a /dev/sdb 讓各節點能夠識別sdb2
# ansible rhcs -m shell -a "partx -a /dev/sdb"

五、配置使用cLVM(集群邏輯卷)

RHCS的核心組件為cman和rgmanager，其中cman為基於openais的“集群基礎架構層”，rgmanager為資源管理器。RHCS的集群中資源的配置需要修改其主配置文件/etc/cluster/cluster.xml實現，這對於很多用戶來說是比較有挑戰性的，因此，RHEL提供了system-config-cluster這個GUI工具，其僅安裝在集群中的某一節點上即可，而cman和rgmanager需要分別安裝在集群中的每個節點上。這裡選擇將此三個rpm包分別安裝在了集群中的每個節點上，這可以在ansible跳板機上執行如下命令實現：
# ansible rhcs -m yum -a "name=lvm2-cluster state=present"

編輯各節點上的/etc/lvm/lvm.conf文件，將locking_type改為3
因為3類型可以用在集群鎖中(Type 3 uses built-in clustered locking)
# ansible rhcs -m shell -a 'sed -i "s@^\([[:space:]]*locking_type\).*@\1 = 3@g" /etc/lvm/lvm.conf'

或者執行

# ansible rhcs -m shell -a "lvmconf --enable-cluster"

設置clvmd開機自啟動
# ansible rhcs -m service -a "name=clvmd state=started enabled=yes"

在node2節點上配置clvm
創建卷組和邏輯卷：
# cd
# pvcreate /dev/sdb2

# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 vg_node2 lvm2 a-- 119.51g 22.83g
/dev/sdb2 lvm2 a-- 10.00g 10.00g

# vgcreate clustervg /dev/sdb2
# lvcreate -L 5G -n clusterlv clustervg
-j 2即指定兩個日志系統
# mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:clvm /dev/clustervg/clusterlv

在各個節點上執行以下命令進行掛載
# mount /dev/clustervg/clusterlv /media/ -t gfs2

當掛載到第三個節點的時候報錯，前面格式化的時候-j 2起到了作用
# mount /dev/clustervg/clusterlv /media
Too many nodes mounting filesystem, no free journals

通過在已經成功掛載的節點上執行添加日志系統，重新掛載成功
# gfs2_jadd -j 1 /dev/clustervg/clusterlv
Filesystem: /media
Old Journals 2
New Journals 3

擴容，將clusterlv增加2G空間
# lvextend -L +2G /dev/clustervg/clusterlv
# gfs2_grow /dev/clustervg/clusterlv

至此，基於corosync+cman實現iscsi設備的分布式文件系統配置已完成

上一篇文章： linux如何搭建lamp服務環境（圖文詳解）
下一篇文章：防火牆系統故障檢測－(2)通過系統信息及常用命令判斷故障

關於Linux

CentOS 6.5環境實現corosync+pacemaker實現DRBD高可用