Why Active/Active ? - Business RequirementsSENARIO OF TWO NODE QREPObjects needed in WebSphere MQOBJECTS IN QPREPARE ACTIONQ CONTROL TABLES -PARMTABLESCreate Q MapsCreate Q subscriptionsQ HOUSE-KEEPINT WORK ACTIONCHANLE ACTION:MQ QUEUE ACTIONQAPPLYLatency analysisLATENCY OUTLOOKCAPTURE LATENCY ANALYSISQ Apply latencyQREP RECOVERY
现在IDC大行其道,同城/异地双活甚至多活的概念已经日益普及。12年的时候,写过一篇mysql中关于MASTER/SLAVE的实现mysql replication---master/salve IN ONE PC 。针对不同的DBMS,双活具体的实现不同,但都基于相同的理论或是基于SQL REPLICATION,或是基于LOG REPLICATION。这里介绍一下IBM中关于A/A的实现,关于Q的架构以及实现后面专门抽出一章来进行介绍,这里是实践贴,使你对Q有一个感性认识。关于本文最后的Q latency部分,也是每一个系统运维人员都必须要掌握的,因为它对于RPO/RTO的重要性不言而喻。
Why Active/Active ? - Business Requirements

Replication Technologies for Business Continuity
Requirements and trade-offs to consider in selecting technologies:
What needs to be recovered?: Application Data vs. DBMS vs. Entire Systems How long does it take? Recovery Time Objective (RTO): One hours or more vs. few seconds How much data could you lose? Recovery Point Objective (RPO): No data loss vs. seconds of data Distance required between sites?: 10s of kilometers vs. 100s of kilometers Hardware Utilization: Standby vs. Active Impact on applications: Direct overhead (synchronous technologies) vs. no impact (async technologies) CPU Overhead: Negligible (hardware e.g., PPRC) vs. Proportional to the workload (transaction replay technology)
SENARIO OF TWO NODE QREP
Objects needed in WebSphere MQ

OBJECTS IN Q

OBJECT DEFINE DESCRIPTION
| OBJECT NAMES | NODE1 | NODE2 |
|---|---|---|
| HOSTNAME | SITEA | SITEB |
| IP | 10.1.1.1 | 10.2.1.1 |
| DB2 VERSION | 10 | 11 |
| SUBSYSTEM | DB1 | DB2 |
| LOCATION | DB0A | DB0B |
| DATABASE PORT | 8000 | 9000 |
| DATABASE | DB1 | DB2 |
| Q MGR | QMGR1 | QMGR2 |
| MQ PROT | 8001 | 9001 |
| RESTARTQ | MQ1.RESTARTQ | MQ2.RESTARTQ |
| ADMINQ | MQ1.ADMINQ | MQ2.ADMINQ |
| SENDQ | MQ1.SENDQ.Q1 | MQ2.SENDQ.Q1 |
| SENDQ | MQ1.SENDQ.Q2 | MQ2.SENDQ.Q2 |
| SENDQ | MQ1.SENDQ.Q3 | MQ2.SENDQ.Q3 |
| REP SCHEMA | ASN1 | ASN2 |
PREPARE ACTION
-
BIND REPLICATION PROGRAM
-
APF AUTHORIZE Q-RELATED PROGRAM
-
ENABLE Q TABEL DATA CAPTURES
ALTER TABLE QTABLE DATA CAPTURE CHANGES;
-
CONFIG CDB TABLE
远程DB2连接使用DRDA协议,在使用之前需要配置对应的CDB信息。CDB 主要是Qapply用来连接Qcapture的属性配置,第一条SQLP配置了本端的连接信息,后面3条SQL配置了,对端的连接信息。
NODE1:
INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0A', 'DB0A', ''); INSERT INTO SYSIBM.IPNAMES(LINKNAME, SECURITY_OUT, USERNAMES, IPADDR) VALUES ('DB0B', 'P', 'O', 'demo.TEST.com'); INSERT INTO SYSIBM.USERNAMES(TYPE, LINKNAME, NEWAUTHID, PASSWORD) VALUES ('O', 'DB0B', 'TESTUSER', 'NOTTELLYOU'); INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0B', 'DB0B', '9000');NODE2:
INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0B', 'DB0B', ''); INSERT INTO SYSIBM.IPNAMES(LINKNAME, SECURITY_OUT, USERNAMES, IPADDR) VALUES ('DB0A', 'P', 'O', 'DEMO.TEST.com'); INSERT INTO SYSIBM.USERNAMES(TYPE, LINKNAME, NEWAUTHID, PASSWORD) VALUES ('O', 'DB0A', 'BRIDDEL', 'XXXXXXXX'); INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0A', 'DB0A', '8000');MQ OBJECT DEFINE
定义MQ传输使用的各种对象。
DEFPSIST
YES means that unless instructed otherwise, the Queue Manager logs transactions to these queues and can recover those messages in the event of failure or restart.
Q CONTROL TABLES -PARMTABLES
All control tables that are located on a node have to have the same schema.both the Q Capture and Q Apply on NODE 1 will use the replication schema ASN1 AND CONTROL TABLE ON NODE USE ASN2.
这里配置Q脚本是通过 ASNCLP配置的。它既支持交互方式,也支持批量方式。
举例说明
SET QMANAGER "MQ1A" FOR NODE 1; SET QMANAGER "MQ1B" FOR NODE 2; CREATE CONTROL TABLES FOR NODE 1 USING CAPPARMS RESTARTQ "MQ1.RESTARTQ" ADMINQ "MQ1.ADMINQ" MONITOR INTERVAL 10000 APPPARMS IN ZOS PAGE LOCK DB DEMODB QCNTLAP CREATE ROW LOCK DB DEMODB2 QCNTLAR CREATE MONITOR INTERVAL 10000; CREATE CONTROL TABLES FOR NODE 2 USING CAPPARMS RESTARTQ "MQ2.RESTARTQ" ADMINQ "MQ2.ADMINQ" MONITOR INTERVAL 10000 APPPARMS IN ZOS PAGE LOCK DB RBBDEMO QCNTLAP CREATE ROW LOCK DB RBBDEMO QCNTLAR CREATE MONITOR INTERVAL 10000;
Create Q Maps
Two Q Maps were used for the first part of this exercise. Even though a single Q
Map can efficiently contain hundreds of Q Subscriptions, you might want to
separate your subscriptions into multiple Q Maps for administrative purposes.
这里即确定了表的订阅关系。
CREATE REPLQMAP BIDIND1_TO_BIDIND2_MAP1 ( NODE 1, NODE 2 ) USING ADMINQ "BIDIND1.ADMINQ" RECVQ "BIDIND1.TO.BIDIND2.DATAQ1" SENDQ "BIDIND1.TO.BIDIND2.DATAQ1" NUM APPLY AGENTS 2; CREATE REPLQMAP BIDIND1_TO_BIDIND2_MAP2 ( NODE 1, NODE 2 ) USING ADMINQ "BIDIND1.ADMINQ" RECVQ "BIDIND1.TO.BIDIND2.DATAQ2" SENDQ "BIDIND1.TO.BIDIND2.DATAQ2" NUM APPLY AGENTS 2;
Create Q subscriptions
定义一张表属于哪一个Qmap,以及是否进行Q复制。
# # Use MAP1 for TABLE1 and TABLE2 # SET CONNECTION SOURCE DB1A.ASNB1 TARGET DB1C.ASNB2 REPLQMAP BIDIND1_TO_BIDIND2_MAP1 ; SET TABLES (DB1.ASN1.TABLE1); CREATE QSUB SUBTYPE U FROM NODE DB1A.ASNB1 SOURCE HAS LOAD PHASE N TARGET CONFLICT ACTION I FROM NODE DB1C.ASNB2 SOURCE HAS LOAD PHASE N TARGET CONFLICT ACTION F;
Q HOUSE-KEEPINT WORK ACTION
CHANLE ACTION:
START CHANNEL(MQ1A.TO.MQ1B)
DISPLAY CHSTATUS(MQ1BA.TO.MQ1B)
START CHANNEL(MQ1B.TO.MQ1A)
DISPLAY CHSTATUS(MQ1B.TO.MQ1A)
MQ QUEUE ACTION
+MQ2 CLEAR QLOCA(MQ2.SENDQ.Q1) DISPLAY QLOCAL(MQ2.SENDQ.Q1) CURDEPTH
QAPPLY
Latency analysis
关于Q latency它与我们搭建Q的目标息息相关,特别是对IDC RPT/ROT 两个指标的影响巨大。因此分析Q lantency变成了DBA一项必须掌握的技能。要想进行latency分析,你必须明确latency是如何定义的,现有的工具以及如何进行调整。
LATENCY OUTLOOK
Control tables for replication monitoring
? IBMQREP_CAPMON
? IBMQREP_CAPQMON
? IBMQREP_APPLYMON
上图中每一个数字对应的动作如下:
t1 = transaction committed at the source site
t2 = the Q Capture program read the transaction from the log
t3 = the Q Capture program committed the message to the send queue
t4 = the Q Apply program read the message from the receive queue
t5 = the Q Apply successfully committed the transaction at the target site
对应的Q END-TO-END latency=T5-T1
即一个交易在目标端与源端commit之间的时间间隔。影响Q latency的因素一个有10个,具体的分布如下:
需要说明的是背景为蓝色的影响因素对Q 的配置以及SQL语句关联较大。而白色背景部分的影响相对可控,前提是MQ,DBMS WORK WELL.
下面给出上述10个影响因素的定义,对应的monitor信息,以及如何调优。
| seq | Factor impacting latency | Relevant statistics |
|---|---|---|
| 1 | Q Capture log latency | IBMQREP_CAPMON:CURRENT_LOG_TIME |
| 2 | Q Capture time reading DB2 log records (DB2 IFI calls) | IBMQREP_CAPMON:LOGREAD_API_TIME |
| 3 | Q Capture time waiting for memory | IBMQREP_CAPMON:LOGRDR_SLEEPTIME,NUM_END_OF_LOGS |
| 4 | Q Capture time to put WebSphere MQ messages in a send queue and Q Capture time to commit all WebSphere MQ transactions | IBMQREP_CAPQMON: MQPUT_TIME, QFULL_ERROR_COUNT IBMQREP_CAPMON:MQCMIT_TIME |
| 5 | WebSphere MQ time sending and staging WebSphere MQ messages | not saved in monitor tables |
| 6 | Q Apply time to retrieve WebSphere MQ messages from a receive queue | IBMQREP_APPLYMON:MQGET_TIME |
| 7 | Q Apply time waiting due to transaction dependencies | IBMQREP_APPLYMON:DEPENDENCY_DELAY |
| 8 | Q Apply waiting for agents to pick up DBMS transactions ready to be applied | IBMQREP_APPLYMON:WORKQ_WAIT_TIME |
| 9 | Q Apply time retrying SQL due to RI, unique violation,or deadlock | IBMQREP_APPLYMON:RETRY_TIME |
| 10 | Q Apply time in DB2 processing successful transactions | IBMQREP_APPLYMON:DBMS_TIME |
CAPTURE LATENCY ANALYSIS
LOGREAD_API_TIME:IBMQREP_CAPMON The time spent in DB2 returning log records to Q Capture NUM_END_OF_LOGS:IBMQREP_CAPMON table, lists the number of times that the Q Capture program reached the end of the DB2 log LOGRDR_SLEEPTIME:IBMQREP_CAPMON LOGREADER SLEEP TIME FOR NUM_END_OF+_LOGS OR it reached its memory limit threshold MQPUT_TIME:IBMQREP_CAPMONThe time delay spent in WebSphere MQ to put new messages in the send queue MQCMIT_TIME: IBMQREP_CAPMON The time delay spent in WebSphere MQ to commit the MQ transactions XMITQDEPTH : IBMQREP_CAPQMON The transmit queue depth valu If the transmit queue depth keeps growing and there is a lot of I/O to the physical pageset of the transmit queue, this means that the capacity of the MQ channel has been exceeded
TRAN_BATCH_SZIE
平均每一个交易的大小 = MQ_BYTES/TRANS_PUBLISHED
平均每一个消息所包含的交易数目=MQ_MESSAGES /TRANS_PUBLISHED,该值应该近似等于TRAN_BATCH_SIZE
Q Apply latency
WORKQ_WAIT_TIME:IBMQREP_APPLYMON table tracks how long it takes transactions that are ready to be applied at the target site to be picked up by a Q Apply agent RETRY_TIME :IBMQREP_APPLYMON Q Apply is retrying some of the SQL statements included in the replicated transactions,SUCH RI,UI,TIMEOUT,DEADLOCK The DBMS_TIME :IBMQREP_APPLYMON tracks the average time spent in DB2 applying the transactions for a given Q Apply monitor interval. The transactions include user tables (replicated tables) and Q Replication tables (control table data that tracks the Q Apply progress)
QREP RECOVERY
STARTLRSN: lsn value from where it is safe to start reading the DB2 log
commit lsn: lsn value up to which it is safe to skip committed transactions as Q Capture is reading the log records, because the transactions were previously published.

本文所有的内容均整理自互联网,仅供参考学习,如有涉及版权问题,请自行删除本文,谢谢。