2009/05/01

双机双引擎冗余配置这回事

Cisco双机双引擎解析:

双机冗余备份使用的协议有Cisco的HSRP和标准的VRRP,具体配置都很简单。
双引擎配置使用的是RPR和SSO:

4500系列:
Catalyst 4500 series switches allow a redundant supervisor engine to take over if the active supervisor engine fails. In software, supervisor engine redundancy is enabled by running the redundant supervisor engine in route processor redundancy (RPR) or stateful switchover (SSO) operating mode.

With supervisor engine redundancy enabled, if the active supervisor engine fails or if a manual
switchover is performed, the redundant supervisor engine becomes the active supervisor engine. The redundant supervisor engine has been automatically initialized with the startup configuration of the active supervisor engine, shortening the switchover time (30 seconds or longer in RPR mode, depending on the configuration; subseconds in SSO mode).

When power is first applied to a switch, the supervisor engine that boots first becomes the active
supervisor engine and remains active until a switchover occurs.
A switchover will occur when one or more of the following events take place:
1,The active supervisor engine fails (due to either hardware or software function) or is removed.
2,A user forces a switchover.
3,A user reloads the active supervisor engine.

Supervisor Engine Redundancy Guidelines and Restrictions:
1,The Catalyst 4507R switch and the 4510R switch are the only Catalyst 4500 series switches that support supervisor engine redundancy.
2,The Catalyst 4510R switch supports the WS-X4516 supervisor engine only. The Catalyst 4507R supports supervisor engines WS-X4013+, WS-X4515, and WS-X4516.
3,Redundancy requires both supervisor engines in the chassis to be of the same supervisor engine model and to use the same Cisco IOS software image.
4,Router ports are not supported when SSO redundancy mode is configured.
5,When you use the WS-X4013+ and WS-X4515 supervisor engines in RPR or SSO mode, only the Gig1/1 and Gig2/1 Gigabit Ethernet interfaces are available, but the Gig1/2 and Gig2/2 uplink ports are unavailable.
6,When the WS-X4516 active and redundant supervisor engines are installed in the same chassis, the four uplink ports (Gig1/1, Gig2/1, Gig 1/2, and Gig2/2) are available.
7,The active and redundant supervisor engines in the chassis must be in slots 1 and 2.
8,Supervisor engine redundancy does not provide supervisor engine load balancing.
9,The Cisco Express Forwarding (CEF) table is cleared on a switchover. As a result, routed traffic is interrupted until route tables reconverge. This reconvergence time is minimal because the SSO feature reduces the supervisor engine redundancy switchover time from 30+ seconds to subseconds, so Layer 3 also has a faster failover time if the switch is configured for SSO.
10,Static IP routes are maintained across a switchover because they are configured from entries in the configuration file.
11,Information about Layer 3 dynamic states that is maintained on the active supervisor engine is not synchronized to the redundant supervisor engine and is lost on switchover.
12,If you are running (or upgrading to) Release 12.2(20)EWA or Release 12.2(25)EW and are using a single supervisor engine in a redundant chassis (Catalyst 4507R or Catalyst 4510R series switch), and you intend to use routed ports, do one of the following:
-Use SVI’s instead of routed ports.
-Change the redundancy mode from SSO to RPR.

Configuring Supervisor Engine Redundancy:

Switch(config)# redundancy
Switch(config-red)# mode {sso rpr}
Switch# show running-config
Switch# show redundancy [clients counters history states]

Switch(config-red)# mode {sso rpr}--->Configures SSO or RPR. When this command is entered, the redundant supervisor engine is reloaded and begins to work in SSO or RPR mode.


Switch> enable
Switch# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
Switch(config)# redundancy
Switch(config-red)# mode sso
Switch(config-red)# end
Switch# show redundancy

This example shows how to display redundancy facility state information:
Switch# show redundancy states

Synchronizing the Supervisor Engine Configurations:

Switch(config)# redundancy
Switch(config-red)# main-cpu ---Enters main-cpu configuration submode.
Switch(config-r-mc)# auto-sync {startup-config config-register bootvar standard}
Switch(config-r-mc)# end
Switch# copy running-config startup-config

In SSO mode, the running-config is always synchronized.

This example shows how to reenable the default automatic synchronization feature using the auto-sync standard command to synchronize the startup-config and config-register configuration of the active supervisor engine with the redundant supervisor engine. Updates for the boot variables are automatic and cannot be disabled.
Switch(config)# redundancy
Switch(config-red)# main-cpu
Switch(config-r-mc)# [no] auto-sync standard
Switch(config-r-mc)# end
Switch# copy running-config startup-config

Performing a Manual Switchover: 手动切换引擎
Switch# show redundancy
Switch# redundancy force-switchover

Be aware of these usage guidelines:
1,To force a switchover, the redundant supervisor engine must be in a standby hot state. You can verify the state with the show redundancy command. If the state is not standby hot, the
redundancy force-switchover command will not execute.
2,Use the redundancy force-switchover command, rather than the reload command, to initiate a switchover. The redundancy force-switchover command will first check that the redundant
supervisor engine is in the correct state. If you issue the reload command and the status is not
standby hot, the reload command will reset the current supervisor engine only.

6500系列: Supervisor Engine 720

The redundant supervisor engines must be the same type with the same model PFC and MSFC.

During the startup of the standby MSFC, image version information is exchanged between MSFCs and one of the following occurs:
1,If the image version information matches and both MSFCs are configured as SSO or have the default (SSO) configuration, the system runs in SSO mode.
2,If the image version information does not match or if one of the MSFCs is configured for route processor redundancy (RPR), the system runs in RPR mode.

In NSF/SSO mode, one MSFC is active and the other MSFC is in a hot-standby mode. The hot-standby MSFC maintains a constant readiness state by receiving state information from the active MSFC. At any given moment, the standby MSFC may be called on by the supervisor engine to take over the responsibilities held by the active MSFC. The supervisor engine monitors the active MSFC and if the MSFC does not respond, the supervisor engine declares the MSFC as lost or down and proceeds to reset the MSFC. The standby MSFC has the up-to-date state information necessary to resume processing (the standby MSFC is fully initialized, but the VLANs are kept in an administrative down state until a switchover occurs).

With NSF, the switching modules and switch fabric continue to forward packets while the MSFC switchover is in progress.


Note :
High availability on the supervisor engine operates independent of the MSFC high-availability feature. However, you must enable high availability on the supervisor engine must be enabled to ensure the correct operation of the MSFC SSO feature.
If you run the MSFC in SSO mode and fail to run the high-availability feature on the supervisor engine, any switchover that may occur will result in a nonstateful switchover and the standby MSFC will reset itself and reload at the time of the switchover. This reset/reload of the standby MSFC occurs because there is insufficient state information on the supervisor engine to support a stateful switchover of the MSFC. This reset/reload of the standby MSFC interrupts service.

RPR is a cold standby mode. When a switchover occurs, the standby MSFC must go completely through its initialization. RPR mode is used primarily for the fast software upgrade (FSU). In RPR mode, the startup configuration is synchronized to the standby MSFC, however, it is not processed in any way until the switchover occurs. The running configuration is not synchronized to the standby MSFC.

When the active MSFC boots completely, no state information is exchanged between the MSFCs. If the active MSFC fails, the standby MSFC processes its startup configuration file and begins its initialization.

If there is an image compatibility problem, the active MSFC boots fully, but the standby MSFC suspends its startup before processing the startup configuration file. If the active MSFC fails, a switchover is triggered and the suspended standby MSFC begins to initialize and become the active MSFC.

Configuration Guidelines and Restrictions :
1, During a switchover, there will be traffic loss for traffic that is routed by the MSFC. NSF only applies to traffic that is hardware switched by modules and the switch fabric. New flows are not allowed until the switchover is complete.
2, In cases where the MSFC has failed and is unable to notify the supervisor engine of the failure, the supervisor engine may take 30 to 40 seconds before it realizes that the MSFC has failed and a switchover is triggered. If the supervisor engine receives the failure notification, the switchover is triggered immediately.
3, The Frame Relay, ATM, and PPP protocols that are not supported in SSO mode.
4, Standby supervisor engine/MSFC insertion—With NSF/SSO redundancy, you can hot swap the standby supervisor engine/MSFC for maintenance. When you hot insert the standby MSFC, the active MSFC detects the presence of the standby MSFC and starts to drive the standby MSFC state transition to hot-standby. When you remove the standby MSFC, the synchronization between the active and standby MSFC is stopped, any pending updates to the standby MSFC are discarded, and the system enters simplex mode. The standby MSFC state is displayed by entering the show redundancy states command.

Configuring SSO

SSO is the default mode. By default, even if you do not configure the system explicitly as SSO, the system comes up in SSO mode. However, we recommend that you explicitly configure SSO mode.
Router(config)# redundancy
Router(config-red)# mode sso
Router(config-red)# end
Router# show redundancy states


Configuring CEF NSF
CEF NSF operates by default while the networking device is running in SSO mode. No configuration is necessary.

This example shows how to verify that CEF is NSF-capable:
router# show cef state

以下是《学习指南》中的摘抄:

因为仅当主组件出现故障时,他们才会激活,所以设备内的冗余子系统通常是以热备用模式维护的,它们不能提供性能。然而在高端和未来的Catalyst交换中,可以通过合适的配置克服这种缺点。

组件高可用性网络的一种折衷方法是:通过在网络拓扑中提供冗余,而不仅仅是在网络设备中提供冗余,来确保可靠性。

NSF(SSO)在Catalyst 4500和Catalyst 6500系列交换机中提供最高级别的可用性。

在Supervisor Engine发生切换之后,RPR和PRP+将在大约1分钟之内恢复交换机的流量转发。在SSO模式中,冗余Supervisor Engine以完全初始化状态进行启动,并且与活跃Supervisor Engine的启动配置和运行配置进行同步。对于SSO支持特性的软硬件状态发生变化的情况,备用Supervisor Engine (SSO模式)将与活跃Supervisor Engine保持同步。如果活跃Supervisor Engine上的SSO支持特性发生中断的情况,那么将无缝地切换到冗余Supervisor Engine。

SSO模式因为没有发生链路状态变更的情况,所以也就不会发生生成树拓扑的变更。

对于Catalyst 6500系列交换机,在Supervisor发生故障之后,第2层流量将在0~3s之内恢复正常工作状态。当SSO结合SRM(Single Router Mode单路由器模式),在发生Supervisor Engine切换之后,第2层和第3层流量将几乎不存在中断。

Catalyst 6500 系列交换机中使用单路由器模式的路由器冗余:
虽然在任何时候都只有一个Supervisor Engine处于活跃状态,而另一个Supervisor Engine处于冗余状态,但默认情况下,两个Supervisor Engine中的MSFC都处于活跃状态。
在典型的网络中,设计中包含冗余路径,使用两台交换机来提供机箱冗余。这样将有4台活跃路由器,难以诊断网络故障。Cisco引入了单路由器模式SRM冗余,用于替代内部冗余(双)MSFC配置(在这种配置下,两个MSFC同时处于活跃状态),从而将活跃路由器从4台减少到2台。

配置SRM:
1, 在指定路由器上启用SRM,然后在非指定路由器上启用SRM,如下所示。
MSFC_1(config)#redundancy
MSFC_1(config-r)#high-availability
MSFC_1(config-r-ha)#single-router-mode
MSFC_2(config)#redundancy
MSFC_2(config-r)#high-availability
MSFC_2(config-r-ha)#single-router-mode
2,通过在指定路由器上使用命令wr,将可以把运行配置保存到启动配置中,同时确保非指定路由器的启动配置也包含这些SRM命令。
3,重新启动非指定路由器。当系统提示时候保存配置时,输入no。如下所示:
MSFC_2#reload
System configuration has been modified. Save? [yes/no]: no
Proceed with reload?[confirm]
4,非指定路由器启动后将进入备用状态。
验证:show redundancy


经典案例:
请教6509+双720引擎该如何配置?
模块状态:
show module all
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
1 6 Firewall Module WS-SVC-FWM-1 SAD122300FB
2 48 48-port 10/100/1000 RJ45 EtherModule WS-X6148A-GE-TX SAL1226UW54
3 24 CEF720 24 port 1000mb SFP WS-X6724-SFP SAL1226VBVP
5 2 Supervisor Engine 720 (Active) WS-SUP720-3B SAL1226UYHE
6 2 Supervisor Engine 720 (Cold) WS-SUP720-3B SAL1230Y9ZJ
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
1 0021.5517.aecc to 0021.5517.aed3 4.2 7.2(1) 2.3(4) Ok
2 001d.70c5.e380 to 001d.70c5.e3af 1.6 8.4(1) 8.7(0.22)H2A Ok
3 0021.d874.4eb0 to 0021.d874.4ec7 3.1 12.2(18r)S1 12.2(33)SXH2 Ok
5 001c.58d0.b098 to 001c.58d0.b09b 5.6 8.5(2) 12.2(33)SXH2 Ok
6 0016.9de6.ab68 to 0016.9de6.ab6b 5.6 8.5(2) 12.2(18)SXF1 Ok
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
3 Centralized Forwarding Card WS-F6700-CFC SAL1226V63Q 4.1 Ok
5 Policy Feature Card 3 WS-F6K-PFC3B SAL1225UERJ 2.3 Ok
5 MSFC3 Daughterboard WS-SUP720 SAL1226UWA7 3.1 Ok
6 Policy Feature Card 3 WS-F6K-PFC3B SAL1230YE06 2.3 Ok
6 MSFC3 Daughterboard WS-SUP720 SAL1230YD9R 3.1 Ok

Mod Online Diag Status
---- -------------------
1 Pass
2 Pass
3 Pass
5 Pass
6 Pass

相关配置:
#show run begin redu
redundancy
keepalive-enable
mode sso
main-cpu
auto-sync running-config

出现的问题是将主引擎复位测试了一下,第二个引擎好像是需要很长的时间才能切换过来。正常情况下SSO模式1-2秒就可以切换。

请问:该配置那些内容才能实现引擎的快速冗余切换?出现上述切换时间过长的原因是什么?

解决:因为两块引擎里灌的IOS版本不一致。虽然你配置了SSO,但是交换机检测两块引擎IOS版本不一致,自动变为RPR模式,这个cold一般只在RPR模式中出现。

没有评论:

发表评论