Document toolboxDocument toolbox

Operations Center Disaster Recovery Procedure

This topic describes how to perform a disaster recovery from a Primary Operations Center to a Standby Operations Center. When you perform a disaster recovery, you first restore the database archive on the Standby Operations Center and then migrate all collectors from the Primary Operations Center to the Standby Operations Center.

To fully configure the Standby Operations Center, you will need a second product license for the disaster recovery system with the same licensing entitlements as the Primary Operations Center license. Contact your Infoblox sales representative for more information.

Note

Ensure that the standby Operations Center appliance and/or all standby collectors use the same NetMRI software release as those in production before performing this procedure.

To perform a disaster recovery:

  1. Log in to the Standby Operations Center command line via SSH using the admin/admin system credentials.
  2. Execute the following administrative shell CLI commands on a newly installed or reset Standby Operations Center instance:
    1. Define the management port IP configuration for the Standby Operations Center:
      admin-na206.corp100.com> configure server
    2. Install the license for the Standby Operations Center:
      • For a physical appliance, generate a license by running the license generate command. For more information, see license generate command.
      • For a virtual appliance, run admin-na206.corp100.com> license <license filename>.gpg.
    3. Define server settings for the Standby Operations Center:
      admin-na206.corp100.com> configure server
      Make a note of your settings for Step 6 of this Procedure.
      Note: The configure server command also generates a new self-signed certificate for the Standby Operations Center. In cases where a CA-signed certificate is used in the original Operations Center, the HTTPS certificates need to be configured using the procedures described in the topic NetMRI Security Settings in the NetMRI Administrator Guide and in the NetMRI online help.
  3. Verify your settings by entering the following commands:
    1. List the complete config settings for the Standby Operations Center:
      admin-na206.corp100.com> show settings
    2. Show the installed license for the Standby Operations Center.
      admin-na206.corp100.com> show license
  4. Via SCP, manually transfer the Primary Operations Center database archive to the Standby Operations Center.
    Note: You can also configure the database backup for the Primary as an automated transfer, using the Settings icon > Database Settings > Scheduled Archive screen on the Primary Operations Center to archive the OC database to the system designated as the Standby. The backup directory, in this case, should be set as "Backup". For more information, see Database Archiving Functions in the Admin Guide and in the online Help.
    When using the automated database backup, you must first log in to the Standby Operations Center through your web browser, and set the admin password to a value different from the "admin" factory default.
    In this case, after the Standby OC system is activated as the Primary, click the Settings > Database Settings > Scheduled Archive tab and define another remote system to back up the new OC's database archive.
    If you schedule the transfer to occur within six hours of the start of weekly maintenance, no new archive will be created. Instead, the archive generated by weekly maintenance will be used. For large deployments with a lot of data, configuring backups to occur more frequently than the weekly interval may affect overall system performance.
  5. Using the administrative shell on the Standby Operations Center, restore the database archive on the Standby Operations Center. Restore time depends upon the size of the database, and may take several hours for a large system.
    admin-na206.corp100.com> restore ExampleNet_4050201203200004-20130221-641
    Note: The admin credentials (that default to admin/admin) are changed on the Standby Operations Center following the database restore operation. The Standby Operations Center will use the admin credentials that previously applied to the Primary Operations Center.
  6. When the database restoration task finishes on the Standby Operations Center, run configure server a second time to regenerate the Standby Operations Center's self-signed certificate for HTTPS access. Retain your settings previously defined in Step 2 of this Procedure.
  7. In the administrative shell on the Standby Operations Center, configure the VPN tunnel server on the Standby Operations Center using the same VPN subnet and other settings as on the Primary. When asked for the Server Public Name or IP address, be sure to enter the correct value for the Standby Operations Center. Do not configure a reference collector. The following listing is a sample capture for an entire session:
    admin-na206.corp100.com> configure tunserver
    +++ Configuring CA Settings
    CA key expiry in days [5475]:
    CA key size in bits [1024]:
    +++ Configuring Server Settings
    Server key expiry in days [5475]:
    Server key size in bits [1024]:
    Server Public Name or IP address: 172.23.27.170 <new IP address for Standby>
    Protocol (tcp, udp, udp6) [tcp]:
    Tunnel network base [5.0.0.0]:
    Block cipher:
    0. None (RSA auth)
    1. Blowfish-CBC
    2. AES-128-CBC
    3. Triple DES
    4. AES-256-CBC
    Enter Choice [2]:
    Use compression [y]:
    You can optionally designate a NetMRI client system as a "reference" system that will be used as a source of common settings.
    Enter reference system serial number or RETURN to skip: <press Enter here>
    Use these settings? (y/n) [n]: y

    +++ Initializing CA (may take a minute) ...
    +++ Creating Server Params and Keypair ...
    Generating DH parameters, 1024 bit long safe prime, generator 2
    This is going to take a long time
    ....++*++*++*
    +++ Creating Server Config ...
    Successfully configured Tunnel CA and Server
    The server needs to be restarted for these changes to take effect.
    Do you wish to restart the server now? (y/n) [y]: y
    +++ Restarting Server ... OK
  8. Check the Standby Operation Center’s VPN tunnel server settings, which are used for communications between the Operations Center and its collectors, before proceeding:
    example-oc> show tunserver
    CA configured: Yes

    Server configured: Yes
    ServerPublicName: 172.23.27.170
    Proto: tcp
    Port: 443
    KeySize: 1024
    Network: 5.0.0.0
    Cipher: AES-128-CBC
    Compression: Yes
    Service running: Yes
    Reference NetMRI SN: N/A
    Reference NetMRI Import: Skipped
    Client Sessions:
    UnitSerialNo: 1200201202100020
    UnitName: oc-170-coll-1
    UnitIPAddress: 5.0.0.15
    Network: ExampleNet
    UnitID: 1
    Status: Offline: Last seen 2013-02-21 03:01:01
    ...
  9. Using a Web browser, log in to the Standby Operations Center. Note that the admin password for the Standby Operations Center system will now be set to the password of the Primary Operations Center.
  10. To re-enable all data collectors needed for the configuration, click the Settings icon > Setup > Collection and Groups.
    Note: You must re-enable SNMP collection on this page, as it is automatically disabled on a restore.
  11. To verify that all collectors are listed, click the Settings icon > Setup > Tunnels and Collectors,
  12. Register the collectors to the Standby Operations Center by executing the following commands on each of the collectors. You use these commands to specify the Standby Operations Center IP address and new admin credentials:
    admin-collector111.corp100.com> reset tunclient
    admin-collector111.corp100.com> register
  13. Verify Operations Center collector registration and communication by entering the following:
    example-oc> show tunclient
    Client configured: Yes
    Server: 172.23.27.182
    Proto: tcp
    Port: 443
    Cipher: AES-128-CBC
    Compression: On
    Tunnel Server IP: 5.0.0.1
    Tunnel Client IP: 5.0.0.10
    Server reachable: Yes
    Service running: Yes
    Latest Service Log Entries:
    Apr 10 17:02:51 localhost openvpn[20804]: VERIFY KU OK
    Apr 10 17:02:51 localhost openvpn[20804]: Validating certificate extended key usage
    Apr 10 17:02:51 localhost openvpn[20804]: ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
    Apr 10 17:02:51 localhost openvpn[20804]: VERIFY EKU OK
    Apr 10 17:02:51 localhost openvpn[20804]: VERIFY OK: depth=0, /C=US/ST=CA/L=Santa_Clara/O=Infoblox/OU=na_Operations_Center/CN=OC182/name=Tunnel-Server/emailAddress=support@infoblox.com
    Apr 10 17:02:51 localhost openvpn[20804]: Data Channel Encrypt: Cipher 'AES-128-CBC' initialized with 128 bit key
    Apr 10 17:02:51 localhost openvpn[20804]: Data Channel Encrypt: Using 160 bit message hash 'SHA1' for HMAC authentication
    Apr 10 17:02:51 localhost openvpn[20804]: Data Channel Decrypt: Cipher 'AES-128-CBC' initialized with 128 bit key
    Apr 10 17:02:51 localhost openvpn[20804]: Data Channel Decrypt: Using 160 bit message hash 'SHA1' for HMAC authentication
    Apr 10 17:02:51 localhost openvpn[20804]: Control Channel: TLSv1, cipher TLSv1/SSLv3 DHE-RSA-AES256-SHA, 1024 bit RSA
    example-oc>
  14.  In NetMRI UI, log in to the Standby Operations Center.
  15.  In Settings > Setup > Tunnels and Collectors, verify that each of the registered collectors is online. The Operations Center will begin receiving data from collectors immediately after the connection is established. Data processing and analysis will catch up in a time interval similar to how long the collectors were offline.
  16.  In Settings > Database Settings > Scheduled Archive, define the new archiving settings that you will need for the new Operations Center system, including enabling automatic archiving, defining the recurrence pattern, and defining the remote systems that will receive the periodic archives.