Configuring Reporting Clustering
You can add higher scale, performance, and reliability to the Reporting and Analytics solution by using the reporting clustering feature. Through reporting clustering, you can combine and configure multiple reporting members in a cluster.
The reporting clustering feature offers high availability and disaster recovery. You can configure one or more reporting appliances in multiple locations (sites). Reporting data is replicated among these reporting appliances to ensure redundancies and continuous service even if one of the servers fails. For example, if one of the reporting members has operational issues, reports and dashboards will use backup copies of the data on other reporting members in the cluster to ensure continuous reporting service. When a new reporting member joins the cluster, you do not need to reconfigure and restart your forwarders to send data to the new reporting member as the Grid Master automatically notifies all forwarders about the new member. In addition, data indexed on the new reporting member participates in searches that support reports and dashboards. Thus, the new reporting clustering solution increases scale, offers higher reporting performance and greatly improves the reliability of the Reporting and Analytics solution.
For more information about how reporting clustering works and the types of clustering mode, see Clustering Overview and Reporting Cluster Modes below.
Note
Reporting clusters are not supported in a Multi-Grid configuration.
Clustering Overview
The concept of reporting clustering is to set up a group of reporting members within one site (location) or across multiple sites. When you configure multiple reporting members within one site, you are setting up a single-site cluster. Configuring multiple reporting members across different sites gives you a multi-site cluster, as illustrated in the figure Sample Multi-Site Reporting Cluster Single-site clusters and multi-site clusters below, offer the benefits of high availability and disaster recovery. Without reporting clustering, a reporting member is known as a single indexer.
A reporting cluster, either single-site or multi-site, consists of the following components that work together to perform reporting and clustering activities:
- Cluster Master: The cluster master coordinates all clustering activities and always runs on the Grid Master.
- Indexer (also known as cluster peer): An indexer that collects, processes, and indexes reporting data. It can also function as the originating indexer (source peer) or a replication target (target peer).
- Forwarder: A forwarder sends reporting data to the indexer for processing.
- Search Head: A search head handles search queries and distributes search requests to indexers in the cluster. One of the reporting members in the cluster will have double duties of being the indexer and search head.
- Replication Factor: This factor defines the number of copies of reporting data the cluster replicates and maintains. This is set to 2 by default for both single-site and multi-site clusters so the clusters can tolerate one reporting member failure without losing any data (since there will still be another copy of data available in the cluster).
- Search Factor: This factor defines the number of copies of searchable data. This is set to 2 for single-site clusters and set to 1 for multi-site clusters so the cluster can tolerate one reporting member failure without impacting search results (since there will still be a searchable copy of data available in the cluster).
- ReportingSite Extensible Attribute: This is an extensible attribute that you associate with reporting members in a multi-site cluster. For more information, see ReportingSite Extensible Attribute below.
In a Grid that includes a reporting cluster, the Grid Master coordinates various activities across reporting members.
In the reporting cluster, a reporting member can act as an indexer and/or a search head. It also participates in peer-to-peer data replication.
To configure a reporting cluster, you must first set up all the reporting members and enable the reporting service in the Grid. You can then select the reporting clustering type for your cluster. For more information about cluster types, see Reporting Cluster Modes below. When you configure the reporting cluster, you must use an NTP server to synchronize the time of the Grid Master, Grid members and reporting members.
Reporting Cluster Modes
You must first enable the reporting service and configure one or more reporting members as needed before configuring the reporting cluster. When you enable reporting clustering, the Grid Master, forwarders, and reporting members use specific ports for network communication. Ports Required for IPv4 and IPv6 Single Indexer, Port Requirement for IPv4 and IPv6 Single-site Clustering, and Port Requirement for IPv4 and IPv6 Multi-site Clustering figures below illustrates whether the network communication is over TCP/SSL or VPN, and ports that you can use for the reporting service.
- Single Indexer: This is the traditional configuration that works on one reporting server (indexer). The forwarder sends reporting data to the indexer and the indexer indexes the data. This is the default configuration when you enable the reporting service for new installations.
Ports Required for IPv4 and IPv6 Single Indexer
- Single-Site Cluster: In a single-site cluster, the Grid Master is also the cluster master and all reporting members are cluster indexer peers. NIOS selects a peer and configures it as the search head to handle search queries. If the selected search head goes down, NIOS automatically selects another search head among the reporting members in the same site. All other Grid members (non-reporting members) are considered forwarders that send reporting data to the cluster peers for processing. You must configure at least two reporting members that are located in the same site (location). By default, the replication factor and search factor for a single-site cluster are set to 2. Note that you can upgrade your configuration from a single-site cluster to a multi-site cluster. However, once configured, you cannot change your configuration back to a single indexer. For information about how to configure a single-site cluster, see Configuring Reporting Clusters below.
Port Requirement for IPv4 and IPv6 Single-site Clustering
- Multi-Site Cluster - A multi-site clustering configuration is useful when you want to manage multiple reporting sites at different locations, with each site having its own set of indexers. The multi-site clustering configuration is valid only when you associate all the reporting members in the cluster with the predefined ReportingSite extensible attribute. For information about the ReportingSite extensible attribute, see ReportingSite Extensible Attribute below. In a multi-site cluster, you configure one of the sites as the primary site, and then plan other sites in a specific order. This order defines the next site of indexers to which the forwarders send data when the primary site is out of service. Note that all Grid members send data only to indexers in the primary site. You can designate a new primary site either by using the Grid Reporting Properties editor, or using the set promote_master CLI command. For more information about the CLI command, refer to the Infoblox CLI Guide. A multi-site cluster must have at least two sites with two reporting members in each site, as illustrated in the Sample Multi-Site Reporting Cluster figure. The first reporting site that you configure is the primary site, which also hosts the search head for the cluster. If the search head goes down, the Grid Master automatically chooses an available reporting member in the same site as the search head. If all the indexers in a site go down, or if you want to change the search head to another site, then you must manually redefine the primary site. Note that you must make one of the active sites as the primary site. In a multi-site cluster, the search factor (also known as the site search factor) determines both the number of searchable copies that the entire cluster maintains and the number of copies that each site maintains. By default, the search factor is set to 1 and the replication factor is 2 in a multi-site cluster.
Note
You can change your configuration from a single indexer to a single-site cluster or multi-site cluster and from a single-site cluster to a multi-site cluster. However, you cannot revert your configuration from a multi-site cluster to a single-site cluster or to a single indexer.
Clustering Data Replication
When you change the configuration from a single indexer to a single-site cluster or multi-site cluster and from a single-site cluster to a multi-site cluster, the replication of data will start only for the new data that are created after you have completed the cluster mode configuration. When you change the configuration, the replication of new data starts only after you have completed the clustering configuration. Any data created prior to switching are restored on the primary site and are not replicated on the secondary site. To manage your reporting clustering data efficiently, see Guidelines for Deploying Reporting Clusters.
Sample Multi-Site Reporting Cluster
For more information about how reporting cluster works, refer to the Splunk documentation at https://docs.splunk.com/Documentation/Splunk/8.2.4/Indexer/Basicclusterarchitecture.
Port Requirement for IPv4 and IPv6 Multi-site Clustering
ReportingSite Extensible Attribute
NIOS defines the ReportingSite extensible attribute for use by the multi-site cluster reporting configuration. You must associate a ReportingSite extensible attribute value with the reporting members defined in the cluster. For more information, see Assigning a Reporting Site EA Value to a Multi-Site Cluster below. Note that your multi-site cluster configuration is invalid unless you assign ReportingSite values to all the reporting members that are part of the cluster.
You can add up to five ReportingSite extensible attributes, and view and edit the ReportingSite extensible attributes in the Administration tab -> Extensible Attributes tab in Grid Manager. You can view the ReportingSite extensible attribute values in the Grid -> Grid Manager -> Members tab in Grid Manager. The ReportingSite column is not available if you customize the Results table. You can enable the ReportingSite column by selecting Columns -> Edit Columns. For information about customizing tables, see About the Grid Manager Interface. You can also use the Group Results function to group reporting members that contain the same ReportingSite extensible attributes. For information about grouping members by extensible attributes, as described in Grouping Members by Extensible Attribute, see Adding Grid Members.
As illustrated in Sample Multi-Site Reporting Cluster above, the ReportingSite value "site1" is assigned to a site within a multi-site cluster and to the reporting members RM1, RM2 and RM3. The ReportingSite value "site2" is assigned to a different site in the cluster and to reporting members RM4, RM5 and RM6. If the search head goes down, the Grid Master automatically chooses an available reporting member in the same site to be the search head.
Note
When you modify the ReportingSite extensible attribute value for any indexers in a multi-site cluster, ensure that you validate the configuration again, as described in Validating Reporting Clustering Configuration below.
Monitoring Reporting Cluster Status
After you have set up reporting members and defined clustering type, you can monitor the cluster status through the following:
- View the reporting member service status, as described in Monitoring Grid Services, see Monitoring Services.
- Check license usage by the reporting member, as described in Home Dashboards.
Promoting the Grid Master Candidate in Multi-Site Clustering
If the Grid Master fails and all other reporting members are up and running in a multi-site reporting cluster, you must promote the Grid Master Candidate to the Grid Master by using the CLI command set promote_master. For information about the CLI command, refer to the Infoblox CLI Guide.
Reporting Categories and Related Data Sources
The reporting member uses two types of data sources to generate reports: file-based data sources and script-based data sources. When the reporting member is down or unreachable, file-based data sources are queued until the reporting member is up and running. However, the script-based data sources are lost if the size of the queued data exceeds 500 KB.
The amount of data in the queue is managed as follows:
- Rotates the reporting syslog files (extracted from /var/log/messages) at 120 MB retaining one older file. The data in the queue depends on the file size when the reporting member becomes unreachable.
The CSV files overwrite the oldest data with the new data at regular intervals. So, the CSV file contains only the latest events.
Note
If you desire older information to be kept, use any of the export methods to daily export this data to a file.
The table below lists the reports provided by the reporting server, report categories, and the source type, data source type (file or script-based), and queue data update frequencies for each report:
Report Categories, Related Data Sources, and Update Frequencies
Report Category | Reports | Source Type | Data Source (file-based or scriptbased) | Update Frequency |
---|---|---|---|---|
Device | Inactive IP Addresses | ib:reserved2 | file-based (syslog) | Rotates at 120 MB; retains one older copy; queued data is between 120 MB and 240 MB |
Port Capacity Utilization by Device Port Capacity Trend Port Capacity Delta by Device | ib:reserved2 | file-based (csv) | Overwritten every 6 hours | |
End Host History | ib:discovery:end_host _activity | file-based (csv) | Overwritten every 24 hours | |
IP Address Inventory | ||||
Network Inventory | ||||
IPAMv4 Device Networks | ||||
Device Interface Inventory | ||||
Device Inventory | ib:reserved2 | file-based (csv) | Overwritten every 24 hours | |
Device Components | ||||
Device Advisor | ib:reserved2 | file-based (csv) | Overwritten every 24 hours | |
DHCP Performance | DHCP Message Rate Trend | ib:dhcp:message | file-based (csv) | Overwritten every 1 minute |
DHCPv4 Usage Trend DHCPv4 Range Utilization Trend | ib:dhcp:range | file-based (csv) | Overwritten every 1 hour | |
DHCP Lease History | DHCP Lease History DHCP Top Lease Clients | ib:dhcp:lease_history | file-based (syslog) | Rotates at 120 MB; retains one older copy; queued data is between 120 MB and 240 MB |
Top Devices Identified Device Trend Device Class Trend Top Device Classes | ib:dhcp:lease_history | file-based (syslog) | Based on summary search report, which is updated during the 16th and 46th minutes of each hour | |
Top Devices Denied an IP Address | ib:dhcp:lease_history | file-based (syslog) | Based on summary search report, which is updated during the 19th and 49th minutes of each hour | |
Device Fingerprint Change Detected | ib:dhcp:lease_history | file-based (syslog) | Executed every 24 hours | |
DNS Performance | DNS Response Latency Trend | ib:dns:perf | script-based | Executed every 1 minute |
DNS Record Scavenging | DNS Scavenged Object Count Trend | ib:dns:reclamation | file-based (csv) | Updated whenever reclamation tasks are executed |
DNS Query Capture | DNS Domain Query Trend DNS Domains Queried by Client Top DNS Clients by Query Type Top DNS Clients Querying MX Records | ib:dns:capture | file-based (csv) | Updated whenever the Data Collection VM collects capture query data from a Grid member |
DDNS | DDNS Update Rate Trend | ib:ddns | file-based (syslog) | Rotates at 120MB; retains one older copy; queued data is between 120MB and 240MB. |
DNS Traffic Control | DNS Traffic Control Resource Availability Trend | ib:dns:reserved | file-based (csv) | Based on summary search report, which is updated once per six hour at 47th minute of each hour. With each execution, it summarizes raw events indexed from 370 minutes ago to 10 minutes ago. |
DNS Traffic Control Resource Availability Status | ib:dns:reserved | file-based (csv) | Based on summary search report, which is updated once per six hour at 47th minute of each hour. With each execution, it summarizes raw events indexed from 370 minutes ago to 10 minutes ago. | |
DNS Traffic Control Resource Pool Availability Trend | ib:dns:reserved | file-based (csv) | Based on summary search report, which is updated once per six hour at 23rd minute of each hour. With each execution, it summarizes raw events indexed from 370 minutes ago to 10 minutes ago. | |
DNS Traffic Control Resource Pool Availability Status | ib:dns:reserved | file-based (csv) | Based on summary search report, which is updated once per six hour at 23rd minute of each hour. With each execution, it summarizes raw events indexed from 370 minutes ago to 10 minutes ago. | |
DNS Traffic Control Response Distribution Trend | ib:dns:reserved | file-based (csv) | Based on summary search report, which is updated once per six hour at 37th minute of each hour. With each execution, it summarizes raw events indexed from 370 minutes ago to 10 minutes ago. | |
DDI Utilization | DHCPv4 Usage Statistics DHCPv4 Top Utilized Networks | ib:dhcp:network | file-based (csv) | Overwritten every 1 hour |
IPAM Network Usage IPAM Top Networks | ib:ipam:network | file-based (csv) | Overwritten every 1 hour | |
DNS Zone Statistics Per DNS View | ib:dns:view | file-based (csv) | Overwritten every 24 hours | |
DNS Statistics per Zone | ib:dns:zone | file-based (csv) | Overwritten every 24 hours | |
DNS Object Count Trend for Flex Grid License | ib:dns:ibflex_zone_counts | file-based (csv) | Generated once in 24 hours and average is calculated over 5 days | |
System Utilization | CPU Utilization Trend Memory Utilization Trend Traffic Rate by Member | ib:system | script-based | Executed every 1 minute |
License Pool Utilization | ib:system | file-based (csv) | Overwritten every 24 hours | |
SPLA Grid Licensing Features Enabled | ib:system | Generated once in 24 hours for all IB-FLEX members on the Grid | ||
System Capacity | System Capacity Prediction | ib:system_capacity:objects | Updated whenever there is relevant event occurs | |
DNS Query | DNS Replies Trend | ib:dns:stats | script-based | Executed every 1 minute |
DNS Cache Hit Rate Trend | ib:dns:query:cache_hit_rate | script-based | Executed every 1 minute | |
DNS Query Rate by Query Type | ib:dns:query:qps | script-based | Executed every 1 minute | |
DNS Query Rate by Member DNS Daily Query Rate by Member DNS Daily Peak Hour Query Rate by Member | ib:dns:query:by_member | script-based | Executed every 1 minute | |
DNS Top Clients | ib:dns:query:top_clients | script-based | Executed every 10 minutes | |
DNS Top Requested Domain Names | ib:dns:query:top_requested _domain_names | script-based | Executed every 10 minutes | |
DNS Top Clients Per Domain DNS Top NXDOMAIN / NOERROR (no data) DNS Top SERVFAIL Errors Received DNS Top SERVFAIL Errors Sent DNS Top Timed-Out Recursive Queries | ib:dns:reserved | script-based | Executed every 10 minutes | |
DNS Query Trend per IP Block Group | ib:dns:reserved | script-based | Executed every 5 minutes | |
DNS Effective Peak Usage Trend for Flex Grid License | ib:dns:query:qps | Executed every 10 minutes and average is calculated over five days | ||
Security | DNS Top RPZ Hits | ib:dns:reserved | script-based | Executed every 10 minutes |
DNS Top RPZ Hits by Clients | ib:dns:reserved | script-based | Executed every 10 minutes | |
Top DNS Firewall Hits | ib:dns:reserved | script-based | Executed every 10 minutes | |
Malicious Activity by Client | ib:dns:reserved | script-based | Executed every 10 minutes | |
DNS Firewall Executive Threat | ib:dns:reserved | script-based | Executed every 10 minutes | |
FireEye Alerts | ib:syslog | script-based | Updated immediately when alerts are logged in the syslog. | |
Threat Protection Event Count By Severity Trend Threat Protection Event Count By Member Trend Threat Protection Event Count By Rule Threat Protection Event Count By Time Threat Protection Event Count By Category Threat Protection Event Count By Member | ib:reserved1 | file-based (csv) | Overwritten every 5 minutes. | |
DNS Top Tunneling Activity DNS Tunneling Traffic by Category Top Malware and DNS Tunneling Events by Client | ib:reserved1 | file-based (csv) | Overwritten every 5 minutes. | |
Network User | User Login History | ib:reserved1 | file-based (csv) | |
Ecosystem Subscription | Subscription Data | ib:reserved1 | file-based (csv) | Updated whenever there is an event received from the vendor that NIOS subscribes. |
Ecosystem Publication | Publish Data | ib:reserved1 | file-based (csv) | Updated whenever there is a relevant RPZ, IPAM, and DHCP lease event occurs. |
Cloud | VM Address History | ib:reserved2 | file-based (csv) | Updated immediately when there is a change related to the VM IP address. Rotates at 300MB and retains one older copy. |
Audit Log | Audit Log Events | ib:audit | file-based (audit log) | Updated immediately when the audit log is updated. |
Audit Log WAPI Events | ib:audit | file-based (audit log) | Updated immediately when the audit log is updated. | |
Syslog | Syslog Events | ib:syslog | file-based (Syslog) | Updated immediately when alerts are logged in the syslog. |
Configuring Reporting Clusters
You can configure a reporting single indexer, a single-site cluster, or a multi-site cluster. When you configure reporting clustering, make sure that you configure two or more reporting appliances and that all indexers are online.
Note
There is no action required if you see intermittent "Too many streaming errors" and "Skip indexing" messages in the Messages menu of the Reporting tab. This can be caused by network connectivity issues between the reporting nodes.
During NIOS upgrade, when configuring reporting clusters, ignore the "Unable to establish a connection to peer" message displayed on the Reporting tab.
To configure a reporting cluster:
- From the Administration tab -> Reporting tab, click Grid Reporting Properties from the Toolbar.
- In the Grid Reporting Properties editor, select the Reporting Clustering tab and complete the following:
- Single Indexer: Select this to configure only one reporting server. This is the default reporting cluster mode.
- Single-Site Cluster: Select this if you want to configure two or more reporting servers in the same site (location). The data is replicated on multiple reporting servers. You can upgrade your configuration to the multi-site clustering mode, but you cannot revert this configuration to a singer indexer mode.
- Multi-Site Cluster: Select this if you want to configure multiple reporting servers at different sites (locations). You must assign the ReportingSite extensible attributes to all the reporting members that you have configured in the same site within the cluster. You can configure the same ReportingSite extensible attribute with multiple reporting members. The reporting members that are configured with the same ReportingSite extensible attributes are tagged to the same site. Click the Add icon and select the ReportingSite extensible attribute that you have configured on the reporting member. The first site that you add is considered to be the primary site, which functions as the search head. You can change the order of the sites by clicking the up and down arrows.
For more information about the reporting cluster type, see Reporting Cluster Modes above.
Note
Your multi-site configuration is invalid if you do not add the correct ReportingSite extensible attribute values to the reporting members. You can validate your configuration as described in Validating Reporting Clustering Configuration below.
3. Click Save & Close.
Assigning a ReportingSite EA Value to a Multi-Site Cluster
The multi-site clustering configuration is valid only when you associate all the reporting members in the cluster with the specified ReportingSite extensible attribute values. Make sure that you select the ReportingSite values from those that are specified for the multi-site cluster in the Grid Reporting Properties editor. After you assign extensible attribute values to the reporting members, you can validate the multi-site cluster configuration as described in Validating Reporting Clustering Configuration below.
To associate the ReportingSite extensible attribute with the reporting member:
- From the Grid tab, select the Grid Manager tab -> Members tab -> member checkbox, and then click Extensible Attributes in the Toolbar.
- Click the Add icon in the Extensible Attributes table to enter extensible attributes. The appliance adds a row to the table each time you click the Add icon. Select the row and the attribute name from the drop-down list, and then enter the value.
- Optionally, select an extensible attribute and click Delete to delete it.
- Click Save & Close.
Validating Reporting Clustering Configuration
After you have configured the reporting cluster mode, you can verify its validity. Whenever you make changes to the reporting configuration through Grid Manager or hardware replacement, make sure that you validate the configuration.
When you verify a multi-site cluster configuration, NIOS validates the following:
- The extensible attribute ReportingSite is specified for all reporting members.
- The set of extensible attributes configured in the GridReportingProperties editor equals to the set of ReportingSite extensible attributes defined for the reporting members.
- For each ReportingSite extensible attribute, the number of reporting peers must be greater or equal to the replication factor in each site.
- For each ReportingSite extensible attribute, the search factor must be less than or equal to the replication factor in each site.
To verify the reporting cluster-mode configuration:
- From the Grid tab -> Grid Manager tab, click the Reporting service.
- In the vertical Toolbar, click Verify Cluster Configuration.
The Verify Reporting Cluster Configuration dialog box displays an error message if the configuration is invalid. Make sure that you associate the ReportingSite extensible attributes with all the reporting members that you have configured. - Click OK to close the dialog box.