Capacity Planning for Data Replication and Clustering
The tables in this topic help you provision your deployment to support data replication and clustering based on your expected usage rates in the following environments:
Note: Refer to Data Durability and Availability in Stellar Cyber for a discussion of best practices related to maintaining data availability.
Capacity Planning for Physical Appliances
The # Appliances column in the table below indicates the number of data nodes. Keep in mind that a multi-node cluster will include one additional appliance operating as the Master DP. The Master DP doesn't process data itself but distributes it to workers and coordinates the cluster.
Data Replica |
# Appliances |
Ingestion (GB/day) |
#Reports |
#Playbooks |
#Tenants |
#Concurrent Sessions |
---|---|---|---|---|---|---|
N |
1 |
300 |
100 |
1000 |
50 |
15 |
N |
2 |
600 |
200 |
2000 |
100 |
30 |
N |
3 |
900 |
300 |
3000 |
150 |
45 |
N |
4 |
1200 |
400 |
4000 |
200 |
60 |
N |
5 |
1500 |
500 |
5000 |
250 |
75 |
N |
6 |
1800 |
600 |
6000 |
300 |
90 |
N |
7 |
2100 |
700 |
7000 |
350 |
105 |
Yes |
2 |
400 |
200 |
2000 |
100 |
30 |
Yes |
3 |
600 |
300 |
3000 |
150 |
45 |
Yes |
4 |
800 |
400 |
4000 |
200 |
60 |
Yes |
5 |
1000 |
500 |
5000 |
250 |
75 |
Yes |
6 |
1200 |
600 |
6000 |
300 |
90 |
Yes |
7 |
1400 |
700 |
7000 |
350 |
105 |
Yes |
8 |
1600 |
800 |
8000 |
400 |
120 |
Yes |
9 |
1800 |
900 |
9000 |
450 |
135 |
Yes |
10 |
2000 |
1000 |
10000 |
500 |
150 |
Capacity Planning for Cloud-Based Deployments
The DL Count column in the AWS and Azure tables below indicate the number of DL instances that actually store data. Keep in mind that a cloud-based deployment will include one additional Dedicated DL-Master node when the cluster scales to daily ingestion greater than 250 GB and requires more than a single DL node. The Dedicated DL-Master node provides storage management and ElasticSearch operations but does not store data itself .
AWS Capacity Planning
Data Replica? | Data Ingestion | # Tenants | # Reports | # ATH | #Concurrent Sessions | DA Instance | DA Count | Per DA CPU | Per DA Memory | DL Instance | DL Count | Per DL CPU | Per DL Memory |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N/A | 50 | 10 | 10 | 100 | 5 | N/A | N/A | N/A | N/A | r5.4xlarge | 1 | 16 | 128GB |
N/A | 100 to 250 | 25 | 100 | 1000 | 15 | M5.4xlarge | 1 | 16 | 64GB | r5.4xlarge | 1 | 16 | 128GB |
N/A | 300 | 50 | 100 | 1000 | 15 | M5.4xlarge | 1 | 16 | 64GB | r5.4xlarge | 2 | 16 | 128GB |
N/A | 350 | 50 | 100 | 1000 | 15 | M5.4xlarge | 2 | 16 | 64GB | r5.4xlarge | 2 | 16 | 128GB |
No | 500 | 75 | 200 | 1500 | 20 | M5.4xlarge | 2 | 16 | 64GB | r5.4xlarge | 2 | 16 | 128GB |
No | 600 | 75 | 300 | 2000 | 30 | M5.4xlarge | 2 | 16 | 64GB | r5.4xlarge | 3 | 16 | 128GB |
No | 900 | 100 | 400 | 3000 | 45 | M5.4xlarge | 3 | 16 | 64GB | r5.4xlarge | 4 | 16 | 128GB |
Yes | 400 | 75 | 300 | 2000 | 30 | M5.4xlarge | 2 | 16 | 64GB | r5.4xlarge | 3 | 16 | 128GB |
Yes | 600 | 100 | 400 | 3000 | 45 | M5.4xlarge | 2 | 16 | 64GB | r5.4xlarge | 4 | 16 | 128GB |
Yes | 800 | 100 | 400 | 2000 | 45 | M5.4xlarge | 3 | 16 | 64GB | r5.4xlarge | 4 | 16 | 128GB |
Azure Capacity Planning
Data Replica? |
Data Ingestion |
# Tenants |
# Reports |
# ATH |
#Concurrent Sessions |
DA Instance |
DA Count |
Per DA CPU |
Per DA Memory |
DL Instance |
DL Count |
Per DL CPU |
Per DL Memory |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N/A |
50 |
10 |
10 |
100 |
5 |
Standard_D16s_v3 |
N/A |
N/A |
N/A |
Standard_E16s_v3 |
1 |
16 |
128GB |
N/A |
100 to 250 |
25 |
100 |
1000 |
15 |
Standard_D16s_v3 |
1 |
16 |
64GB |
Standard_E16s_v3 |
1 |
16 |
128GB |
N/A |
300 |
50 |
100 |
1000 |
15 |
Standard_D16s_v3 |
1 |
16 |
64GB |
Standard_E16s_v3 |
2 |
16 |
128GB |
N/A |
350 |
50 |
100 |
1000 |
15 |
Standard_D16s_v3 |
2 |
16 |
64GB |
Standard_E16s_v3 |
3 |
16 |
128GB |
No |
500 |
75 |
200 |
1500 |
20 |
Standard_D16s_v3 |
2 |
16 |
64GB |
Standard_E16s_v3 |
3 |
16 |
128GB |
No |
750 |
75 |
300 |
2000 |
30 |
Standard_D16s_v3 |
3 |
16 |
64GB |
Standard_E16s_v3 |
4 |
16 |
128GB |
No |
1000 |
100 |
400 |
3000 |
45 |
Standard_D16s_v3 |
4 |
16 |
64GB |
Standard_E16s_v3 |
5 |
16 |
128GB |
Yes |
300 |
75 |
300 |
2000 |
30 |
Standard_D16s_v3 |
2 |
16 |
64GB |
Standard_E16s_v3 |
3 |
16 |
128GB |
Yes |
450 |
100 |
400 |
3000 |
45 |
Standard_D16s_v3 |
2 |
16 |
64GB |
Standard_E16s_v3 |
4 |
16 |
128GB |
Yes |
600 |
100 |
400 |
2000 |
45 |
Standard_D16s_v3 |
2 |
16 |
64GB |
Standard_E16s_v3 |
5 |
16 |
128GB |
Data Sinks and Capacity Planning
If you enable a data sink in your deployment, it is crucial that you do not exceed the guidelines in the tables above and provision sufficient DA nodes for your anticipated ingestion. Do not exceed 300GB of daily ingestion per DA node.
Data sink performance depends heavily on I/O bandwidth between DA nodes and the data sink itself and adding a data sink can reduce DA performance by 30-40%. Because of this, you should anticipate loading your DA nodes with no more than the maximum of 300GB of daily ingestion per node described in the tables above when a data sink is enabled. If your current configuration exceeds this per-DA load, add additional DA nodes to your cluster before enabling a data sink.