Deploy Infrastructure Observability (IO)
Infrastructure Observability (IO) in the Virtana Platform is a self-contained monitoring component that observes the infrastructure environment where it is deployed and transmits relevant operational data to GlobalView (GV), the centralized management layer of the Virtana Platform. Deploy IO to get deep visibility into their infrastructure health, performance, and resource utilization, enabling proactive monitoring, alarm management, analytics, and integration with various infrastructure technologies such as VMware, Nutanix, storage arrays, and network devices. IO operates independently of GV and can be deployed in a separate environment, communicating with GV via unidirectional HTTPS requests, where IO initiates all calls, and GV responds with configurations and policies.
Infrastructure Observability architecture and prerequisite
IO continuously monitors the infrastructure where it resides and transmits operational data to GV through a logically unidirectional communication model. IO initiates all HTTPS requests to the GV endpoint on port 443, while GV responds with configurations, policies, and handshake mechanisms, but never independently initiates outbound calls to IO.
Communicating IO and GV
IO initiates all communication. It sends HTTPS requests (port 443) to the GV endpoint.
GV responds with configurations, policies, and handshake data. But, GV never independently initiates outbound calls to IO.
This maintains a logically unidirectional communication model (IO > GV), with GV acting purely in a responding role.
Prerequisites
Before deploying IO, the following prerequisites must be met:
The GV endpoint (port 443) must be reachable from the IO environment.
Allow outbound traffic on port 443 (HTTPS) from IO to GV.
Obtain the Client ID and Client Secret specific to your tenant from GV.
A running Kubernetes cluster (or OpenShift) with Helm installed.
The virtana-repo Helm repository must be added and accessible.
Valid DockerHub (or private registry) login credentials.
A reachable time server for clock synchronization.
Create the Infrastructure Observability values file
Create a file named io-values.yaml. This is the central configuration file that drives the entire IO deployment.
Global settings
global:
dockerRegistryCredentials:
DOCKER_SERVER: "https://index.docker.io/v2/"
DOCKER_USERNAME: "username"
DOCKER_PASSWORD: "password"
serviceAccount:
create: false
name: "virtana-io-sa"
annotations: {}
filestorage:
storageClassName: "<filestorage-storageclass-name>"
blockstorage:
storageClassName: "<blockstorage-storageclass-name>"
brandingtheme:
name: virtana
image:
registryUrl: index.docker.io/virtana
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
ntp:
servers:
- "time.server"
deploymentSize: smallField | Description | Default value |
|---|---|---|
| URL of the Docker registry. |
|
| Docker registry username for pulling container images. |
|
| Docker registry password or access token. |
|
| Whether to create a new Kubernetes service account. Set false if using a pre-existing one. |
|
| Name of the Kubernetes service account. |
|
| Optional annotations for the service account |
|
| Name of an RWX (ReadWriteMany) storage class. Required for shared file storage across pods. |
|
| Name of an RWO (ReadWriteOnce) storage class. Required for database and single-pod volumes. |
|
| UI branding theme. |
|
| Base URL of the container image registry. |
|
| UID under which all containers run. |
|
| GID under which all containers run. |
|
| Filesystem group ID applied to mounted volumes. |
|
| NTP server addresses for time synchronization. Replace "time.server" with your actual NTP server. | |
| Deployment sizing profile that determines default resource allocations. |
|
Dependency check settings
dependencyCheck:
enabled: true
releases:
- virtana-io-core
- virtana-io-dbs
- virtana-io-infraField | Description |
|---|---|
| Enables pre-deployment validation to ensure dependent Helm releases exist before proceeding. |
| Helm release names that must be present before the current release can deploy. Ensures correct installation order. |
Integration toggles
The integration section controls which infrastructure probes, translators, and connectors are enabled. Set to true to enable, false to disable.
integration: probevm-vcenter-proxy: false probesw-snmp-proxy: false probevm-aix-proxy: false probenetflow: false probe-scaleio-proxy: false probe-os-proxy: false probevm-windows-proxy: false probe-hw-proxy: false integration-ibmsvc: false integration-emcisilon: false integration-emcvmax: false integration-selfmonitor: true translator-aidc: false translator-kvm: false translator-ntap: false translator-linux: false translator-solaris: false translator-windows: false translator-pure: false translator-cbts-infinibox: false translator-powerstore: false translator-unity: false translator-objectscale: false translator-hpe3par: false translator-vicm-xtremio: false translator-vplex: false translator-psdo-ucs: false translator-nexus: false translator-hvsp: false vw-integration-slack-app: false integration-vcs: true translator-hcp: false translator-ucp: false translator-ds8k: false translator-hnas: false translator-nutanix: false vw-ciscogrpc-proxy: false translator-snmp-switch: false translator-datadomain: false
Probes (Data Collectors)
Agents that gather raw metrics directly from infrastructure sources such as hypervisors, network devices, and storage systems.
Field | Description |
|---|---|
| VMware vCenter probe for virtual machine monitoring. |
| SNMP probe for switch or network device monitoring. |
| AIX virtual machine probe. |
| NetFlow collector for network traffic analysis. |
| Dell ScaleIO or PowerFlex storage probe. |
| OS-level probe for host monitoring. |
| Windows virtual machine probe. |
| Hardware probe for physical server monitoring. |
Platform integrations
Connectors that link the platform to external services, storage systems, and notification channels.
Field | Description |
|---|---|
| IBM SAN Volume Controller (SVC) integration. |
| Dell EMC Isilon (PowerScale) storage integration. |
| Dell EMC VMAX or PowerMax storage integration. |
| Self-monitoring of the IO platform itself. |
| VCS (Virtana Cloud Services) integration for GV communication. |
| Slack integration for alert notifications. |
| Cisco gRPC telemetry proxy. |
Translators (Data Normalizers)
Components that transform and normalize vendor-specific data into a unified format for analysis and reporting.
Field | Description |
|---|---|
translator-aidc | AIDC (Automatic Infrastructure Data Collection) translator. |
translator-kvm | KVM hypervisor translator. |
translator-ntap | NetApp storage translator. |
translator-linux | Linux OS translator. |
translator-solaris | Solaris OS translator. |
translator-windows | Windows OS translator. |
translator-pure | Pure Storage translator. |
translator-cbts-infinibox | CBTS or Infinidat InfiniBox storage translator. |
translator-powerstore | Dell PowerStore storage translator. |
translator-unity | Dell EMC Unity storage translator. |
translator-objectscale | Dell ObjectScale (object storage) translator. |
translator-hpe3par | HPE 3PAR or Primera storage translator. |
translator-vicm-xtremio | Dell EMC XtremIO storage translator. |
translator-vplex | Dell EMC VPLEX storage translator. |
translator-psdo-ucs | Cisco UCS (Unified Computing System) translator. |
translator-nexus | Cisco Nexus switch translator. |
translator-hvsp | Hyper-V Server Platform translator. |
translator-hcp | Hitachi Content Platform translator. |
translator-ucp | Hitachi Unified Compute Platform translator. |
translator-ds8k | IBM DS8000 storage translator. |
translator-hnas | Hitachi NAS translator. |
translator-nutanix | Nutanix hyperconverged infrastructure translator. |
translator-snmp-switch | Generic SNMP switch translator. |
translator-datadomain | Dell Data Domain (PowerProtect DD) translator. |
Platform-specific configuration
The Virtana IO platform is designed to run on standard Kubernetes clusters, but certain deployment environments, such as Red Hat OpenShift and FIPS-compliant infrastructures, require additional configuration to align with their unique security models, networking conventions, and compliance requirements. OpenShift, for instance, uses Routes instead of standard Ingress resources and enforces Security Context Constraints (SCCs). Similarly, organizations operating under Federal Information Processing Standards (FIPS) must override specific container image tags to use FIPS-validated cryptographic builds.
You need to apply these configurations to ensure IO deploys correctly, adheres to platform-specific security policies, and remains compliant with regulatory standards.
OpenShift deployments
Add the following command to io-values.yaml if deploying on Red Hat OpenShift:
global: isOpenShift: true public_dns_host_name: "virtanaio.example.com"
Field | Description |
|---|---|
| Set to true to enable OpenShift-specific configurations (Routes, SCCs, etc.). |
| Public DNS hostname used by OpenShift Route resources to expose IO services externally. |
OpenShift with FIPS compliance
For FIPS-enabled OpenShift environments, add the following command:
global:
image:
tag:
couchdb_ubi_clouseau: couchdb-3.4.3-fips-virtanaThis command overrides the CouchDB image tag to use a FIPS-compliant build. Required only in FIPS-enabled environments.
Deployment instructions
Infrastructure Observability is deployed in a phased sequence using Helm. Each phase installs a specific layer.
Perform the following steps:
Check the latest chart version.
helm search repo virtana-repo/virtana-io
Deploy infrastructure components to install foundational services, such as networking, ingress, or shared utilities.
helm upgrade --install virtana-io-infra virtana-repo/virtana-io \ --namespace virtana-io --create-namespace \ -f io-values.yaml --set tags.infra=true \ --version <LATEST_VERSION>
Deploy databases to install database services, such as Vertica, CouchDB, and Redis, that core services depend on.
helm upgrade --install virtana-io-dbs virtana-repo/virtana-io \ --namespace virtana-io --create-namespace \ -f io-values.yaml --set tags.dbs=true \ --version <LATEST_VERSION>
Deploy core services to install the main IO application services, such as alarm, analytics, query, or graph store.
helm upgrade --install virtana-io-core virtana-repo/virtana-io \ --namespace virtana-io --create-namespace \ -f io-values.yaml --set tags.core_services=true \ --version <LATEST_VERSION>
Deploy integrations to install the integration layer (probes, translators, connectors) based on your
io-values.yamltoggles.helm upgrade --install virtana-io-integrations virtana-repo/virtana-io-integrations \ --namespace virtana-io --create-namespace \ -f io-values.yaml \ --version <LATEST_VERSION>
Advanced overrides (Optional)
These overrides provide fine-grained control over CPU, memory, and JVM heap per service. They are not mandatory. Use them only when specific tuning is needed.
global
resources_limits:
small:
vw_alarm:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "100m"
vw_case_management:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "100m"
vw_fabric_service:
resources:
requests:
memory: "2Gi"
cpu: "50m"
limits:
memory: "2Gi"
cpu: "100m"
vw_graph_store_service:
main:
resources:
requests:
memory: "8Gi"
cpu: "800m"
limits:
memory: "12Gi"
cpu: "1000m"
db_export_controller:
resources:
requests:
memory: "2Gi"
cpu: "200m"
limits:
memory: "4Gi"
cpu: "400m"
vw_metrix_service:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "2Gi"
cpu: "100m"
vw_job_manager:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "100m"
vw_scheduler_service:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
vw_integrations_service:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
vw_analytics_service:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "250m"
vw_query_service:
resources:
requests:
memory: "2Gi"
cpu: "50m"
limits:
memory: "3Gi"
cpu: "100m"
vw_application_service:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
vw_integration_mgmt_service:
resources:
requests:
memory: "2Gi"
cpu: "50m"
limits:
memory: "4Gi"
cpu: "100m"
vw_investigation_service:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "20m"
vw_tomcat:
resources:
requests:
memory: "4Gi"
cpu: "100m"
limits:
memory: "6Gi"
cpu: "250m"
vw_capi:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "250m"
vw_license_manager:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "250m"
vw_export:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "250m"
vw_papi_integration:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
vw_actions_service:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "150m"
vw_dynamicalarm_integration:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "250m"
vw_unification_service:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "150m"
vw_am_service:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "150m"
vw_vertica:
resources:
requests:
memory: "8Gi"
cpu: "1000m"
limits:
memory: "8Gi"
cpu: "1000m"
vw_redis:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "250m"
couchdb:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "250m"
vw_log_collector:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
vw_updater:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
restore_ipm:
resources:
requests:
memory: "1Gi"
cpu: "10m"
limits:
memory: "1Gi"
cpu: "50m"
# Integrations
vw_emcisilon_integration:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
vw_emcvmax_integration:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "500m"
vw_integration_ibmsvc:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
vw_selfmonitor_integration:
resources:
requests:
memory: "500Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "200m"
vw_vcs_integration:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
vw_probe_hw_proxy:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "100m"
vw_os_proxy:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "1Gi"
cpu: "100m"
vw_scaleio_proxy:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
vw_netflow_proxy:
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "200m"
vw_probesw_proxy:
resources:
requests:
memory: "2Gi"
cpu: "50m"
limits:
memory: "4Gi"
cpu: "100m"
vw_aix_proxy:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
vw_vmware_proxy:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
vw_windows_proxy:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
translator_aidc:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "2Gi"
cpu: "100m"
translator_cbts_infinibox:
resources:
requests:
memory: "2Gi"
cpu: "300m"
limits:
memory: "3Gi"
cpu: "500m"
translator_hcp:
resources:
requests:
memory: "2Gi"
cpu: "300m"
limits:
memory: "3Gi"
cpu: "500m"
translator_ucp:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
translator_ds8k:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "2Gi"
cpu: "100m"
translator_hnas:
resources:
requests:
memory: "2Gi"
cpu: "300m"
limits:
memory: "3Gi"
cpu: "500m"
translator_hpe3par:
resources:
requests:
memory: "1Gi"
cpu: "300m"
limits:
memory: "2Gi"
cpu: "500m"
translator_hvsp:
resources:
requests:
memory: "1Gi"
cpu: "300m"
limits:
memory: "2Gi"
cpu: "500m"
translator_kvm:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
translator_linux:
resources:
requests:
memory: "1Gi"
cpu: "50m"
limits:
memory: "2Gi"
cpu: "100m"
translator_nexus:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "3Gi"
cpu: "300m"
translator_ntap:
resources:
requests:
memory: "1Gi"
cpu: "300m"
limits:
memory: "2Gi"
cpu: "500m"
nutanix_exporter:
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "100m"
translator_nutanix:
resources:
requests:
memory: "2Gi"
cpu: "100m"
limits:
memory: "4Gi"
cpu: "200m"
translator_powerstore:
resources:
requests:
memory: "2Gi"
cpu: "200m"
limits:
memory: "4Gi"
cpu: "500m"
translator_psdo_ucs:
resources:
requests:
memory: "2Gi"
cpu: "300m"
limits:
memory: "4Gi"
cpu: "500m"
translator_pure:
resources:
requests:
memory: "1Gi"
cpu: "300m"
limits:
memory: "2Gi"
cpu: "500m"
translator_solaris:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
translator_unity:
resources:
requests:
memory: "1Gi"
cpu: "300m"
limits:
memory: "2Gi"
cpu: "500m"
translator_vicm_xtremio:
resources:
requests:
memory: "2Gi"
cpu: "200m"
limits:
memory: "4Gi"
cpu: "500m"
translator_vplex:
resources:
requests:
memory: "2Gi"
cpu: "200m"
limits:
memory: "3Gi"
cpu: "300m"
translator_windows:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
vw_ciscogrpc_proxy:
resources:
requests:
memory: "2Gi"
cpu: "300m"
limits:
memory: "4Gi"
cpu: "500m"
vw_integration_slack_app:
resources:
requests:
memory: "1Gi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "200m"
translator_snmp_switch:
resources:
requests:
memory: "1Gi"
cpu: "150m"
limits:
memory: "2Gi"
cpu: "200m"
translator_datadomain:
resources:
requests:
memory: "1Gi"
cpu: "150m"
limits:
memory: "2Gi"
cpu: "200m"
translator_objectscale:
resources:
requests:
memory: "1Gi"
cpu: "150m"
limits:
memory: "2Gi"
cpu: "200m"
vwHeapSize:
vw_action_service: 1g
vw_alarm: 1g
vw_analytics_service: 1g
vw_application_service: 1g
vw_capi: 4g
vw_case_management: 1g
vw_dynamicalarm_integration: 2g
vw_fabric_service: 2g
vw_graph_store_service: 4g
vw_integration_mgmt_service: 4g
vw_integrations_service: 1g
vw_investigation_service: 1g
vw_job_manager: 1g
vw_license_manager: 2g
vw_metrix_service: 4g
vw_papi_integration: 2g
vw_query_service: 4g
vw_scheduler_service: 1g
vw_tomcat: 8g
vw_vertica: 8g
vw_redis: 2g
vw_emcisilon_integration: 2g
vw_windows_proxy: 2g
vw_emcvmax_integration: 2g
vw_integration_ibmsvc: 2g
vw_vcs_integration: 2g
vw_probe_hw_proxy: 2g
vw_os_proxy: 2g
vw_scaleio_proxy: 2g
vw_netflow_proxy: 2g
vw_probesw_proxy: 2g
vw_aix_proxy: 2g
vw_vmware_proxy: 2g
vw_ciscogrpc_proxy: 2g
vw_unification_service: 2gFixing issues with overrides
Issues | Override to apply |
|---|---|
Pod is OOMKilled | Increase |
Java | Increase |
CPU throttling observed | Increase |
Single service needs tuning | Apply a one-off override for that service only |
Full capacity planning | Override all services (large, planned deployments only) |
Important
For Java services, increasing container memory alone is not enough. The JVM heap (vwHeapSize) must also be increased. As a rule of thumb, set the heap to 80–90% of the container memory limit.
Kubernetes resource overrides
Controls Kubernetes requests (scheduling guarantees) and limits (hard caps). For example, Overriding vw_graph_store_service due to OOM:
global:
resources_limits:
small:
vw_graph_store_service:
main:
resources:
requests:
memory: "8Gi"
cpu: "800m"
limits:
memory: "12Gi"
cpu: "1000m"Field | Description |
|---|---|
| Top-level key for all Kubernetes resource overrides. |
| Overrides are scoped to the |
| CPU and memory requests and limits for the cluster. |
JVM Heap Size Overrides
Controls the JVM maximum heap (-Xmx) for Java-based services. Must be adjusted whenever you increase a Java service's container memory.
Note
The heap must always be less than the container memory.
global:
vwHeapSize:
vw_graph_store_service: 10g # ~80% of 12Gi limit
vw_query_service: 2g
vw_tomcat: 4g # ~65% of 6Gi limitField | Description |
|---|---|
| Top-level key for JVM heap overrides. |
| Max JVM heap for the service (e.g., 10g). Must be less than |