Container Observability – South Deployment Guide

In this deployment, you install the Virtana CO cluster components into your Kubernetes or OpenShift cluster. It helps cluster to continuously collect metrics, logs, and Kubernetes metadata and securely forward that telemetry to the Virtana backend. You need a South deployment to enable end-to-end observability for a specific cluster. Without it, the platform cannot discover workloads, export metrics, or collect node or container signals, etc., which can cause dashboards, health status, alerting, and troubleshooting views in the UI to be incomplete or unavailable for that cluster.

Prerequisites

Ensure the following requirements are met before starting the deployment or configuration process:

You need cluster-admin access to the target Kubernetes cluster.
Credentials for Docker Hub (or a private registry) and Keycloak client for the CO backend.

Get South `values.yaml`

The South values.yaml file contains the tenant and cluster-specific configuration needed to deploy CO South correctly. It also includes Org identifiers, backend endpoints, and any pre-configured module defaults expected for your environment. This ensures that deployment connects the South components to the correct Virtana tenant and applies the right settings for your selected cluster.

Perform the following steps to get the South values.yaml file:

Open the following URL, https://GLOBAL_VIEW_HOSTNAME/ui.
Log in to Virtana Platform using your org email and password.
Navigate to the Container Observability > Cluster.
In the top right of the CO default page, click System Status and select South Deployment Guide.
To download South values.yaml by clicking Generate Token to Download YAML.
Copy the URL generated and run it on your machine to download the YAML file.

Run the commands provided under Deploy Opscruise, or use the following commands.

helm repo add virtana-repo https://virtana.gitlab.io/helm-charts
helm repo update
helm search repo virtana-repo/virtana-co

Save as <ORG_ID>-<CLUSTER_NAME>-opscruise-values.yaml.

Deploy with Helm (CLI)

Deploy the South components directly from your terminal using native Helm command-line tools.

helm upgrade --install opscruise-bundle virtana-repo/virtana-co --namespace opscruise \
  --create-namespace -f <ORG_ID>-<CLUSTER_NAME>-virtana-co-values.yaml \
  --version <LATEST_VERSION>

Table 55.

Field	Description
`--namespace opscruise`	Target namespace for all components.
`--create-namespace`	Creates the namespace if absent.
`--version <LATEST_VERSION>`	Specific chart version to deploy.

Deploy with Argo CD

Create an Argo CD Application to manage the Helm chart declaratively.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: virtana-CLUSTER_NAME-south
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  destination:
    server: https://kubernetes.default.svc   
    namespace: opscruise
  source:
    chart: virtana-co
    repoURL: "https://virtana.gitlab.io/helm-charts"
    targetRevision: <LATEST_VERSION>
    helm:
      releaseName: opscruise-bundle
      valueFiles:
      - values.yaml
      values: |
        global:
          gatewayCreds:
            environment:
              DOCKER_SERVER: "https://index.docker.io/v2/"
              DOCKER_USERNAME: "xxxxx"
              DOCKER_PASSWORD: "xxxxx"
              OPSCRUISE_ENDPOINT: "xxxxxxxxxxxxx-xxxxxxxxxxxxx.elb.us-east-2.amazonaws.com:443"
              KEYCLOAK_ENABLED: "true"
              KEYCLOAK_URL: "https://xxxxxx.example.com:443"
              KEYCLOAK_CLIENT_ID: "xxxxxx"
              KEYCLOAK_CLIENT_SECRET: "xxxxxx"
              KEYCLOAK_REALM: "xxxxxx"
              OPSCRUISE_ACCOUNT_ID: "xxxxxx"
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

The following table describes each field in the configuration file.

Table 56.

Field	Description	Default value
`apiVersion`	The Kubernetes API version for the Argo CD Application Custom Resource.	`argoproj.io/v1alpha1`
`metadata.name`	The name of the Argo CD Application object.	`virtana-CLUSTER_NAME-south`
`metadata.namespace`	The namespace where Argo CD is installed.	`argocd`
`metadata.finalizers`	Prevents the Application from being deleted until Argo CD has cleaned up the resources it created.	`- resources-finalizer.argocd.argoproj.io`
`spec.project`	The Argo CD Project this Application belongs to.	`default`
`spec.destination.server`	The Kubernetes API server address for the destination cluster.	`https://kubernetes.default.svc`
`spec.source.chart`	The Helm chart name to install.	`virtana-co`
`spec.source.repoURL`	The Helm repository URL hosting the chart.	`https://virtana.gitlab.io/helm-charts`
`spec.source.targetRevision`	The chart version Argo CD should deploy.	`<LATEST_VERSION>`
`spec.source.helm`	Helm-specific settings under Argo CD	-
`spec.syncPolicy.automated`	Enables automatic sync without manual clicking.	-
`syncPolicy.syncOptions`	Tells Argo CD to create the destination namespace if it doesn’t exist.	`- CreateNamespace=true`

Deploy with Terraform

You can also integrate the deployment into your Infrastructure as Code pipelines by using the Terraform Helm provider.

resource "helm_release" "south" {
  create_namespace = true
  chart            = "virtana-co"
  name             = "opscruise-bundle"
  namespace        = "opscruise"
  repository       = "https://virtana.gitlab.io/helm-charts"
  version          = var.helm_version
  values = [
    templatefile("${path.module}/../values/opscruise-values.yaml", {
      docker_password          = var.docker_password
      docker_username          = var.docker_username
      keycloak_hostname        = var.keycloak_hostname
      opscruise_kafka_endpoint = var.opscruise_kafka_endpoint
      kafka_client_id          = var.kafka_client_id
      kafka_client_secret      = var.kafka_client_secret
      tenant_name              = var.tenant_name
      cluster_name             = var.cluster_name
    })
  ]
}

The following table describes each field in the configuration file.

Table 57.

Field	Description	Default value
`create_namespace`	Create opscruise namespace automatically.	`true`
`chart/repository/version`	Chart identity and version to install.	-
`name/namespace`	Helm release name and namespace.	-
`values`	Rendered values file using Terraform templatefile with variables for credentials and cluster metadata.	-

Optional settings

Use the following optional settings to pull images from a private registry or to set the resource profile for the South modules.

Using a private image registry

Enter the following command to <ORG_ID>-<CLUSTER_NAME>-virtana-co-values.yaml to use the private image registry.

global:
  useGlobalRepository: true
  globalRepositoryName: example.io

The following table describes each field in the configuration file.

Table 58.

Field	Description	Default value
`global.useGlobalRepository`	Enable global override of container image registry for supported components.	`true`
`global.globalRepositoryName`	Fully qualified registry/repository path.	`example.io`

Update resource profile

Virtana CO South supports resource profiles to simplify sizing across modules. Instead of manually configuring CPU or memory for every component, you can select a machine_type profile, such as small, medium, or large, and the chart applies the corresponding default resource limits.

Global-level resource profile

You can use this when you want a single sizing profile to apply across CO South components, unless a module overrides it.

global:
  machine_type: "small"

The command global.machine_type selects the default sizing profile used by modules that do not define their own machine_type. You can select small, medium, or large as a default sizing profile.

Module-level override

Use this command when a particular component needs separate resources from the global profile that provides them. If you want to change the machin_type for a specific module, you can add it to the specific section.

loggw-loki:
  machine_type: "large"

The command loggw-loki.machine_type overrides global.machine_type. You can select small, medium, or large as a default sizing profile.

Custom resource values for a module

When the predefined resource profiles ( small/medium/large ) do not meet the requirements of a specific module, you can define custom resource values. To do this, configure the machine_type and resources fields under the corresponding machine type section (small_machine, medium_machine, or large_machine). Ensure that the value of machine_type matches the machine type section where the custom resources are defined.

loggw-loki:
    machine_type: "large"
    large_machine:
        resources:
            requests:
                memory: 2Gi
                cpu: 1
            limits:
                memory: 4Gi
                cpu: 2
        xss: -Xss4m
        xms: -Xms1500m
        xmx: -Xmx3500m

The following table describes each field in the configuration file.

Table 59.

Field	Description	Default value
`loggw-loki.machine_type`	Chooses which profile is active for this module.	`small/medium/large`
`loggw-loki.large_machine`	A profile-specific override block for the `large` profile.
`resources`	Kubernetes resource configuration to apply to the module’s pods.
`resources.requests`	Guaranteed minimum resources the scheduler uses for placement. `requests.cpu`: Minimum CPU `requests.memory`: Minimum memory: Minimum memory
`resources.limits`	The maximum resources the container is allowed to consume. `limits.cpu`: Max CPU `limits.memory`: Max memory
`xss`, `xms`, `xmx`	JVM tuning options, such as thread stack size, initial heap, and max heap. These apply only to Java Gateways such as loggw, tracegw, gcpgw, and azuregw.	`xss: -Xss4m` `xms: -Xms1500m` `xmx: -Xmx3500m`

OpenShift deployment specifics

OpenShift-specific deployment steps are only required if your Kubernetes platform is Red Hat OpenShift, because OpenShift enforces additional security and runtime constraints compared to upstream Kubernetes. In particular, OpenShift uses Security Context Constraints (SCC) that can prevent CO South components from running with the permissions they need unless the correct SCCs are granted to their service accounts.

Follow the steps below to prepare the cluster before running the normal South deployment flow.

Download the virtana-co-values.yaml generated by the Virtana UI. This file includes cluster-specific settings required for the deployment.
Update the node-exporter port to 9200 instead of the default 9100 in virtana-co-values.yaml.
```
global:
  nodeExporterPort: 9200
```
The command global.nodeExporterPort sets the port used by Node Exporter.
Enter the following command to create the South Namespace if not present, where opscruise is a namespace where CO South components are deployed.
```
kubectl create ns opscruise
```

Grant required SCC permissions to the CO South service accounts. OpenShift uses SCCs to control privileges.

The following commands bind SCCs to the service accounts used by South modules so pods can run correctly.

oc adm policy add-scc-to-user anyuid -z k8sgw-service-account -n opscruise
oc adm policy add-scc-to-user anyuid -z loggw-service-account -n opscruise
oc adm policy add-scc-to-user anyuid -z promgw-service-account -n opscruise
oc adm policy add-scc-to-user anyuid -z prometheus-service-account -n opscruise
oc adm policy add-scc-to-user privileged -z loggw-service-account -n opscruise
oc adm policy add-scc-to-user privileged -z prometheus-service-account -n opscruise
oc adm policy add-scc-to-user privileged -z ne-service-account -n opscruise
oc adm policy add-scc-to-user privileged -z opscruise-bundle-loki -n opscruise
oc adm policy add-scc-to-user privileged -z opscruise-bundle-promtail -n opscruise

After completing the OpenShift-specific prerequisites steps above, deploy CO South using your preferred method, such as Helm CLI, Argo CD, or Terraform.

Create secrets manually

The Virtana CO South components require values, for example, Keycloak client credentials and Docker registry credentials, to authenticate to the Opscruise backend and pull images. So, if secrets are not provided through the values file, they must already exist in the cluster or namespace with the expected names.

You perform this manual secret creation before deploying South, especially when their organization’s security policy requires secrets to be managed outside Helm, such as via Vault or Fuze, or a dedicated secrets pipeline.

Create a Keycloak client secret `oc-kc-secret`

You can create a Keycloak client secret oc-kc-secret manually if secret_source is set to none, which contains KEYCLOAK_CLIENT_SECRET, KEYCLOAK_CLIENT_ID, and KEYCLOAK_CLIENT_TOKEN.

For Linux, enter the following command

export KEYCLOAK_CLIENT_ID="xxxx"
export KEYCLOAK_CLIENT_SECRET="xxxx"

kubectl create secret generic oc-kc-secret \
  --from-literal=KEYCLOAK_CLIENT_SECRET=${KEYCLOAK_CLIENT_SECRET} \
  --from-literal=KEYCLOAK_CLIENT_ID=${KEYCLOAK_CLIENT_ID} \
  --from-literal=KEYCLOAK_CLIENT_TOKEN="Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -w 0)" \
  -n opscruise

cat <<EOF
{
  "KEYCLOAK_CLIENT_SECRET": "${KEYCLOAK_CLIENT_SECRET}",
  "KEYCLOAK_CLIENT_ID": "${KEYCLOAK_CLIENT_ID}",
  "KEYCLOAK_CLIENT_TOKEN": "Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -w 0)"
}
EOF

For macOS, enter the following command

export KEYCLOAK_CLIENT_ID="xxxx"
export KEYCLOAK_CLIENT_SECRET="xxxx"

kubectl create secret generic oc-kc-secret \
  --from-literal=KEYCLOAK_CLIENT_SECRET=${KEYCLOAK_CLIENT_SECRET} \
  --from-literal=KEYCLOAK_CLIENT_ID=${KEYCLOAK_CLIENT_ID} \
  --from-literal=KEYCLOAK_CLIENT_TOKEN="Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -b 0)" \
  -n opscruise

cat <<EOF
{
  "KEYCLOAK_CLIENT_SECRET": "${KEYCLOAK_CLIENT_SECRET}",
  "KEYCLOAK_CLIENT_ID": "${KEYCLOAK_CLIENT_ID}",
  "KEYCLOAK_CLIENT_TOKEN": "Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -b 0)"
}
EOF

Create Docker registry credentials

Enter the following command to create a secret directly in Kubernetes.

export DOCKER_USERNAME="xxxx"
export DOCKER_PASSWORD="xxxx"

kubectl create secret docker-registry oc-ns-docker-creds \
  --docker-server=https://index.docker.io/v2/ \
  --docker-username="${DOCKER_USERNAME}" \
  --docker-password="${DOCKER_PASSWORD}" \
  -n opscruise

Enter the following command to create a secret in Fuze or Vault using JSON.

registry_server="https://index.docker.io/v2/"
registry_username="xxxx"
registry_password="xxxx"

encoded_registry_auth=$(echo -n ${registry_username}:${registry_password} | base64)
DOCKER_CONFIG=$(echo -n "{\"auths\": {\"${registry_server}\": {\"auth\": \"${encoded_registry_auth}\"}}}")

cat <<EOF
{
  "dockerconfigjson": "${DOCKER_CONFIG}"
}
EOF

Environment-specific settings

This section covers three commonly used optional configurations for CO South deployments, which include enabling TLS and Basic Authentication for Prometheus, deploying the Zenoss Kubernetes Agent, and enabling GKE Autopilot compatibility. Apply only the subsections that match your environment or requirements by updating your virtana-co-values.yaml.

Prometheus: Enable TLS and Basic Authentication

Use this configuration when you want Prometheus endpoints protected using HTTPS (TLS) and Basic Authentication.

prometheus:
  security:
    authentication:
      enabled: true         
      password: prom123     
    tls:
      enabled: true            
      selfSignedCerts: true    
      hostname: ""

The following table describes each field in the configuration file.

Table 60.

Field	Description	Default value
`prometheus`	Configuration section for the Prometheus component deployed as part of CO South.
`prometheus.security.authentication.enabled`	Enables or disables Basic Authentication.	`false`
`prometheus.security.authentication.password`	Password used for Basic Auth.	`prom123`
`prometheus.security.tls.enabled`	Enables or disables TLS (HTTPS).	`false`
`prometheus.security.tls.selfSignedCerts`	When `true`, Prometheus uses self-signed certificates.	`true`
`prometheus.security.tls.hostname`	Optional DNS name for Prometheus.	" "

Deploy the Zenoss Kubernetes agent

Enable this when you want CO South to deploy the Zenoss Kubernetes Agent and connect it to your Zenoss environment.

zenoss-agent-kubernetes:
  enabled: true        
  zenoss:
    clusterName: ""
    address: ""
    apiKey: ""

The following table describes each field in the configuration file.

Table 61.

Field	Description	Default value
`zenoss-agent-kubernetes.enabled`	Turns the Zenoss agent deployment on or off.	`false`
`zenoss-agent-kubernetes.zenoss.clusterName`	Cluster identifier or name as it should appear in Zenoss.	" "
`zenoss-agent-kubernetes.zenoss.address`	Zenoss endpoint or address used by the agent to communicate.	" "
`zenoss-agent-kubernetes.zenoss.apikey`	API key or token used to authenticate to Zenoss.	" "

GKE autopilot support

Enable this only when your CO South target cluster is a GKE Autopilot cluster (as opposed to standard GKE). The default value of gkeAutoPilot is set to false.

global:
  gkeAutoPilot: false

If you keep the default GKE cluster type as false, then it considers standard GKE behavior, and if you update the value to true, then autopilot-compatible behavior is considered.

Container Observability South base values file for reference

This section provides a curated snapshot of the most important virtana-co-values.yaml settings used to deploy and operate CO South, focusing on the parameters you most commonly review or tune during installation and troubleshooting. Use it as a quick reference to understand how global credentials, registry settings, backend connectivity, scheduling controls (tolerations or affinity), and key module configurations, such as k8sgw, promgw, Prometheus, etc., fit together. You can adjust only what’s required for your environment while keeping the UI-generated tenant or cluster values intact.

CO South base values.yaml

##### Opscruise credentials #####
global:

  opscruiseChartVersion: TO_BE_DEFINED
  secret_source: "valuesfile"
  imagePullSecrets:
    - name: oc-ns-docker-creds

  ##### AWS Credentials #####
  awsCredentials:
    regions:
    - us-east-1
    aws_access_key_id: aws_access_key_id
    aws_secret_access_key: aws_secret_access_key

    roleArn: ""

  gkeAutoPilot: false

  ##### gateway Credentials #####
  gatewayCreds:
    environment:
      DOCKER_SERVER: "https://index.docker.io/v1/"
      DOCKER_USERNAME: "<DOCKER_USERNAME>"
      DOCKER_PASSWORD: "<DOCKER_PASSWORD>"
      DOCKER_EMAIL: "<DOCKER_EMAIL>"
      OPSCRUISE_ENDPOINT: "<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"
      KEYCLOAK_ENABLED: "true"
      KEYCLOAK_URL: "https://auth.opscruise.io:443"
      KEYCLOAK_CLIENT_ID: "<KAFKA_CLIENT_ID>"
      KEYCLOAK_CLIENT_SECRET: "<KEYCLOAK_CLIENT_SECRET>"
      KEYCLOAK_REALM: "<KEYCLOAK_REALM>"
      OPSCRUISE_ACCOUNT_ID: "<KEYCLOAK_CLUSTERID>"

  externalCadvisor: false

  nodeExporterPort: 9100

  useGlobalRepository: false
  globalRepositoryName: ""       

  k8sClusterFqdn: "cluster.local"

  metricScraper: "prometheus"

  metricScrapeInterval: 60

  machine_type: "small"

  ## namespace filtering ##
  # namespaceFiltering:
  #   namespaceAllowList:
  #   - kube-system
  #   - collectors
  #   - opscruise

  ## allow whitelisted pod labels ##
  # whitelistedPodLabels:
  # - app

  tolerations:
    - key: opscruise
      effect: NoSchedule
      operator: Exists

  daemonsetTolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule 
      operator: Exists

  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

  daemonsetAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

  # awsgw:
  #   enabled: false
  # azuregw:
  #   enabled: false
  # gcpgw:
  #   enabled: false
  # k8sgw:
  #   enabled: true
  # promgw:
  #   enabled: true
  # loggw-loki:
  #   enabled: true
  # tracegw:
  #   enabled: false
  # eventgw:
  #   enabled: false
  # trace-router:
  #   enabled: false
  # opscruise-node-exporter:
  #   enabled: false
  # opscruise-node-exporter-new:
  #   enabled: true
  # otel-metric-collector:
  #   enabled: true
  # kube-state-metrics:
  #   enabled: true
  # prometheus:
  #   enabled: true
  # loki-stack:
  #   enabled: true
  # prometheus-yace-exporter:
  #   enabled: false
  # jaeger:
  #   enabled: false
  # jaeger-operator:
  #   enabled: false
  # prometheus-postgres-exporter:
  #   enabled: false
  # prometheus-mongodb-exporter:
  #   enabled: false
  # kafka-exporter:
  #   enabled: false
  # fluent-bit:
  #   enabled: false
  # prometheus-mysql-exporter:
  #   enabled: false
  # influxdb-exporter:
  #   enabled: false
  # x509-certificate-exporter:
  #   enabled: false
  # prometheus-redis-exporter:
  #   enabled: false
  # nginx-prometheus-exporter:
  #   enabled: false
  # beyla:
  #   enabled: false
  # alloy:
  #   enabled: true
  # otel-trace-collector:
  #   enabled: false

##### Awsgw configs #####
awsgw:
  enabled: false
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 250Mi
    requests:
      cpu: 50m
      memory: 50Mi

##### K8sgw configs #####
k8sgw:
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 50Mi
  ## namespace filtering
  # configMap:
  #   config:
  #     kubernetes:
  #       namespace_allow_list:
  #       - kube-system
  #       - collectors
  #       - opscruise

##### Promgw configs #####
promgw:
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 300Mi
    requests:
      cpu: 50m
      memory: 50Mi

##### Loggw config #####
loggw-loki:
  enabled: true
  logLevel: "INFO"
  config:
    oauthAcceptUnsecureServer: "true"
    jgateway:
      lokiHost: "opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

##### Azuregw configs #####
azuregw:
  enabled: false
  logLevel: "INFO"
  azureCredentials:
  - azureauth_clientId: azureauth_clientId
    azureauth_tenantId: azureauth_tenantId
    azureauth_clientSecret: azureauth_clientSecret
    azureauth_subId: azureauth_subId
    name: "credential_name"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

##### Gcpgw configs #####
gcpgw:
  enabled: false
  logLevel: "INFO"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi

##### Tracegw configs #####
tracegw:
  enabled: false
  logLevel: "INFO"
  persistentType: ebs    
  storageClassName: ""   
  slo_storage_size: 20Gi
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  service:
    type: ClusterIP

  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi
  # env config
  config:
    tracegw:
      filterTagsKey: notag
      filterTagsValue: notag
      traceDataFromJeager: "true"
      mode: poll #or listen

##### eventgw #####
eventgw:
  enabled: false
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""

##### trace-router #####
trace-router:
  enabled: false
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  service:
    type: ClusterIP
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi

##### Node-Exporter-New configs #####
opscruise-node-exporter-new:
  ## Only lowercase accepted
  #logLevel: "info" - Keep the logLevel as info for GKE Autopilot clusters
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""
  #args:
    #btfFilePath: path-to-btf-file
    #kconfigFilePath: path-to-kconfig
    #customArgs:
      #- "--collector.ocflowbpfcollector.skip-bpf-verification"
      #- "--collector.ocflowbpfcollector.enable-dns-tracking"
      #- "--no-collector.ocflowbpfcollector.retain-original-public-ip-sources"
  # publicIPAggregationSubnetPatterns:
  #   -   pattern: "172.16.0.86/32"
  #       aggregate_to: "1.2.3.4"
  #       aggregate_name: ""
  #   -   pattern: "1.1.1.1/0" 1"
  #       aggregate_name: ""

##### BEYLA #####
beyla:
  # podLabels:
  #   key: value
  annotations:
  #   #key: value
  # podAnnotations:
  #   key: value
  tolerations:
  #   # - key: node-role.kubernetes.io/master
  #   #   effect: NoSchedule
  #   #   operator: Exists
  priorityClassName: ""
  affinity:

##### OTEL Metric Collector #####
otel-metric-collector:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""

  # additional_receivers_configs:
  #   prometheus:
  #     config:
  #       scrape_configs:
  #       - job_name: new-job-exporter
  #         static_configs:
  #         - targets:
  #           - '172.16.71.143:9256'
  #           - '172.16.73.102:9256'
  #         scheme: http
  #         tls_config:
  #           insecure_skip_verify: true

##### cadvisor configs #####
cadvisor:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  hostNetwork: false
  #Resources
  resources:
    limits:
      cpu: 300m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi
  # customArgs:
  #   - --enable_load_reader=false

##### KSM configs #####
kube-state-metrics:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    requests:
      cpu: 50m
      memory: 30Mi
    limits:
      cpu: 300m
      memory: 250Mi

##### prometheus configs #####
prometheus:
  # logLevel: info
  enableIstio: false
  enablePersistent: false
  prometheusEcsDiscovery:
    enabled: false
    awsCredentials:
      regions:
      - us-east-1
      aws_access_key_id: aws_access_key_id
      aws_secret_access_key: aws_secret_access_key
      roleArn: ""

  # scrape_interval: 30
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    requests:
      cpu: 50m
      memory: 1000Mi
    limits:
      memory: 5Gi

  nonIstioConfigMap:
    enabledScrapeJobs:
    - oc-kubernetes-pods
    - oc-app-exporters
    - kubernetes-nodes
    - kubernetes-nodes-cadvisor
    - kubernetes-apiservers
    - kube-scheduler
    additionalScrapeConfigs:
    # - job_name: ecs-task-targets
    #   file_sd_configs:
    #   - files: ['/mnt/*.yml']
    #     refresh_interval: 1m
    # - job_name: kubernetes-nodes-cadvisor
    #   scheme: https  
    #   tls_config:
    #     ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    #   bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    #   kubernetes_sd_configs:
    #     - role: node
    #   relabel_configs:
    #     - action: labelmap
    #       regex: __meta_kubernetes_node_label_(.+)
    #     - target_label: __address__
    #       replacement: kubernetes.default.svc:443
    #     - source_labels: [__meta_kubernetes_node_name]
    #       regex: (.+)
    #       target_label: __metrics_path__
    #       replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    #   metric_relabel_configs:
    #     - action: replace
    #       source_labels: [id]
    #       regex: '^/machine.slice/machine-rkt\x2d([^\]+)\.+/([^/]+).service$'
    #       target_label: rkt_container_name
    #       replacement: '${2}-${1}'
    #     - action: replace
    #       source_labels: [id]
    #       regex: '^/system.slice/(.+).service$'
    #       target_label: systemd_service_name
    #       replacement: '${1}'
    #     - action: replace
    #       source_labels: [container]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_container_name
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [pod]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_pod_name
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [namespace]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_pod_namespace
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [id]
    #       regex: '.+?pod([^\.g-z]+?)[\.\/\s](.*)'
    #       target_label: container_label_io_kubernetes_pod_uid
    #       replacement: ${1}

##### Prometheus Postgres Exporter #####
prometheus-postgres-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    datasource:
      # Specify one of both datasource or datasourceSecret
      host: "<POSTGRESQL_SERVICENAME.NAMESPACE.svc.cluster.local>"
      user: "postgres_exporter"
      # Only one of password and passwordSecret can be specified
      password: "<POSTGRES_EXPORTER_PASSWORD>"
      # Specify passwordSecret if DB password is stored in secret.
      passwordSecret: {}
      #  name: <Secret name>
      #  key: <Password key inside secret>
      database: "<DB_NAME>"
      sslmode: disable
    autoDiscoverDatabases: false
    excludeDatabases: []
    includeDatabases: []
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1
##### Prometheus MongoDB Exporter #####
prometheus-mongodb-exporter:
  enabled: false
  mongodb:
    uri: "mongodb://${USERNAME}:${PASSWORD}@<SERVICE_NAME.<NAMESPACE>.svc.cluster.local"
  existingSecret:
    name: ""
    key: "mongodb-uri"
  resources:
    limits:
      cpu: 250m
      memory: 192Mi
    requests:
      cpu: 50m
      memory: 128Mi
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### Kafka Exporter #####
kafka-exporter:
  enabled: false
  args:
    - --kafka.server=<KAFKA_SERVICE>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>
    - --zookeeper.server=<ZOOKEEPER_SERVICE>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  mutiple_kafka_zookeepers:
  #  - <KAFKA_SERVICE_1>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>,<ZOOKEEPER_SERVICE_1>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  #  - <KAFKA_SERVICE_2>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>,<ZOOKEEPER_SERVICE_2>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  resources:
    requests:
      cpu: 50m
      memory: 256Mi
    limits:
      cpu: "0.5"
      memory: 256Mi

##### Prometheus MYSQL Exporter #####
prometheus-mysql-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 200Mi
    requests:
      cpu: 50m
      memory: 128Mi
  mysql:
    db: "<DB_NAME>"
    host: "<MYSQL_SERVICE>.<NAMESPACE>.svc.cluster.local"
    param: ""
    # If "existingPasswordSecret" is specified, "pass" can be ignored
    pass: ""
    port: 3306
    user: "mysql_exporter"
    # If "pass" is specified, "existingPasswordSecret" can be ignored
    existingPasswordSecret:
      name: ""
      key: ""
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### InfluxDB Exporter #####
influxdb-exporter:
  enabled: false
  # common_dns: If InfluxDB is running within cluster but different namespace/externally on a VM and is accessible with DNS
  # external_ip: If InfluxDB is running externally on a VM and is accessible with IP address only
  endpoint_type: common_dns
  common_dns_name: ""
  common_dns_port: ""
  external_ip_address: ""
  external_ip_port: ""
  serviceLabels: {}
  annotations: {}

##### Loki-stack Promtail Integration #####
loki-stack:
  promtail:
    tolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 128Mi
    pipelineStages:
    - docker: {}
    - replace:
        # Example to remove: "2025-09-25T07:18:51.353495749Z stderr F "
        expression: '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z (?:stdout|stderr) [A-Z] )'
        replace: ''
    - multiline:
        # Combine multiple patterns for detecting the first line of a multiline log:
        # Pattern 1: ISO8601 with milliseconds and Zulu timezone (e.g., 2025-09-25T07:18:51.353Z)
        # Pattern 2: Date + time without T separator (e.g., 2025-09-25 07:18:51)
        # Pattern 3: Date + time with space separator enclosed in square bracket (e.g., [2025-09-25 07:18:51])
        # Pattern 4: Custom prefix starting with "Unexpected character" (example pattern)
        # Pattern 5: Matches log level prefixes like "info: " or "error: "
        # Pattern 6: Identify zero-width space (not visible space) as first line of a multiline block.
        firstline: '(^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)|(^\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2})|(^\[\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\])|(^\w{10} \w{9} \(\?\) \w{2} \w{8} \d{1,2})|(^(?:info|error): )|(^\x{200B}\[)'
        max_wait_time: 3s
#     affinity:
#       nodeAffinity:
#         preferredDuringSchedulingIgnoredDuringExecution:
#         - preference:
#             matchExpressions:
#               - key: opscruise
#                 operator: In
#                 values:
#                   - "true"
#           weight: 1
#     config:
#       client:
#         external_labels:
#           cluster_name: CLUSTER_NAME
    extraVolumes:
    - name: varlog
      hostPath:
        path: /var/log
    extraVolumeMounts:
    - name: varlog
      mountPath: /var/log
      readOnly: true
    extraScrapeConfigs:
    - job_name: kubernetes-nodes-debian
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubelet-logs
        target_label: namespace
      - replacement: /var/log/syslog
        target_label: __path__
      - source_labels: [kubernetes_io_hostname]
        target_label: host
    - job_name: kubernetes-nodes-redhat
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubelet-logs
        target_label: namespace
      - replacement: /var/log/messages
        target_label: __path__
      - source_labels: [kubernetes_io_hostname]
        target_label: host
  loki:
    resources:
      limits:
        cpu: 300m
        memory: 4Gi
      requests:
        cpu: 50m
        memory: 512Mi
#     config:
#       limits_config:
#         reject_old_samples_max_age: 24h
#       table_manager:
#         retention_deletes_enabled:  true
#         retention_period: 24h
#     tolerations:
#     - key: node-role.kubernetes.io/<node>
#       effect: NoSchedule
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
                - key: opscruise
                  operator: In
                  values:
                    - "true"
            weight: 1

##### Prometheus Redis Exporter #####
prometheus-redis-exporter:
  enabled: false
  redisAddress: redis://<REDIS_IP/FQDN>:6379
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### Nginx Prometheus Exporter #####
nginx-prometheus-exporter:
  enabled: false
  args:
    - -nginx.scrape-uri=http://NGINX_ENDPOINT:PORT/METRIC_ENDPOINT
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

##### prometheus-yace-exporter #####
prometheus-yace-exporter:
  enabled: false
  image:
    repository: ghcr.io/nerdswords/yet-another-cloudwatch-exporter
    tag: v0.61.2
    pullPolicy: IfNotPresent
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

Global Configuration

The global section is the foundation of the CO South values file. It defines site-wide defaults, such as chart versioning, secret management strategy, image pull credentials, cluster type, metric collection settings, and resource sizing profiles, that are inherited by all modules unless explicitly overridden at the individual module level.

Chart version and secret source

The global section contains settings that apply across all deployed modules unless overridden at the individual module level.

global:
  opscruiseChartVersion: TO_BE_DEFINED
  secret_source: "valuesfile"
  imagePullSecrets:
    - name: oc-ns-docker-creds
  gkeAutoPilot: false
  externalCadvisor: false
  nodeExporterPort: 9100
  useGlobalRepository: false
  globalRepositoryName: ""
  k8sClusterFqdn: "cluster.local"
  metricScraper: "prometheus"
  metricScrapeInterval: 60
  machine_type: "small"

Table 62.

Field	Description	Default values
`opscruiseChartVersion`	The version of the OpsCruise Helm chart being deployed.	`TO_BE_DEFINED`
`secret_source`	Defines where the deployment should read secrets from.	`valuesfile`
`global.imagePullSecrets`	A list of Kubernetes secrets used to authenticate with private container image registries when pulling images.	`- name: oc-ns-docker-creds`
`gkeAutoPilot`	Enables or disables configuration adjustments specific to GKE Autopilot clusters.	`true`
`externalCadvisor`	When set to `true`, the deployment uses an externally managed cAdvisor instance for container metrics instead of deploying its own. Set to `false` to use the built-in cAdvisor.	`false`
`nodeExporterPort`	The port on which the Prometheus Node Exporter listens for host-level metrics.	`9100`
`metricScraper`	Specifies the metrics collection backend.	`"prometheus"`
`metricScrapeInterval`	The interval (in seconds) at which metrics are scraped from targets.	`60`
`useGlobalRepository`	When set to `true`, all charts pull images from the repository specified in globalRepositoryName instead of their individually configured repositories. Set to `false` to use per-chart image settings.	`false`
`globalRepositoryName`	The global image repository path is used when useGlobalRepository is `true`.	(" ") An empty string means no global override is active.
`k8sClusterFqdn`	The fully qualified domain name (FQDN) of the Kubernetes cluster's internal DNS.	`"cluster.local"`
`machine_type`	A sizing profile that controls default resource allocations (CPU, memory) across services. You can select `small`, `medium` , or `large` as `machin_Type` value.	`small`

Global AWS credentials

This subsection configures the AWS credentials used by cloud-aware modules to authenticate with AWS services and specify which regions to monitor.

global:
  awsCredentials:
    regions:
      - us-east-1
    aws_access_key_id: aws_access_key_id
    aws_secret_access_key: aws_secret_access_key
    roleArn: ""

Table 63.

Field	Description	Default value
`awsCredentials`	Configuration for authenticating with AWS services.
`awsCredentials.regions`	A list of AWS regions the deployment interacts with.	`us-east-1`
`awsCredentials.aws_access_key_id`	The AWS access key ID used for authentication. Replace with your actual key.
`awsCredentials.aws_secret_access_key`	The AWS secret access key paired with the access key ID. Replace with your actual secret.
`awsCredentials.roleArn`	An optional AWS IAM Role ARN to assume for cross-account access or scoped permissions.	" "

Global gateway credentials

This subsection provides the backend connectivity and image-registry credentials that all South gateway modules use to authenticate with the Opscruise or Virtana backend (via Keycloak) and to pull container images from the Docker registry.

global:
  gatewayCreds:
    environment:
      DOCKER_SERVER: "https://index.docker.io/v1/"
      DOCKER_USERNAME: "<DOCKER_USERNAME>"
      DOCKER_PASSWORD: "<DOCKER_PASSWORD>"
      DOCKER_EMAIL: "<DOCKER_EMAIL>"
      OPSCRUISE_ENDPOINT: "<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"
      KEYCLOAK_ENABLED: "true"
      KEYCLOAK_URL: "https://auth.opscruise.io:443"
      KEYCLOAK_CLIENT_ID: "<KAFKA_CLIENT_ID>"
      KEYCLOAK_CLIENT_SECRET: "<KEYCLOAK_CLIENT_SECRET>"
      KEYCLOAK_REALM: "<KEYCLOAK_REALM>"
      OPSCRUISE_ACCOUNT_ID: "<KEYCLOAK_CLUSTERID>"

Table 64.

Field	Description	Default value
`DOCKER_SERVER`	Docker registry server URL.	`"https://index.docker.io/v1/"`
`DOCKER_USERNAME`	Docker registry username.	`"<DOCKER_USERNAME>"`
`DOCKER_PASSWORD`	Docker registry password.	`"<DOCKER_PASSWORD>"`
`DOCKER_EMAIL`	Docker registry email.	`"<DOCKER_EMAIL>"`
`OPSCRUISE_ENDPOINT`	Backend Kafka endpoint to which South components send telemetry.	`"<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"`
`KEYCLOAK_ENABLED`	Enables or disables Keycloak-based authentication.	`"true"`
`KEYCLOAK_URL`	Keycloak server URL.	`"https://auth.opscruise.io:443"`
`KEYCLOAK_CLIENT_ID`	Keycloak client ID.	`"<KAFKA_CLIENT_ID>"`
`KEYCLOAK_CLIENT_SECRET`	Keycloak client secret.	`"<KEYCLOAK_CLIENT_SECRET>"`
`KEYCLOAK_REALM`	Keycloak realm name.	`"<KEYCLOAK_REALM>"`
`OPSCRUISE_ACCOUNT_ID`	Opscruise account or cluster identifier.	`"<KEYCLOAK_CLUSTERID>"`

Global namespace filtering and pod label whitelisting

These optional settings let you restrict which namespaces are monitored and control which pod labels are collected or forwarded by CO South. By default, all namespaces and no specific pod labels are filtered.

global:
  # namespaceFiltering:
  #   namespaceAllowList:
  #     - kube-system
  #     - collectors
  #     - opscruise
  # whitelistedPodLabels:
  #   - app

Table 65.

Field	Description
`namespaceFiltering.namespaceAllowList`	Restricts monitoring to only the listed namespaces.
`whitelistedPodLabels`	Allows specific pod labels to be collected or forwarded.

Global tolerations and affinity

These settings control pod scheduling across the cluster. Global tolerations and affinity rules determine which nodes CO South pods can (or prefer to) run on. Separate daemonset variants apply only to daemonset-based modules that need to run on every node, including master or control-plane nodes.

global:
  tolerations:
    - key: opscruise
      effect: NoSchedule
      operator: Exists
  daemonsetTolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule
      operator: Exists
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1
  daemonsetAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

Table 66.

Field	Description
tolerations	Global tolerations for non-daemonset modules.
daemonsetTolerations	Additional tolerations for daemonsets.
affinity	Global node affinity for non-daemonset modules.
daemonsetAffinity	Additional node affinity for daemonsets.

Global module enable/disable flags

This subsection provides a centralized toggle for every CO South module. Modules enabled by default include node-exporter, KSM, loggw-loki, loki, promgw, k8sgw, promtail, and prometheus. You can override these flags here or via the Helm CLI.

global:
  # awsgw:
  #   enabled: false
  # k8sgw:
  #   enabled: true
  # promgw:
  #   enabled: true
  # loggw-loki:
  #   enabled: true
  # tracegw:
  #   enabled: false  
  # eventgw:
  #   enabled: false
  # trace-router:
  #   enabled: false
  # opscruise-node-exporter:
  #   enabled: false
  # opscruise-node-exporter-new:
  #   enabled: true
  # otel-metric-collector:
  #   enabled: true
  # kube-state-metrics:
  #   enabled: true
  # prometheus:
  #   enabled: true
  # loki-stack:
  #   enabled: true
  # prometheus-yace-exporter:
  #   enabled: false
  # jaeger:
  #   enabled: false
  # jaeger-operator:
  #   enabled: false
  # prometheus-postgres-exporter:
  #   enabled: false
  # prometheus-mongodb-exporter:
  #   enabled: false
  # kafka-exporter:
  #   enabled: false
  # fluent-bit:
  #   enabled: false
  # prometheus-mysql-exporter:
  #   enabled: false
  # influxdb-exporter:
  #   enabled: false
  # x509-certificate-exporter:
  #   enabled: false
  # prometheus-redis-exporter:
  #   enabled: false
  # nginx-prometheus-exporter:
  #   enabled: false
  # beyla:
  #   enabled: false
  # alloy:
  #   enabled: true
  # otel-trace-collector:
  #   enabled: false

Core gateway modules

The Core Gateway Modules serve as the essential data conduits between your monitored environment and the Opscruise/Virtana backend. These modules function as specialized collectors that gather metrics, metadata, logs, events, and traces from diverse sources that include Kubernetes clusters and major cloud providers (AWS, Azure, GCP), standardizing and streaming the data for unified observability and analysis.

AWS gateway

The AWS Gateway collects AWS cloud metrics from the configured AWS regions and forwards them to the Opscruise backend. Enable this module only if your environment includes AWS resources you want to monitor alongside your Kubernetes workloads.

awsgw:
  enabled: false
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 250Mi
    requests:
      cpu: 50m
      memory: 50Mi

Table 67.

Field	Description	Default value
`enabled`	Enables or disables the AWS Gateway module.	`false`
`logLevel`	Log verbosity level.	`"info"`
`labels`	Custom Kubernetes labels applied to awsgw pods.	`{ }`
`annotations`	Custom Kubernetes annotations applied to awsgw pods.	`{ }`
`tolerations`	Module-specific tolerations.	`[ ]`
`affinity`	Module-specific node affinity.	`{ }`
`priorityClassName`	Kubernetes PriorityClass name for pod scheduling priority.	`" "`
`resources`	CPU and memory requests and limits for the pod.

Kubernetes gateway

The Kubernetes Gateway is a core CO South module that discovers and collects Kubernetes cluster metadata, including pods, deployments, services, nodes, and events, and streams it to the Opscruise backend. It supports optional namespace-level filtering to limit which namespaces are monitored.

k8sgw:
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 50Mi
  # configMap:
  #   config:
  #     kubernetes:
  #       namespace_allow_list:
  #         - kube-system
  #         - collectors
  #         - opscruise

Table 68.

Field	Description
`logLevel`	Sets the logging verbosity level for the Kubernetes Gateway. Common values are "info", "debug", "warn", and "error".
`labels`	Custom Kubernetes labels to apply to the k8sgw pod(s). Useful for organizing, filtering, or selecting resources.
`annotations`	Custom Kubernetes annotations to attach to the k8sgw pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.
`tolerations`	A list of Kubernetes tolerations that allow the k8sgw pod(s) to be scheduled on nodes with matching taints.
`affinity`	Kubernetes affinity or anti-affinity rules that control which nodes the k8sgw pod(s) can be scheduled on, based on node labels or other pod locations.
`priorityClassName`	The name of a Kubernetes PriorityClass to assign to the k8sgw pod(s). Determines scheduling and eviction priority relative to other pods.
`resources`	CPU and memory requests and limits for the cluster.
`configMap.config.kubernetes.namespace_allow_list`	Module-level namespace filtering for k8sgw only.

Prometheus gateway

The Prometheus Gateway receives scraped metrics from Prometheus and forwards them to the Opscruise backend over the authenticated Kafka channel. It acts as the bridge between the local Prometheus instance and the Virtana cloud platform.

promgw:
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 300Mi
    requests:
      cpu: 50m
      memory: 50Mi

Table 69.

Field	Description
`logLevel`	Sets the logging verbosity level for the Prometheus Gateway. Common values are "info", "debug", "warn", and "error".
`labels`	Custom Kubernetes labels to apply to the promgw pod(s). Useful for organizing, filtering, or selecting resources.
`annotations`	Custom Kubernetes annotations to attach to the promgw pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.
`tolerations`	A list of Kubernetes tolerations that allow the promgw pod(s) to be scheduled on nodes with matching taints.
`affinity`	Kubernetes affinity or anti-affinity rules that control which nodes the promgw pod(s) can be scheduled on, based on node labels or other pod locations.
`priorityClassName`	The name of a Kubernetes PriorityClass to assign to the promgw pod(s). Determines scheduling and eviction priority relative to other pods.
`resources.limits`	The maximum CPU and memory the promgw container is allowed to consume.
`resources.requests`	The minimum CPU and memory guaranteed to the promgw container.

Log gateway - Loki

The Log Gateway is a Java-based gateway that reads logs from the in-cluster Loki instance and forwards them to the Opscruise backend. It handles OAuth authentication and connects to Loki using the cluster-internal DNS endpoint derived from global.k8sClusterFqdn.

loggw-loki:
  enabled: true
  logLevel: "info"
  config:
    oauthAcceptUnsecureServer: "true"
    jgateway:
      lokiHost: "opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

Table 70.

Field	Description	Default value
	Enables or disables the Log Gateway.	`true`
`config.oauthAcceptUnsecureServer`	Allows OAuth connections to non-TLS servers.	`true`
`config.jgateway.lokiHost`	Internal Loki endpoint used by the log gateway. `K8S_CLUSTER_FQDN` is replaced by `global.k8sClusterFqdn`.	`opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100`

Azure gateway

The Azure Gateway collects Azure cloud metrics from one or more Azure subscriptions and forwards them to the Opscruise backend. It supports multiple credential sets for monitoring resources across different subscriptions or tenants.

azuregw:
  enabled: false
  logLevel: "INFO"
  azureCredentials:
    - azureauth_clientId: azureauth_clientId
      azureauth_tenantId: azureauth_tenantId
      azureauth_clientSecret: azureauth_clientSecret
      azureauth_subId: azureauth_subId
      name: "credential_name"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

Table 71.

Field	Description	Default value
`enabled`	Enables or disables the Azure Gateway.	false
`azureCredentials`	List of Azure credential sets.
`azureCredentials[].azureauth_clientId`	Azure AD application (client) ID.
`azureCredentials[].azureauth_tenantId`	Azure AD tenant ID.
`azureCredentials[].azureauth_clientSecret`	Azure AD client secret.
`azureCredentials[].azureauth_subId`	Azure subscription ID.
`azureCredentials[].name`	Readable name for Azure credential set.

GCP gateway

The GCP Gateway collects Google Cloud Platform metrics and forwards them to the Opscruise backend. Enable this module only if your environment includes GCP resources you want to monitor.

gcpgw:
  enabled: false
  logLevel: "INFO"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi

Trace gateway

The Trace Gateway collects distributed tracing data and forwards trace spans to the Opscruise backend for analysis and SLO tracking. It supports persistent storage for SLO data and can operate in either poll (pull) or listen (push) mode.

tracegw:
  enabled: false
  logLevel: "INFO"
  persistentType: ebs
  storageClassName: ""
  slo_storage_size: 20Gi
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  service:
    type: ClusterIP
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    tracegw:
      filterTagsKey: notag
      filterTagsValue: notag
      traceDataFromJeager: "true"
      mode: poll

Table 72.

Field	Description	Default value
`enabled`	Enables or disables the Trace Gateway.	`false`
`persistentType`	Persistent volume type. Supported: `ebs`, `hostpath`.	`ebs`
`storageClassName`	Kubernetes StorageClass name. Leave blank for default.	`" "`
`slo_storage_size`	Persistent volume size for SLO data.	`20 Gi`
`service.type`	Kubernetes Service type.	`ClusterIP`
`config.tracegw.filterTagsKey`	Tag key used for trace filtering.	`notag`
`config.tracegw.filterTagsValue`	Tag key used for trace filtering.	`notag`
`config.tracegw.traceDataFromJeager`	Whether trace data is sourced from Jaeger.	`true`
`config.tracegw.mode`	Data retrieval mode. Supported: `poll`, `listen`.	`poll`

Event gateway

The Event Gateway captures Kubernetes cluster events, such as pod scheduling, node conditions, and resource warnings, and forwards them to the Opscruise backend for correlation with metrics and logs.

eventgw:
  enabled: false
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""

Trace router

The Trace Router acts as an intermediary routing layer for distributed traces, directing trace data between trace sources and the Trace Gateway. It is typically used in complex tracing topologies where traces need to be filtered, sampled, or routed before reaching the gateway.

trace-router:
  enabled: false
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  service:
    type: ClusterIP
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi

Table 73.

Field	Description
`enabled`	Controls whether the Trace Router module is deployed. Set to true to enable or false to disable.
`labels`	Custom Kubernetes labels to apply to the trace-router pod(s). Useful for organizing, filtering, or selecting resources.
`annotations`	Custom Kubernetes annotations to attach to the trace-router pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.
`tolerations`	A list of Kubernetes tolerations that allow the trace-router pod(s) to be scheduled on nodes with matching taints.
`affinity`	Kubernetes affinity or anti-affinity rules that control which nodes the trace-router pod(s) can be scheduled on, based on node labels or other pod locations.
`priorityClassName`	The name of a Kubernetes PriorityClass to assign to the trace-router pod(s). Determines scheduling and eviction priority relative to other pods.
`service.type`	Specifies the Kubernetes Service type used to expose the Trace Router. `ClusterIP`makes it accessible only within the cluster. Other possible values include `NodePort` and `LoadBalancer`.
`resources.limits`	The maximum CPU and memory the trace-router container is allowed to consume.
`resources.requests`	The minimum CPU and memory guaranteed to the trace-router container.

Metric collection components

The Metric Collection Components are the specialized agents and collectors responsible for gathering raw performance data from your infrastructure. These modules operate at the host, container, and application levels, utilizing technologies like eBPF and OpenTelemetry to capture granular resource usage, network flows, and request-level metrics. They provide the raw data foundation that allows Opscruise to visualize health and troubleshoot performance bottlenecks across the entire stack.

Node Exporter (New)

The new Node Exporter is the recommended replacement for the legacy version. It collects host-level metrics and additionally supports eBPF-based flow collection (network flow visibility, DNS tracking) via custom arguments. It also supports public IP aggregation patterns for network analytics.

opscruise-node-exporter-new:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  # args:
  #   btfFilePath: path-to-btf-file
  #   kconfigFilePath: path-to-kconfig
  #   customArgs:
  #     - "--collector.ocflowbpfcollector.skip-bpf-verification"
  #     - "--collector.ocflowbpfcollector.enable-dns-tracking"
  # publicIPAggregationSubnetPatterns:
  #   - pattern: "172.16.0.86/32"
  #     aggregate_to: "1.2.3.4"
  #     aggregate_name: ""

Table 74.

Field	Description
`logLevel`	Log verbosity. Keep as `info` for GKE Autopilot.
`args.btfFilePath`	Path to BTF file for eBPF-based collectors.
`args.kconfigFilePath`	Path to kernel config file.
`args.customArgs`	Additional CLI arguments.
`publicIPAggregationSubnetPatterns`	Patterns for aggregating public IPs into representative addresses.
`publicIPAggregationSubnetPatterns[].pattern`	CIDR pattern to match.
`publicIPAggregationSubnetPatterns[].aggregate_to`	IP address to aggregate matched traffic to.
`publicIPAggregationSubnetPatterns[].aggregate_name`	An optional readable name for the aggregated IP.

Beyla

Beyla provides automatic, zero-code instrumentation of applications using eBPF. It captures HTTP/gRPC request metrics and traces without requiring any changes to application code or container images, making it ideal for gaining instant observability into services that are not yet manually instrumented.

beyla:
  annotations: {}
  tolerations: []
  priorityClassName: ""
  affinity: {}

Table 75.

Field	Description
`podLabels`	Custom labels applied to Beyla pods.
`annotations`	Custom annotations on the Beyla resource.
`podAnnotations`	Custom annotations applied to Beyla pods.

OTEL metric collector

The OpenTelemetry Metric Collector receives, processes, and exports metrics using the OpenTelemetry protocol. It can be extended with additional Prometheus scrape configurations to collect metrics from custom exporters or services not covered by the default CO South modules.

otel-metric-collector:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  # additional_receivers_configs:
  #   prometheus:
  #     config:
  #       scrape_configs:
  #         - job_name: new-job-exporter
  #           static_configs:
  #             - targets:
  #                 - '172.16.71.143:9256'
  #           scheme: http
  #           tls_config:
  #             insecure_skip_verify: true

Use additional_receivers_configs to add custom Prometheus scrape jobs to the OTEL collector alongside the default targets.

cAdvisor

cAdvisor runs as a daemonset on every node and collects container-level resource usage and performance metrics (CPU, memory, filesystem, network) for all running containers. It provides the per-container granularity that complements node-level metrics from Node Exporter.

cadvisor:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  affinity: {}
  hostNetwork: false
  resources:
    limits:
      cpu: 300m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi

Table 76.

Field	Description	Default value
`hostNetwork`	Whether cAdvisor pods use the host network namespace.	`false`
`customArgs`	Additional CLI arguments	set to `enable_load_reader=false`
`resources`	CPU and memory requests and limits for the container.

Kube State metrics

Kube State Metrics (KSM) generates Prometheus-format metrics about the state of Kubernetes objects, such as deployments, pods, nodes, jobs, and config maps, by listening to the Kubernetes API server. It provides the "desired vs. actual" state visibility that raw resource metrics alone cannot offer.

kube-state-metrics:
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    requests:
      cpu: 50m
      memory: 30Mi
    limits:
      cpu: 300m
      memory: 250Mi

Prometheus

Prometheus is the primary metric scraping engine in CO South. It discovers and scrapes metrics from Kubernetes nodes, pods, cAdvisor, kube-state-metrics, and any custom exporters, then makes them available to the Prometheus Gateway (promgw) for forwarding to the backend. It also supports Istio integration and ECS service discovery for hybrid environments.

prometheus:
  enableIstio: false
  enablePersistent: false
  prometheusEcsDiscovery:
    enabled: false
    awsCredentials:
      regions:
        - us-east-1
      aws_access_key_id: aws_access_key_id
      aws_secret_access_key: aws_secret_access_key
      roleArn: ""
  resources:
    requests:
      cpu: 50m
      memory: 1000Mi
    limits:
      memory: 5Gi
  nonIstioConfigMap:
    enabledScrapeJobs:
      - oc-kubernetes-pods
      - oc-app-exporters
      - kubernetes-nodes
      - kubernetes-nodes-cadvisor
      - kubernetes-apiservers
      - kube-scheduler
    additionalScrapeConfigs: []

Table 77.

Field	Description	Default value
`enableIstio`	Enables or disables Istio service mesh integration for Prometheus.	`false`
`enablePersistent`	Enables persistent storage for Prometheus data.	`false`
`scrape_interval`	Module-level scrape interval.
`prometheusEcsDiscovery.enabled`	Enables ECS service discovery for Prometheus.	`false`
`prometheusEcsDiscovery.awsCredentials`	AWS credentials for ECS discovery.
`nonIstioConfigMap.enabledScrapeJobs`	List of default scrape job names enabled when Istio is not used.
`nonIstioConfigMap.additionalScrapeConfigs`	Custom Prometheus scrape configurations, for example, ECS targets, and custom cAdvisor.	`[ ]`

Database and middleware exporters

The Database and Middleware Exporters are specialized bridge components designed to extract deep visibility from stateful services and messaging systems. Since databases and middleware often operate as "black boxes" with their own internal telemetry formats, these exporters scrape service-specific data, such as query latency, connection pools, and queue depths, and translate them into a unified Prometheus-compatible format. This allows Opscruise to correlate the health of your data layer directly with the performance of your application services.

Prometheus PostgreSQL exporter

The PostgreSQL Exporter scrapes database-level metrics, such as connections, transactions, locks, replication lag, or table/index statistics, from a PostgreSQL instance and exposes them in Prometheus format. It supports direct password configuration or Kubernetes Secret-based credential management.

prometheus-postgres-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    datasource:
      host: "<POSTGRESQL_SERVICENAME.NAMESPACE.svc.cluster.local>"
      user: "postgres_exporter"
      password: "<POSTGRES_EXPORTER_PASSWORD>"
      passwordSecret: {}
      database: "<DB_NAME>"
      sslmode: disable
    autoDiscoverDatabases: false
    excludeDatabases: []
    includeDatabases: []

Table 78.

Field	Description	Default value
`enabled`	Enables or disables the PostgreSQL exporter.	`false`
`config.datasource.host`	PostgreSQL service DNS or IP.
`config.datasource.user`	Database user for the exporter.	`postgres_exporter`
`config.datasource.password`	Database password. Only one of password or passwordSecret should be used.
`config.datasource.passwordSecret.name`	Kubernetes Secret name containing the password.
`config.datasource.passwordSecret.key`	Key inside the Secret.
`config.datasource.database`	Database name to connect to.
`config.datasource.sslmode`	PostgreSQL SSL mode.	`disable`
`config.autoDiscoverDatabases`	Automatically discover and monitor all databases.	`false`
`config.excludeDatabases`	Databases to exclude from auto-discovery.	`[ ]`
`config.includeDatabases`	Databases to include from auto-discovery.	`[ ]`

Prometheus MongoDB exporter

The MongoDB Exporter scrapes database-level metrics from a MongoDB instance and exposes them in Prometheus format. It supports both inline connection URI and Kubernetes Secret-based credential management.

prometheus-mongodb-exporter:
  enabled: false
  mongodb:
    uri: "mongodb://${USERNAME}:${PASSWORD}@<SERVICE_NAME.<NAMESPACE>.svc.cluster.local"
  existingSecret:
    name: ""
    key: "mongodb-uri"
  resources:
    limits:
      cpu: 250m
      memory: 192Mi
    requests:
      cpu: 50m
      memory: 128Mi

Table 79.

Field	Description	Default value
`enabled`	Enables or disables the MongoDB exporter.	`false`
`mongodb.uri`	MongoDB connection URI. Ignored if `existingSecret` is provided.
`existingSecret.name`	Name of an existing Kubernetes Secret containing the URI.	`" "`
`existingSecret.key`	Key inside the Secret holding the connection URI.	`mongodb-uri`

Kafka exporter

The Kafka Exporter scrapes Kafka broker and consumer group metrics and exposes them in Prometheus format. It connects to both Kafka brokers and ZooKeeper, and supports multi-broker/multi-ZooKeeper configurations.

kafka-exporter:
  enabled: false
  args:
    - --kafka.server=<KAFKA_SERVICE>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>
    - --zookeeper.server=<ZOOKEEPER_SERVICE>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  mutiple_kafka_zookeepers: []
  resources:
    requests:
      cpu: 50m
      memory: 256Mi
    limits:
      cpu: "0.5"
      memory: 256Mi

Table 80.

Field	Description	Default value
enabled	Enables or disables the Kafka exporter.	`false`
args	CLI arguments specifying Kafka and ZooKeeper server endpoints.
mutiple_kafka_zookeepers	List of additional Kafka or ZooKeeper endpoint pairs for multi-broker setups.	`[ ]`

Prometheus MySQL exporter

The MySQL Exporter scrapes database-level metrics from a MySQL instance and exposes them in Prometheus format. It supports both inline password and Kubernetes Secret-based credential management.

prometheus-mysql-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 200Mi
    requests:
      cpu: 50m
      memory: 128Mi
  mysql:
    db: "<DB_NAME>"
    host: "<MYSQL_SERVICE>.<NAMESPACE>.svc.cluster.local"
    param: ""
    pass: ""
    port: 3306
    user: "mysql_exporter"
    existingPasswordSecret:
      name: ""
      key: ""

Table 81.

Field	Description	Default value
`enabled`	Enables or disables the MySQL exporter.	`false`
`mysql.db`	Database name.
`mysql.host`	MySQL service DNS or IP.
`mysql.param`	Additional DSN parameters.	`" "`
`mysql.pass`	Database password. Ignored if `existingPasswordSecret` is set.	`" "`
`mysql.port`	MySQL port.	`3306`
`mysql.user`	Database user.	`mysql_exporter`
`mysql.existingPasswordSecret.name`	Kubernetes Secret name containing the password.	`" "`
`mysql.existingPasswordSecret.key`	Key inside the Secret.	`" "`

InfluxDB exporter

The InfluxDB Exporter collects InfluxDB metrics and exposes them in Prometheus format. It supports two connectivity modes: DNS-based access (for in-cluster or DNS-resolvable instances) and direct IP-based access.

influxdb-exporter:
  enabled: false
  endpoint_type: common_dns
  common_dns_name: ""
  common_dns_port: ""
  external_ip_address: ""
  external_ip_port: ""
  serviceLabels: {}
  annotations: {}

Table 82.

Field	Description	Default value
`enabled`	Enables or disables the InfluxDB exporter.	`false`
`endpoint_type`	How to reach InfluxDB. `common_dns = DNS`-accessible; `external_ip = IP`-only access.	`common_dns`
`common_dns_name`	DNS name of the InfluxDB instance. Used when `endpoint_type: common_dns`.	`" "`
`common_dns_port`	Port of the InfluxDB instance. Used when `endpoint_type: common_dns`.	`" "`
`external_ip_address`	IP address. Used when `endpoint_type: external_ip`.	`" "`
`external_ip_port`	Port used when `endpoint_type: external_ip`.	`" "`
`serviceLabels`	Custom labels on the exporter Service.	`{ }`
`annotations`	Custom annotations.	`{ }`

Loki-stack promtail integration

The Loki Stack deploys Promtail and Loki. Promtail tails container and node logs, applies pipeline stages, such as parsing, multiline merging, or timestamp stripping, and ships them to Loki. The Log Gateway (loggw-loki) then reads from Loki and forwards logs to the Opscruise backend.

Promtail

Promtail is the log shipping agent that runs on every node, discovers pod and node log files, applies configurable pipeline stages, and pushes the processed logs to the in-cluster Loki instance.

loki-stack:
  promtail:
    tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 128Mi
    pipelineStages:
      - docker: {}
      - replace:
          expression: '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z (?:stdout|stderr) [A-Z] )...'
          replace: ''
      - multiline:
          firstline: '(^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)|(^\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2})|(^\[\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\])|(^\w{10} \w{9} \(\?\) \w{2} \w{8} \d{1,2})|(^(?:info|error): )|(^\x{200B}\[)...'
          max_wait_time: 3s
    extraVolumes:
      - name: varlog
        hostPath:
          path: /var/log
    extraVolumeMounts:
      - name: varlog
        mountPath: /var/log
        readOnly: true
    extraScrapeConfigs:
      - job_name: kubernetes-nodes-debian      
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubelet-logs
          target_label: namespace
        - replacement: /var/log/syslog
          target_label: __path__
        - source_labels: [kubernetes_io_hostname]
          target_label: host
      - job_name: kubernetes-nodes-redhat
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubelet-logs
          target_label: namespace
        - replacement: /var/log/messages
          target_label: __path__
        - source_labels: [kubernetes_io_hostname]
          target_label: host

Table 83.

Field	Description	Default value
`tolerations`	Tolerations for Promtail daemonset pods.	`key: node-role.kubernetes.io/master` `effect: NoSchedule`
`resources`	CPU and memory requests and limits for the container.
`pipelineStages`	Promtail log processing pipeline. Includes docker (CRI parsing), replace (strip kubelet prefix timestamps), and multiline (combine multi-line logs).
`pipelineStages[].multiline.firstline`	Regex patterns to detect the first line of a multi-line log entry.
`pipelineStages[].multiline.max_wait_time`	Maximum time to wait for additional lines before flushing.	`3s`
`extraVolumes`	Additional volumes mounted into Promtail pods.	`[{name: varlog, hostPath: {path: /var/log}}]`
`extraVolumeMounts`	Mount points for extra volumes.	`[{name: varlog, mountPath: /var/log, readOnly: true}]`
`extraScrapeConfigs`	Additional Promtail scrape configurations for node-level logs.	Debian + RedHat node log jobs

Loki

Loki is the log aggregation backend that stores and indexes logs shipped by Promtail. It provides the query interface used by the Log Gateway (loggw-loki) to retrieve and forward logs to the Opscruise backend. It supports configurable retention policies and sample age limits.

loki-stack:
  loki:
    resources:
      limits:
        cpu: 300m
        memory: 4Gi
      requests:
        cpu: 50m
        memory: 512Mi
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
                - key: opscruise
                  operator: In
                  values:
                    - "true"
            weight: 1

Table 84.

Field	Description
`config.limits_config.reject_old_samples_max_age`	Maximum age of log samples accepted.
`config.table_manager.retention_deletes_enabled`	Enables automatic deletion of old log data.
`config.table_manager.retention_period`	Duration to retain log data.
`affinity`	Node affinity for Loki pods.

Prometheus Redis exporter

The Redis Exporter scrapes Redis server metrics, such as, memory usage, connected clients, commands processed, keyspace statistics, or replication info, and exposes them in Prometheus format for monitoring Redis instances alongside your Kubernetes workloads.

prometheus-redis-exporter:
  enabled: false
  redisAddress: redis://<REDIS_IP/FQDN>:6379

Table 85.

Field	Description
`enabled`	Enables or disables the Redis exporter.
`redisAddress`	Redis connection URI.

Nginx Prometheus exporter

The Nginx Exporter scrapes Nginx server metrics from the Nginx stub_status or metrics endpoint and exposes them in Prometheus format for monitoring Nginx instances running in or alongside your cluster.

nginx-prometheus-exporter:
  enabled: false
  args:
    - -nginx.scrape-uri=http://NGINX_ENDPOINT:PORT/METRIC_ENDPOINT
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

Table 86.

Field	Description
`enabled`	Enables or disables the Nginx exporter.
`args`	CLI arguments. `-nginx.scrape-uri` specifies the Nginx stub_status or metrics endpoint URL.
`resources`	CPU and memory requests and limits for the server.

YACE - AWS CloudWatch exporter

YACE (Yet Another CloudWatch Exporter) scrapes AWS CloudWatch metrics and exposes them in Prometheus format. It enables monitoring of AWS-managed services, for example, RDS, ELB, Lambda, SQS) that do not run inside the Kubernetes cluster but are part of the overall application infrastructure.

prometheus-yace-exporter:
  enabled: false
  image:
    repository: ghcr.io/nerdswords/yet-another-cloudwatch-exporter
    tag: v0.61.2
    pullPolicy: IfNotPresent
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

Table 87.

Field	Description	Default value
`enabled`	Enables or disables the YACE CloudWatch exporter.	`false`
`image.repository`	Container image repository.	`ghcr.io/nerdswords/yet-another-cloudwatch-exporter`
`image.tag`	Container image tag/version.	`v0.61.2`
`image.pullPolicy`	Kubernetes image pull policy.	`IfNotPresent`
`resources`	CPU and memory requests and limits for the server.

In this section:

Container Observability – South Deployment Guide

Prerequisites

Get South values.yaml

Deploy with Helm (CLI)

Deploy with Argo CD

Deploy with Terraform

Optional settings

Using a private image registry

Update resource profile

Global-level resource profile

Module-level override

Custom resource values for a module

OpenShift deployment specifics

Create secrets manually

Create a Keycloak client secret oc-kc-secret

For Linux, enter the following command

For macOS, enter the following command

Create Docker registry credentials

Environment-specific settings

Prometheus: Enable TLS and Basic Authentication

Deploy the Zenoss Kubernetes agent

GKE autopilot support

Container Observability South base values file for reference

CO South base values.yaml

Global Configuration

Chart version and secret source

Global AWS credentials

Global gateway credentials

Global namespace filtering and pod label whitelisting

Global tolerations and affinity

Global module enable/disable flags

Core gateway modules

AWS gateway

Kubernetes gateway

Prometheus gateway

Log gateway - Loki

Azure gateway

GCP gateway

Trace gateway

Event gateway

Trace router

Metric collection components

Node Exporter (New)

Beyla

OTEL metric collector

cAdvisor

Kube State metrics

Prometheus

Database and middleware exporters

Prometheus PostgreSQL exporter

Prometheus MongoDB exporter

Kafka exporter

Prometheus MySQL exporter

InfluxDB exporter

Loki-stack promtail integration

Promtail

Loki

Prometheus Redis exporter

Nginx Prometheus exporter

YACE - AWS CloudWatch exporter

Search results

Get South `values.yaml`

Create a Keycloak client secret `oc-kc-secret`