Skip to main content

Container Observability – South Deployment Guide

In this deployment, you install the Virtana CO cluster components into your Kubernetes or OpenShift cluster. It helps cluster to continuously collect metrics, logs, and Kubernetes metadata and securely forward that telemetry to the Virtana backend. You need a South deployment to enable end-to-end observability for a specific cluster. Without it, the platform cannot discover workloads, export metrics, or collect node or container signals, etc., which can cause dashboards, health status, alerting, and troubleshooting views in the UI to be incomplete or unavailable for that cluster.

Prerequisites

Ensure the following requirements are met before starting the deployment or configuration process:

  • You need cluster-admin access to the target Kubernetes cluster.

  • Credentials for Docker Hub (or a private registry) and Keycloak client for the CO backend.

Get South values.yaml

The South values.yaml file contains the tenant and cluster-specific configuration needed to deploy CO South correctly. It also includes Org identifiers, backend endpoints, and any pre-configured module defaults expected for your environment. This ensures that deployment connects the South components to the correct Virtana tenant and applies the right settings for your selected cluster.

Perform the following steps to get the South values.yaml file:

  1. Open the following URL, https://GLOBAL_VIEW_HOSTNAME/ui.

  2. Log in to Virtana Platform using your org email and password.

  3. Navigate to the Container Observability > Cluster.

  4. In the top right of the CO default page, click System Status and select South Deployment Guide.

  5. To download South values.yaml by clicking Generate Token to Download YAML.

  6. Copy the URL generated and run it on your machine to download the YAML file.

  7. Run the commands provided under Deploy Opscruise, or use the following commands.

    helm repo add virtana-repo https://virtana.gitlab.io/helm-charts
    helm repo update
    helm search repo virtana-repo/virtana-co
  8. Save as <ORG_ID>-<CLUSTER_NAME>-opscruise-values.yaml.

Deploy the South components directly from your terminal using native Helm command-line tools.

helm upgrade --install opscruise-bundle virtana-repo/virtana-co --namespace opscruise \
  --create-namespace -f <ORG_ID>-<CLUSTER_NAME>-virtana-co-values.yaml \
  --version <LATEST_VERSION>
Table 55.

Field

Description

--namespace opscruise

Target namespace for all components.

--create-namespace

Creates the namespace if absent.

--version <LATEST_VERSION>

Specific chart version to deploy.



Create an Argo CD Application to manage the Helm chart declaratively.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: virtana-CLUSTER_NAME-south
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  destination:
    server: https://kubernetes.default.svc   
    namespace: opscruise
  source:
    chart: virtana-co
    repoURL: "https://virtana.gitlab.io/helm-charts"
    targetRevision: <LATEST_VERSION>
    helm:
      releaseName: opscruise-bundle
      valueFiles:
      - values.yaml
      values: |
        global:
          gatewayCreds:
            environment:
              DOCKER_SERVER: "https://index.docker.io/v2/"
              DOCKER_USERNAME: "xxxxx"
              DOCKER_PASSWORD: "xxxxx"
              OPSCRUISE_ENDPOINT: "xxxxxxxxxxxxx-xxxxxxxxxxxxx.elb.us-east-2.amazonaws.com:443"
              KEYCLOAK_ENABLED: "true"
              KEYCLOAK_URL: "https://xxxxxx.example.com:443"
              KEYCLOAK_CLIENT_ID: "xxxxxx"
              KEYCLOAK_CLIENT_SECRET: "xxxxxx"
              KEYCLOAK_REALM: "xxxxxx"
              OPSCRUISE_ACCOUNT_ID: "xxxxxx"
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

The following table describes each field in the configuration file.

Table 56.

Field

Description

Default value

apiVersion

The Kubernetes API version for the Argo CD Application Custom Resource.

argoproj.io/v1alpha1

metadata.name

The name of the Argo CD Application object.

virtana-CLUSTER_NAME-south

metadata.namespace

The namespace where Argo CD is installed.

argocd

metadata.finalizers

Prevents the Application from being deleted until Argo CD has cleaned up the resources it created.

- resources-finalizer.argocd.argoproj.io

spec.project

The Argo CD Project this Application belongs to.

default

spec.destination.server

The Kubernetes API server address for the destination cluster.

https://kubernetes.default.svc

spec.source.chart

The Helm chart name to install.

virtana-co

spec.source.repoURL

The Helm repository URL hosting the chart.

https://virtana.gitlab.io/helm-charts

spec.source.targetRevision

The chart version Argo CD should deploy.

<LATEST_VERSION>

spec.source.helm

Helm-specific settings under Argo CD

-

spec.syncPolicy.automated

Enables automatic sync without manual clicking.

-

syncPolicy.syncOptions

Tells Argo CD to create the destination namespace if it doesn’t exist.

- CreateNamespace=true



You can also integrate the deployment into your Infrastructure as Code pipelines by using the Terraform Helm provider.

resource "helm_release" "south" {
  create_namespace = true
  chart            = "virtana-co"
  name             = "opscruise-bundle"
  namespace        = "opscruise"
  repository       = "https://virtana.gitlab.io/helm-charts"
  version          = var.helm_version
  values = [
    templatefile("${path.module}/../values/opscruise-values.yaml", {
      docker_password          = var.docker_password
      docker_username          = var.docker_username
      keycloak_hostname        = var.keycloak_hostname
      opscruise_kafka_endpoint = var.opscruise_kafka_endpoint
      kafka_client_id          = var.kafka_client_id
      kafka_client_secret      = var.kafka_client_secret
      tenant_name              = var.tenant_name
      cluster_name             = var.cluster_name
    })
  ]
}

The following table describes each field in the configuration file.

Table 57.

Field

Description

Default value

create_namespace

Create opscruise namespace automatically.

true

chart/repository/version

Chart identity and version to install.

-

name/namespace

Helm release name and namespace.

-

values

Rendered values file using Terraform templatefile with variables for credentials and cluster metadata.

-



Optional settings

Use the following optional settings to pull images from a private registry or to set the resource profile for the South modules.

Using a private image registry

Enter the following command to <ORG_ID>-<CLUSTER_NAME>-virtana-co-values.yaml to use the private image registry.

global:
  useGlobalRepository: true
  globalRepositoryName: example.io

The following table describes each field in the configuration file.

Table 58.

Field

Description

Default value

global.useGlobalRepository

Enable global override of container image registry for supported components.

true

global.globalRepositoryName

Fully qualified registry/repository path.

example.io



Update resource profile

Virtana CO South supports resource profiles to simplify sizing across modules. Instead of manually configuring CPU or memory for every component, you can select a machine_type profile, such as small, medium, or large, and the chart applies the corresponding default resource limits.

Global-level resource profile

You can use this when you want a single sizing profile to apply across CO South components, unless a module overrides it.

global:
  machine_type: "small"   

The command global.machine_type selects the default sizing profile used by modules that do not define their own machine_type. You can select small, medium, or large as a default sizing profile.

Module-level override

Use this command when a particular component needs separate resources from the global profile that provides them. If you want to change the machin_type for a specific module, you can add it to the specific section.

loggw-loki:
  machine_type: "large"

The command loggw-loki.machine_type overrides global.machine_type. You can select small, medium, or large as a default sizing profile.

Custom resource values for a module

When the predefined resource profiles ( small/medium/large ) do not meet the requirements of a specific module, you can define custom resource values. To do this, configure the machine_type and resources fields under the corresponding machine type section (small_machine, medium_machine, or large_machine). Ensure that the value of machine_type matches the machine type section where the custom resources are defined.

loggw-loki:
    machine_type: "large"
    large_machine:
        resources:
            requests:
                memory: 2Gi
                cpu: 1
            limits:
                memory: 4Gi
                cpu: 2
        xss: -Xss4m
        xms: -Xms1500m
        xmx: -Xmx3500m

The following table describes each field in the configuration file.

Table 59.

Field

Description

Default value

loggw-loki.machine_type

Chooses which profile is active for this module.

small/medium/large

loggw-loki.large_machine

A profile-specific override block for the large profile.

resources

Kubernetes resource configuration to apply to the module’s pods.

resources.requests

Guaranteed minimum resources the scheduler uses for placement.

  • requests.cpu: Minimum CPU

  • requests.memory: Minimum memory: Minimum memory

resources.limits

The maximum resources the container is allowed to consume.

  • limits.cpu: Max CPU

  • limits.memory: Max memory

xss, xms, xmx

JVM tuning options, such as thread stack size, initial heap, and max heap. These apply only to Java Gateways such as loggw, tracegw, gcpgw, and azuregw.

xss: -Xss4m

xms: -Xms1500m

xmx: -Xmx3500m



OpenShift deployment specifics

OpenShift-specific deployment steps are only required if your Kubernetes platform is Red Hat OpenShift, because OpenShift enforces additional security and runtime constraints compared to upstream Kubernetes. In particular, OpenShift uses Security Context Constraints (SCC) that can prevent CO South components from running with the permissions they need unless the correct SCCs are granted to their service accounts.

Follow the steps below to prepare the cluster before running the normal South deployment flow.

  1. Download the virtana-co-values.yaml generated by the Virtana UI. This file includes cluster-specific settings required for the deployment.

  2. Update the node-exporter port to 9200 instead of the default 9100 in virtana-co-values.yaml.

    global:
      nodeExporterPort: 9200

    The command global.nodeExporterPort sets the port used by Node Exporter.

  3. Enter the following command to create the South Namespace if not present, where opscruise is a namespace where CO South components are deployed.

    kubectl create ns opscruise
  4. Grant required SCC permissions to the CO South service accounts. OpenShift uses SCCs to control privileges.

    The following commands bind SCCs to the service accounts used by South modules so pods can run correctly.

    oc adm policy add-scc-to-user anyuid -z k8sgw-service-account -n opscruise
    oc adm policy add-scc-to-user anyuid -z loggw-service-account -n opscruise
    oc adm policy add-scc-to-user anyuid -z promgw-service-account -n opscruise
    oc adm policy add-scc-to-user anyuid -z prometheus-service-account -n opscruise
    oc adm policy add-scc-to-user privileged -z loggw-service-account -n opscruise
    oc adm policy add-scc-to-user privileged -z prometheus-service-account -n opscruise
    oc adm policy add-scc-to-user privileged -z ne-service-account -n opscruise
    oc adm policy add-scc-to-user privileged -z opscruise-bundle-loki -n opscruise
    oc adm policy add-scc-to-user privileged -z opscruise-bundle-promtail -n opscruise
  5. After completing the OpenShift-specific prerequisites steps above, deploy CO South using your preferred method, such as Helm CLI, Argo CD, or Terraform.

Create secrets manually

The Virtana CO South components require values, for example, Keycloak client credentials and Docker registry credentials, to authenticate to the Opscruise backend and pull images. So, if secrets are not provided through the values file, they must already exist in the cluster or namespace with the expected names.

You perform this manual secret creation before deploying South, especially when their organization’s security policy requires secrets to be managed outside Helm, such as via Vault or Fuze, or a dedicated secrets pipeline.

Create a Keycloak client secret oc-kc-secret

You can create a Keycloak client secret oc-kc-secret manually if secret_source is set to none, which contains KEYCLOAK_CLIENT_SECRETKEYCLOAK_CLIENT_ID, and KEYCLOAK_CLIENT_TOKEN.

export KEYCLOAK_CLIENT_ID="xxxx"
export KEYCLOAK_CLIENT_SECRET="xxxx"

kubectl create secret generic oc-kc-secret \
  --from-literal=KEYCLOAK_CLIENT_SECRET=${KEYCLOAK_CLIENT_SECRET} \
  --from-literal=KEYCLOAK_CLIENT_ID=${KEYCLOAK_CLIENT_ID} \
  --from-literal=KEYCLOAK_CLIENT_TOKEN="Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -w 0)" \
  -n opscruise

cat <<EOF
{
  "KEYCLOAK_CLIENT_SECRET": "${KEYCLOAK_CLIENT_SECRET}",
  "KEYCLOAK_CLIENT_ID": "${KEYCLOAK_CLIENT_ID}",
  "KEYCLOAK_CLIENT_TOKEN": "Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -w 0)"
}
EOF
export KEYCLOAK_CLIENT_ID="xxxx"
export KEYCLOAK_CLIENT_SECRET="xxxx"

kubectl create secret generic oc-kc-secret \
  --from-literal=KEYCLOAK_CLIENT_SECRET=${KEYCLOAK_CLIENT_SECRET} \
  --from-literal=KEYCLOAK_CLIENT_ID=${KEYCLOAK_CLIENT_ID} \
  --from-literal=KEYCLOAK_CLIENT_TOKEN="Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -b 0)" \
  -n opscruise

cat <<EOF
{
  "KEYCLOAK_CLIENT_SECRET": "${KEYCLOAK_CLIENT_SECRET}",
  "KEYCLOAK_CLIENT_ID": "${KEYCLOAK_CLIENT_ID}",
  "KEYCLOAK_CLIENT_TOKEN": "Basic $(echo -n "${KEYCLOAK_CLIENT_ID}:${KEYCLOAK_CLIENT_SECRET}" | base64 -b 0)"
}
EOF

Create Docker registry credentials

Enter the following command to create a secret directly in Kubernetes.

export DOCKER_USERNAME="xxxx"
export DOCKER_PASSWORD="xxxx"

kubectl create secret docker-registry oc-ns-docker-creds \
  --docker-server=https://index.docker.io/v2/ \
  --docker-username="${DOCKER_USERNAME}" \
  --docker-password="${DOCKER_PASSWORD}" \
  -n opscruise

Enter the following command to create a secret in Fuze or Vault using JSON.

registry_server="https://index.docker.io/v2/"
registry_username="xxxx"
registry_password="xxxx"

encoded_registry_auth=$(echo -n ${registry_username}:${registry_password} | base64)
DOCKER_CONFIG=$(echo -n "{\"auths\": {\"${registry_server}\": {\"auth\": \"${encoded_registry_auth}\"}}}")

cat <<EOF
{
  "dockerconfigjson": "${DOCKER_CONFIG}"
}
EOF

Environment-specific settings

This section covers three commonly used optional configurations for CO South deployments, which include enabling TLS and Basic Authentication for Prometheus, deploying the Zenoss Kubernetes Agent, and enabling GKE Autopilot compatibility. Apply only the subsections that match your environment or requirements by updating your virtana-co-values.yaml.

Prometheus: Enable TLS and Basic Authentication

Use this configuration when you want Prometheus endpoints protected using HTTPS (TLS) and Basic Authentication.

prometheus:
  security:
    authentication:
      enabled: true         
      password: prom123     
    tls:
      enabled: true            
      selfSignedCerts: true    
      hostname: ""             

The following table describes each field in the configuration file.

Table 60.

Field

Description

Default value

prometheus

Configuration section for the Prometheus component deployed as part of CO South.

prometheus.security.authentication.enabled

Enables or disables Basic Authentication.

false

prometheus.security.authentication.password

Password used for Basic Auth.

prom123

prometheus.security.tls.enabled

Enables or disables TLS (HTTPS).

false

prometheus.security.tls.selfSignedCerts

When true, Prometheus uses self-signed certificates.

true

prometheus.security.tls.hostname

Optional DNS name for Prometheus.

" "



Deploy the Zenoss Kubernetes agent

Enable this when you want CO South to deploy the Zenoss Kubernetes Agent and connect it to your Zenoss environment.

zenoss-agent-kubernetes:
  enabled: true        
  zenoss:
    clusterName: ""
    address: ""
    apiKey: ""

The following table describes each field in the configuration file.

Table 61.

Field

Description

Default value

zenoss-agent-kubernetes.enabled

Turns the Zenoss agent deployment on or off.

false

zenoss-agent-kubernetes.zenoss.clusterName

Cluster identifier or name as it should appear in Zenoss.

" "

zenoss-agent-kubernetes.zenoss.address

Zenoss endpoint or address used by the agent to communicate.

" "

zenoss-agent-kubernetes.zenoss.apikey

API key or token used to authenticate to Zenoss.

" "



GKE autopilot support

Enable this only when your CO South target cluster is a GKE Autopilot cluster (as opposed to standard GKE). The default value of gkeAutoPilot is set to false.

global:
  gkeAutoPilot: false

If you keep the default GKE cluster type as false, then it considers standard GKE behavior, and if you update the value to true, then autopilot-compatible behavior is considered.

Container Observability South base values file for reference

This section provides a curated snapshot of the most important virtana-co-values.yaml settings used to deploy and operate CO South, focusing on the parameters you most commonly review or tune during installation and troubleshooting. Use it as a quick reference to understand how global credentials, registry settings, backend connectivity, scheduling controls (tolerations or affinity), and key module configurations, such as k8sgw, promgw, Prometheus, etc., fit together. You can adjust only what’s required for your environment while keeping the UI-generated tenant or cluster values intact.

##### Opscruise credentials #####
global:

  opscruiseChartVersion: TO_BE_DEFINED
  secret_source: "valuesfile"
  imagePullSecrets:
    - name: oc-ns-docker-creds

  ##### AWS Credentials #####
  awsCredentials:
    regions:
    - us-east-1
    aws_access_key_id: aws_access_key_id
    aws_secret_access_key: aws_secret_access_key

    roleArn: ""

  gkeAutoPilot: false

  ##### gateway Credentials #####
  gatewayCreds:
    environment:
      DOCKER_SERVER: "https://index.docker.io/v1/"
      DOCKER_USERNAME: "<DOCKER_USERNAME>"
      DOCKER_PASSWORD: "<DOCKER_PASSWORD>"
      DOCKER_EMAIL: "<DOCKER_EMAIL>"
      OPSCRUISE_ENDPOINT: "<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"
      KEYCLOAK_ENABLED: "true"
      KEYCLOAK_URL: "https://auth.opscruise.io:443"
      KEYCLOAK_CLIENT_ID: "<KAFKA_CLIENT_ID>"
      KEYCLOAK_CLIENT_SECRET: "<KEYCLOAK_CLIENT_SECRET>"
      KEYCLOAK_REALM: "<KEYCLOAK_REALM>"
      OPSCRUISE_ACCOUNT_ID: "<KEYCLOAK_CLUSTERID>"

  externalCadvisor: false

  nodeExporterPort: 9100

  useGlobalRepository: false
  globalRepositoryName: ""       

  k8sClusterFqdn: "cluster.local"

  metricScraper: "prometheus"

  metricScrapeInterval: 60

  machine_type: "small"

  ## namespace filtering ##
  # namespaceFiltering:
  #   namespaceAllowList:
  #   - kube-system
  #   - collectors
  #   - opscruise

  ## allow whitelisted pod labels ##
  # whitelistedPodLabels:
  # - app

  tolerations:
    - key: opscruise
      effect: NoSchedule
      operator: Exists

  daemonsetTolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule 
      operator: Exists

  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

  daemonsetAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

  # awsgw:
  #   enabled: false
  # azuregw:
  #   enabled: false
  # gcpgw:
  #   enabled: false
  # k8sgw:
  #   enabled: true
  # promgw:
  #   enabled: true
  # loggw-loki:
  #   enabled: true
  # tracegw:
  #   enabled: false
  # eventgw:
  #   enabled: false
  # trace-router:
  #   enabled: false
  # opscruise-node-exporter:
  #   enabled: false
  # opscruise-node-exporter-new:
  #   enabled: true
  # otel-metric-collector:
  #   enabled: true
  # kube-state-metrics:
  #   enabled: true
  # prometheus:
  #   enabled: true
  # loki-stack:
  #   enabled: true
  # prometheus-yace-exporter:
  #   enabled: false
  # jaeger:
  #   enabled: false
  # jaeger-operator:
  #   enabled: false
  # prometheus-postgres-exporter:
  #   enabled: false
  # prometheus-mongodb-exporter:
  #   enabled: false
  # kafka-exporter:
  #   enabled: false
  # fluent-bit:
  #   enabled: false
  # prometheus-mysql-exporter:
  #   enabled: false
  # influxdb-exporter:
  #   enabled: false
  # x509-certificate-exporter:
  #   enabled: false
  # prometheus-redis-exporter:
  #   enabled: false
  # nginx-prometheus-exporter:
  #   enabled: false
  # beyla:
  #   enabled: false
  # alloy:
  #   enabled: true
  # otel-trace-collector:
  #   enabled: false

##### Awsgw configs #####
awsgw:
  enabled: false
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 250Mi
    requests:
      cpu: 50m
      memory: 50Mi

##### K8sgw configs #####
k8sgw:
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 50Mi
  ## namespace filtering
  # configMap:
  #   config:
  #     kubernetes:
  #       namespace_allow_list:
  #       - kube-system
  #       - collectors
  #       - opscruise

##### Promgw configs #####
promgw:
  logLevel: "info"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 300Mi
    requests:
      cpu: 50m
      memory: 50Mi

##### Loggw config #####
loggw-loki:
  enabled: true
  logLevel: "INFO"
  config:
    oauthAcceptUnsecureServer: "true"
    jgateway:
      lokiHost: "opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  # Resources
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

##### Azuregw configs #####
azuregw:
  enabled: false
  logLevel: "INFO"
  azureCredentials:
  - azureauth_clientId: azureauth_clientId
    azureauth_tenantId: azureauth_tenantId
    azureauth_clientSecret: azureauth_clientSecret
    azureauth_subId: azureauth_subId
    name: "credential_name"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi

##### Gcpgw configs #####
gcpgw:
  enabled: false
  logLevel: "INFO"
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi

##### Tracegw configs #####
tracegw:
  enabled: false
  logLevel: "INFO"
  persistentType: ebs    
  storageClassName: ""   
  slo_storage_size: 20Gi
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  service:
    type: ClusterIP

  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi
  # env config
  config:
    tracegw:
      filterTagsKey: notag
      filterTagsValue: notag
      traceDataFromJeager: "true"
      mode: poll #or listen

##### eventgw #####
eventgw:
  enabled: false
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""

##### trace-router #####
trace-router:
  enabled: false
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  service:
    type: ClusterIP
  #Resources
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi

##### Node-Exporter-New configs #####
opscruise-node-exporter-new:
  ## Only lowercase accepted
  #logLevel: "info" - Keep the logLevel as info for GKE Autopilot clusters
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""
  #args:
    #btfFilePath: path-to-btf-file
    #kconfigFilePath: path-to-kconfig
    #customArgs:
      #- "--collector.ocflowbpfcollector.skip-bpf-verification"
      #- "--collector.ocflowbpfcollector.enable-dns-tracking"
      #- "--no-collector.ocflowbpfcollector.retain-original-public-ip-sources"
  # publicIPAggregationSubnetPatterns:
  #   -   pattern: "172.16.0.86/32"
  #       aggregate_to: "1.2.3.4"
  #       aggregate_name: ""
  #   -   pattern: "1.1.1.1/0" 1"
  #       aggregate_name: ""

##### BEYLA #####
beyla:
  # podLabels:
  #   key: value
  annotations:
  #   #key: value
  # podAnnotations:
  #   key: value
  tolerations:
  #   # - key: node-role.kubernetes.io/master
  #   #   effect: NoSchedule
  #   #   operator: Exists
  priorityClassName: ""
  affinity:

##### OTEL Metric Collector #####
otel-metric-collector:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""

  # additional_receivers_configs:
  #   prometheus:
  #     config:
  #       scrape_configs:
  #       - job_name: new-job-exporter
  #         static_configs:
  #         - targets:
  #           - '172.16.71.143:9256'
  #           - '172.16.73.102:9256'
  #         scheme: http
  #         tls_config:
  #           insecure_skip_verify: true

##### cadvisor configs #####
cadvisor:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
    #   operator: Exists
  priorityClassName: ""
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  hostNetwork: false
  #Resources
  resources:
    limits:
      cpu: 300m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi
  # customArgs:
  #   - --enable_load_reader=false

##### KSM configs #####
kube-state-metrics:
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/master
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    requests:
      cpu: 50m
      memory: 30Mi
    limits:
      cpu: 300m
      memory: 250Mi

##### prometheus configs #####
prometheus:
  # logLevel: info
  enableIstio: false
  enablePersistent: false
  prometheusEcsDiscovery:
    enabled: false
    awsCredentials:
      regions:
      - us-east-1
      aws_access_key_id: aws_access_key_id
      aws_secret_access_key: aws_secret_access_key
      roleArn: ""

  # scrape_interval: 30
  labels:
    #key: value
  annotations:
    #key: value
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  priorityClassName: ""
  #Resources
  resources:
    requests:
      cpu: 50m
      memory: 1000Mi
    limits:
      memory: 5Gi

  nonIstioConfigMap:
    enabledScrapeJobs:
    - oc-kubernetes-pods
    - oc-app-exporters
    - kubernetes-nodes
    - kubernetes-nodes-cadvisor
    - kubernetes-apiservers
    - kube-scheduler
    additionalScrapeConfigs:
    # - job_name: ecs-task-targets
    #   file_sd_configs:
    #   - files: ['/mnt/*.yml']
    #     refresh_interval: 1m
    # - job_name: kubernetes-nodes-cadvisor
    #   scheme: https  
    #   tls_config:
    #     ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    #   bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    #   kubernetes_sd_configs:
    #     - role: node
    #   relabel_configs:
    #     - action: labelmap
    #       regex: __meta_kubernetes_node_label_(.+)
    #     - target_label: __address__
    #       replacement: kubernetes.default.svc:443
    #     - source_labels: [__meta_kubernetes_node_name]
    #       regex: (.+)
    #       target_label: __metrics_path__
    #       replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    #   metric_relabel_configs:
    #     - action: replace
    #       source_labels: [id]
    #       regex: '^/machine.slice/machine-rkt\x2d([^\]+)\.+/([^/]+).service$'
    #       target_label: rkt_container_name
    #       replacement: '${2}-${1}'
    #     - action: replace
    #       source_labels: [id]
    #       regex: '^/system.slice/(.+).service$'
    #       target_label: systemd_service_name
    #       replacement: '${1}'
    #     - action: replace
    #       source_labels: [container]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_container_name
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [pod]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_pod_name
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [namespace]
    #       regex: (.*)
    #       target_label: container_label_io_kubernetes_pod_namespace
    #       replacement: ${1}
    #     - action: replace
    #       source_labels: [id]
    #       regex: '.+?pod([^\.g-z]+?)[\.\/\s](.*)'
    #       target_label: container_label_io_kubernetes_pod_uid
    #       replacement: ${1}

##### Prometheus Postgres Exporter #####
prometheus-postgres-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    datasource:
      # Specify one of both datasource or datasourceSecret
      host: "<POSTGRESQL_SERVICENAME.NAMESPACE.svc.cluster.local>"
      user: "postgres_exporter"
      # Only one of password and passwordSecret can be specified
      password: "<POSTGRES_EXPORTER_PASSWORD>"
      # Specify passwordSecret if DB password is stored in secret.
      passwordSecret: {}
      #  name: <Secret name>
      #  key: <Password key inside secret>
      database: "<DB_NAME>"
      sslmode: disable
    autoDiscoverDatabases: false
    excludeDatabases: []
    includeDatabases: []
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1
##### Prometheus MongoDB Exporter #####
prometheus-mongodb-exporter:
  enabled: false
  mongodb:
    uri: "mongodb://${USERNAME}:${PASSWORD}@<SERVICE_NAME.<NAMESPACE>.svc.cluster.local"
  existingSecret:
    name: ""
    key: "mongodb-uri"
  resources:
    limits:
      cpu: 250m
      memory: 192Mi
    requests:
      cpu: 50m
      memory: 128Mi
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### Kafka Exporter #####
kafka-exporter:
  enabled: false
  args:
    - --kafka.server=<KAFKA_SERVICE>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>
    - --zookeeper.server=<ZOOKEEPER_SERVICE>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  mutiple_kafka_zookeepers:
  #  - <KAFKA_SERVICE_1>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>,<ZOOKEEPER_SERVICE_1>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  #  - <KAFKA_SERVICE_2>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>,<ZOOKEEPER_SERVICE_2>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  resources:
    requests:
      cpu: 50m
      memory: 256Mi
    limits:
      cpu: "0.5"
      memory: 256Mi

##### Prometheus MYSQL Exporter #####
prometheus-mysql-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 200Mi
    requests:
      cpu: 50m
      memory: 128Mi
  mysql:
    db: "<DB_NAME>"
    host: "<MYSQL_SERVICE>.<NAMESPACE>.svc.cluster.local"
    param: ""
    # If "existingPasswordSecret" is specified, "pass" can be ignored
    pass: ""
    port: 3306
    user: "mysql_exporter"
    # If "pass" is specified, "existingPasswordSecret" can be ignored
    existingPasswordSecret:
      name: ""
      key: ""
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### InfluxDB Exporter #####
influxdb-exporter:
  enabled: false
  # common_dns: If InfluxDB is running within cluster but different namespace/externally on a VM and is accessible with DNS
  # external_ip: If InfluxDB is running externally on a VM and is accessible with IP address only
  endpoint_type: common_dns
  common_dns_name: ""
  common_dns_port: ""
  external_ip_address: ""
  external_ip_port: ""
  serviceLabels: {}
  annotations: {}

##### Loki-stack Promtail Integration #####
loki-stack:
  promtail:
    tolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 128Mi
    pipelineStages:
    - docker: {}
    - replace:
        # Example to remove: "2025-09-25T07:18:51.353495749Z stderr F "
        expression: '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z (?:stdout|stderr) [A-Z] )'
        replace: ''
    - multiline:
        # Combine multiple patterns for detecting the first line of a multiline log:
        # Pattern 1: ISO8601 with milliseconds and Zulu timezone (e.g., 2025-09-25T07:18:51.353Z)
        # Pattern 2: Date + time without T separator (e.g., 2025-09-25 07:18:51)
        # Pattern 3: Date + time with space separator enclosed in square bracket (e.g., [2025-09-25 07:18:51])
        # Pattern 4: Custom prefix starting with "Unexpected character" (example pattern)
        # Pattern 5: Matches log level prefixes like "info: " or "error: "
        # Pattern 6: Identify zero-width space (not visible space) as first line of a multiline block.
        firstline: '(^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)|(^\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2})|(^\[\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\])|(^\w{10} \w{9} \(\?\) \w{2} \w{8} \d{1,2})|(^(?:info|error): )|(^\x{200B}\[)'
        max_wait_time: 3s
#     affinity:
#       nodeAffinity:
#         preferredDuringSchedulingIgnoredDuringExecution:
#         - preference:
#             matchExpressions:
#               - key: opscruise
#                 operator: In
#                 values:
#                   - "true"
#           weight: 1
#     config:
#       client:
#         external_labels:
#           cluster_name: CLUSTER_NAME
    extraVolumes:
    - name: varlog
      hostPath:
        path: /var/log
    extraVolumeMounts:
    - name: varlog
      mountPath: /var/log
      readOnly: true
    extraScrapeConfigs:
    - job_name: kubernetes-nodes-debian
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubelet-logs
        target_label: namespace
      - replacement: /var/log/syslog
        target_label: __path__
      - source_labels: [kubernetes_io_hostname]
        target_label: host
    - job_name: kubernetes-nodes-redhat
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubelet-logs
        target_label: namespace
      - replacement: /var/log/messages
        target_label: __path__
      - source_labels: [kubernetes_io_hostname]
        target_label: host
  loki:
    resources:
      limits:
        cpu: 300m
        memory: 4Gi
      requests:
        cpu: 50m
        memory: 512Mi
#     config:
#       limits_config:
#         reject_old_samples_max_age: 24h
#       table_manager:
#         retention_deletes_enabled:  true
#         retention_period: 24h
#     tolerations:
#     - key: node-role.kubernetes.io/<node>
#       effect: NoSchedule
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
                - key: opscruise
                  operator: In
                  values:
                    - "true"
            weight: 1

##### Prometheus Redis Exporter #####
prometheus-redis-exporter:
  enabled: false
  redisAddress: redis://<REDIS_IP/FQDN>:6379
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1

##### Nginx Prometheus Exporter #####
nginx-prometheus-exporter:
  enabled: false
  args:
    - -nginx.scrape-uri=http://NGINX_ENDPOINT:PORT/METRIC_ENDPOINT
  tolerations:
    # - key: node-role.kubernetes.io/<node>
    #   effect: NoSchedule
  affinity:
    # nodeAffinity:
    #   preferredDuringSchedulingIgnoredDuringExecution:
    #     - preference:
    #         matchExpressions:
    #           - key: opscruise
    #             operator: In
    #             values:
    #               - "true"
    #       weight: 1
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

##### prometheus-yace-exporter #####
prometheus-yace-exporter:
  enabled: false
  image:
    repository: ghcr.io/nerdswords/yet-another-cloudwatch-exporter
    tag: v0.61.2
    pullPolicy: IfNotPresent
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi

Global Configuration

The global section is the foundation of the CO South values file. It defines site-wide defaults, such as chart versioning, secret management strategy, image pull credentials, cluster type, metric collection settings, and resource sizing profiles, that are inherited by all modules unless explicitly overridden at the individual module level.

Chart version and secret source

The global section contains settings that apply across all deployed modules unless overridden at the individual module level.

global:
  opscruiseChartVersion: TO_BE_DEFINED
  secret_source: "valuesfile"
  imagePullSecrets:
    - name: oc-ns-docker-creds
  gkeAutoPilot: false
  externalCadvisor: false
  nodeExporterPort: 9100
  useGlobalRepository: false
  globalRepositoryName: ""
  k8sClusterFqdn: "cluster.local"
  metricScraper: "prometheus"
  metricScrapeInterval: 60
  machine_type: "small"
Table 62.

Field

Description

Default values

opscruiseChartVersion

The version of the OpsCruise Helm chart being deployed.

TO_BE_DEFINED

secret_source

Defines where the deployment should read secrets from.

valuesfile

global.imagePullSecrets

A list of Kubernetes secrets used to authenticate with private container image registries when pulling images.

- name: oc-ns-docker-creds

gkeAutoPilot

Enables or disables configuration adjustments specific to GKE Autopilot clusters.

true

externalCadvisor

When set to true, the deployment uses an externally managed cAdvisor instance for container metrics instead of deploying its own. Set to false to use the built-in cAdvisor.

false

nodeExporterPort

The port on which the Prometheus Node Exporter listens for host-level metrics.

9100

metricScraper

Specifies the metrics collection backend.

"prometheus"

metricScrapeInterval

The interval (in seconds) at which metrics are scraped from targets.

60

useGlobalRepository

When set to true, all charts pull images from the repository specified in globalRepositoryName instead of their individually configured repositories. Set to false to use per-chart image settings.

false

globalRepositoryName

The global image repository path is used when useGlobalRepository is true.

(" ")

An empty string means no global override is active.

k8sClusterFqdn

The fully qualified domain name (FQDN) of the Kubernetes cluster's internal DNS.

"cluster.local"

machine_type

A sizing profile that controls default resource allocations (CPU, memory) across services. You can select small, medium , or large as machin_Type value.

small



Global AWS credentials

This subsection configures the AWS credentials used by cloud-aware modules to authenticate with AWS services and specify which regions to monitor.

global:
  awsCredentials:
    regions:
      - us-east-1
    aws_access_key_id: aws_access_key_id
    aws_secret_access_key: aws_secret_access_key
    roleArn: ""
Table 63.

Field

Description

Default value

awsCredentials

Configuration for authenticating with AWS services.

awsCredentials.regions

A list of AWS regions the deployment interacts with.

us-east-1

awsCredentials.aws_access_key_id

The AWS access key ID used for authentication. Replace with your actual key.

awsCredentials.aws_secret_access_key

The AWS secret access key paired with the access key ID. Replace with your actual secret.

awsCredentials.roleArn

An optional AWS IAM Role ARN to assume for cross-account access or scoped permissions.

" "



Global gateway credentials

This subsection provides the backend connectivity and image-registry credentials that all South gateway modules use to authenticate with the Opscruise or Virtana backend (via Keycloak) and to pull container images from the Docker registry.

global:
  gatewayCreds:
    environment:
      DOCKER_SERVER: "https://index.docker.io/v1/"
      DOCKER_USERNAME: "<DOCKER_USERNAME>"
      DOCKER_PASSWORD: "<DOCKER_PASSWORD>"
      DOCKER_EMAIL: "<DOCKER_EMAIL>"
      OPSCRUISE_ENDPOINT: "<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"
      KEYCLOAK_ENABLED: "true"
      KEYCLOAK_URL: "https://auth.opscruise.io:443"
      KEYCLOAK_CLIENT_ID: "<KAFKA_CLIENT_ID>"
      KEYCLOAK_CLIENT_SECRET: "<KEYCLOAK_CLIENT_SECRET>"
      KEYCLOAK_REALM: "<KEYCLOAK_REALM>"
      OPSCRUISE_ACCOUNT_ID: "<KEYCLOAK_CLUSTERID>"
Table 64.

Field

Description

Default value

DOCKER_SERVER

Docker registry server URL.

"https://index.docker.io/v1/"

DOCKER_USERNAME

Docker registry username.

"<DOCKER_USERNAME>"

DOCKER_PASSWORD

Docker registry password.

"<DOCKER_PASSWORD>"

DOCKER_EMAIL

Docker registry email.

"<DOCKER_EMAIL>"

OPSCRUISE_ENDPOINT

Backend Kafka endpoint to which South components send telemetry.

"<OPSCRUISE_BACKEND_KAFKA_ENDPOINT>:443"

KEYCLOAK_ENABLED

Enables or disables Keycloak-based authentication.

"true"

KEYCLOAK_URL

Keycloak server URL.

"https://auth.opscruise.io:443"

KEYCLOAK_CLIENT_ID

Keycloak client ID.

"<KAFKA_CLIENT_ID>"

KEYCLOAK_CLIENT_SECRET

Keycloak client secret.

"<KEYCLOAK_CLIENT_SECRET>"

KEYCLOAK_REALM

Keycloak realm name.

"<KEYCLOAK_REALM>"

OPSCRUISE_ACCOUNT_ID

Opscruise account or cluster identifier.

"<KEYCLOAK_CLUSTERID>"



Global namespace filtering and pod label whitelisting

These optional settings let you restrict which namespaces are monitored and control which pod labels are collected or forwarded by CO South. By default, all namespaces and no specific pod labels are filtered.

global:
  # namespaceFiltering:
  #   namespaceAllowList:
  #     - kube-system
  #     - collectors
  #     - opscruise
  # whitelistedPodLabels:
  #   - app
Table 65.

Field

Description

namespaceFiltering.namespaceAllowList

Restricts monitoring to only the listed namespaces.

whitelistedPodLabels

Allows specific pod labels to be collected or forwarded.



Global tolerations and affinity

These settings control pod scheduling across the cluster. Global tolerations and affinity rules determine which nodes CO South pods can (or prefer to) run on. Separate daemonset variants apply only to daemonset-based modules that need to run on every node, including master or control-plane nodes.

global:
  tolerations:
    - key: opscruise
      effect: NoSchedule
      operator: Exists
  daemonsetTolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule
      operator: Exists
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1
  daemonsetAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: opscruise
                operator: In
                values:
                  - "true"
          weight: 1
Table 66.

Field

Description

tolerations

Global tolerations for non-daemonset modules.

daemonsetTolerations

Additional tolerations for daemonsets.

affinity

Global node affinity for non-daemonset modules.

daemonsetAffinity

Additional node affinity for daemonsets.



Global module enable/disable flags

This subsection provides a centralized toggle for every CO South module. Modules enabled by default include node-exporter, KSM, loggw-loki, loki, promgw, k8sgw, promtail, and prometheus. You can override these flags here or via the Helm CLI.

global:
  # awsgw:
  #   enabled: false
  # k8sgw:
  #   enabled: true
  # promgw:
  #   enabled: true
  # loggw-loki:
  #   enabled: true
  # tracegw:
  #   enabled: false  
  # eventgw:
  #   enabled: false
  # trace-router:
  #   enabled: false
  # opscruise-node-exporter:
  #   enabled: false
  # opscruise-node-exporter-new:
  #   enabled: true
  # otel-metric-collector:
  #   enabled: true
  # kube-state-metrics:
  #   enabled: true
  # prometheus:
  #   enabled: true
  # loki-stack:
  #   enabled: true
  # prometheus-yace-exporter:
  #   enabled: false
  # jaeger:
  #   enabled: false
  # jaeger-operator:
  #   enabled: false
  # prometheus-postgres-exporter:
  #   enabled: false
  # prometheus-mongodb-exporter:
  #   enabled: false
  # kafka-exporter:
  #   enabled: false
  # fluent-bit:
  #   enabled: false
  # prometheus-mysql-exporter:
  #   enabled: false
  # influxdb-exporter:
  #   enabled: false
  # x509-certificate-exporter:
  #   enabled: false
  # prometheus-redis-exporter:
  #   enabled: false
  # nginx-prometheus-exporter:
  #   enabled: false
  # beyla:
  #   enabled: false
  # alloy:
  #   enabled: true
  # otel-trace-collector:
  #   enabled: false

Core gateway modules

The Core Gateway Modules serve as the essential data conduits between your monitored environment and the Opscruise/Virtana backend. These modules function as specialized collectors that gather metrics, metadata, logs, events, and traces from diverse sources that include Kubernetes clusters and major cloud providers (AWS, Azure, GCP), standardizing and streaming the data for unified observability and analysis.

AWS gateway

The AWS Gateway collects AWS cloud metrics from the configured AWS regions and forwards them to the Opscruise backend. Enable this module only if your environment includes AWS resources you want to monitor alongside your Kubernetes workloads.

awsgw:
  enabled: false
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 250Mi
    requests:
      cpu: 50m
      memory: 50Mi
Table 67.

Field

Description

Default value

enabled

Enables or disables the AWS Gateway module.

false

logLevel

Log verbosity level.

"info"

labels

Custom Kubernetes labels applied to awsgw pods.

{ }

annotations

Custom Kubernetes annotations applied to awsgw pods.

{ }

tolerations

Module-specific tolerations.

[ ]

affinity

Module-specific node affinity.

{ }

priorityClassName

Kubernetes PriorityClass name for pod scheduling priority.

" "

resources

CPU and memory requests and limits for the pod.



Kubernetes gateway

The Kubernetes Gateway is a core CO South module that discovers and collects Kubernetes cluster metadata, including pods, deployments, services, nodes, and events, and streams it to the Opscruise backend. It supports optional namespace-level filtering to limit which namespaces are monitored.

k8sgw:
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 50Mi
  # configMap:
  #   config:
  #     kubernetes:
  #       namespace_allow_list:
  #         - kube-system
  #         - collectors
  #         - opscruise
Table 68.

Field

Description

logLevel

Sets the logging verbosity level for the Kubernetes Gateway. Common values are "info", "debug", "warn", and "error".

labels

Custom Kubernetes labels to apply to the k8sgw pod(s). Useful for organizing, filtering, or selecting resources.

annotations

Custom Kubernetes annotations to attach to the k8sgw pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.

tolerations

A list of Kubernetes tolerations that allow the k8sgw pod(s) to be scheduled on nodes with matching taints.

affinity

Kubernetes affinity or anti-affinity rules that control which nodes the k8sgw pod(s) can be scheduled on, based on node labels or other pod locations.

priorityClassName

The name of a Kubernetes PriorityClass to assign to the k8sgw pod(s). Determines scheduling and eviction priority relative to other pods.

resources

CPU and memory requests and limits for the cluster.

configMap.config.kubernetes.namespace_allow_list

Module-level namespace filtering for k8sgw only.



Prometheus gateway

The Prometheus Gateway receives scraped metrics from Prometheus and forwards them to the Opscruise backend over the authenticated Kafka channel. It acts as the bridge between the local Prometheus instance and the Virtana cloud platform.

promgw:
  logLevel: "info"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 300Mi
    requests:
      cpu: 50m
      memory: 50Mi
Table 69.

Field

Description

logLevel

Sets the logging verbosity level for the Prometheus Gateway. Common values are "info", "debug", "warn", and "error".

labels

Custom Kubernetes labels to apply to the promgw pod(s). Useful for organizing, filtering, or selecting resources.

annotations

Custom Kubernetes annotations to attach to the promgw pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.

tolerations

A list of Kubernetes tolerations that allow the promgw pod(s) to be scheduled on nodes with matching taints.

affinity

Kubernetes affinity or anti-affinity rules that control which nodes the promgw pod(s) can be scheduled on, based on node labels or other pod locations.

priorityClassName

The name of a Kubernetes PriorityClass to assign to the promgw pod(s). Determines scheduling and eviction priority relative to other pods.

resources.limits

The maximum CPU and memory the promgw container is allowed to consume.

resources.requests

The minimum CPU and memory guaranteed to the promgw container.



Log gateway - Loki

The Log Gateway is a Java-based gateway that reads logs from the in-cluster Loki instance and forwards them to the Opscruise backend. It handles OAuth authentication and connects to Loki using the cluster-internal DNS endpoint derived from global.k8sClusterFqdn.

loggw-loki:
  enabled: true
  logLevel: "info"
  config:
    oauthAcceptUnsecureServer: "true"
    jgateway:
      lokiHost: "opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi
Table 70.

Field

Description

Default value

Enables or disables the Log Gateway.

true

config.oauthAcceptUnsecureServer

Allows OAuth connections to non-TLS servers.

true

config.jgateway.lokiHost

Internal Loki endpoint used by the log gateway. K8S_CLUSTER_FQDN is replaced by global.k8sClusterFqdn.

opscruise-bundle-loki.opscruise.svc.K8S_CLUSTER_FQDN:3100



Azure gateway

The Azure Gateway collects Azure cloud metrics from one or more Azure subscriptions and forwards them to the Opscruise backend. It supports multiple credential sets for monitoring resources across different subscriptions or tenants.

azuregw:
  enabled: false
  logLevel: "INFO"
  azureCredentials:
    - azureauth_clientId: azureauth_clientId
      azureauth_tenantId: azureauth_tenantId
      azureauth_clientSecret: azureauth_clientSecret
      azureauth_subId: azureauth_subId
      name: "credential_name"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 50m
      memory: 256Mi
Table 71.

Field

Description

Default value

enabled

Enables or disables the Azure Gateway.

false

azureCredentials

List of Azure credential sets.

azureCredentials[].azureauth_clientId

Azure AD application (client) ID.

azureCredentials[].azureauth_tenantId

Azure AD tenant ID.

azureCredentials[].azureauth_clientSecret

Azure AD client secret.

azureCredentials[].azureauth_subId

Azure subscription ID.

azureCredentials[].name

Readable name for Azure credential set.



GCP gateway

The GCP Gateway collects Google Cloud Platform metrics and forwards them to the Opscruise backend. Enable this module only if your environment includes GCP resources you want to monitor.

gcpgw:
  enabled: false
  logLevel: "INFO"
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi

Trace gateway

The Trace Gateway collects distributed tracing data and forwards trace spans to the Opscruise backend for analysis and SLO tracking. It supports persistent storage for SLO data and can operate in either poll (pull) or listen (push) mode.

tracegw:
  enabled: false
  logLevel: "INFO"
  persistentType: ebs
  storageClassName: ""
  slo_storage_size: 20Gi
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  service:
    type: ClusterIP
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    tracegw:
      filterTagsKey: notag
      filterTagsValue: notag
      traceDataFromJeager: "true"
      mode: poll
Table 72.

Field

Description

Default value

enabled

Enables or disables the Trace Gateway.

false

persistentType

Persistent volume type. Supported: ebs, hostpath.

ebs

storageClassName

Kubernetes StorageClass name. Leave blank for default.

" "

slo_storage_size

Persistent volume size for SLO data.

20 Gi

service.type

Kubernetes Service type.

ClusterIP

config.tracegw.filterTagsKey

Tag key used for trace filtering.

notag

config.tracegw.filterTagsValue

Tag key used for trace filtering.

notag

config.tracegw.traceDataFromJeager

Whether trace data is sourced from Jaeger.

true

config.tracegw.mode

Data retrieval mode. Supported: poll, listen.

poll



Event gateway

The Event Gateway captures Kubernetes cluster events, such as pod scheduling, node conditions, and resource warnings, and forwards them to the Opscruise backend for correlation with metrics and logs.

eventgw:
  enabled: false
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""

Trace router

The Trace Router acts as an intermediary routing layer for distributed traces, directing trace data between trace sources and the Trace Gateway. It is typically used in complex tracing topologies where traces need to be filtered, sampled, or routed before reaching the gateway.

trace-router:
  enabled: false
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  service:
    type: ClusterIP
  resources:
    limits:
      cpu: 500m
      memory: 3Gi
    requests:
      cpu: 50m
      memory: 128Mi
Table 73.

Field

Description

enabled

Controls whether the Trace Router module is deployed. Set to true to enable or false to disable.

labels

Custom Kubernetes labels to apply to the trace-router pod(s). Useful for organizing, filtering, or selecting resources.

annotations

Custom Kubernetes annotations to attach to the trace-router pod(s). Often used for integrations with monitoring tools, ingress controllers, or policy engines.

tolerations

A list of Kubernetes tolerations that allow the trace-router pod(s) to be scheduled on nodes with matching taints.

affinity

Kubernetes affinity or anti-affinity rules that control which nodes the trace-router pod(s) can be scheduled on, based on node labels or other pod locations.

priorityClassName

The name of a Kubernetes PriorityClass to assign to the trace-router pod(s). Determines scheduling and eviction priority relative to other pods.

service.type

Specifies the Kubernetes Service type used to expose the Trace Router. ClusterIPmakes it accessible only within the cluster. Other possible values include NodePort and LoadBalancer.

resources.limits

The maximum CPU and memory the trace-router container is allowed to consume.

resources.requests

The minimum CPU and memory guaranteed to the trace-router container.



Metric collection components

The Metric Collection Components are the specialized agents and collectors responsible for gathering raw performance data from your infrastructure. These modules operate at the host, container, and application levels, utilizing technologies like eBPF and OpenTelemetry to capture granular resource usage, network flows, and request-level metrics. They provide the raw data foundation that allows Opscruise to visualize health and troubleshoot performance bottlenecks across the entire stack.

Node Exporter (New)

The new Node Exporter is the recommended replacement for the legacy version. It collects host-level metrics and additionally supports eBPF-based flow collection (network flow visibility, DNS tracking) via custom arguments. It also supports public IP aggregation patterns for network analytics.

opscruise-node-exporter-new:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  # args:
  #   btfFilePath: path-to-btf-file
  #   kconfigFilePath: path-to-kconfig
  #   customArgs:
  #     - "--collector.ocflowbpfcollector.skip-bpf-verification"
  #     - "--collector.ocflowbpfcollector.enable-dns-tracking"
  # publicIPAggregationSubnetPatterns:
  #   - pattern: "172.16.0.86/32"
  #     aggregate_to: "1.2.3.4"
  #     aggregate_name: ""
Table 74.

Field

Description

logLevel

Log verbosity. Keep as info for GKE Autopilot.

args.btfFilePath

Path to BTF file for eBPF-based collectors.

args.kconfigFilePath

Path to kernel config file.

args.customArgs

Additional CLI arguments.

publicIPAggregationSubnetPatterns

Patterns for aggregating public IPs into representative addresses.

publicIPAggregationSubnetPatterns[].pattern

CIDR pattern to match.

publicIPAggregationSubnetPatterns[].aggregate_to

IP address to aggregate matched traffic to.

publicIPAggregationSubnetPatterns[].aggregate_name

An optional readable name for the aggregated IP.



Beyla

Beyla provides automatic, zero-code instrumentation of applications using eBPF. It captures HTTP/gRPC request metrics and traces without requiring any changes to application code or container images, making it ideal for gaining instant observability into services that are not yet manually instrumented.

beyla:
  annotations: {}
  tolerations: []
  priorityClassName: ""
  affinity: {}
Table 75.

Field

Description

podLabels

Custom labels applied to Beyla pods.

annotations

Custom annotations on the Beyla resource.

podAnnotations

Custom annotations applied to Beyla pods.



OTEL metric collector

The OpenTelemetry Metric Collector receives, processes, and exports metrics using the OpenTelemetry protocol. It can be extended with additional Prometheus scrape configurations to collect metrics from custom exporters or services not covered by the default CO South modules.

otel-metric-collector:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  # additional_receivers_configs:
  #   prometheus:
  #     config:
  #       scrape_configs:
  #         - job_name: new-job-exporter
  #           static_configs:
  #             - targets:
  #                 - '172.16.71.143:9256'
  #           scheme: http
  #           tls_config:
  #             insecure_skip_verify: true

Use additional_receivers_configs to add custom Prometheus scrape jobs to the OTEL collector alongside the default targets.

cAdvisor

cAdvisor runs as a daemonset on every node and collects container-level resource usage and performance metrics (CPU, memory, filesystem, network) for all running containers. It provides the per-container granularity that complements node-level metrics from Node Exporter.

cadvisor:
  labels: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  affinity: {}
  hostNetwork: false
  resources:
    limits:
      cpu: 300m
      memory: 512Mi
    requests:
      cpu: 50m
      memory: 128Mi
Table 76.

Field

Description

Default value

hostNetwork

Whether cAdvisor pods use the host network namespace.

false

customArgs

Additional CLI arguments

set to enable_load_reader=false

resources

CPU and memory requests and limits for the container.



Kube State metrics

Kube State Metrics (KSM) generates Prometheus-format metrics about the state of Kubernetes objects, such as deployments, pods, nodes, jobs, and config maps, by listening to the Kubernetes API server. It provides the "desired vs. actual" state visibility that raw resource metrics alone cannot offer.

kube-state-metrics:
  labels: {}
  annotations: {}
  tolerations: []
  affinity: {}
  priorityClassName: ""
  resources:
    requests:
      cpu: 50m
      memory: 30Mi
    limits:
      cpu: 300m
      memory: 250Mi

Prometheus

Prometheus is the primary metric scraping engine in CO South. It discovers and scrapes metrics from Kubernetes nodes, pods, cAdvisor, kube-state-metrics, and any custom exporters, then makes them available to the Prometheus Gateway (promgw) for forwarding to the backend. It also supports Istio integration and ECS service discovery for hybrid environments.

prometheus:
  enableIstio: false
  enablePersistent: false
  prometheusEcsDiscovery:
    enabled: false
    awsCredentials:
      regions:
        - us-east-1
      aws_access_key_id: aws_access_key_id
      aws_secret_access_key: aws_secret_access_key
      roleArn: ""
  resources:
    requests:
      cpu: 50m
      memory: 1000Mi
    limits:
      memory: 5Gi
  nonIstioConfigMap:
    enabledScrapeJobs:
      - oc-kubernetes-pods
      - oc-app-exporters
      - kubernetes-nodes
      - kubernetes-nodes-cadvisor
      - kubernetes-apiservers
      - kube-scheduler
    additionalScrapeConfigs: []
Table 77.

Field

Description

Default value

enableIstio

Enables or disables Istio service mesh integration for Prometheus.

false

enablePersistent

Enables persistent storage for Prometheus data.

false

scrape_interval

Module-level scrape interval.

prometheusEcsDiscovery.enabled

Enables ECS service discovery for Prometheus.

false

prometheusEcsDiscovery.awsCredentials

AWS credentials for ECS discovery.

nonIstioConfigMap.enabledScrapeJobs

List of default scrape job names enabled when Istio is not used.

nonIstioConfigMap.additionalScrapeConfigs

Custom Prometheus scrape configurations, for example, ECS targets, and custom cAdvisor.

[ ]



Database and middleware exporters

The Database and Middleware Exporters are specialized bridge components designed to extract deep visibility from stateful services and messaging systems. Since databases and middleware often operate as "black boxes" with their own internal telemetry formats, these exporters scrape service-specific data, such as query latency, connection pools, and queue depths, and translate them into a unified Prometheus-compatible format. This allows Opscruise to correlate the health of your data layer directly with the performance of your application services.

Prometheus PostgreSQL exporter

The PostgreSQL Exporter scrapes database-level metrics, such as connections, transactions, locks, replication lag, or table/index statistics, from a PostgreSQL instance and exposes them in Prometheus format. It supports direct password configuration or Kubernetes Secret-based credential management.

prometheus-postgres-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 50m
      memory: 128Mi
  config:
    datasource:
      host: "<POSTGRESQL_SERVICENAME.NAMESPACE.svc.cluster.local>"
      user: "postgres_exporter"
      password: "<POSTGRES_EXPORTER_PASSWORD>"
      passwordSecret: {}
      database: "<DB_NAME>"
      sslmode: disable
    autoDiscoverDatabases: false
    excludeDatabases: []
    includeDatabases: []
Table 78.

Field

Description

Default value

enabled

Enables or disables the PostgreSQL exporter.

false

config.datasource.host

PostgreSQL service DNS or IP.

config.datasource.user

Database user for the exporter.

postgres_exporter

config.datasource.password

Database password. Only one of password or passwordSecret should be used.

config.datasource.passwordSecret.name

Kubernetes Secret name containing the password.

config.datasource.passwordSecret.key

Key inside the Secret.

config.datasource.database

Database name to connect to.

config.datasource.sslmode

PostgreSQL SSL mode.

disable

config.autoDiscoverDatabases

Automatically discover and monitor all databases.

false

config.excludeDatabases

Databases to exclude from auto-discovery.

[ ]

config.includeDatabases

Databases to include from auto-discovery.

[ ]



Prometheus MongoDB exporter

The MongoDB Exporter scrapes database-level metrics from a MongoDB instance and exposes them in Prometheus format. It supports both inline connection URI and Kubernetes Secret-based credential management.

prometheus-mongodb-exporter:
  enabled: false
  mongodb:
    uri: "mongodb://${USERNAME}:${PASSWORD}@<SERVICE_NAME.<NAMESPACE>.svc.cluster.local"
  existingSecret:
    name: ""
    key: "mongodb-uri"
  resources:
    limits:
      cpu: 250m
      memory: 192Mi
    requests:
      cpu: 50m
      memory: 128Mi
Table 79.

Field

Description

Default value

enabled

Enables or disables the MongoDB exporter.

false

mongodb.uri

MongoDB connection URI. Ignored if existingSecret is provided.

existingSecret.name

Name of an existing Kubernetes Secret containing the URI.

" "

existingSecret.key

Key inside the Secret holding the connection URI.

mongodb-uri



Kafka exporter

The Kafka Exporter scrapes Kafka broker and consumer group metrics and exposes them in Prometheus format. It connects to both Kafka brokers and ZooKeeper, and supports multi-broker/multi-ZooKeeper configurations.

kafka-exporter:
  enabled: false
  args:
    - --kafka.server=<KAFKA_SERVICE>.<NAMESPACE>.svc.cluster.local:<KAFKA_SERVICE_PORT>
    - --zookeeper.server=<ZOOKEEPER_SERVICE>.<NAMESPACE>.svc.cluster.local:<ZOOKEEPER_SERVICE_PORT>
  mutiple_kafka_zookeepers: []
  resources:
    requests:
      cpu: 50m
      memory: 256Mi
    limits:
      cpu: "0.5"
      memory: 256Mi
Table 80.

Field

Description

Default value

enabled

Enables or disables the Kafka exporter.

false

args

CLI arguments specifying Kafka and ZooKeeper server endpoints.

mutiple_kafka_zookeepers

List of additional Kafka or ZooKeeper endpoint pairs for multi-broker setups.

[ ]



Prometheus MySQL exporter

The MySQL Exporter scrapes database-level metrics from a MySQL instance and exposes them in Prometheus format. It supports both inline password and Kubernetes Secret-based credential management.

prometheus-mysql-exporter:
  enabled: false
  resources:
    limits:
      cpu: 100m
      memory: 200Mi
    requests:
      cpu: 50m
      memory: 128Mi
  mysql:
    db: "<DB_NAME>"
    host: "<MYSQL_SERVICE>.<NAMESPACE>.svc.cluster.local"
    param: ""
    pass: ""
    port: 3306
    user: "mysql_exporter"
    existingPasswordSecret:
      name: ""
      key: ""
Table 81.

Field

Description

Default value

enabled

Enables or disables the MySQL exporter.

false

mysql.db

Database name.

mysql.host

MySQL service DNS or IP.

mysql.param

Additional DSN parameters.

" "

mysql.pass

Database password. Ignored if existingPasswordSecret is set.

" "

mysql.port

MySQL port.

3306

mysql.user

Database user.

mysql_exporter

mysql.existingPasswordSecret.name

Kubernetes Secret name containing the password.

" "

mysql.existingPasswordSecret.key

Key inside the Secret.

" "



InfluxDB exporter

The InfluxDB Exporter collects InfluxDB metrics and exposes them in Prometheus format. It supports two connectivity modes: DNS-based access (for in-cluster or DNS-resolvable instances) and direct IP-based access.

influxdb-exporter:
  enabled: false
  endpoint_type: common_dns
  common_dns_name: ""
  common_dns_port: ""
  external_ip_address: ""
  external_ip_port: ""
  serviceLabels: {}
  annotations: {}
Table 82.

Field

Description

Default value

enabled

Enables or disables the InfluxDB exporter.

false

endpoint_type

How to reach InfluxDB.

common_dns = DNS-accessible;

external_ip = IP-only access.

common_dns

common_dns_name

DNS name of the InfluxDB instance.

Used when endpoint_type: common_dns.

" "

common_dns_port

Port of the InfluxDB instance.

Used when endpoint_type: common_dns.

" "

external_ip_address

IP address.

Used when endpoint_type: external_ip.

" "

external_ip_port

Port used when endpoint_type: external_ip.

" "

serviceLabels

Custom labels on the exporter Service.

{ }

annotations

Custom annotations.

{ }



Loki-stack promtail integration

The Loki Stack deploys Promtail and Loki. Promtail tails container and node logs, applies pipeline stages, such as parsing, multiline merging, or timestamp stripping, and ships them to Loki. The Log Gateway (loggw-loki) then reads from Loki and forwards logs to the Opscruise backend.

Promtail

Promtail is the log shipping agent that runs on every node, discovers pod and node log files, applies configurable pipeline stages, and pushes the processed logs to the in-cluster Loki instance.

loki-stack:
  promtail:
    tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 128Mi
    pipelineStages:
      - docker: {}
      - replace:
          expression: '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z (?:stdout|stderr) [A-Z] )...'
          replace: ''
      - multiline:
          firstline: '(^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)|(^\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2})|(^\[\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\])|(^\w{10} \w{9} \(\?\) \w{2} \w{8} \d{1,2})|(^(?:info|error): )|(^\x{200B}\[)...'
          max_wait_time: 3s
    extraVolumes:
      - name: varlog
        hostPath:
          path: /var/log
    extraVolumeMounts:
      - name: varlog
        mountPath: /var/log
        readOnly: true
    extraScrapeConfigs:
      - job_name: kubernetes-nodes-debian      
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubelet-logs
          target_label: namespace
        - replacement: /var/log/syslog
          target_label: __path__
        - source_labels: [kubernetes_io_hostname]
          target_label: host
      - job_name: kubernetes-nodes-redhat
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubelet-logs
          target_label: namespace
        - replacement: /var/log/messages
          target_label: __path__
        - source_labels: [kubernetes_io_hostname]
          target_label: host
Table 83.

Field

Description

Default value

tolerations

Tolerations for Promtail daemonset pods.

key: node-role.kubernetes.io/master

effect: NoSchedule

resources

CPU and memory requests and limits for the container.

pipelineStages

Promtail log processing pipeline.

Includes docker (CRI parsing), replace (strip kubelet prefix timestamps), and multiline (combine multi-line logs).

pipelineStages[].multiline.firstline

Regex patterns to detect the first line of a multi-line log entry.

pipelineStages[].multiline.max_wait_time

Maximum time to wait for additional lines before flushing.

3s

extraVolumes

Additional volumes mounted into Promtail pods.

[{name: varlog, hostPath: {path: /var/log}}]

extraVolumeMounts

Mount points for extra volumes.

[{name: varlog, mountPath: /var/log, readOnly: true}]

extraScrapeConfigs

Additional Promtail scrape configurations for node-level logs.

Debian + RedHat node log jobs



Loki

Loki is the log aggregation backend that stores and indexes logs shipped by Promtail. It provides the query interface used by the Log Gateway (loggw-loki) to retrieve and forward logs to the Opscruise backend. It supports configurable retention policies and sample age limits.

loki-stack:
  loki:
    resources:
      limits:
        cpu: 300m
        memory: 4Gi
      requests:
        cpu: 50m
        memory: 512Mi
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
                - key: opscruise
                  operator: In
                  values:
                    - "true"
            weight: 1
Table 84.

Field

Description

config.limits_config.reject_old_samples_max_age

Maximum age of log samples accepted.

config.table_manager.retention_deletes_enabled

Enables automatic deletion of old log data.

config.table_manager.retention_period

Duration to retain log data.

affinity

Node affinity for Loki pods.



Prometheus Redis exporter

The Redis Exporter scrapes Redis server metrics, such as, memory usage, connected clients, commands processed, keyspace statistics, or replication info, and exposes them in Prometheus format for monitoring Redis instances alongside your Kubernetes workloads.

prometheus-redis-exporter:
  enabled: false
  redisAddress: redis://<REDIS_IP/FQDN>:6379
Table 85.

Field

Description

enabled

Enables or disables the Redis exporter.

redisAddress

Redis connection URI.



Nginx Prometheus exporter

The Nginx Exporter scrapes Nginx server metrics from the Nginx stub_status or metrics endpoint and exposes them in Prometheus format for monitoring Nginx instances running in or alongside your cluster.

nginx-prometheus-exporter:
  enabled: false
  args:
    - -nginx.scrape-uri=http://NGINX_ENDPOINT:PORT/METRIC_ENDPOINT
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi
Table 86.

Field

Description

enabled

Enables or disables the Nginx exporter.

args

CLI arguments. -nginx.scrape-uri specifies the Nginx stub_status or metrics endpoint URL.

resources

CPU and memory requests and limits for the server.



YACE - AWS CloudWatch exporter

YACE (Yet Another CloudWatch Exporter) scrapes AWS CloudWatch metrics and exposes them in Prometheus format. It enables monitoring of AWS-managed services, for example, RDS, ELB, Lambda, SQS) that do not run inside the Kubernetes cluster but are part of the overall application infrastructure.

prometheus-yace-exporter:
  enabled: false
  image:
    repository: ghcr.io/nerdswords/yet-another-cloudwatch-exporter
    tag: v0.61.2
    pullPolicy: IfNotPresent
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 20m
      memory: 128Mi
Table 87.

Field

Description

Default value

enabled

Enables or disables the YACE CloudWatch exporter.

false

image.repository

Container image repository.

ghcr.io/nerdswords/yet-another-cloudwatch-exporter

image.tag

Container image tag/version.

v0.61.2

image.pullPolicy

Kubernetes image pull policy.

IfNotPresent

resources

CPU and memory requests and limits for the server.