helm-charts

Version ArtifactHub License Slack X Reddit

Kubernetes monitoring on VictoriaMetrics stack. Includes VictoriaMetrics Operator, Grafana dashboards, ServiceScrapes and VMRules

Overview

This chart is an All-in-one solution to start monitoring kubernetes cluster. It installs multiple dependency charts like grafana, node-exporter, kube-state-metrics and victoria-metrics-operator. Also it installs Custom Resources like VMSingle, VMCluster, VMAgent, VMAlert.

By default, the operator converts all existing prometheus-operator API objects into corresponding VictoriaMetrics Operator objects.

To enable metrics collection for kubernetes this chart installs multiple scrape configurations for kubernetes components like kubelet and kube-proxy, etc. Metrics collection is done by VMAgent. So if want to ship metrics to external VictoriaMetrics database you can disable VMSingle installation by setting vmsingle.enabled to false and setting vmagent.vmagentSpec.remoteWrite.url to your external VictoriaMetrics database.

This chart also installs bunch of dashboards and recording rules from kube-prometheus project.

Overview

Configuration

Configuration of this chart is done through helm values.

Dependencies

Dependencies can be enabled or disabled by setting enabled to true or false in values.yaml file.

!Important: for dependency charts anything that you can find in values.yaml of dependency chart can be configured in this chart under key for that dependency. For example if you want to configure grafana you can find all possible configuration options in values.yaml and you should set them in values for this chart under grafana: key. For example if you want to configure grafana.persistence.enabled you should set it in values.yaml like this:

#################################################
###              dependencies               #####
#################################################
# Grafana dependency chart configuration. For possible values refer to https://github.com/grafana/helm-charts/tree/main/charts/grafana#configuration
grafana:
  enabled: true
  persistence:
    type: pvc
    enabled: false

VictoriaMetrics components

This chart installs multiple VictoriaMetrics components using Custom Resources that are managed by victoria-metrics-operator Each resource can be configured using spec of that resource from API docs of victoria-metrics-operator. For example if you want to configure VMAgent you can find all possible configuration options in API docs and you should set them in values for this chart under vmagent.spec key. For example if you want to configure remoteWrite.url you should set it in values.yaml like this:

vmagent:
  spec:
    remoteWrite:
      - url: "https://insert.vmcluster.domain.com/insert/0/prometheus/api/v1/write"

ArgoCD issues

Operator self signed certificates

When deploying K8s stack using ArgoCD without Cert Manager (.Values.victoria-metrics-operator.admissionWebhooks.certManager.enabled: false) it will rerender operator’s webhook certificates on each sync since Helm lookup function is not respected by ArgoCD. To prevent this please update you K8s stack Application spec.syncPolicy and spec.ignoreDifferences with a following:

apiVersion: argoproj.io/v1alpha1
kind: Application
...
spec:
  ...
  destination:
    ...
    namespace: <k8s-stack-namespace>
  ...
  syncPolicy:
    syncOptions:
    # https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#respect-ignore-difference-configs
    # argocd must also ignore difference during apply stage
    # otherwise it ll silently override changes and cause a problem
    - RespectIgnoreDifferences=true
  ignoreDifferences:
    - group: ""
      kind: Secret
      name: <fullname>-validation
      namespace: <k8s-stack-namespace>
      jsonPointers:
        - /data
    - group: admissionregistration.k8s.io
      kind: ValidatingWebhookConfiguration
      name: <fullname>-admission
      jqPathExpressions:
      - '.webhooks[]?.clientConfig.caBundle'

where <fullname> is output of `` for your setup

metadata.annotations: Too long: must have at most 262144 bytes on dashboards

If one of dashboards ConfigMap is failing with error Too long: must have at most 262144 bytes, please make sure you’ve added argocd.argoproj.io/sync-options: ServerSideApply=true annotation to your dashboards:

defaultDashboards:
  annotations:
    argocd.argoproj.io/sync-options: ServerSideApply=true

argocd.argoproj.io/sync-options: ServerSideApply=true

Resources are not completely removed after chart uninstallation

This chart uses pre-delete Helm hook to cleanup resources managed by operator, but it’s not supported in ArgoCD and this hook is ignored. To have a control over resources removal please consider using either ArgoCD sync phases and waves or installing operator chart separately

Rules and dashboards

This chart by default install multiple dashboards and recording rules from kube-prometheus you can disable dashboards with defaultDashboards.enabled: false and experimentalDashboardsEnabled: false and rules can be configured under defaultRules

Adding external dashboards

By default, this chart uses sidecar in order to provision default dashboards. If you want to add you own dashboards there are two ways to do it:

Prometheus scrape configs

This chart installs multiple scrape configurations for kubernetes monitoring. They are configured under #ServiceMonitors section in values.yaml file. For example if you want to configure scrape config for kubelet you should set it in values.yaml like this:

kubelet:
  enabled: true
  # spec for VMNodeScrape crd
  # https://docs.victoriametrics.com/operator/api#vmnodescrapespec
  spec:
    interval: "30s"

Using externally managed Grafana

If you want to use an externally managed Grafana instance but still want to use the dashboards provided by this chart you can set grafana.enabled to false and set defaultDashboards.enabled to true. This will install the dashboards but will not install Grafana.

For example:

defaultDashboards:
  enabled: true

grafana:
  enabled: false

This will create ConfigMaps with dashboards to be imported into Grafana.

If additional configuration for labels or annotations is needed in order to import dashboard to an existing Grafana you can set .grafana.sidecar.dashboards.additionalDashboardLabels or .grafana.sidecar.dashboards.additionalDashboardAnnotations in values.yaml:

For example:

defaultDashboards:
  enabled: true
  labels:
    key: value
  annotations:
    key: value

Prerequisites

helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

How to install

Access a Kubernetes cluster.

Setup chart repository (can be omitted for OCI repositories)

Add a chart helm repository with follow commands:

helm repo add vm https://victoriametrics.github.io/helm-charts/

helm repo update

List versions of vm/victoria-metrics-k8s-stack chart available to installation:

helm search repo vm/victoria-metrics-k8s-stack -l

Install victoria-metrics-k8s-stack chart

Export default values of victoria-metrics-k8s-stack chart to file values.yaml:

Change the values according to the need of the environment in values.yaml file.

Test the installation with command:

Install chart with command:

Get the pods lists by running this commands:

kubectl get pods -A | grep 'vmks'

Get the application by running this command:

helm list -f vmks -n NAMESPACE

See the history of versions of vmks application with command.

helm history vmks -n NAMESPACE

Install operator separately

To have control over an order of managed resources removal or to be able to remove a whole namespace with managed resources it’s recommended to disable operator in k8s-stack chart (victoria-metrics-operator.enabled: false) and install it separately. To move operator from existing k8s-stack release to a separate one please follow the steps below:

If you’re planning to delete k8s-stack by a whole namespace removal please consider deploying operator in a separate namespace as due to uncontrollable removal order process can hang if operator is removed before at least one resource it manages.

Install locally (Minikube)

To run VictoriaMetrics stack locally it’s possible to use Minikube. To avoid dashboards and alert rules issues please follow the steps below:

Run Minikube cluster

minikube start --container-runtime=containerd --extra-config=scheduler.bind-address=0.0.0.0 --extra-config=controller-manager.bind-address=0.0.0.0 --extra-config=etcd.listen-metrics-urls=http://0.0.0.0:2381

Install helm chart

helm install [RELEASE_NAME] vm/victoria-metrics-k8s-stack -f values.yaml -f values.minikube.yaml -n NAMESPACE --debug --dry-run

How to uninstall

Remove application with command.

helm uninstall vmks -n NAMESPACE

CRDs created by this chart are not removed by default and should be manually cleaned up:

kubectl get crd | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl delete crd {}

Troubleshooting

Upgrade guide

Usually, helm upgrade doesn’t requires manual actions. Just execute command:

$ helm upgrade [RELEASE_NAME] vm/victoria-metrics-k8s-stack

But release with CRD update can only be patched manually with kubectl. Since helm does not perform a CRD update, we recommend that you always perform this when updating the helm-charts version:

# 1. check the changes in CRD
$ helm show crds vm/victoria-metrics-k8s-stack --version [YOUR_CHART_VERSION] | kubectl diff -f -

# 2. apply the changes (update CRD)
$ helm show crds vm/victoria-metrics-k8s-stack --version [YOUR_CHART_VERSION] | kubectl apply -f - --server-side

All other manual actions upgrades listed below:

Upgrade to 0.29.0

To provide more flexibility for VMAuth configuration all <component>.vmauth params were moved to vmauth.spec. Also .vm.write and .vm.read variables are available in vmauth.spec, which represent vmsingle, vminsert, externalVM.write and vmsingle, vmselect, externalVM.read parsed URLs respectively.

If your configuration in version < 0.29.0 looked like below:

vmcluster:
  vmauth:
    vmselect:
      - src_paths:
          - /select/.*
        url_prefix:
          - /
    vminsert:
      - src_paths:
          - /insert/.*
        url_prefix:
          - /

In 0.29.0 it should look like:

vmauth:
  spec:
    unauthorizedAccessConfig:
      - src_paths:
          - '/.*'
        url_prefix:
          - '/'
      - src_paths:
          - '/.*'
        url_prefix:
          - '/'

Upgrade to 0.13.0

kubectl delete daemonset -l app=prometheus-node-exporter

Upgrade to 0.6.0

All CRD must be update to the lastest version with command:

kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/helm-charts/master/charts/victoria-metrics-k8s-stack/crds/crd.yaml

Upgrade to 0.4.0

All CRD must be update to v1 version with command:

kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/helm-charts/master/charts/victoria-metrics-k8s-stack/crds/crd.yaml

Upgrade from 0.2.8 to 0.2.9

Update VMAgent crd

command:

kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.16.0/config/crd/bases/operator.victoriametrics.com_vmagents.yaml

### Upgrade from 0.2.5 to 0.2.6

New CRD added to operator - VMUser and VMAuth, new fields added to exist crd. Manual commands:

kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmusers.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmauths.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmalerts.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmagents.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmsingles.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmclusters.yaml

Documentation of Helm Chart

Install helm-docs following the instructions on this tutorial.

Generate docs with helm-docs command.

cd charts/victoria-metrics-k8s-stack

helm-docs

The markdown generation is entirely go template driven. The tool parses metadata from charts and generates a number of sub-templates that can be referenced in a template file (by default README.md.gotmpl). If no template file is provided, the tool has a default internal template that will generate a reasonably formatted README.

Parameters

The following tables lists the configurable parameters of the chart and their default values.

Change the values according to the need of the environment in victoria-metrics-k8s-stack/values.yaml file.

Key Type Default Description
additionalVictoriaMetricsMap string
null

Provide custom recording or alerting rules to be deployed into the cluster.

alertmanager.annotations object
{}

Alertmanager annotations

alertmanager.config object
receivers:
    - name: blackhole
route:
    receiver: blackhole
templates:
    - /etc/vm/configs/**/*.tmpl

Alertmanager configuration

alertmanager.enabled bool
true

Create VMAlertmanager CR

alertmanager.ingress object
annotations: {}
enabled: false
extraPaths: []
hosts:
    - alertmanager.domain.com
labels: {}
path: ''
pathType: Prefix
tls: []

Alertmanager ingress configuration

alertmanager.ingress.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

alertmanager.monzoTemplate object
enabled: true

Better alert templates for slack source

alertmanager.spec object
configSecret: ""
externalURL: ""
image:
    tag: v0.27.0
port: "9093"
replicaCount: 1
routePrefix: /
selectAllByDefault: true

Full spec for VMAlertmanager CRD. Allowed values described here

alertmanager.spec.configSecret string
""

If this one defined, it will be used for alertmanager configuration and config parameter will be ignored

alertmanager.templateFiles object
{}

Extra alert templates

argocdReleaseOverride string
""

If this chart is used in “Argocd” with “releaseName” field then VMServiceScrapes couldn’t select the proper services. For correct working need set value ‘argocdReleaseOverride=$ARGOCD_APP_NAME’

coreDns.enabled bool
true

Enabled CoreDNS metrics scraping

coreDns.service.enabled bool
true

Create service for CoreDNS metrics

coreDns.service.port int
9153

CoreDNS service port

coreDns.service.selector object
k8s-app: kube-dns

CoreDNS service pod selector

coreDns.service.targetPort int
9153

CoreDNS service target port

coreDns.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

defaultDashboards.annotations object
{}

defaultDashboards.dashboards object
node-exporter-full:
    enabled: true
victoriametrics-operator:
    enabled: true
victoriametrics-vmalert:
    enabled: true

Create dashboards as ConfigMap despite dependency it requires is not installed

defaultDashboards.dashboards.node-exporter-full object
enabled: true

In ArgoCD using client-side apply this dashboard reaches annotations size limit and causes k8s issues without server side apply See this issue

defaultDashboards.defaultTimezone string
utc

defaultDashboards.enabled bool
true

Enable custom dashboards installation

defaultDashboards.grafanaOperator.enabled bool
false

Create dashboards as CRDs (requires grafana-operator to be installed)

defaultDashboards.grafanaOperator.spec.allowCrossNamespaceImport bool
false

defaultDashboards.grafanaOperator.spec.instanceSelector.matchLabels.dashboards string
grafana

defaultDashboards.labels object
{}

defaultDatasources.alertmanager object
datasources:
    - access: proxy
      jsonData:
        implementation: prometheus
      name: Alertmanager
perReplica: false

List of alertmanager datasources. Alertmanager generated url will be added to each datasource in template if alertmanager is enabled

defaultDatasources.alertmanager.perReplica bool
false

Create per replica alertmanager compatible datasource

defaultDatasources.extra list
[]

Configure additional grafana datasources (passed through tpl). Check here for details

defaultDatasources.victoriametrics.datasources list
- isDefault: true
  name: VictoriaMetrics
  type: prometheus
- isDefault: false
  name: VictoriaMetrics (DS)
  type: victoriametrics-datasource

List of prometheus compatible datasource configurations. VM url will be added to each of them in templates.

defaultDatasources.victoriametrics.perReplica bool
false

Create per replica prometheus compatible datasource

defaultRules object
alerting:
    spec:
        annotations: {}
        labels: {}
annotations: {}
create: true
group:
    spec:
        params: {}
groups:
    alertmanager:
        create: true
        rules: {}
    etcd:
        create: true
        rules: {}
    general:
        create: true
        rules: {}
    k8sContainerCpuLimits:
        create: true
        rules: {}
    k8sContainerCpuRequests:
        create: true
        rules: {}
    k8sContainerCpuUsageSecondsTotal:
        create: true
        rules: {}
    k8sContainerMemoryCache:
        create: true
        rules: {}
    k8sContainerMemoryLimits:
        create: true
        rules: {}
    k8sContainerMemoryRequests:
        create: true
        rules: {}
    k8sContainerMemoryRss:
        create: true
        rules: {}
    k8sContainerMemorySwap:
        create: true
        rules: {}
    k8sContainerMemoryWorkingSetBytes:
        create: true
        rules: {}
    k8sContainerResource:
        create: true
        rules: {}
    k8sPodOwner:
        create: true
        rules: {}
    kubeApiserver:
        create: true
        rules: {}
    kubeApiserverAvailability:
        create: true
        rules: {}
    kubeApiserverBurnrate:
        create: true
        rules: {}
    kubeApiserverHistogram:
        create: true
        rules: {}
    kubeApiserverSlos:
        create: true
        rules: {}
    kubePrometheusGeneral:
        create: true
        rules: {}
    kubePrometheusNodeRecording:
        create: true
        rules: {}
    kubeScheduler:
        create: true
        rules: {}
    kubeStateMetrics:
        create: true
        rules: {}
    kubelet:
        create: true
        rules: {}
    kubernetesApps:
        create: true
        rules: {}
        targetNamespace: .*
    kubernetesResources:
        create: true
        rules: {}
    kubernetesStorage:
        create: true
        rules: {}
        targetNamespace: .*
    kubernetesSystem:
        create: true
        rules: {}
    kubernetesSystemApiserver:
        create: true
        rules: {}
    kubernetesSystemControllerManager:
        create: true
        rules: {}
    kubernetesSystemKubelet:
        create: true
        rules: {}
    kubernetesSystemScheduler:
        create: true
        rules: {}
    node:
        create: true
        rules: {}
    nodeNetwork:
        create: true
        rules: {}
    vmHealth:
        create: true
        rules: {}
    vmagent:
        create: true
        rules: {}
    vmcluster:
        create: true
        rules: {}
    vmoperator:
        create: true
        rules: {}
    vmsingle:
        create: true
        rules: {}
labels: {}
recording:
    spec:
        annotations: {}
        labels: {}
rule:
    spec:
        annotations: {}
        labels: {}
rules: {}
runbookUrl: https://runbooks.prometheus-operator.dev/runbooks

Create default rules for monitoring the cluster

defaultRules.alerting object
spec:
    annotations: {}
    labels: {}

Common properties for VMRules alerts

defaultRules.alerting.spec.annotations object
{}

Additional annotations for VMRule alerts

defaultRules.alerting.spec.labels object
{}

Additional labels for VMRule alerts

defaultRules.annotations object
{}

Annotations for default rules

defaultRules.group object
spec:
    params: {}

Common properties for VMRule groups

defaultRules.group.spec.params object
{}

Optional HTTP URL parameters added to each rule request

defaultRules.groups object
alertmanager:
    create: true
    rules: {}
etcd:
    create: true
    rules: {}
general:
    create: true
    rules: {}
k8sContainerCpuLimits:
    create: true
    rules: {}
k8sContainerCpuRequests:
    create: true
    rules: {}
k8sContainerCpuUsageSecondsTotal:
    create: true
    rules: {}
k8sContainerMemoryCache:
    create: true
    rules: {}
k8sContainerMemoryLimits:
    create: true
    rules: {}
k8sContainerMemoryRequests:
    create: true
    rules: {}
k8sContainerMemoryRss:
    create: true
    rules: {}
k8sContainerMemorySwap:
    create: true
    rules: {}
k8sContainerMemoryWorkingSetBytes:
    create: true
    rules: {}
k8sContainerResource:
    create: true
    rules: {}
k8sPodOwner:
    create: true
    rules: {}
kubeApiserver:
    create: true
    rules: {}
kubeApiserverAvailability:
    create: true
    rules: {}
kubeApiserverBurnrate:
    create: true
    rules: {}
kubeApiserverHistogram:
    create: true
    rules: {}
kubeApiserverSlos:
    create: true
    rules: {}
kubePrometheusGeneral:
    create: true
    rules: {}
kubePrometheusNodeRecording:
    create: true
    rules: {}
kubeScheduler:
    create: true
    rules: {}
kubeStateMetrics:
    create: true
    rules: {}
kubelet:
    create: true
    rules: {}
kubernetesApps:
    create: true
    rules: {}
    targetNamespace: .*
kubernetesResources:
    create: true
    rules: {}
kubernetesStorage:
    create: true
    rules: {}
    targetNamespace: .*
kubernetesSystem:
    create: true
    rules: {}
kubernetesSystemApiserver:
    create: true
    rules: {}
kubernetesSystemControllerManager:
    create: true
    rules: {}
kubernetesSystemKubelet:
    create: true
    rules: {}
kubernetesSystemScheduler:
    create: true
    rules: {}
node:
    create: true
    rules: {}
nodeNetwork:
    create: true
    rules: {}
vmHealth:
    create: true
    rules: {}
vmagent:
    create: true
    rules: {}
vmcluster:
    create: true
    rules: {}
vmoperator:
    create: true
    rules: {}
vmsingle:
    create: true
    rules: {}

Rule group properties

defaultRules.groups.etcd.rules object
{}

Common properties for all rules in a group

defaultRules.labels object
{}

Labels for default rules

defaultRules.recording object
spec:
    annotations: {}
    labels: {}

Common properties for VMRules recording rules

defaultRules.recording.spec.annotations object
{}

Additional annotations for VMRule recording rules

defaultRules.recording.spec.labels object
{}

Additional labels for VMRule recording rules

defaultRules.rule object
spec:
    annotations: {}
    labels: {}

Common properties for all VMRules

defaultRules.rule.spec.annotations object
{}

Additional annotations for all VMRules

defaultRules.rule.spec.labels object
{}

Additional labels for all VMRules

defaultRules.rules object
{}

Per rule properties

defaultRules.runbookUrl string
https://runbooks.prometheus-operator.dev/runbooks

Runbook url prefix for default rules

externalVM object
read:
    url: ""
write:
    url: ""

External VM read and write URLs

extraObjects list
[]

Add extra objects dynamically to this chart

fullnameOverride string
""

Resource full name override

global.cluster.dnsDomain string
cluster.local.

K8s cluster domain suffix, uses for building storage pods’ FQDN. Details are here

global.clusterLabel string
cluster

Cluster label to use for dashboards and rules

global.license object
key: ""
keyRef: {}

Global license configuration

grafana object
enabled: true
forceDeployDatasource: false
ingress:
    annotations: {}
    enabled: false
    extraPaths: []
    hosts:
        - grafana.domain.com
    labels: {}
    path: /
    pathType: Prefix
    tls: []
sidecar:
    dashboards:
        defaultFolderName: default
        enabled: true
        folder: /var/lib/grafana/dashboards
        multicluster: false
        provider:
            name: default
            orgid: 1
    datasources:
        enabled: true
        initDatasources: true
        label: grafana_datasource
vmScrape:
    enabled: true
    spec:
        endpoints:
            - port: ''
        selector:
            matchLabels:
                app.kubernetes.io/name: ''

Grafana dependency chart configuration. For possible values refer here

grafana.forceDeployDatasource bool
false

Create datasource configmap even if grafana deployment has been disabled

grafana.ingress.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

grafana.vmScrape object
enabled: true
spec:
    endpoints:
        - port: ''
    selector:
        matchLabels:
            app.kubernetes.io/name: ''

Grafana VM scrape config

grafana.vmScrape.spec object
endpoints:
    - port: ''
selector:
    matchLabels:
        app.kubernetes.io/name: ''

Scrape configuration for Grafana

kube-state-metrics object
enabled: true
vmScrape:
    enabled: true
    spec:
        endpoints:
            - honorLabels: true
              metricRelabelConfigs:
                - action: labeldrop
                  regex: (uid|container_id|image_id)
              port: http
        jobLabel: app.kubernetes.io/name
        selector:
            matchLabels:
                app.kubernetes.io/instance: ''
                app.kubernetes.io/name: ''

kube-state-metrics dependency chart configuration. For possible values check here

kube-state-metrics.vmScrape object
enabled: true
spec:
    endpoints:
        - honorLabels: true
          metricRelabelConfigs:
            - action: labeldrop
              regex: (uid|container_id|image_id)
          port: http
    jobLabel: app.kubernetes.io/name
    selector:
        matchLabels:
            app.kubernetes.io/instance: ''
            app.kubernetes.io/name: ''

Scrape configuration for Kube State Metrics

kubeApiServer.enabled bool
true

Enable Kube Api Server metrics scraping

kubeApiServer.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: https
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            serverName: kubernetes
    jobLabel: component
    namespaceSelector:
        matchNames:
            - default
    selector:
        matchLabels:
            component: apiserver
            provider: kubernetes

Spec for VMServiceScrape CRD is here

kubeControllerManager.enabled bool
true

Enable kube controller manager metrics scraping

kubeControllerManager.endpoints list
[]

If your kube controller manager is not deployed as a pod, specify IPs it can be found on

kubeControllerManager.service.enabled bool
true

Create service for kube controller manager metrics scraping

kubeControllerManager.service.port int
10257

Kube controller manager service port

kubeControllerManager.service.selector object
component: kube-controller-manager

Kube controller manager service pod selector

kubeControllerManager.service.targetPort int
10257

Kube controller manager service target port

kubeControllerManager.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            serverName: kubernetes
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

kubeDns.enabled bool
false

Enabled KubeDNS metrics scraping

kubeDns.service.enabled bool
false

Create Service for KubeDNS metrics

kubeDns.service.ports object
dnsmasq:
    port: 10054
    targetPort: 10054
skydns:
    port: 10055
    targetPort: 10055

KubeDNS service ports

kubeDns.service.selector object
k8s-app: kube-dns

KubeDNS service pods selector

kubeDns.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics-dnsmasq
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics-skydns
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

kubeEtcd.enabled bool
true

Enabled KubeETCD metrics scraping

kubeEtcd.endpoints list
[]

If your etcd is not deployed as a pod, specify IPs it can be found on

kubeEtcd.service.enabled bool
true

Enable service for ETCD metrics scraping

kubeEtcd.service.port int
2379

ETCD service port

kubeEtcd.service.selector object
component: etcd

ETCD service pods selector

kubeEtcd.service.targetPort int
2379

ETCD service target port

kubeEtcd.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

kubeProxy.enabled bool
false

Enable kube proxy metrics scraping

kubeProxy.endpoints list
[]

If your kube proxy is not deployed as a pod, specify IPs it can be found on

kubeProxy.service.enabled bool
true

Enable service for kube proxy metrics scraping

kubeProxy.service.port int
10249

Kube proxy service port

kubeProxy.service.selector object
k8s-app: kube-proxy

Kube proxy service pod selector

kubeProxy.service.targetPort int
10249

Kube proxy service target port

kubeProxy.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

kubeScheduler.enabled bool
true

Enable KubeScheduler metrics scraping

kubeScheduler.endpoints list
[]

If your kube scheduler is not deployed as a pod, specify IPs it can be found on

kubeScheduler.service.enabled bool
true

Enable service for KubeScheduler metrics scrape

kubeScheduler.service.port int
10259

KubeScheduler service port

kubeScheduler.service.selector object
component: kube-scheduler

KubeScheduler service pod selector

kubeScheduler.service.targetPort int
10259

KubeScheduler service target port

kubeScheduler.vmScrape object
spec:
    endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          port: http-metrics
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    jobLabel: jobLabel
    namespaceSelector:
        matchNames:
            - kube-system

Spec for VMServiceScrape CRD is here

kubelet object
enabled: true
vmScrape:
    kind: VMNodeScrape
    spec:
        bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
        honorLabels: true
        honorTimestamps: false
        interval: 30s
        metricRelabelConfigs:
            - action: labeldrop
              regex: (uid)
            - action: labeldrop
              regex: (id|name)
            - action: drop
              regex: (rest_client_request_duration_seconds_bucket|rest_client_request_duration_seconds_sum|rest_client_request_duration_seconds_count)
              source_labels:
                - __name__
        relabelConfigs:
            - action: labelmap
              regex: __meta_kubernetes_node_label_(.+)
            - sourceLabels:
                - __metrics_path__
              targetLabel: metrics_path
            - replacement: kubelet
              targetLabel: job
        scheme: https
        scrapeTimeout: 5s
        tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            insecureSkipVerify: true
vmScrapes:
    cadvisor:
        enabled: true
        spec:
            path: /metrics/cadvisor
    kubelet:
        spec: {}
    probes:
        enabled: true
        spec:
            path: /metrics/probes
    resources:
        enabled: true
        spec:
            path: /metrics/resource

Component scraping the kubelets

kubelet.vmScrape object
kind: VMNodeScrape
spec:
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    honorLabels: true
    honorTimestamps: false
    interval: 30s
    metricRelabelConfigs:
        - action: labeldrop
          regex: (uid)
        - action: labeldrop
          regex: (id|name)
        - action: drop
          regex: (rest_client_request_duration_seconds_bucket|rest_client_request_duration_seconds_sum|rest_client_request_duration_seconds_count)
          source_labels:
            - __name__
    relabelConfigs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - sourceLabels:
            - __metrics_path__
          targetLabel: metrics_path
        - replacement: kubelet
          targetLabel: job
    scheme: https
    scrapeTimeout: 5s
    tlsConfig:
        caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecureSkipVerify: true

Spec for VMNodeScrape CRD is here

kubelet.vmScrapes.cadvisor object
enabled: true
spec:
    path: /metrics/cadvisor

Enable scraping /metrics/cadvisor from kubelet’s service

kubelet.vmScrapes.probes object
enabled: true
spec:
    path: /metrics/probes

Enable scraping /metrics/probes from kubelet’s service

kubelet.vmScrapes.resources object
enabled: true
spec:
    path: /metrics/resource

Enabled scraping /metrics/resource from kubelet’s service

nameOverride string
""

Override chart name

prometheus-node-exporter object
enabled: true
extraArgs:
    - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
    - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
service:
    labels:
        jobLabel: node-exporter
vmScrape:
    enabled: true
    spec:
        endpoints:
            - metricRelabelConfigs:
                - action: drop
                  regex: /var/lib/kubelet/pods.+
                  source_labels:
                    - mountpoint
              port: metrics
        jobLabel: jobLabel
        selector:
            matchLabels:
                app.kubernetes.io/name: ''

prometheus-node-exporter dependency chart configuration. For possible values check here

prometheus-node-exporter.vmScrape object
enabled: true
spec:
    endpoints:
        - metricRelabelConfigs:
            - action: drop
              regex: /var/lib/kubelet/pods.+
              source_labels:
                - mountpoint
          port: metrics
    jobLabel: jobLabel
    selector:
        matchLabels:
            app.kubernetes.io/name: ''

Node Exporter VM scrape config

prometheus-node-exporter.vmScrape.spec object
endpoints:
    - metricRelabelConfigs:
        - action: drop
          regex: /var/lib/kubelet/pods.+
          source_labels:
            - mountpoint
      port: metrics
jobLabel: jobLabel
selector:
    matchLabels:
        app.kubernetes.io/name: ''

Scrape configuration for Node Exporter

prometheus-operator-crds object
enabled: false

Install prometheus operator CRDs

tenant string
"0"

Tenant to use for Grafana datasources and remote write

victoria-metrics-operator object
crds:
    cleanup:
        enabled: true
        image:
            pullPolicy: IfNotPresent
            repository: bitnami/kubectl
    plain: true
enabled: true
operator:
    disable_prometheus_converter: false
serviceMonitor:
    enabled: true

VictoriaMetrics Operator dependency chart configuration. More values can be found here. Also checkout here possible ENV variables to configure operator behaviour

victoria-metrics-operator.operator.disable_prometheus_converter bool
false

By default, operator converts prometheus-operator objects.

vmagent.additionalRemoteWrites list
[]

Remote write configuration of VMAgent, allowed parameters defined in a spec

vmagent.annotations object
{}

VMAgent annotations

vmagent.enabled bool
true

Create VMAgent CR

vmagent.ingress object
annotations: {}
enabled: false
extraPaths: []
hosts:
    - vmagent.domain.com
labels: {}
path: ""
pathType: Prefix
tls: []

VMAgent ingress configuration

vmagent.spec object
externalLabels: {}
extraArgs:
    promscrape.dropOriginalLabels: "true"
    promscrape.streamParse: "true"
port: "8429"
scrapeInterval: 20s
selectAllByDefault: true

Full spec for VMAgent CRD. Allowed values described here

vmalert.additionalNotifierConfigs object
{}

Allows to configure static notifiers, discover notifiers via Consul and DNS, see specification here. This configuration will be created as separate secret and mounted to VMAlert pod.

vmalert.annotations object
{}

VMAlert annotations

vmalert.enabled bool
true

Create VMAlert CR

vmalert.ingress object
annotations: {}
enabled: false
extraPaths: []
hosts:
    - vmalert.domain.com
labels: {}
path: ""
pathType: Prefix
tls: []

VMAlert ingress config

vmalert.ingress.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

vmalert.remoteWriteVMAgent bool
false

Controls whether VMAlert should use VMAgent or VMInsert as a target for remotewrite

vmalert.spec object
evaluationInterval: 15s
externalLabels: {}
extraArgs:
    http.pathPrefix: /
port: "8080"
selectAllByDefault: true

Full spec for VMAlert CRD. Allowed values described here

vmalert.templateFiles object
{}

Extra VMAlert annotation templates

vmauth.annotations object
{}

VMAuth annotations

vmauth.enabled bool
false

Enable VMAuth CR

vmauth.spec object
port: "8427"
unauthorizedUserAccessSpec:
    discover_backend_ips: true
    url_map:
        - src_paths:
            - '/.*'
          url_prefix:
            - '/'

Full spec for VMAuth CRD. Allowed values described here It’s possible to use given below predefined variables in spec: * - parsed vmselect, vmsingle or externalVM.read URL * - parsed vminsert, vmsingle or externalVM.write URL

vmcluster.annotations object
{}

VMCluster annotations

vmcluster.enabled bool
false

Create VMCluster CR

vmcluster.ingress.insert.annotations object
{}

Ingress annotations

vmcluster.ingress.insert.enabled bool
false

Enable deployment of ingress for server component

vmcluster.ingress.insert.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

vmcluster.ingress.insert.hosts list
[]

Array of host objects

vmcluster.ingress.insert.ingressClassName string
""

Ingress controller class name

vmcluster.ingress.insert.labels object
{}

Ingress extra labels

vmcluster.ingress.insert.path string
''

Ingress default path

vmcluster.ingress.insert.pathType string
Prefix

Ingress path type

vmcluster.ingress.insert.tls list
[]

Array of TLS objects

vmcluster.ingress.select.annotations object
{}

Ingress annotations

vmcluster.ingress.select.enabled bool
false

Enable deployment of ingress for server component

vmcluster.ingress.select.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

vmcluster.ingress.select.hosts list
[]

Array of host objects

vmcluster.ingress.select.ingressClassName string
""

Ingress controller class name

vmcluster.ingress.select.labels object
{}

Ingress extra labels

vmcluster.ingress.select.path string
''

Ingress default path

vmcluster.ingress.select.pathType string
Prefix

Ingress path type

vmcluster.ingress.select.tls list
[]

Array of TLS objects

vmcluster.ingress.storage.annotations object
{}

Ingress annotations

vmcluster.ingress.storage.enabled bool
false

Enable deployment of ingress for server component

vmcluster.ingress.storage.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

vmcluster.ingress.storage.hosts list
[]

Array of host objects

vmcluster.ingress.storage.ingressClassName string
""

Ingress controller class name

vmcluster.ingress.storage.labels object
{}

Ingress extra labels

vmcluster.ingress.storage.path string
""

Ingress default path

vmcluster.ingress.storage.pathType string
Prefix

Ingress path type

vmcluster.ingress.storage.tls list
[]

Array of TLS objects

vmcluster.spec object
replicationFactor: 2
retentionPeriod: "1"
vminsert:
    extraArgs: {}
    port: "8480"
    replicaCount: 2
    resources: {}
vmselect:
    cacheMountPath: /select-cache
    extraArgs: {}
    port: "8481"
    replicaCount: 2
    resources: {}
    storage:
        volumeClaimTemplate:
            spec:
                resources:
                    requests:
                        storage: 2Gi
vmstorage:
    replicaCount: 2
    resources: {}
    storage:
        volumeClaimTemplate:
            spec:
                resources:
                    requests:
                        storage: 10Gi
    storageDataPath: /vm-data

Full spec for VMCluster CRD. Allowed values described here

vmcluster.spec.retentionPeriod string
"1"

Data retention period. Possible units character: h(ours), d(ays), w(eeks), y(ears), if no unit character specified - month. The minimum retention period is 24h. See these docs

vmsingle.annotations object
{}

VMSingle annotations

vmsingle.enabled bool
true

Create VMSingle CR

vmsingle.ingress.annotations object
{}

Ingress annotations

vmsingle.ingress.enabled bool
false

Enable deployment of ingress for server component

vmsingle.ingress.extraPaths list
[]

Extra paths to prepend to every host configuration. This is useful when working with annotation based services.

vmsingle.ingress.hosts list
[]

Array of host objects

vmsingle.ingress.ingressClassName string
""

Ingress controller class name

vmsingle.ingress.labels object
{}

Ingress extra labels

vmsingle.ingress.path string
""

Ingress default path

vmsingle.ingress.pathType string
Prefix

Ingress path type

vmsingle.ingress.tls list
[]

Array of TLS objects

vmsingle.spec object
extraArgs: {}
port: "8429"
replicaCount: 1
retentionPeriod: "1"
storage:
    accessModes:
        - ReadWriteOnce
    resources:
        requests:
            storage: 20Gi

Full spec for VMSingle CRD. Allowed values describe here

vmsingle.spec.retentionPeriod string
"1"

Data retention period. Possible units character: h(ours), d(ays), w(eeks), y(ears), if no unit character specified - month. The minimum retention period is 24h. See these docs