Kubernetes monitoring on VictoriaMetrics stack. Includes VictoriaMetrics Operator, Grafana dashboards, ServiceScrapes and VMRules
This chart is an All-in-one solution to start monitoring kubernetes cluster. It installs multiple dependency charts like grafana, node-exporter, kube-state-metrics and victoria-metrics-operator. Also it installs Custom Resources like VMSingle, VMCluster, VMAgent, VMAlert.
By default, the operator converts all existing prometheus-operator API objects into corresponding VictoriaMetrics Operator objects.
To enable metrics collection for kubernetes this chart installs multiple scrape configurations for kubernetes components like kubelet and kube-proxy, etc. Metrics collection is done by VMAgent. So if want to ship metrics to external VictoriaMetrics database you can disable VMSingle installation by setting vmsingle.enabled
to false
and setting vmagent.vmagentSpec.remoteWrite.url
to your external VictoriaMetrics database.
This chart also installs bunch of dashboards and recording rules from kube-prometheus project.
Configuration of this chart is done through helm values.
Dependencies can be enabled or disabled by setting enabled
to true
or false
in values.yaml
file.
!Important: for dependency charts anything that you can find in values.yaml of dependency chart can be configured in this chart under key for that dependency. For example if you want to configure grafana
you can find all possible configuration options in values.yaml and you should set them in values for this chart under grafana: key. For example if you want to configure grafana.persistence.enabled
you should set it in values.yaml like this:
#################################################
### dependencies #####
#################################################
# Grafana dependency chart configuration. For possible values refer to https://github.com/grafana/helm-charts/tree/main/charts/grafana#configuration
grafana:
enabled: true
persistence:
type: pvc
enabled: false
This chart installs multiple VictoriaMetrics components using Custom Resources that are managed by victoria-metrics-operator
Each resource can be configured using spec
of that resource from API docs of victoria-metrics-operator. For example if you want to configure VMAgent
you can find all possible configuration options in API docs and you should set them in values for this chart under vmagent.spec
key. For example if you want to configure remoteWrite.url
you should set it in values.yaml like this:
vmagent:
spec:
remoteWrite:
- url: "https://insert.vmcluster.domain.com/insert/0/prometheus/api/v1/write"
When deploying K8s stack using ArgoCD without Cert Manager (.Values.victoria-metrics-operator.admissionWebhooks.certManager.enabled: false
)
it will rerender operator’s webhook certificates on each sync since Helm lookup
function is not respected by ArgoCD.
To prevent this please update you K8s stack Application spec.syncPolicy
and spec.ignoreDifferences
with a following:
apiVersion: argoproj.io/v1alpha1
kind: Application
...
spec:
...
destination:
...
namespace: <k8s-stack-namespace>
...
syncPolicy:
syncOptions:
# https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#respect-ignore-difference-configs
# argocd must also ignore difference during apply stage
# otherwise it ll silently override changes and cause a problem
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: ""
kind: Secret
name: <fullname>-validation
namespace: <k8s-stack-namespace>
jsonPointers:
- /data
- group: admissionregistration.k8s.io
kind: ValidatingWebhookConfiguration
name: <fullname>-admission
jqPathExpressions:
- '.webhooks[]?.clientConfig.caBundle'
where <fullname>
is output of `` for your setup
metadata.annotations: Too long: must have at most 262144 bytes
on dashboardsIf one of dashboards ConfigMap is failing with error Too long: must have at most 262144 bytes
, please make sure you’ve added argocd.argoproj.io/sync-options: ServerSideApply=true
annotation to your dashboards:
defaultDashboards:
annotations:
argocd.argoproj.io/sync-options: ServerSideApply=true
argocd.argoproj.io/sync-options: ServerSideApply=true
This chart uses pre-delete
Helm hook to cleanup resources managed by operator, but it’s not supported in ArgoCD and this hook is ignored.
To have a control over resources removal please consider using either ArgoCD sync phases and waves or installing operator chart separately
This chart by default install multiple dashboards and recording rules from kube-prometheus
you can disable dashboards with defaultDashboards.enabled: false
and experimentalDashboardsEnabled: false
and rules can be configured under defaultRules
By default, this chart uses sidecar in order to provision default dashboards. If you want to add you own dashboards there are two ways to do it:
apiVersion: v1
kind: ConfigMap
metadata:
labels:
grafana_dashboard: "1"
name: grafana-dashboard
data:
dashboard.json: |-
{...}
grafana:
sidecar:
dashboards:
enabled: false
dashboards:
vmcluster:
gnetId: 11176
revision: 38
datasource: VictoriaMetrics
When using this approach, you can find dashboards for VictoriaMetrics components published here.
This chart installs multiple scrape configurations for kubernetes monitoring. They are configured under #ServiceMonitors
section in values.yaml
file. For example if you want to configure scrape config for kubelet
you should set it in values.yaml like this:
kubelet:
enabled: true
# spec for VMNodeScrape crd
# https://docs.victoriametrics.com/operator/api#vmnodescrapespec
spec:
interval: "30s"
If you want to use an externally managed Grafana instance but still want to use the dashboards provided by this chart you can set
grafana.enabled
to false
and set defaultDashboards.enabled
to true
. This will install the dashboards
but will not install Grafana.
For example:
defaultDashboards:
enabled: true
grafana:
enabled: false
This will create ConfigMaps with dashboards to be imported into Grafana.
If additional configuration for labels or annotations is needed in order to import dashboard to an existing Grafana you can
set .grafana.sidecar.dashboards.additionalDashboardLabels
or .grafana.sidecar.dashboards.additionalDashboardAnnotations
in values.yaml
:
For example:
defaultDashboards:
enabled: true
labels:
key: value
annotations:
key: value
All images of VictoriaMetrics components are available on Docker Hub and Quay. It is possible to override default image registry for all components deployed by operator and operator itself by using the following values:
victoria-metrics-operator:
image:
registry: "quay.io"
env:
- name: "VM_USECUSTOMCONFIGRELOADER"
value: "true"
- name: VM_CUSTOMCONFIGRELOADERIMAGE
value: "quay.io/victoriametrics/operator:config-reloader-v0.53.0"
- name: VM_VLOGSDEFAULT_IMAGE
value: "quay.io/victoriametrics/victoria-logs"
- name: "VM_VMALERTDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vmalert"
- name: "VM_VMAGENTDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vmagent"
- name: "VM_VMSINGLEDEFAULT_IMAGE"
value: "quay.io/victoriametrics/victoria-metrics"
- name: "VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vmselect"
- name: "VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vmstorage"
- name: "VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vminsert"
- name: "VM_VMBACKUP_IMAGE"
value: "quay.io/victoriametrics/vmbackupmanager"
- name: "VM_VMAUTHDEFAULT_IMAGE"
value: "quay.io/victoriametrics/vmauth"
- name: "VM_VMALERTMANAGER_ALERTMANAGERDEFAULTBASEIMAGE"
value: "quay.io/prometheus/alertmanager"
Install the follow packages: git
, kubectl
, helm
, helm-docs
. See this tutorial.
Add dependency chart repositories
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Access a Kubernetes cluster.
Add a chart helm repository with follow commands:
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
List versions of vm/victoria-metrics-k8s-stack
chart available to installation:
helm search repo vm/victoria-metrics-k8s-stack -l
victoria-metrics-k8s-stack
chartExport default values of victoria-metrics-k8s-stack
chart to file values.yaml
:
For HTTPS repository
helm show values vm/victoria-metrics-k8s-stack > values.yaml
For OCI repository
helm show values oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-k8s-stack > values.yaml
Change the values according to the need of the environment in values.yaml
file.
Test the installation with command:
For HTTPS repository
helm install vmks vm/victoria-metrics-k8s-stack -f values.yaml -n NAMESPACE --debug --dry-run
For OCI repository
helm install vmks oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-k8s-stack -f values.yaml -n NAMESPACE --debug --dry-run
Install chart with command:
For HTTPS repository
helm install vmks vm/victoria-metrics-k8s-stack -f values.yaml -n NAMESPACE
For OCI repository
helm install vmks oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-k8s-stack -f values.yaml -n NAMESPACE
Get the pods lists by running this commands:
kubectl get pods -A | grep 'vmks'
Get the application by running this command:
helm list -f vmks -n NAMESPACE
See the history of versions of vmks
application with command.
helm history vmks -n NAMESPACE
To have control over an order of managed resources removal or to be able to remove a whole namespace with managed resources it’s recommended to disable operator in k8s-stack chart (victoria-metrics-operator.enabled: false
) and install it separately. To move operator from existing k8s-stack release to a separate one please follow the steps below:
victoria-metrics-operator.crds.cleanup.enabled: false
) and apply changesvictoria-metrics-operator.enabled: false
) and apply changescrds.plain: true
If you’re planning to delete k8s-stack by a whole namespace removal please consider deploying operator in a separate namespace as due to uncontrollable removal order process can hang if operator is removed before at least one resource it manages.
To run VictoriaMetrics stack locally it’s possible to use Minikube. To avoid dashboards and alert rules issues please follow the steps below:
Run Minikube cluster
minikube start --container-runtime=containerd --extra-config=scheduler.bind-address=0.0.0.0 --extra-config=controller-manager.bind-address=0.0.0.0 --extra-config=etcd.listen-metrics-urls=http://0.0.0.0:2381
Install helm chart
helm install [RELEASE_NAME] vm/victoria-metrics-k8s-stack -f values.yaml -f values.minikube.yaml -n NAMESPACE --debug --dry-run
Remove application with command.
helm uninstall vmks -n NAMESPACE
CRDs created by this chart are not removed by default and should be manually cleaned up:
kubectl get crd | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl delete crd {}
configmap already exist
. It could happen because of name collisions, if you set too long release name.
Kubernetes by default, allows only 63 symbols at resource names and all resource names are trimmed by helm to 63 symbols.
To mitigate it, use shorter name for helm chart release name, like:
# stack - is short enough
helm upgrade -i stack vm/victoria-metrics-k8s-stack
Or use override for helm chart release name:
helm upgrade -i some-very-long-name vm/victoria-metrics-k8s-stack --set fullnameOverride=stack
Usually, helm upgrade doesn’t requires manual actions. Just execute command:
$ helm upgrade [RELEASE_NAME] vm/victoria-metrics-k8s-stack
But release with CRD update can only be patched manually with kubectl. Since helm does not perform a CRD update, we recommend that you always perform this when updating the helm-charts version:
# 1. check the changes in CRD
$ helm show crds vm/victoria-metrics-k8s-stack --version [YOUR_CHART_VERSION] | kubectl diff -f -
# 2. apply the changes (update CRD)
$ helm show crds vm/victoria-metrics-k8s-stack --version [YOUR_CHART_VERSION] | kubectl apply -f - --server-side
All other manual actions upgrades listed below:
To provide more flexibility for VMAuth configuration all <component>.vmauth
params were moved to vmauth.spec
.
Also .vm.write
and .vm.read
variables are available in vmauth.spec
, which represent vmsingle
, vminsert
, externalVM.write
and vmsingle
, vmselect
, externalVM.read
parsed URLs respectively.
If your configuration in version < 0.29.0 looked like below:
vmcluster:
vmauth:
vmselect:
- src_paths:
- /select/.*
url_prefix:
- /
vminsert:
- src_paths:
- /insert/.*
url_prefix:
- /
In 0.29.0 it should look like:
vmauth:
spec:
unauthorizedAccessConfig:
- src_paths:
- '/.*'
url_prefix:
- '/'
- src_paths:
- '/.*'
url_prefix:
- '/'
kubectl delete daemonset -l app=prometheus-node-exporter
scrape configuration for kubernetes components was moved from vmServiceScrape.spec
section to spec
section. If you previously modified scrape configuration you need to update your values.yaml
grafana.defaultDashboardsEnabled
was renamed to defaultDashboardsEnabled
(moved to top level). You may need to update it in your values.yaml
All CRD
must be update to the latest version with command:
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/helm-charts/master/charts/victoria-metrics-k8s-stack/crds/crd.yaml
All CRD
must be update to v1
version with command:
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/helm-charts/master/charts/victoria-metrics-k8s-stack/crds/crd.yaml
Update VMAgent
crd
command:
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.16.0/config/crd/bases/operator.victoriametrics.com_vmagents.yaml
### Upgrade from 0.2.5 to 0.2.6
New CRD added to operator - VMUser
and VMAuth
, new fields added to exist crd.
Manual commands:
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmusers.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmauths.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmalerts.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmagents.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmsingles.yaml
kubectl apply -f https://raw.githubusercontent.com/VictoriaMetrics/operator/v0.15.0/config/crd/bases/operator.victoriametrics.com_vmclusters.yaml
Install helm-docs
following the instructions on this tutorial.
Generate docs with helm-docs
command.
cd charts/victoria-metrics-k8s-stack
helm-docs
The markdown generation is entirely go template driven. The tool parses metadata from charts and generates a number of sub-templates that can be referenced in a template file (by default README.md.gotmpl
). If no template file is provided, the tool has a default internal template that will generate a reasonably formatted README.
The following tables lists the configurable parameters of the chart and their default values.
Change the values according to the need of the environment in victoria-metrics-k8s-stack/values.yaml
file.