fix: Fixed the volsync jitter.
This commit is contained in:
@@ -3,7 +3,7 @@
|
||||
{"id":"homelab-5wg","title":"Fix network configuration conflicts (etcd + routes)","description":"Multiple network configuration issues on the cluster nodes:\n\n**Issue 1: etcd Peer URL Conflict**\nNode esxi-2cu-8g-01 (10.0.0.146) has duplicate peer URLs in etcd (10.0.0.128 and 10.0.0.146), causing \"Peer URLs already exists\" error. Node is currently unreachable.\n\n**Issue 2: Network Route Conflict**\nNodes are showing route conflict errors:\n```\nerror adding route: netlink receive: file exists\ngateway: 10.0.0.129\n```\n\nThis is because nodes were previously configured with `/24` subnet and gateway `10.0.0.1`, but now configured with `/27` subnet and gateway `10.0.0.129`. Old routes persist.\n\n**Root Cause:**\nConfiguration changed from:\n- Old: 10.0.0.0/24, gateway 10.0.0.1\n- New: 10.0.0.128/27, gateway 10.0.0.129\n\n**Solution:**\n1. Reset ALL nodes to clear old network config\n2. Re-apply Talos configuration\n3. Bootstrap cluster fresh\n\nCommands:\n```bash\n# Reset each node\ntalosctl -n 10.0.0.145 reset --graceful=false --reboot\ntalosctl -n 10.0.0.146 reset --graceful=false --reboot \ntalosctl -n 10.0.0.147 reset --graceful=false --reboot\n\n# Wait for nodes to boot into maintenance mode, then:\ntask bootstrap:talos\n```","acceptance_criteria":"- Member ceeb52e03fde8032 is removed from etcd cluster\n- Node 10.0.0.146 is reset and reconfigured\n- Node rejoins etcd cluster with correct peer URL\n- `talosctl etcd members` shows only one peer URL per member\n- All three nodes are healthy in etcd cluster","notes":"**Recommended Fix: Full Cluster Reset (Option 1)**\n\nAll nodes are currently offline. Once nodes are back online, execute:\n\n```bash\n# Reset all nodes to maintenance mode\ntalosctl -n 10.0.0.145 reset --graceful=false --reboot --insecure\ntalosctl -n 10.0.0.146 reset --graceful=false --reboot --insecure\ntalosctl -n 10.0.0.147 reset --graceful=false --reboot --insecure\n\n# Wait for nodes to boot into maintenance mode (~2-3 min)\n# Verify with: nmap -Pn -n -p 50000 10.0.0.145-147 -vv\n\n# Re-bootstrap\ntask bootstrap:talos\ntask bootstrap:apps\n```\n\nThis is the cleanest approach to clear all lingering network config and etcd state issues. Estimated time: ~15 minutes total.","status":"closed","priority":1,"issue_type":"bug","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T01:10:22.498887798+01:00","created_by":"Laur IVAN","updated_at":"2026-02-10T22:59:48.077254996+01:00","closed_at":"2026-02-10T22:59:48.077254996+01:00","close_reason":"Fixed - etcd cluster healthy with 3 members, each with single peer URL. No route conflicts. All cluster health checks passed.","labels":["etcd","talos","urgent"]}
|
||||
{"id":"homelab-7k4","title":"Push talhelper encrypted secret to git","description":"After installing Talos, commit and push the talhelper encrypted secret to the repository","acceptance_criteria":"- Changes are staged with `git add -A`\n- Commit is created with message \"chore: add talhelper encrypted secret :lock:\"\n- Changes are pushed to remote repository","status":"closed","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:32:05.950780413+01:00","created_by":"Laur IVAN","updated_at":"2026-02-07T00:44:58.80046492+01:00","closed_at":"2026-02-07T00:44:58.80046492+01:00","close_reason":"Successfully staged, committed, and pushed talhelper encrypted secret to git repository","labels":["bootstrap","git"]}
|
||||
{"id":"homelab-82o","title":"Verify Flux status and resources","description":"Check the status of Flux and verify all Flux resources are up-to-date and in a ready state","acceptance_criteria":"- Command `flux check` passes all checks\n- Command `flux get sources git flux-system` shows ready state\n- Command `flux get ks -A` shows all kustomizations ready\n- Command `flux get hr -A` shows all helm releases ready","status":"closed","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:32:43.666513198+01:00","created_by":"Laur IVAN","updated_at":"2026-02-10T23:03:07.067406014+01:00","closed_at":"2026-02-10T23:03:07.067406014+01:00","close_reason":"Verified - Flux check passed. All controllers ready (helm, kustomize, notification, source). GitRepository synced. All Kustomizations applied successfully.","labels":["flux","verification"]}
|
||||
{"id":"homelab-c68","title":"Fix volsync MutatingAdmissionPolicy API version","description":"Kustomization storage-system/volsync is failing with error:\\n\\n```\\nMutatingAdmissionPolicy/storage-system/volsync-mover-jitter dry-run failed: no matches for kind \\\"MutatingAdmissionPolicy\\\" in version \\\"admissionregistration.k8s.io/v1beta1\\\"\\n```\\n\\nThis indicates that MutatingAdmissionPolicy v1beta1 API is not available in Kubernetes 1.34. This API was introduced in 1.30 as v1alpha1 and promoted to v1beta1 in 1.32, but may have been removed or changed in 1.34.\\n\\nFix: Update volsync configuration to use correct API version or remove the MutatingAdmissionPolicy if not needed.","status":"open","priority":2,"issue_type":"bug","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-11T00:51:29.41277186+01:00","created_by":"Laur IVAN","updated_at":"2026-02-11T00:51:29.41277186+01:00","labels":["api-version","storage","volsync"]}
|
||||
{"id":"homelab-c68","title":"Fix volsync MutatingAdmissionPolicy API version","description":"Kustomization storage-system/volsync is failing with error:\n\n```\nMutatingAdmissionPolicy/storage-system/volsync-mover-jitter dry-run failed: no matches for kind \"MutatingAdmissionPolicy\" in version \"admissionregistration.k8s.io/v1beta1\"\n```\n\n**Root Cause:**\nMutatingAdmissionPolicy API does not exist in Kubernetes 1.34. Only `admissionregistration.k8s.io/v1` is available, which only includes MutatingWebhookConfiguration.\n\nThe MutatingAdmissionPolicy feature was experimental and appears to have been removed or never graduated to stable.\n\n**What the policy does:**\nAdds a jitter initContainer to volsync jobs to randomize start times (sleep 0-30 seconds). This is optional functionality.\n\n**Fix:**\nRemove or comment out the mutating-admission-policy.yaml file from kubernetes/apps/storage-system/volsync/app/kustomization.yaml since this feature is not available in K8s 1.34 and is non-critical.","notes":"**Verified in K8s 1.35.0:**\nIssue still exists. MutatingAdmissionPolicy API is not available in Kubernetes 1.35.0.\n\nOnly `admissionregistration.k8s.io/v1` exists, which includes:\n- MutatingWebhookConfiguration\n- ValidatingWebhookConfiguration\n\nMutatingAdmissionPolicy/MutatingAdmissionPolicyBinding are not available.\n\nThe fix remains the same: remove or comment out the mutating-admission-policy.yaml file.","status":"open","priority":2,"issue_type":"bug","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-11T00:51:29.41277186+01:00","created_by":"Laur IVAN","updated_at":"2026-02-11T10:24:51.047684333+01:00","labels":["api-version","storage","volsync"]}
|
||||
{"id":"homelab-f7u","title":"Tidy up repository (remove templates)","description":"Clean up the repository by removing the templates directory and templating-related files to eliminate clutter and resolve Renovate warnings","acceptance_criteria":"- Command `task template:tidy` completes successfully\n- Templates directory is removed\n- Templating-related files are cleaned up\n- Changes are committed with message \"chore: tidy up :broom:\"\n- Changes are pushed to git","status":"open","priority":3,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:33:32.475687645+01:00","created_by":"Laur IVAN","updated_at":"2026-02-07T00:33:32.475687645+01:00","labels":["cleanup","git"]}
|
||||
{"id":"homelab-gqj","title":"Bootstrap cluster applications (cilium, coredns, spegel, flux)","description":"Install cilium, coredns, spegel, flux and sync the cluster to the repository state","acceptance_criteria":"- Command `task bootstrap:apps` completes successfully\n- Cilium is installed\n- CoreDNS is installed\n- Spegel is installed\n- Flux is installed\n- Cluster is synced to repository state","status":"closed","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:32:15.371162045+01:00","created_by":"Laur IVAN","updated_at":"2026-02-07T15:50:03.091375279+01:00","closed_at":"2026-02-07T15:50:03.091375279+01:00","close_reason":"Successfully installed cilium, coredns, spegel, cert-manager, flux-operator. Flux-instance is reconciling (timeout is normal). All nodes are Ready.","labels":["apps","bootstrap"]}
|
||||
{"id":"homelab-hmc","title":"Finish monitoring system setup","description":"Uncomment the grafana and kube-prometheus-stack resources in kubernetes/apps/monitoring-system/kustomization.yaml to enable the full monitoring stack with Grafana dashboards and Prometheus metrics collection","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-09T22:53:49.071709362+01:00","updated_at":"2026-02-09T22:53:49.071709362+01:00","labels":["grafana","monitoring","prometheus"]}
|
||||
@@ -11,6 +11,7 @@
|
||||
{"id":"homelab-k3j","title":"Verify DNS resolution for echo subdomain","description":"Check that DNS resolution works for the echo subdomain and resolves to the Cloudflare gateway address","acceptance_criteria":"- Command `dig @${cluster_dns_gateway_addr} echo.${cloudflare_domain}` resolves successfully\n- DNS resolves to ${cloudflare_gateway_addr}\n- DNS resolution is working correctly","status":"closed","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:33:02.539037288+01:00","created_by":"Laur IVAN","updated_at":"2026-02-10T23:03:01.06585734+01:00","closed_at":"2026-02-10T23:03:01.06585734+01:00","close_reason":"Verified - DNS resolution working. echo.laurivan.com resolves to 10.0.0.158 (envoy-external gateway) via k8s-gateway","labels":["dns","verification"]}
|
||||
{"id":"homelab-mbk","title":"Verify TCP connectivity to gateways","description":"Check TCP connectivity to both the internal and external gateways on port 443","acceptance_criteria":"- Command `nmap -Pn -n -p 443 ${cluster_gateway_addr} ${cloudflare_gateway_addr} -vv` succeeds\n- Port 443 is open on both internal and external gateways\n- TCP connectivity is confirmed","status":"open","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:32:54.223562688+01:00","created_by":"Laur IVAN","updated_at":"2026-02-07T00:32:54.223562688+01:00","labels":["network","verification"]}
|
||||
{"id":"homelab-n0h","title":"Verify Cilium status","description":"Verify that Cilium is installed and running correctly","acceptance_criteria":"- Command `cilium status` runs successfully\n- Cilium reports healthy status\n- All Cilium components are operational","status":"closed","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:32:34.123646456+01:00","created_by":"Laur IVAN","updated_at":"2026-02-10T23:01:46.996445944+01:00","closed_at":"2026-02-10T23:01:46.996445944+01:00","close_reason":"Verified - Cilium OK, Operator OK, 3/3 DaemonSet ready, 1/1 Operator ready, 29/29 cluster pods managed","labels":["cilium","verification"]}
|
||||
{"id":"homelab-oqx","title":"Fix tuppr HelmRelease - invalid ServiceAccount API version","description":"Tuppr HelmRelease is failing with error:\n\n```\nHelm install failed: resource mapping not found for name: \"tuppr-talosconfig\" namespace: \"system-upgrade\" from \"\": no matches for kind \"ServiceAccount\" in version \"talos.dev/v1alpha1\"\nensure CRDs are installed first\n```\n\nThe tuppr chart is trying to create a ServiceAccount with apiVersion `talos.dev/v1alpha1`, which is invalid. ServiceAccount should use `v1` API version.\n\nThis appears to be a bug in the tuppr chart itself (version 0.0.52). The chart is incorrectly using a Talos-specific API version for a standard Kubernetes ServiceAccount resource.\n\nPossible fixes:\n1. Wait for upstream chart fix\n2. Use a different version of tuppr\n3. Apply a patch to fix the ServiceAccount apiVersion\n4. Disable tuppr if not critical","status":"open","priority":2,"issue_type":"bug","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-11T00:51:35.813199154+01:00","created_by":"Laur IVAN","updated_at":"2026-02-11T01:01:47.963406638+01:00","labels":["oci-repository","system-upgrade","tuppr"]}
|
||||
{"id":"homelab-rzs","title":"Verify wildcard Certificate status","description":"Check the status of the wildcard Certificate in the network namespace","acceptance_criteria":"- Command `kubectl -n network describe certificates` runs successfully\n- Certificate status shows Ready condition\n- Certificate is valid and not expired","status":"open","priority":2,"issue_type":"task","owner":"laur.ivan@ec.europa.eu","created_at":"2026-02-07T00:33:12.166198226+01:00","created_by":"Laur IVAN","updated_at":"2026-02-07T00:33:12.166198226+01:00","labels":["certificates","verification"]}
|
||||
{"id":"homelab-u3p","title":"Install homepage dashboard","description":"Create the homepage application manifests (helmrelease, ocirepository, kustomization) in kubernetes/apps/default/homepage/app/ directory and configure the ks.yaml to deploy it","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-09T22:53:44.511470131+01:00","updated_at":"2026-02-09T22:53:44.511470131+01:00","labels":["dashboard","deployment","homepage"]}
|
||||
{"id":"homelab-xpp","title":"Install home assistant for home automation","description":"Create home assistant application manifests (helmrelease, ocirepository, kustomization) in kubernetes/apps/default/home-assistant/app/ directory and configure deployment.\n\nNote: Ensure the application has network access to the IoT VLAN where most smart home devices are located. This may require configuring network policies or multus CNI for VLAN access.","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-09T22:57:31.4810088+01:00","updated_at":"2026-02-09T22:57:31.4810088+01:00","labels":["automation","home-assistant","iot","networking"]}
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
---
|
||||
cluster:
|
||||
allowSchedulingOnControlPlanes: true
|
||||
apiServer:
|
||||
@@ -6,6 +7,9 @@ cluster:
|
||||
extraArgs:
|
||||
# https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/
|
||||
enable-aggregator-routing: true
|
||||
# Enable MutatingAdmissionPolicy feature gate (beta in K8s 1.35)
|
||||
feature-gates: MutatingAdmissionPolicy=true
|
||||
runtime-config: admissionregistration.k8s.io/v1beta1=true
|
||||
controllerManager:
|
||||
extraArgs:
|
||||
bind-address: 0.0.0.0
|
||||
|
||||
Reference in New Issue
Block a user