Extract the existing-node lookup logic from HandleNodeFromPreAuthKey
into a separate method. This reduces the cyclomatic complexity from
32 to 28, below the gocyclo limit of 30.
Updates #3077
Tagged nodes no longer have user_id set, so ListNodes(user) cannot
find them. Update integration tests to use ListNodes() (all nodes)
when looking up tagged nodes.
Add a findNode helper to locate nodes by predicate from an
unfiltered list, used in ACL tests that have multiple nodes per
scenario.
Updates #3077
Test the tagged-node-survives-user-deletion scenario at two layers:
DB layer (users_test.go):
- success_user_only_has_tagged_nodes: tagged nodes with nil
user_id do not block user deletion and survive it
- error_user_has_tagged_and_owned_nodes: user-owned nodes
still block deletion even when tagged nodes coexist
App layer (grpcv1_test.go):
- TestDeleteUser_TaggedNodeSurvives: full registration flow
with tagged PreAuthKey verifies nil UserID after registration,
absence from nodesByUser index, user deletion succeeds, and
tagged node remains in global node list
Also update auth_tags_test.go assertions to expect nil UserID
on tagged nodes, consistent with the new invariant.
Updates #3077
Tagged nodes are owned by their tags, not a user. Enforce this
invariant at every write path:
- createAndSaveNewNode: do not set UserID for tagged PreAuthKey
registration; clear UserID when advertise-tags are applied
during OIDC/CLI registration
- SetNodeTags: clear UserID/User when tags are assigned
- processReauthTags: clear UserID/User when tags are applied
during re-authentication
- validateNodeOwnership: reject tagged nodes with non-nil UserID
- NodeStore: skip nodesByUser indexing for tagged nodes since
they have no owning user
- HandleNodeFromPreAuthKey: add fallback lookup for tagged PAK
re-registration (tagged nodes indexed under UserID(0)); guard
against nil User deref for tagged nodes in different-user check
Since tagged nodes now have user_id = NULL, ListNodesByUser
will not return them and DestroyUser naturally allows deleting
users whose nodes have all been tagged. The ON DELETE CASCADE
FK cannot reach tagged nodes through a NULL foreign key.
Also tone down shouty comments throughout state.go.
Fixes#3077
Tagged nodes are owned by their tags, not a user. Previously
user_id was kept as "created by" tracking, but this prevents
deleting users whose nodes have all been tagged, and the
ON DELETE CASCADE FK would destroy the tagged nodes.
Add a migration that sets user_id = NULL on all existing tagged
nodes. Subsequent commits enforce this invariant at write time.
Updates #3077
Add --disable flag to "headscale nodes expire" CLI command and
disable_expiry field handling in the gRPC API to allow disabling
key expiry for nodes. When disabled, the node's expiry is set to
NULL and IsExpired() returns false.
The CLI follows the new grpcRunE/RunE/printOutput patterns
introduced in the recent CLI refactor.
Also fix NodeSetExpiry to persist directly to the database instead
of going through persistNodeToDB which omits the expiry field.
Fixes#2681
Co-authored-by: Marco Santos <me@marcopsantos.com>
Add bool disable_expiry field (field 3) to ExpireNodeRequest proto
and regenerate all protobuf, gRPC gateway, and OpenAPI files.
Fixes#2681
Co-authored-by: Marco Santos <me@marcopsantos.com>
Add end-to-end test cases to TestUnmarshalPolicy that verify bracketed
IPv6 addresses are correctly parsed through the full policy pipeline
(JSON unmarshal -> splitDestinationAndPort -> parseAlias -> parsePortRange)
and survive JSON round-trips.
Cover single port, multiple ports, wildcard port, CIDR prefix, port
range, bracketed IPv4, and hostname rejection.
Updates #2754
Headscale rejects IPv6 addresses with square brackets in ACL policy
destinations (e.g. "[fd7a:115c:a1e0::87e1]:80,443"), while Tailscale
SaaS accepts them. The root cause is that splitDestinationAndPort uses
strings.LastIndex(":") which leaves brackets on the destination string,
and netip.ParseAddr does not accept brackets.
Add a bracket-handling branch at the top of splitDestinationAndPort that
uses net.SplitHostPort for RFC 3986 parsing when input starts with "[".
The extracted host is validated with netip.ParseAddr/ParsePrefix to
ensure brackets are only accepted around IP addresses and CIDR prefixes,
not hostnames or other alias types like tags and groups.
Fixes#2754
This just fixes a small issue I noticed reading the docs: the two 'scenarios' listed in the scaling section end up showing up as a numbered list of five items, instead of the desired two items + their descriptions.
Remove dead if-resp-nil checks in tagCmd and approveRoutesCmd; gRPC
returns either an error or a valid response, never (nil, nil).
Rename HasMachineOutputFlag to hasMachineOutputFlag since it has a
single internal caller in root.go.
Add bypassDatabase() to consolidate the repeated LoadServerConfig +
NewHeadscaleDatabase pattern in getPolicy and setPolicy.
Replace os.Open + io.ReadAll with os.ReadFile in setPolicy and
checkPolicy, removing the manual file-handle management.
Move errMissingParameter from users.go to utils.go alongside the
other shared sentinel errors; the variable is referenced by
api_key.go and preauthkeys.go.
Move the Error constant-error type from debug.go to mockoidc.go,
its only consumer.
Replace five hardcoded "2006-01-02 15:04:05" strings with the
HeadscaleDateTimeFormat constant already defined in utils.go.
Replace two literal 10 base arguments to strconv.FormatUint with
util.Base10 to match the convention in api_key.go and nodes.go.
Flags registered on a cobra.Command cannot fail to read at runtime;
GetString/GetUint64/GetStringSlice only error when the flag name is
unknown. The error-handling blocks for these calls are unreachable
dead code.
Adopt the value, _ := pattern already used in api_key.go,
preauthkeys.go and users.go, removing ~40 lines of dead code from
nodes.go and debug.go.
The --namespace flag on nodes list/register and debug create-node was
never wired to the --user flag, so its value was silently ignored.
Remove it along with the deprecateNamespaceMessage constant.
Also remove the namespace/ns command aliases on users and
machine/machines aliases on nodes, which have been deprecated since
the naming changes in 0.23.0.
Add expirationFromFlag helper that parses the --expiration flag into a
timestamppb.Timestamp, replacing identical duration-parsing blocks in
api_key.go and preauthkeys.go.
Add apiKeyIDOrPrefix helper to validate the mutually-exclusive --id and
--prefix flags, replacing the duplicated switch block in expireAPIKeyCmd
and deleteAPIKeyCmd.
Centralise the repeated force-flag-check + YesNo-prompt logic into a
single confirmAction(cmd, prompt) helper. Callers still decide what
to return on decline (error, message, or nil).
Replace three inconsistent MarkFlagRequired error-handling styles
(stdlib log.Fatal, zerolog log.Fatal, silently discarded) with a
single mustMarkRequired helper that panics on programmer error.
Also fixes a bug where renameNodeCmd.MarkFlagRequired("new-name")
targeted the wrong command (should be renameUserCmd), making the
--new-name flag effectively never required on "headscale users rename".
Add a helper that checks the --output flag and either serialises as
JSON/YAML or invokes a table-rendering callback. This removes the
repeated format,_ := cmd.Flags().GetString("output") + if-branch from
the five list commands.
Convert the 10 commands that were still using Run with
ErrorOutput/SuccessOutput or log.Fatal/os.Exit:
- backfillNodeIPsCmd: use grpcRunE-style manual connection with
error returns; simplify the confirm/force logic
- getPolicy, setPolicy, checkPolicy: replace ErrorOutput with
fmt.Errorf returns in both the bypass-gRPC and gRPC paths
- serveCmd, configTestCmd: replace log.Fatal with error returns
- mockOidcCmd: replace log.Error+os.Exit with error return
- versionCmd, generatePrivateKeyCmd: replace SuccessOutput with
printOutput
- dumpConfigCmd: return the error instead of swallowing it
Rename grpcRun to grpcRunE: the inner closure now returns error
and the wrapper returns a cobra RunE-compatible function.
Change newHeadscaleCLIWithConfig to return an error instead of
calling log.Fatal/os.Exit, making connection failures propagate
through the normal error path.
Add formatOutput (returns error) and printOutput (writes to stdout)
as non-exiting replacements for the old output/SuccessOutput pair.
Extract output format string literals into package-level constants.
Mark the old ErrorOutput, SuccessOutput and output helpers as
deprecated; they remain temporarily for the unconverted commands.
Convert all 22 grpcRunE commands from Run+ErrorOutput+SuccessOutput
to RunE+fmt.Errorf+printOutput. Change usernameAndIDFromFlag to
return an error instead of calling ErrorOutput directly.
Update backfillNodeIPsCmd and policy.go callers of
newHeadscaleCLIWithConfig for the new 5-return signature while
keeping their Run-based pattern for now.
Set SilenceErrors and SilenceUsage on the root command so that
cobra never prints usage text for runtime errors. A SetFlagErrorFunc
callback re-enables usage output specifically for flag-parsing
errors (the kubectl pattern).
Add printError to utils.go and switch Execute() to ExecuteC() so
the returned error is formatted as JSON/YAML when --output requests
machine-readable output.
Add a grpcRun helper that wraps cobra RunFuncs, injecting a ready
gRPC client and context. The connection lifecycle (cancel, close)
is managed by the wrapper, eliminating the duplicated 3-line
boilerplate (newHeadscaleCLIWithConfig + defer cancel + defer
conn.Close) from 22 command handlers across 7 files.
Three call sites are intentionally left unconverted:
- backfillNodeIPsCmd: creates the client only after user confirmation
- getPolicy/setPolicy: conditionally use gRPC vs direct DB access
Docker 29 (shipped with runner-images 20260209.23.1) breaks docker
build via Go client libraries (broken pipe writing build context)
and docker load/save with certain tarball formats. Add Docker's
official apt repository and install docker-ce 28.5.x in all CI
jobs that interact with Docker.
See https://github.com/actions/runner-images/issues/13474
Updates #3058
Use new(users["name"]) instead of extracting to intermediate
variables that staticcheck does not recognise as used with
Go 1.26 new(value) syntax.
Updates #3058
Fix issues found by the upgraded golangci-lint:
- wsl_v5: add required whitespace in CLI files
- staticcheck SA4006: replace new(var.Field) with &localVar
pattern since staticcheck does not recognize Go 1.26
new(value) as a use of the variable
- staticcheck SA5011: use t.Fatal instead of t.Error for
nil guard checks so execution stops
- unused: remove dead ptrTo helper function
Use GORM AutoMigrate instead of raw SQL to create the
database_versions table, since PostgreSQL does not support the
datetime type used in the raw SQL (it requires timestamp).
Add a version check that runs before database migrations to ensure
users do not skip minor versions or downgrade. This protects database
migrations and allows future cleanup of old migration code.
Rules enforced:
- Same minor version: always allowed (patch changes either way)
- Single minor upgrade (e.g. 0.27 -> 0.28): allowed
- Multi-minor upgrade (e.g. 0.25 -> 0.28): blocked with guidance
- Any minor downgrade: blocked
- Major version change: blocked
- Dev builds: warn but allow, preserve stored version
The version is stored in a purpose-built database_versions table
after migrations succeed. The table is created with raw SQL before
gormigrate runs to avoid circular dependencies.
Updates #3058
Replace tiangolo/issue-manager with custom logic that distinguishes
bot comments from human responses. The issue-manager action treated
all comments equally, so the bot's own instruction comment would
trigger label removal on the next scheduled run.
Split into two jobs:
- remove-label-on-response: triggers on issue_comment from non-bot
users, removes the needs-more-info label immediately
- close-stale: runs on daily schedule, uses nushell to iterate open
needs-more-info issues, checks for human comments after the label
was added, and closes after 3 days with no response
Remove the `issues: labeled` trigger from the timer workflow.
When both workflows triggered on label addition, the comment workflow
would post the bot comment, and by the time the timer workflow ran,
issue-manager would see "a comment was added after the label" and
immediately remove the label due to `remove_label_on_comment: true`.
The timer workflow now only runs on:
- Daily cron (to close stale issues)
- issue_comment (to remove label when humans respond)
- workflow_dispatch (for manual testing)
Add workflow that automatically closes issues labeled as
support-request with a message directing users to Discord
for configuration and support questions.
The workflow:
- Triggers when support-request label is added
- Posts a comment explaining this tracker is for bugs/features
- Links to documentation and Discord
- Closes the issue as "not planned"