1
0
mirror of https://github.com/juanfont/headscale.git synced 2025-10-23 11:19:19 +02:00
Commit Graph

40 Commits

Author SHA1 Message Date
Kristoffer Dalby
fddc7117e4
stability and race conditions in auth and node store (#2781)
This PR addresses some consistency issues that was introduced or discovered with the nodestore.

nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.

auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.

integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)

This PR was assisted, particularly tests, by claude code.
2025-10-16 12:17:43 +02:00
Andrey Bobelev
c4a8c038cd fix: return valid AuthUrl in followup request on expired reg id
- tailscale client gets a new AuthUrl and sets entry in the regcache
- regcache entry expires
- client doesn't know about that
- client always polls followup request а gets error

When user clicks "Login" in the app (after cache expiry), they visit
invalid URL and get "node not found in registration cache". Some clients
on Windows for e.g. can't get a new AuthUrl without restart the app.

To fix that we can issue a new reg id and return user a new valid
AuthUrl.

RegisterNode is refactored to be created with NewRegisterNode() to
autocreate channel and other stuff.
2025-10-11 05:57:39 +02:00
Kristoffer Dalby
9d236571f4 state/nodestore: in memory representation of nodes
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.

It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.

Writes will block until commited, while reads are never
blocked.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Florian Preinstorfer
30a1f7e68e Log registrationID to simplify interactive node registration
Some clients such as Android make it hard to transfer the registrationID
to the server, its easier to get it from the server logs.
2025-08-15 17:11:38 +02:00
Kristoffer Dalby
a058bf3cd3
mapper: produce map before poll (#2628) 2025-07-28 11:15:53 +02:00
Kristoffer Dalby
c6d7b512bd
integration: replace time.Sleep with assert.EventuallyWithT (#2680) 2025-07-10 23:38:55 +02:00
Kristoffer Dalby
b904276f2b poll: use nodeview everywhere
There was a bug in HA subnet router handover where we used stale node data
from the longpoll session that we handed to Connect. This meant that we got
some odd behaviour where routes would not be deactivated correctly.

This commit changes to the nodeview is used through out, and we load the
current node to be updated in the write path and then handle it all there
to be consistent.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-07-08 21:05:15 +02:00
Kristoffer Dalby
1553f0ab53 state: introduce state
this commit moves all of the read and write logic, and all different parts
of headscale that manages some sort of persistent and in memory state into
a separate package.

The goal of this is to clearly define the boundry between parts of the app
which accesses and modifies data, and where it happens. Previously, different
state (routes, policy, db and so on) was used directly, and sometime passed to
functions as pointers.

Now all access has to go through state. In the initial implementation,
most of the same functions exists and have just been moved. In the future
centralising this will allow us to optimise bottle necks with the database
(in memory state) and make the different parts talking to eachother do so
in the same way across headscale components.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-06-24 07:58:54 +02:00
Kristoffer Dalby
eb1ecefd9e
auth: ensure that routes are autoapproved when the node is stored (#2550)
* integration: ensure route is set before node joins, reproduce

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* auth: ensure that routes are autoapproved when the node is stored

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-01 07:05:42 +02:00
Kristoffer Dalby
5a18e91317
fix auto approver on register and new policy (#2506)
* fix issue auto approve route on register bug

This commit fixes an issue where routes where not approved
on a node during registration. This cause the auto approval
to require the node to readvertise the routes.

Fixes #2497
Fixes #2485

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* hsic: only set db policy if exist

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: calculate changed based on policy and filter

v1 is a bit simpler than v2, it does not pre calculate the auto approver map
and we cannot tell if it is changed.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-03-31 15:55:07 +02:00
Kristoffer Dalby
7891378f57
Redo route code (#2422)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-02-26 16:22:55 +01:00
Kristoffer Dalby
16868190c8
fix double login URL with OIDC (#2445)
* factor out login url parser

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* move to not trigger test gen checker

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* return regresp or err after waiting for registration

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-02-25 18:16:07 +01:00
Kristoffer Dalby
1f0110fe06
use helper function for constructing state updates (#2410)
This helps preventing messages being sent with the wrong update type
and payload combination, and it is shorter/neater.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-02-07 13:49:59 +01:00
Kristoffer Dalby
45752db0f6
Return better web errors to the user (#2398)
* add dedicated http error to propagate to user

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* classify user errors in http handlers

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* move validation of pre auth key out of db

This move separates the logic a bit and allow us to
write specific errors for the caller, in this case the web
layer so we can present the user with the correct error
codes without bleeding web stuff into a generic validate.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-02-01 15:25:18 +01:00
Kristoffer Dalby
d57a55c024
Rewrite authentication flow (#2374) 2025-02-01 09:16:51 +00:00
Kristoffer Dalby
4c8e847f47
use dedicated registration ID for auth flow (#2337) 2025-01-26 22:20:11 +01:00
Kristoffer Dalby
380fcdba17
Add worker reading extra_records_path from file (#2271)
* consolidate scheduled tasks into one goroutine

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* rename Tailcfg dns struct

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add dns.extra_records_path option

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* prettier lint

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* go-fmt

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-12-13 07:52:40 +00:00
Kristoffer Dalby
f7b0cbbbea
wrap policy in policy manager interface (#2255)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-26 15:16:06 +01:00
Kristoffer Dalby
218138afee
Redo OIDC configuration (#2020)
expand user, add claims to user

This commit expands the user table with additional fields that
can be retrieved from OIDC providers (and other places) and
uses this data in various tailscale response objects if it is
available.

This is the beginning of implementing
https://docs.google.com/document/d/1X85PMxIaVWDF6T_UPji3OeeUqVBcGj_uHRM5CI-AwlY/edit
trying to make OIDC more coherant and maintainable in addition
to giving the user a better experience and integration with a
provider.

remove usernames in magic dns, normalisation of emails

this commit removes the option to have usernames as part of MagicDNS
domains and headscale will now align with Tailscale, where there is a
root domain, and the machine name.

In addition, the various normalisation functions for dns names has been
made lighter not caring about username and special character that wont
occur.

Email are no longer normalised as part of the policy processing.

untagle oidc and regcache, use typed cache

This commits stops reusing the registration cache for oidc
purposes and switches the cache to be types and not use any
allowing the removal of a bunch of casting.

try to make reauth/register branches clearer in oidc

Currently there was a function that did a bunch of stuff,
finding the machine key, trying to find the node, reauthing
the node, returning some status, and it was called validate
which was very confusing.

This commit tries to split this into what to do if the node
exists, if it needs to register etc.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-02 14:50:17 +02:00
Kristoffer Dalby
064c46f2a5
move logic for validating node names (#2127)
* move logic for validating node names

this commits moves the generation of "given names" of nodes
into the registration function, and adds validation of renames
to RenameNode using the same logic.

Fixes #2121

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix double arg

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-11 18:27:49 +02:00
greizgh
8571513e3c
reformat code (#2019)
* reformat code

This is mostly an automated change with `make lint`.
I had to manually please golangci-lint in routes_test because of a short
variable name.

* fix start -> strategy which was wrongly corrected by linter
2024-07-22 08:56:00 +02:00
Kristoffer Dalby
7e62031444
replace ephemeral deletion logic (#2008)
* replace ephemeral deletion logic

this commit replaces the way we remove ephemeral nodes,
currently they are deleted in a loop and we look at last seen
time. This time is now only set when a node disconnects and
there was a bug (#2006) where nodes that had never disconnected
was deleted since they did not have a last seen.

The new logic will start an expiry timer when the node disconnects
and delete the node from the database when the timer is up.

If the node reconnects within the expiry, the timer is cancelled.

Fixes #2006

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use uint64 as authekyid and ptr helper in tests

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add test db helper

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add list ephemeral node func

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* schedule ephemeral nodes for removal on startup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix gorm query for postgres

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add godoc

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-07-18 10:01:59 +02:00
Kristoffer Dalby
5ad0aa44cb
update tailscale go dep (#1948)
* update tailscale go dep

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update gorm go dep

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update grpc go dep

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update golang.org go dep

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update rest of go dep

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-05-17 08:58:33 -04:00
MichaelKo
7fd2485000
Restore foreign keys and add constraints (#1562)
* fix #1482, restore foregin keys, add constraints

* #1562, fix tests, fix formatting

* #1562: fix tests

* #1562: fix local run of test_integration
2024-05-15 20:40:14 -04:00
Kristoffer Dalby
1c6bfc503c
fix preauth key logging in as previous user (#1920)
* add test case to reproduce #1885

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix preauth key issue logging in as wrong user

Fixes #1885

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add test to gh

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-05-02 11:53:16 +02:00
Kristoffer Dalby
ba614a5e6c
metrics, tuning in tests, db cleanups, fix concurrency issue (#1895) 2024-04-21 18:28:17 +02:00
Kristoffer Dalby
2ce23df45a
Migrate IP fields in database to dedicated columns (#1869) 2024-04-17 07:03:06 +02:00
Kristoffer Dalby
58c94d2bd3 Rework map session
This commit restructures the map session in to a struct
holding the state of what is needed during its lifetime.

For streaming sessions, the event loop is structured a
bit differently not hammering the clients with updates
but rather batching them over a short, configurable time
which should significantly improve cpu usage, and potentially
flakyness.

The use of Patch updates has been dialed back a little as
it does not look like its a 100% ready for prime time. Nodes
are now updated with full changes, except for a few things
like online status.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-04-15 12:31:53 +02:00
Kristoffer Dalby
384ca03208
new IP allocator and add postgres to integration tests. (#1756) 2024-02-18 19:31:29 +01:00
Kristoffer Dalby
68a8ecee7a
Prepare notify channel before sending first update (#1730)
* create channel before sending first update

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* do not notify on register, wait for connect

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-02-12 09:11:17 +01:00
Kristoffer Dalby
83769ba715
Replace database locks with transactions (#1701)
This commits removes the locks used to guard data integrity for the
database and replaces them with Transactions, turns out that SQL had
a way to deal with this all along.

This reduces the complexity we had with multiple locks that might stack
or recurse (database, nofitifer, mapper). All notifications and state
updates are now triggered _after_ a database change.


Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-02-08 17:28:19 +01:00
DeveloperDragon
cbf57e27a7
Login with OIDC after having been logged out (#1719) 2024-02-05 10:45:35 +01:00
Kristoffer Dalby
f65f4eca35
ensure online status and route changes are propagated (#1564) 2023-12-09 18:09:24 +01:00
Kristoffer Dalby
a59aab2081
Remove support for non-noise clients (pre-1.32) (#1611) 2023-11-23 08:31:33 +01:00
Kristoffer Dalby
ed4e19996b
Use tailscale key types instead of strings (#1609)
* upgrade tailscale

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* make Node object use actualy tailscale key types

This commit changes the Node struct to have both a field for strings
to store the keys in the database and a dedicated Key for each type
of key.

The keys are populated and stored with Gorm hooks to ensure the data
is stored in the db.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use key types throughout the code

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* make sure machinekey is concistently used

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use machine key in auth url

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix web register

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use key type in notifier

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix relogin with webauth

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-11-19 22:37:04 +01:00
Kristoffer Dalby
c0fd06e3f5
remove the use key stripping and store the proper keys (#1603) 2023-11-16 17:55:29 +01:00
Juan Font
0030af3fa4
Rename Machine to Node (#1553) 2023-09-24 06:42:05 -05:00
Kristoffer Dalby
13a7285658 fix lint
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
eff529f2c5 introduce rw lock for db, ish...
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
0562260fe0 rename handler files
This commit renames a bunch of files to try to make it a bit less confusing;

protocol_ is now auth as they contained registration, auth and login/out flow
protocol_.*_poll is now poll.go
api.go and other generic handlers are now handlers.go

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00