Vitest Pros:
* Automated failing test comments on github PRs
* A nice local UI with incremental testing when changing files (`yarn
test:ui`)
* Also nicely supported in all major IDEs, click to run test works (so
we won't miss what we had with jest).
* Works well with ESM
Vitest Cons:
* The ESBuild transformer vitest uses takes a little longer to transform
than our current SWC/jest setup, however, it is possible to setup SWC as
the transformer for vitest as well (though it only does one transform,
so we're paying ~7-10 seconds instead of ~ 2-3 seconds in transform
phase).
* Exposes how slow our tests are (tongue in cheek here)
We're migrating to ESM, which will allow us to import the latest
versions of our dependencies.
Co-Authored-By: Christopher Kolstad <chriswk@getunleash.io>
https://linear.app/unleash/issue/2-3406/hold-unknown-flags-in-memory-and-show-them-in-the-ui-somehow
This PR introduces a suggestion for a “unknown flags” feature.
When clients report metrics for flags that don’t exist in Unleash (e.g.
due to typos), we now track a limited set of these unknown flag names
along with the appnames that reported them. The goal is to help users
identify and clean up incorrect flag usage across their apps.
We store up to 10 unknown flag + appName combinations, keeping only the
most recent reports. Data is collected in-memory and flushed
periodically to the DB, with deduplication and merging to ensure we
don’t exceed the cap even across pods.
We were especially careful to make this implementation defensive, as
unknown flags could be reported in very high volumes. Writes are
batched, deduplicated, and hard-capped to avoid DB pressure.
No UI has been added yet — this is backend-only for now and intended as
a step toward better visibility into client misconfigurations.
I would suggest starting with a simple banner that opens a dialog
showing the list of unknown flags and which apps reported them.
<img width="497" alt="image"
src="https://github.com/user-attachments/assets/b7348e0d-0163-4be4-a7f8-c072e8464331"
/>
## About the changes
Builds on top of #8766 to use memoized results from stats-service.
Because stats service depends on version service, and to avoid making
the version service depend on stats service creating a cyclic
dependency. I've introduced a telemetry data provider. It's not clean
code, but it does the job.
After validating this works as expected I'll clean up
Added an e2e test validating that the replacement is correct:
[8475492](847549234c)
and it did:
https://github.com/Unleash/unleash/actions/runs/11861854341/job/33060032638?pr=8776#step:9:294
Finally, cleaning up version service
## About the changes
We have many aggregation queries that run on a schedule:
f63496d47f/src/lib/metrics.ts (L714-L719)
These staticCounters are usually doing db query aggregations that
traverse tables and we run all of them in parallel:
f63496d47f/src/lib/metrics.ts (L410-L412)
This can add strain to the db. This PR suggests a way of handling these
queries in a more structured way, allowing us to run them sequentially
(therefore spreading the load):
f02fe87835/src/lib/metrics-gauge.ts (L38-L40)
As an additional benefit, we get both the gauge definition and the
queries in a single place:
f02fe87835/src/lib/metrics.ts (L131-L141)
This PR only tackles 1 metric, and it only focuses on gauges to gather
initial feedback. The plan is to migrate these metrics and eventually
incorporate more types (e.g. counters)
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
## About the changes
This PR removes the feature flag `queryMissingTokens` that was fully
rolled out.
It introduces a new way of checking edgeValidTokens controlled by the
flag `checkEdgeValidTokensFromCache` that relies in the cached data but
hits the DB if needed.
The assumption is that most of the times edge will find tokens in the
cache, except for a few cases in which a new token is queried. From all
tokens we expect at most one to hit the DB and in this case querying a
single token should be better than querying all the tokens.
I've tried to use/add the audit info to all events I could see/find.
This makes this PR necessarily huge, because we do store quite a few
events.
I realise it might not be complete yet, but tests
run green, and I think we now have a pattern to follow for other events.
## About the changes
- Removes the feature flag for the created_by migrations.
- Adds a configuration option in IServerOption for
`ENABLE_SCHEDULED_CREATED_BY_MIGRATION` that defaults to `false`
- the new configuration option when set on startup enables scheduling of
the two created_by migration services (features+events)
- Removes the dependency on flag provider in EventStore as it's no
longer needed
- Adds a brief description of the new configuration option in
`configuring-unleash.md`
- Sets the events created_by migration interval to 15 minutes, up from
2.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
This PR provides a service that allows a scheduled function to run in a
single instance. It's currently not in use but tests show how to wrap a
function to make it single-instance:
65b7080e05/src/lib/features/scheduler/job-service.test.ts (L26-L32)
The key `'test'` is used to identify the group and most likely should
have the same name as the scheduled job.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.io>
## About the changes
App stats is mainly used to cap the number of applications reported to
Unleash based on the last 7 days information:
cc2ccb1134/src/lib/middleware/response-time-metrics.ts (L24-L28)
Instead of getting all stats, just calculate appCount statistics
Use scheduler service instead of setInterval
When a stop signal is sent to Unleash the scheduler-service should
cancel any scheduled jobs. This also applies to the job scheduled for
initial execution with jitter.
We observed that initial jobs was executed after the database
connections are terminated. This appears after v5.9.0 of Unleash.
```
Error: aborted
at Object.queryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
at createQueryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
at EventStore.knex [as db] (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
at EventStore.setUnannouncedToAnnounced (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:286:33)
at EventStore.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:293:35)
at EventAnnouncer.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/services/event-announcer-service.js:9:32)
at runScheduledFunctionWithEvent (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:30:23)
at Timeout.<anonymous> (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:50:27)
at runNextTicks (node:internal/process/task_queues:60:5)
at process.processTimers (node:internal/timers:509:9)
```
## About the changes
Schedules a best-effort task setting the value of
events.created_by_user_id based on what is found in the created_by
column and if it's capable of resolving that to a userid/a system id.
The process is executed in the events-store, it takes a chunk of events
that haven't been processed yet, attempts to join users and api_tokens
tables on created_by = username/email, loops through and tries to figure
out an id to set. Then updates the record.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Adds a scheduled task that every 5 seconds updates 500 entries in the
features table setting `created_by_user_id`.
It does this by looking at the related event, checks created_by and
joins users table for match on username or email, and joins api_tokens
table on username matches. Then picks either a users id if set, or uses
-42 (admin token user)
So, this was causing a lot of ERROR in our logs, due to the metric
having gotten an extra label the last month.
Two things for this fix.
1. add the missing label to the two calls that did not have it added
2. update the log line to include the error as another argument to the
logger, so we actually get a stacktrace from the error.
### What
Adds `createdByUserId` to all events exposed by unleash. In addition
this PR updates all tests and usages of the methods in this codebase to
include the required number.
## PR Description
https://linear.app/unleash/issue/2-1645/address-post-mortem-action-point-all-flags-should-be-runtime
Refactor with the goal of ensuring that flags are runtime controllable,
mostly focused on the current scheduler logic.
This includes the following changes:
- Moves scheduler into its own "scheduler" feature folder
- Reverts dependency: SchedulerService takes in the MaintenanceService,
not the other way around
- Scheduler now evaluates maintenance mode at runtime instead of relying
only on its mode state (active / paused)
- Favors flag checks to happen inside the scheduled methods, instead of
controlling whether the method is scheduled at all (favor runtime over
startup)
- Moves "account last seen update" to scheduler
- Updates tests accordingly
- Boyscouting
Here's a manual test showing this behavior, where my local instance was
controlled by a remote instance. Whenever I toggle `maintenanceMode`
through a flag remotely, my scheduled functions stop running:
https://github.com/Unleash/unleash/assets/14320932/ae0a7fa9-5165-4c0b-9b0b-53b9fb20de72
Had a look through all of our current flags and it *seems to me* that
they are all used in a runtime controllable way, but would still feel
more comfortable if this was double checked, since it can be complex to
ensure this.
The only exception to this was `migrationLock`, which I believe is OK,
since the migration only happens at the start anyways.
## Discussion / Questions
~~Scheduler `mode` (active / paused) is currently not *really* being
used, along with its respective methods, except in tests. I think this
could be a potential footgun. Should we remove it in favor of only
controlling the scheduler state through maintenance mode?~~ Addressed in
7c52e3f638
~~The config property `disableScheduler` is still a startup
configuration, but perhaps that makes sense to leave as is?~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
Are there any other tests we should add?
Is there anything I missed?
Identified some `setInterval` and `setTimeout` that may make sense to
leave as is instead of moving over to the scheduler service:
- ~~`src/lib/metrics` - This is currently considered a `MetricsMonitor`.
Should this be refactored to a service instead and adapt these
setIntervals to use the scheduler instead? Is there anything special
with this we need to take into account? @chriswk @ivarconr~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1820501511)
by @ivarconr, leaving as is.
- ~~`src/lib/proxy/proxy-repository.ts` - This seems to have a complex
and specific logic currently. Perhaps we should leave it alone for now?
@FredrikOseberg~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
- `src/lib/services/user-service.ts` - This one also seems to be a bit
more specific, where we generate new timeouts for each receiver id.
Might not belong in the scheduler service. @Tymek