## About the changes
This PR removes the feature flag `queryMissingTokens` that was fully
rolled out.
It introduces a new way of checking edgeValidTokens controlled by the
flag `checkEdgeValidTokensFromCache` that relies in the cached data but
hits the DB if needed.
The assumption is that most of the times edge will find tokens in the
cache, except for a few cases in which a new token is queried. From all
tokens we expect at most one to hit the DB and in this case querying a
single token should be better than querying all the tokens.
I've tried to use/add the audit info to all events I could see/find.
This makes this PR necessarily huge, because we do store quite a few
events.
I realise it might not be complete yet, but tests
run green, and I think we now have a pattern to follow for other events.
## About the changes
- Removes the feature flag for the created_by migrations.
- Adds a configuration option in IServerOption for
`ENABLE_SCHEDULED_CREATED_BY_MIGRATION` that defaults to `false`
- the new configuration option when set on startup enables scheduling of
the two created_by migration services (features+events)
- Removes the dependency on flag provider in EventStore as it's no
longer needed
- Adds a brief description of the new configuration option in
`configuring-unleash.md`
- Sets the events created_by migration interval to 15 minutes, up from
2.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
This PR provides a service that allows a scheduled function to run in a
single instance. It's currently not in use but tests show how to wrap a
function to make it single-instance:
65b7080e05/src/lib/features/scheduler/job-service.test.ts (L26-L32)
The key `'test'` is used to identify the group and most likely should
have the same name as the scheduled job.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.io>
## About the changes
App stats is mainly used to cap the number of applications reported to
Unleash based on the last 7 days information:
cc2ccb1134/src/lib/middleware/response-time-metrics.ts (L24-L28)
Instead of getting all stats, just calculate appCount statistics
Use scheduler service instead of setInterval
When a stop signal is sent to Unleash the scheduler-service should
cancel any scheduled jobs. This also applies to the job scheduled for
initial execution with jitter.
We observed that initial jobs was executed after the database
connections are terminated. This appears after v5.9.0 of Unleash.
```
Error: aborted
at Object.queryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
at createQueryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
at EventStore.knex [as db] (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
at EventStore.setUnannouncedToAnnounced (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:286:33)
at EventStore.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:293:35)
at EventAnnouncer.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/services/event-announcer-service.js:9:32)
at runScheduledFunctionWithEvent (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:30:23)
at Timeout.<anonymous> (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:50:27)
at runNextTicks (node:internal/process/task_queues:60:5)
at process.processTimers (node:internal/timers:509:9)
```
## About the changes
Schedules a best-effort task setting the value of
events.created_by_user_id based on what is found in the created_by
column and if it's capable of resolving that to a userid/a system id.
The process is executed in the events-store, it takes a chunk of events
that haven't been processed yet, attempts to join users and api_tokens
tables on created_by = username/email, loops through and tries to figure
out an id to set. Then updates the record.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Adds a scheduled task that every 5 seconds updates 500 entries in the
features table setting `created_by_user_id`.
It does this by looking at the related event, checks created_by and
joins users table for match on username or email, and joins api_tokens
table on username matches. Then picks either a users id if set, or uses
-42 (admin token user)
So, this was causing a lot of ERROR in our logs, due to the metric
having gotten an extra label the last month.
Two things for this fix.
1. add the missing label to the two calls that did not have it added
2. update the log line to include the error as another argument to the
logger, so we actually get a stacktrace from the error.
### What
Adds `createdByUserId` to all events exposed by unleash. In addition
this PR updates all tests and usages of the methods in this codebase to
include the required number.
## PR Description
https://linear.app/unleash/issue/2-1645/address-post-mortem-action-point-all-flags-should-be-runtime
Refactor with the goal of ensuring that flags are runtime controllable,
mostly focused on the current scheduler logic.
This includes the following changes:
- Moves scheduler into its own "scheduler" feature folder
- Reverts dependency: SchedulerService takes in the MaintenanceService,
not the other way around
- Scheduler now evaluates maintenance mode at runtime instead of relying
only on its mode state (active / paused)
- Favors flag checks to happen inside the scheduled methods, instead of
controlling whether the method is scheduled at all (favor runtime over
startup)
- Moves "account last seen update" to scheduler
- Updates tests accordingly
- Boyscouting
Here's a manual test showing this behavior, where my local instance was
controlled by a remote instance. Whenever I toggle `maintenanceMode`
through a flag remotely, my scheduled functions stop running:
https://github.com/Unleash/unleash/assets/14320932/ae0a7fa9-5165-4c0b-9b0b-53b9fb20de72
Had a look through all of our current flags and it *seems to me* that
they are all used in a runtime controllable way, but would still feel
more comfortable if this was double checked, since it can be complex to
ensure this.
The only exception to this was `migrationLock`, which I believe is OK,
since the migration only happens at the start anyways.
## Discussion / Questions
~~Scheduler `mode` (active / paused) is currently not *really* being
used, along with its respective methods, except in tests. I think this
could be a potential footgun. Should we remove it in favor of only
controlling the scheduler state through maintenance mode?~~ Addressed in
7c52e3f638
~~The config property `disableScheduler` is still a startup
configuration, but perhaps that makes sense to leave as is?~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
Are there any other tests we should add?
Is there anything I missed?
Identified some `setInterval` and `setTimeout` that may make sense to
leave as is instead of moving over to the scheduler service:
- ~~`src/lib/metrics` - This is currently considered a `MetricsMonitor`.
Should this be refactored to a service instead and adapt these
setIntervals to use the scheduler instead? Is there anything special
with this we need to take into account? @chriswk @ivarconr~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1820501511)
by @ivarconr, leaving as is.
- ~~`src/lib/proxy/proxy-repository.ts` - This seems to have a complex
and specific logic currently. Perhaps we should leave it alone for now?
@FredrikOseberg~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
- `src/lib/services/user-service.ts` - This one also seems to be a bit
more specific, where we generate new timeouts for each receiver id.
Might not belong in the scheduler service. @Tymek