This PR adds an endpoint to Unleash that accepts an error message and
option error stack and logs it as an error. This allows us to leverage
errors in logs observability to catch UI errors consistently.
Considered a test, but this endpoint only accepts and logs input, so I'm
not sure how useful it would be.
## About the changes
App stats is mainly used to cap the number of applications reported to
Unleash based on the last 7 days information:
cc2ccb1134/src/lib/middleware/response-time-metrics.ts (L24-L28)
Instead of getting all stats, just calculate appCount statistics
Use scheduler service instead of setInterval
This PR updates the change request email sending method to handle the
recent changes we have made. That means that the email now:
- says that change requests have been suspended instead of saying that
application will fail.
- handles cases where segments or strategies have been updated causing
potential conflicts.
I have updated the email templates and made some adjustments to the
email sending method. To make the transition from one to the other
easier, I have kept the original method as an interim solution until
enterprise has switched over.
## About the changes
getAllActive from api-tokens store is the second most frequent query
![image](https://github.com/Unleash/unleash/assets/455064/63c5ae76-bb62-41b2-95b4-82aca59a7c16)
To prevent starving our db connections, we can cache this data that
rarely changes and clear the cache when we see changes. Because we will
only clear changes in the node receiving the change we're only caching
the data for 1 minute.
This should give us some room to test if this solution will work
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
Adds a new Inactive Users list component to admin/users for easier cleanup of users that are counted as inactive: No sign of activity (logins or api token usage) in the last 180 days.
---------
Co-authored-by: David Leek <david@getunleash.io>
In the beginning we used process.nextTick() as a trick to load some data
initally in the constructor of a service.
This is a bad pattern and we should generally avoid any async operations
in the constructor. Today we have two alternatives:
1. Defer loading until data is needed (wrap it in async)
2. Use the schdule-service.
When a stop signal is sent to Unleash the scheduler-service should
cancel any scheduled jobs. This also applies to the job scheduled for
initial execution with jitter.
We observed that initial jobs was executed after the database
connections are terminated. This appears after v5.9.0 of Unleash.
```
Error: aborted
at Object.queryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
at createQueryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
at EventStore.knex [as db] (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
at EventStore.setUnannouncedToAnnounced (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:286:33)
at EventStore.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:293:35)
at EventAnnouncer.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/services/event-announcer-service.js:9:32)
at runScheduledFunctionWithEvent (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:30:23)
at Timeout.<anonymous> (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:50:27)
at runNextTicks (node:internal/process/task_queues:60:5)
at process.processTimers (node:internal/timers:509:9)
```
## About the changes
Queries on client-feature-toggle store have many purposes depending on
the requestType, making the query more complex or not depending on the
use case. Also, each use case has different frequencies (i.e. playground
is expected to be used rarely).
The name for the store metrics was wrong, copy&pasted from:
7b04db0547/src/lib/features/feature-toggle/feature-toggle-store.ts (L107)
Which was also present in feature-tag metrics:
7b04db0547/src/lib/db/feature-tag-store.ts (L37)
With this, we'll have more granularity to understand the execution time
and frequency of each
Usually maintenance mode is disabled. If the call throws, which we see a
lot of when a unleash instance is in terminating state, we should return
a default value.
By having it throw inside of the memoizee function, the response is not
cached, and it will trigger new calls until it return a cachable result.
## About the changes
Every schedule job will now check if maintenance is enabled. This ends
up querying the settings table in the db at least once per second per
running unleash instance. This small fix caches this query for 60
seconds to reduce the load somewhat.
We should reconsider this solution for the long term, but this will be a
great improvement on the short term.
**Logs after this fix running locally.**
We can observe that we resolve settings from the DB once per minute.
![image](https://github.com/Unleash/unleash/assets/158948/c313cf38-8d86-4b86-a0ba-4f4df60d50d6)
Also we should consider giving a warning in section where you enable
maintenance mode that it can take up to a minute to propagate.
## About the changes
the created_by_user_id data migration from resolving events.created_by
(for both events and features) now emits events on how many rows were
updated.
Adds listeners for these events that records these metrics with
prometheus
![image](https://github.com/Unleash/unleash/assets/707867/3bb02645-0919-4a9a-83fe-a07383ac0be1)
We were sending `user.id` to the service, but if an admin token is used,
there is no `user.id.` Instead, there is
`user.internalAdminTokenUserId`. so we need to use the special method
`extractUserIdFromUser`.
This PR adds this implementation, and now the service correctly
retrieves the appropriate ID for admins.
Related to: https://github.com/Unleash/unleash/pull/5924
## About the changes
Schedules a best-effort task setting the value of
events.created_by_user_id based on what is found in the created_by
column and if it's capable of resolving that to a userid/a system id.
The process is executed in the events-store, it takes a chunk of events
that haven't been processed yet, attempts to join users and api_tokens
tables on created_by = username/email, loops through and tries to figure
out an id to set. Then updates the record.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Change the sorting of features to migrate created_by_user_id for, and
filter out unresolvable feature/users
Query tested manually in enterprise
## About the changes
Sets data migration of features and events created_by_user_id to
disabled by default
Map to promise and await all in created by user id migration for features
This escape with `??` double escaped the LIKE query causing no results.
This updates to using whereLike, which does the correct escaping for
string query.
## About the changes
Adds a scheduled task that every 5 seconds updates 500 entries in the
features table setting `created_by_user_id`.
It does this by looking at the related event, checks created_by and
joins users table for match on username or email, and joins api_tokens
table on username matches. Then picks either a users id if set, or uses
-42 (admin token user)
## About the changes
This PR replaces the old systemUser -1 in user-service.ts with the new
SYSTEM_USER -1337 and adds a migration to move events created_by = -1 to
-1337
## Discussion points
Does it make sense to do both of these things? Or should we skip the
migration? How would this behave in a large system with hundreds of
thousands of events, should this be split up?
Only triggers if there is any rows in client instances that have
sdk_version: unleash-edge with version < 17.0.0
The function that checks this memoizes the check for 10 minutes to avoid
scanning the client instances table too often.
Previously we used a killswitch and returned 404 if the feature was
enabled. This flips that to a default disabled toggle, that has to be
turned on to handle old Edge (pre 17.0.0) posting bulk metrics