## About the changes
Migrations for:
- Adds column is_system to users
- Inserts unleash_system_user id -1337 to users
includes `is_system: false` in the activeUsers and activeAccounts where filter
Tested by running:
`
select * into users_pre_check from users where id > -1;
delete from users where id > -1;
`
before starting unleash, then inspecting users table after unleash has
started and verifying that an 'admin' user has been created.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.ai>
## About the changes
Adds the new nullable column created_by_user_id to the data used by
feature-tag-store and feature-tag-service. Also updates openapi schemas.
### What
Adds `createdByUserId` to all events exposed by unleash. In addition
this PR updates all tests and usages of the methods in this codebase to
include the required number.
This PR fixes the issue discussed in SR-234, where you would get a 200
OK response even if your POST request to
`/api/admin/projects/<project-name>/access` contains invalid data (and
nothing is persisted).
The previous check would return `false` if the value was 0, causing a
bug where the usage data wouldn't be included.
This also adds tests to ensure that usage data for CR segments is
propagated correctly because that's where I first encountered the issue.
Before this fix, if the values were 0, the data would display like the
bottom element in the screenshot:
![image](https://github.com/Unleash/unleash/assets/17786332/9642b945-12c4-4217-aec9-7fef4a88e9af)
Otherwise, we might accidentally display CR data to open source users.
But more importantly, it might keep them from being able to delete a
segment that's in use by a CR in their database that they can't touch.
So by checking that they're on an enterprise instance, we avoid this
potential blocker.
I've added the `includeChangeRequestUsageData` parameter as a boolean
now, but I'm open to other suggestions.
This PR updates the segment usage counting to also include segment usage
in pending change requests.
The changes include:
- Updating the schema to explicitly call out that change request usage
is included.
- Adding two tests to verify the new features
- Writing an alternate query to count this data
Specifically, it'll update the part of the UI that tells you how many
places a segment is used:
![image](https://github.com/Unleash/unleash/assets/17786332/a77cf932-d735-4a13-ae43-a2840f7106cb)
## Implementation
Implementing this was a little tricky. Previously, we'd just count
distinct instances of feature names and project names on the
feature_strategy table. However, to merge this with change request data,
we can't just count existing usage and change request usage separately,
because that could cause duplicates.
Instead of turning this into a complex DB query, I've broken it up into
a few separate queries and done the merging in JS. I think that's more
readable and it was easier to reason about.
Here's the breakdown:
1. Get the list of pending change requests. We need their IDs and their
project.
2. Get the list of updateStrategy and addStrategy events that have
segment data.
3. Take the result from step 2 and turn it into a dictionary of segment
id to usage data.
4. Query the feature_strategy_segment and feature_strategies table, to
get existing segment usage data
5. Fold that data into the change request data.
6. Perform the preexisting segment query (without counting logic) to get
other segment data
7. Enrich the results of the query from step 2 with usage data.
## Discussion points
I feel like this could be done in a nicer way, so any ideas on how to
achieve that (whether that's as a db query or just breaking up the code
differently) is very welcome.
Second, using multiple queries obviously yields more overhead than just
a single one. However, I do not think this is in the hot path, so I
don't consider performance to be critical here, but I'm open to hearing
opposing thoughts on this of course.
https://linear.app/unleash/issue/SR-169/ticket-1107-project-feature-flag-limit-is-not-correctly-updatedFixes#5315, an issue where it would not be possible to set an empty
flag limit.
This also fixes the UI behavior: Before, when the flag limit field was
emptied, it would disappear from the UI.
I'm a bit unsure of the original intent of the `(data.defaultStickiness
!== undefined || data.featureLimit !== undefined)` condition. We're in
an update method, triggered by a PUT endpoint - I think it's safe to
assume that we'll always want to set these values to whatever they come
as, we just need to convert them to `null` in case they are not present
(i.e. `undefined`).
## About the changes
This fixes a bug updating a project, when optional data
(defaultStickiness and featureLimit are not part of the payload).
The problem happens due to:
1. ProjectController does not use the type: UpdateProjectSchema for the
request body (will be addressed in another PR in unleash-enterprise)
2. Project Store interface does not match UpdateProjectSchema (but it
relies on accepting `additional properties: true`, which is what we
agreed on for input)
3. Feature limit is not defined in UpdateProjectSchema (also addressed
in the other PR)
Sort array items before running compare. Feature flag certain properties
of strategy that were previously not present in the /api/admin/features
endpoint.
## About the changes
This small improvement aims to help developers when instantiating
services. They need to be constructed without injecting services or
stores created elsewhere so they can be bound to the same transactional
scope.
This suggests that you need to create the services and stores on your
own
This PR is the first step in separating the client and admin stores.
Currently our feature toggle services uses the client store to serve
multiple purposes.
Admin API uses the feature toggle service to serve both the feature
toggle list and playground features, while the client API uses the
feature toggle service to serve client features. The admin API can
change often and have very different requirements than the client API,
which changes infrequently and generally keeps the same stable structure
for long periods of time. This architecture is error prone, because when
you need to make changes to the admin API, you can very easily affect
the client API.
I aim to put up a stone wall between the two APIs. Complete separation
between the two APIs, at the cost of some duplication.
In this PR I have created a feature oriented architecture for client
features and disconnected the client API from the feature toggle
service. It now goes through it's own service to it's own store. For
feature toggle service I have duplicated and replaced the functionality
that serves /api/admin/features, I have kept a lot of the ugliness in
the code and haven't removed anything in order to avoid breaking
changes.
Next steps:
* Move playground to admin API
* Remove client-feature-toggle-store from feature-toggle-service
As part of more telemetry on the usage of Unleash.
This PR adds a new `stat_` prefixed table as well as a trigger on the
events table trigger on each insert to increment a counter per
environment per day.
The trigger will trigger on every insert into the events base, but will
filter and only increment the counter for events that actually have the
environment set. (there are events, like user-created, that does not
relate to a specific environment).
Bit wary on this, but since we truncate down to row per (day,
environment) combo, finding conflict and incrementing shouldn't take too
long here.
@ivarconr was it something like this you were considering?
This PR cleans up and refactors the feature-strategy-store method
getFeatureOverview to join on the new table and attempts to make the
function more readable by extracting some of the logic into separate
functions. Keeping the LastSeenMapper for now in case there is a reason
to use it for the other endpoints.
Fixes an issue where SSO group sync would delete a syncable group that a
user was manually added to
## Discussion points
Is this the longterm fix for this? Or would we want another column in
the mapping table for future-proofing this?
## About the changes
This transactional implementation decorates a service with a
transactional method that removes the need to start transactions in the
method using the service.
This is a gradual rollout with a feature toggle, just because
transactions are not easy.
## About the changes
In our staging setup, we create ad-hoc environments and import Unleash
state from production. After unleash is deployed on such environment,
the import job kicks in and feeds Unleash instance with feature flags
from production.
Between Unleash being up and running and the import job running, some
applications start polling Unleash. They get an empty feature toggle
list with `meta.revisionId=0`. Then apps use this as part of `eTag`
header in subsequent requests. Even though after import Unleash server
finally has toggles to serve, it doesn't because it calculates _max
revision id_ based on toggle updates (not null `feature_name` column in
query) or `SEGMENT_UPDATED`.
This change adds an extra condition to query so feature toggles import
is considered something that should invalidate the cache.
This commit changes our linter/formatter to biome (https://biomejs.dev/)
Causing our prehook to run almost instantly, and our "yarn lint" task to
run in sub 100ms.
Some trade-offs:
* Biome isn't quite as well established as ESLint
* Are we ready to install a different vscode plugin (the biome plugin)
instead of the prettier plugin
The configuration set for biome also has a set of recommended rules,
this is turned on by default, in order to get to something that was
mergeable I have turned off a couple the rules we seemed to violate the
most, that we also explicitly told eslint to ignore.
## About the changes
Add partial index on events by announced. This should help avoid `Seq
Scan on events` when the majority of events are announced=true
---
Co-authored-by: Ivar Østhus <ivar@getunleash.io>
Co-authored-by: Gard Rimestad <gard@getunleash.io>
## About the changes
When the events table is large we might be doing a full table scan
searching for unannounced events. We spotted it due to a performance
alert and confirmed in AWS performance insights
![image](https://github.com/Unleash/unleash/assets/455064/8e815fa3-7a1b-4453-881a-98a148eae119)
The proposal is to limit this operation to 500 events (rule of thumb)
per round
f82ae354eb/src/lib/services/index.ts (L141-L147)
and also ignore the events older than a day (because it seems
reasonable)
## Discussion points
**Idea**: split the `events` table into `recent_events` and
`historical_events`. Recent can be anything from a day/week/month. This
would help with recurrent queries that rely on recent data from the
event's table such as optimal 304 calculation or event this scheduled
task that sends unannounced events.
## About the changes
This enables us to use names instead of permission ids across all our
APIs at the computational cost of searching for the ids in the DB but
improving the API user experience
## Open topics
We're using methods that are test-only and circumvent our business
logic. This makes our test to rely on assumptions that are not always
true because these assumptions are not validated frequently.
i.e. We are expecting that after removing a permission it's no longer
there, but to test this, the permission has to be there before:
78273e4ff3/src/test/e2e/services/access-service.e2e.test.ts (L367-L375)
But it seems that's not the case.
We'll look into improving this later.
## About the changes
- `getActiveUsers` is using multiple stores, so it is refactored into
read-model
- Refactored Instance stats service into `features` to co-locate related
code
Closes https://linear.app/unleash/issue/UNL-230/active-users-prometheus
### Important files
`src/lib/features/instance-stats/getActiveUsers.ts`
## Discussion points
`getActiveUsers` is coded less _class-based_ then previous similar
read-models. In one file instead of 3 (read-model interface, fake read
model, sql read model). I find types and functions way more readable,
but I'm ready to refactor it to interfaces and classes if consistency is
more important.