- Create 2 new events to replace the SCHEDULED_CHANGE_REQUEST_EXECUTED
event
- Handle the 3 events in slack-app and webhook addon definitions
3 events handled:
- CHANGE_REQUEST_SCHEDULED
- CHANGE_REQUEST_SCHEDULED_APPLICATION_SUCCESS
- CHANGE_REQUEST_SCHEDULED_APPLICATION_FAILURE
Closes #
[1-1555](https://linear.app/unleash/issue/1-1555/update-change-request-scheduled-and-scheduled-change-request-executed)
Note: SCHEDULED_CHANGE_REQUEST_EXECUTED will be removed in follow up PR
not to break current enterprise build
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
This PR addresses some cleanup related to removing the
useLastSeenRefactor flag:
* Added fallback last seen to the feature table last_seen_at column
* Remove foreign key on environment since we can not guarantee that we
will get valid data in this field
* Add environments to cleanup function
* Add test for cleanup environments
This PR changes the behavior of the API a little bit. Instead of
removing any strategies from `changeRequestStrategies` that are also
in `strategies`, we keep them in instead.
The reason for this is that the overview of where a segment is used is
incomplete if it shows only strategies but not CRs. Imagine this:
You want to delete a segment, but you're told it's only used in strategy
S.
So you go and remove it from strategy S, but then you're told it's
suddenly used in CRs A, B, and C. This is now a two-step operation
with a bad surprise. Instead, we could show you immediately that this
segment is used in strategy S and CRs A, B, and C.
Otherwise, we might accidentally display CR data to open source users.
But more importantly, it might keep them from being able to delete a
segment that's in use by a CR in their database that they can't touch.
So by checking that they're on an enterprise instance, we avoid this
potential blocker.
I've added the `includeChangeRequestUsageData` parameter as a boolean
now, but I'm open to other suggestions.
This PR handles the case where a single strategy is used in multiple
change requests. Instead of listing the strategy several times in the
output, we consolidate the entries and add a new `changeRequestIds`
property. This is a non-empty list that points to all the change
requests it is used in.
This is required for us to be able to link back to the change requests
from the UI overview.
This change is just a refactor, removing code that's no longer used. Instead of
checking just whether a segment is in use, we now extract the list of
strategies that use this segment. This is slightly more costly,
perhaps, but it will be necessary for the upcoming implementation.
This PR changes the payload of the strategiesBySegment endpoint when the
flag is active. In addition to returning just the strategies, the object
will also contain a new property, called `changeRequestStrategies`
containing the strategies that are used in change requests.
This PR does not update the schema. That can be done later when the
changes go into beta. This also allows us some time to iterate on the
payload without changing the public API.
## Discussion points:
Should `strategies` and `changeRequestStrategies` ever contain
duplicates? Take this scenario:
- Strategy S uses segment T.
- There is an open change request that updates the list of segments for
S to T and a new segment U.
- In this case, strategy S would show up both in `strategies` _and_ in
`changeRequestStrategies`.
We have two options:
1. Filter the list of change request strategies, so that they don't
contain any duplicates (this is currently how it's implemented)
2. Ignore the duplicates and just send both lists as is.
We're doing option 2 for now.
Removing a user from a project was impossible if you only had 1 owner.
It worked fine when having more than an owner. This should fix it and
we'll add tests later
In short the issue is that after our last seen improvements, we did not
update where we are getting last_seen field. It was still using features
table, which is not the source of last seen anymore.
## PR Description
https://linear.app/unleash/issue/2-1645/address-post-mortem-action-point-all-flags-should-be-runtime
Refactor with the goal of ensuring that flags are runtime controllable,
mostly focused on the current scheduler logic.
This includes the following changes:
- Moves scheduler into its own "scheduler" feature folder
- Reverts dependency: SchedulerService takes in the MaintenanceService,
not the other way around
- Scheduler now evaluates maintenance mode at runtime instead of relying
only on its mode state (active / paused)
- Favors flag checks to happen inside the scheduled methods, instead of
controlling whether the method is scheduled at all (favor runtime over
startup)
- Moves "account last seen update" to scheduler
- Updates tests accordingly
- Boyscouting
Here's a manual test showing this behavior, where my local instance was
controlled by a remote instance. Whenever I toggle `maintenanceMode`
through a flag remotely, my scheduled functions stop running:
https://github.com/Unleash/unleash/assets/14320932/ae0a7fa9-5165-4c0b-9b0b-53b9fb20de72
Had a look through all of our current flags and it *seems to me* that
they are all used in a runtime controllable way, but would still feel
more comfortable if this was double checked, since it can be complex to
ensure this.
The only exception to this was `migrationLock`, which I believe is OK,
since the migration only happens at the start anyways.
## Discussion / Questions
~~Scheduler `mode` (active / paused) is currently not *really* being
used, along with its respective methods, except in tests. I think this
could be a potential footgun. Should we remove it in favor of only
controlling the scheduler state through maintenance mode?~~ Addressed in
7c52e3f638
~~The config property `disableScheduler` is still a startup
configuration, but perhaps that makes sense to leave as is?~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
Are there any other tests we should add?
Is there anything I missed?
Identified some `setInterval` and `setTimeout` that may make sense to
leave as is instead of moving over to the scheduler service:
- ~~`src/lib/metrics` - This is currently considered a `MetricsMonitor`.
Should this be refactored to a service instead and adapt these
setIntervals to use the scheduler instead? Is there anything special
with this we need to take into account? @chriswk @ivarconr~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1820501511)
by @ivarconr, leaving as is.
- ~~`src/lib/proxy/proxy-repository.ts` - This seems to have a complex
and specific logic currently. Perhaps we should leave it alone for now?
@FredrikOseberg~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
- `src/lib/services/user-service.ts` - This one also seems to be a bit
more specific, where we generate new timeouts for each receiver id.
Might not belong in the scheduler service. @Tymek
This PR adds the ability to detect which strategies use a specific
segment in active change requests.
It does not wire this functionality up to anything just yet. Follow-up
PRs will integrate this with the segment service and eventually with the
front end.
The issue was that we all features were created exactly in same time,
and our feature counter waas expecting time to be unique to feature,
which was not the case.
Instead of throwing an error when the project doesn't exist, we say that
the names are valid, because we have nothing to say that they're not.
Presumably there is already something in place to prevent you from
importing into a non-existent project.
## About the changes
This feature allows our Enterprise customers to configure banners to be
displayed on their Unleash instance for all their users to see and
interact with. Previously known as "internal message banners".
Optimizations:
1. Removed extra round trip to database to count environments
2. Removed extra round trip to database to count features
Fixes:
Currently, we were using a very optimistic query to set correct limit
and offset. This breaks as soon we we join tags.
` query = query
.select(selectColumns)
.limit(limit * environmentCount)
.offset(offset * environmentCount);`
The solution was to use common table expressions, so we could count and
rank features.
Rename event to SCHEDULED_CHANGE_REQUEST_EXECUTED
This event will be triggered when the executor runs a scheduled change
request.
The ChangeRequestApplied event will remain as is (going out to project
members - but will have a scheduled = true property in the data if it
scheduled.
This new event will fire on execution of the schedule and have a result
= "failed" | "succeeded" property.
Because notifications are tied to events, this notification will go out
to the creator and the applier
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
This PR updates the segment usage counting to also include segment usage
in pending change requests.
The changes include:
- Updating the schema to explicitly call out that change request usage
is included.
- Adding two tests to verify the new features
- Writing an alternate query to count this data
Specifically, it'll update the part of the UI that tells you how many
places a segment is used:
![image](https://github.com/Unleash/unleash/assets/17786332/a77cf932-d735-4a13-ae43-a2840f7106cb)
## Implementation
Implementing this was a little tricky. Previously, we'd just count
distinct instances of feature names and project names on the
feature_strategy table. However, to merge this with change request data,
we can't just count existing usage and change request usage separately,
because that could cause duplicates.
Instead of turning this into a complex DB query, I've broken it up into
a few separate queries and done the merging in JS. I think that's more
readable and it was easier to reason about.
Here's the breakdown:
1. Get the list of pending change requests. We need their IDs and their
project.
2. Get the list of updateStrategy and addStrategy events that have
segment data.
3. Take the result from step 2 and turn it into a dictionary of segment
id to usage data.
4. Query the feature_strategy_segment and feature_strategies table, to
get existing segment usage data
5. Fold that data into the change request data.
6. Perform the preexisting segment query (without counting logic) to get
other segment data
7. Enrich the results of the query from step 2 with usage data.
## Discussion points
I feel like this could be done in a nicer way, so any ideas on how to
achieve that (whether that's as a db query or just breaking up the code
differently) is very welcome.
Second, using multiple queries obviously yields more overhead than just
a single one. However, I do not think this is in the hot path, so I
don't consider performance to be critical here, but I'm open to hearing
opposing thoughts on this of course.
https://linear.app/unleash/issue/SR-169/ticket-1107-project-feature-flag-limit-is-not-correctly-updatedFixes#5315, an issue where it would not be possible to set an empty
flag limit.
This also fixes the UI behavior: Before, when the flag limit field was
emptied, it would disappear from the UI.
I'm a bit unsure of the original intent of the `(data.defaultStickiness
!== undefined || data.featureLimit !== undefined)` condition. We're in
an update method, triggered by a PUT endpoint - I think it's safe to
assume that we'll always want to set these values to whatever they come
as, we just need to convert them to `null` in case they are not present
(i.e. `undefined`).
This fixes an edge case not caught originally in
https://github.com/Unleash/unleash/pull/5304 - When creating a new
segment on the global level:
- There is no `projectId`, either in the params or body
- The `UPDATE_PROJECT_SEGMENT` is still a part of the permissions
checked on the endpoint
- There is no `id` on the params
This made it so that we would run `segmentStore.get(id)` with an
undefined `id`, causing issues.
The fix was simply checking for the presence of `params.id` before
proceeding.
This PR hooks up the changes introduced in #5301 to the API and puts
them behind a feature flag. A new test has been added and the test setup
has been slightly tweaked to allow this test.
When the flag is enabled, the API will now not let you delete a segment
that's used in any active CRs.
https://linear.app/unleash/issue/SR-164/ticket-1106-user-with-createedit-project-segment-is-not-able-to-edit-a
Fixes a bug where the `UPDATE_PROJECT_SEGMENT` permission is not
respected, both on the UI and on the API. The original intention was
stated
[here](https://github.com/Unleash/unleash/pull/3346#discussion_r1140434517).
This was easy to fix on the UI, since we were simply missing the extra
permission on the button permission checks.
Unfortunately the API can be tricky. Our auth middleware tries to grab
the `project` information from either the params or body object, but our
`DELETE` method does not contain this information. There is no body and
the endpoint looks like `/admin/segments/:id`, only including the
segment id.
This means that, in the rbac middleware when we check the permissions,
we need to figure out if we're in such a scenario and fetch the project
information from the DB, which feels a bit hacky, but it's something
we're seemingly already doing for features, so at least it's somewhat
consistent.
Ideally what we could do is leave this API alone and create a separate
one for project segments, with endpoints where we would have project as
a param, like so:
`http://localhost:4242/api/admin/projects/:projectId/segments/1`.
This PR opts to go with the quick and hacky solution for now since this
is an issue we want to fix quickly, but this is something that we should
be aware of. I'm also unsure if we want to create a new API for project
segments. If we decide that we want a different solution I don't mind
either adapting this PR or creating a follow up.
This test was flaky because it relied on the order of the array
returned. To make it less flaky, we now turn the array into an object
instead and compare that.
This PR adds a way to tell if a specific segment is being used in any
active change requests. It's the first step towards preventing segments
that are being used in change requests from being deleted.
It does that by checking the db for any unclosed CRs and using those CR
ids to look for "addStrategy" and "updateStrategy" events in the cr
events table.
## Upcoming PRs
This only puts in a way to detect it, but doesn't add that to anything.
That'll be in an upcoming iteration.
The `dataPath` was present (but not in the type) in previous versions of
the
error library that we use. But with the recent major upgrade, it's
been removed and the `instancePath` property has finally come into use.
This PR removes all the handling for the previous property and
replaces it with `instancePath`. Because the `dataPath` used full
stops and the `instancePath` uses slashes, we need to change a little
bit of the handling too.
Switch the express-openapi implementation from our internal fork to the
upstream version. We have upstreamed our changes and a new version has
been released, so this should be the last step before we can retire our
fork.
Because some of the dependencies have been updated since our internal
fork, we also need to update some of our error handling to reflect this.
Expose new interface while also getting rid of unneeded compiler ignores
None of the changes should add new security risks, despite this report:
> Code scanning results / CodeQL Failing after 4s — 2 new alerts
including 2 high severity security vulnerabilities
Not sure what that means, maybe a removed ignore...
Sort the items before inserting them into the database in order to
reduce the chance of deadlocks happening when multiple pods are
inserting at the same time.
For a while we ran a diffing algorithm in production to verify that the
results of the refactor did not differ from the previous results. As the
experiment has run it's course and new attributes have been added on top
of the new flow, this will remove the logging and associated code.
### What
This PR makes the rate limit for user creation and simple login (our
password based login) configurable in the same way you can do
metricsRateLimiting.
### Worth noting
In addition this PR adds a `rate_limit{endpoint, method}` prometheus
gauge, which gets the data from the UnleashConfig.
As #4475 says, MD5 is not available in secure places anymore. This PR
swaps out gravatar-url with an inline function using crypto:sha256 which
is FIPS-140-2 compliant. Since we only used this method for generating
avatar URLs the extra customization wasn't needed and we could hard code
the URL parameters.
fixes: Linear
https://linear.app/unleash/issue/SR-112/gh-support-swap-out-gravatar-url-libcloses: #4475
To prepare for 5.6 GA,
I've done a find through both Frontend and Backend here to remove the
usages of the flag. Seems like the flag was only in use in the frontend.
@nunogois can you confirm?
This PR adds a cleanup job that removes unknown feature flags from
last_seen_at_metrics table every 24 hours since we no longer have a
foreign key on the name column in the features table.
## About the changes
This fixes a bug updating a project, when optional data
(defaultStickiness and featureLimit are not part of the payload).
The problem happens due to:
1. ProjectController does not use the type: UpdateProjectSchema for the
request body (will be addressed in another PR in unleash-enterprise)
2. Project Store interface does not match UpdateProjectSchema (but it
relies on accepting `additional properties: true`, which is what we
agreed on for input)
3. Feature limit is not defined in UpdateProjectSchema (also addressed
in the other PR)
https://linear.app/unleash/issue/2-1531/rename-message-banners-to-banners
This renames "message banners" to "banners".
I also added support for external banners coming from a `banner` flag
instead of only `messageBanner` flag, so we can eventually migrate to
the new one in the future if we want.
## About the changes
This makes sure that projects have at least one owner, either a group or
a user. This is to prevent accidentally losing access to a project.
We check this when removing a user/group or when changing the role of a
user/group
**Note**: We can still leave a group empty as the only owner of the
project, but that's okay because we can still add more users to the
group
Sort array items before running compare. Feature flag certain properties
of strategy that were previously not present in the /api/admin/features
endpoint.
### What
The heaviest requests we serve are the register and metrics POSTs from
our SDKs/clients.
This PR adds ratelimiting to /api/client/register, /api/client/metrics,
/api/frontend/register and /api/frontend/metrics with a default set to
6000 requests per minute (or 100 rps) for each of the endpoints.
It will be overrideable by the environment variables documented.
### Points of discussion
@kwasniew already suggested using featuretoggles with variants to
control the rate per clientId. I struggled to see if we could
dynamically update the middleware after initialisation, so this attempt
will need a restart of the pod to update the request limit.
## About the changes
This small improvement aims to help developers when instantiating
services. They need to be constructed without injecting services or
stores created elsewhere so they can be bound to the same transactional
scope.
This suggests that you need to create the services and stores on your
own
This fixes a return type error by changing the logic of
`extractUsernameFromUser` to never return undefined.
In the previous code, `user` could be truthy, but that doesn't mean
`email` or `username` were defined. This assumes we always fallback to
"unknown" in those scenarios.
This PR is the first step in separating the client and admin stores.
Currently our feature toggle services uses the client store to serve
multiple purposes.
Admin API uses the feature toggle service to serve both the feature
toggle list and playground features, while the client API uses the
feature toggle service to serve client features. The admin API can
change often and have very different requirements than the client API,
which changes infrequently and generally keeps the same stable structure
for long periods of time. This architecture is error prone, because when
you need to make changes to the admin API, you can very easily affect
the client API.
I aim to put up a stone wall between the two APIs. Complete separation
between the two APIs, at the cost of some duplication.
In this PR I have created a feature oriented architecture for client
features and disconnected the client API from the feature toggle
service. It now goes through it's own service to it's own store. For
feature toggle service I have duplicated and replaced the functionality
that serves /api/admin/features, I have kept a lot of the ugliness in
the code and haven't removed anything in order to avoid breaking
changes.
Next steps:
* Move playground to admin API
* Remove client-feature-toggle-store from feature-toggle-service