Only triggers if there is any rows in client instances that have
sdk_version: unleash-edge with version < 17.0.0
The function that checks this memoizes the check for 10 minutes to avoid
scanning the client instances table too often.
Previously we used a killswitch and returned 404 if the feature was
enabled. This flips that to a default disabled toggle, that has to be
turned on to handle old Edge (pre 17.0.0) posting bulk metrics
We should use the enhanced flagResolver
Tested locally:
```
9:44:13 AM - Starting compilation in watch mode...
[dev:backend]
[dev:backend]
[dev:backend] 9:44:26 AM - Found 0 errors. Watching for file changes.
[dev:backend] [2024-01-23T09:44:27.498] [INFO] server-impl.js - DB migration: start
[dev:backend] [2024-01-23T09:44:27.499] [INFO] server-impl.js - Running migration with lock
[dev:backend] [2024-01-23T09:44:29.884] [INFO] server-impl.js - DB migration: end
```
This PR will allow us to use a feature flag with variants to control
whether or not we should show the comments field of the feedback form.
This will allow us to see whether we can increase feedback collection if
we reduce the load on the customer.
## About the changes
This was spotted while testing automated actions. Steps to reproduce:
1. Add an editor user
2. Get a PAT for the editor user
3. As Admin create a feature in a project where the editor user is not a
member and enable the feature
4. Try using the editor's PAT to modify the feature
5. As the editor create a project (you'd be made owner) and try the same
request but just change the project name for the new project just
created (don't change anything else)
**Expected behavior**: you can't disable the feature
**Actual behavior**: the feature is disabled
This does not happen when trying to turn on a flag because during the
turn-on process we do validate if the feature belongs to project when we
call updateStrategy:
c18a7c0dc2/src/lib/features/feature-toggle/feature-toggle-service.ts (L1751-L1764)
https://linear.app/unleash/issue/2-1856/add-typesafe-wrappers-over-prom-clients-metrics
As discussed on the latest knowledge sharing session, this adds typesafe
wrappers over prom client's metrics, requiring us to specify all the
configured labels for each metric.
This uses a functional approach and only exposes the methods that are
currently relevant to us, while also exposing the underlying instance of
the metric for an easy access if needed.
Since we often chain `labels` with `inc` in counters, this adds a
convenience `increment` method for counters which does both in a single
call.
Uses a new `URL_SAFE_BASIC` regex constant that checks for characters
that are commonly used in URL path sections: alphanumeric lowercase
characters, dashes and underscores.
This will allow us to re-use this constant in our server-side
validation.
Follow up of https://github.com/Unleash/unleash/issues/4303
We are adding primary keys to all tables missing them, currently
**role_permission**, **api_token_project**, and **project_stats**.
By adding primary keys, the issue with migrations failing during
upgrades in replicated database setups will be resolved.
So, this was causing a lot of ERROR in our logs, due to the metric
having gotten an extra label the last month.
Two things for this fix.
1. add the missing label to the two calls that did not have it added
2. update the log line to include the error as another argument to the
logger, so we actually get a stacktrace from the error.
### What
Adds Read and Write permissions for project administration settings
(user access, change request settings, default strategy, other).
### Why
On request from two large customers that wanted our RBAC controls to be
more granulated to easier be able to limit the access they granted their
users.
## About the changes
Whenever we get a call from an admin token we want to associate it with
the [admin token
user](4d42093a07/src/lib/types/core.ts (L34-L41)).
This should give us the needed audit for this type of calls that
currently were lacking a user id (we only stored a string with the token
name in the event log).
We consciously decided not to use `id` as the property to prevent any
unforeseen side effects. The reason is that only `IUser` type has an id
and adding an id to `IApiUser` might lead to confusion.
Since we've now added PAT's we really do recommend switching to those,
or for enterprises, we recommend using service accounts.
Admin tokens have an obvious disadvantage in that they're not connected
to any user, so actions performed by them are harder to audit.
This PR adds a killswitch for turning it off, in preparation for
deprecating them and ultimately removing them in the future.
## About the changes
This admin token user will help us differentiate actions performed by
the system from actions performed with an admin token.
Events created with an admin token should have the id of this user as
createdByUserId property and the username of the token used as the
createdBy property. i.e.
```json
{
"id": 11,
"type": "pat-created",
"createdBy": "admin-token",
"createdAt": "2024-01-16T13:16:27.887Z",
"createdByUserId": -42,
"data": {
"description": "admin-pat",
"expiresAt": "2024-02-15T13:16:25.586Z",
"secret": "***",
"userId": 1
},
"preData": null,
"tags": [],
"featureName": null,
"project": null,
"environment": null
}
```
## About the changes
EventsService is a dependency in most of our services. This creates
helper methods to create them easily and replace a few places where
we're creating them manually
This change removes the system user's email from the definition, instead
setting it to `null`. It also changes the name to "Unleash System".
The IUser interface doesn't allow `null` email addresses, so we change
the type definition of the system user to get around it. However, using
`null` (instead of just removing the property entirely) is useful
because when you get the system user from the DB, it's email value will
be null (after it has been nulled out).
As of today, there is nowhere in the Unleash system (OSS or Enterprise)
where we use the system user as an IUser (we only use username and ID).
So this change shouldn't break anything.
This should follow https://github.com/Unleash/unleash/pull/5849.
Updates it from 'system@getunleash.io' to `null`. We don't have that
address registered (and probably don't want it), so we'll leave it
empty.
This is a companion PR to
https://github.com/Unleash/unleash/pull/5893. With both of those
merged, the system user in the DB should match the one defined in
`core.ts`
Lots of work here, mostly because I didn't want to turn off the
`noImplicitAnyLet` lint. This PR tries its best to type all the untyped
lets biome complained about (Don't ask me how many hours that took or
how many lints that was >200...), which in the future will force test
authors to actually type their global variables setup in `beforeAll`.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
This change adjusts the exported `SYSTEM_USER` constant in `core.ts` to
match the one created in the migration in
`src/migrations/20231222071533-unleash-system-user.js`
The slight discrepancy between these two caused me some minor headache
when trying to write a test in enterprise.
It also removes the email because we have no inbox at that address (and
we probably don't want one).
For reference, the migration looks like this:
``` sql
ALTER TABLE users ADD COLUMN IF NOT EXISTS is_system BOOLEAN NOT NULL DEFAULT FALSE;
INSERT INTO users
(id, name, username, email, created_by_user_id, is_system)
VALUES
(-1337, 'Unleash System', 'unleash_system_user', 'system@getunleash.io', -1337, true);
```
This adds a bulk endpoint under `/api/client/metrics`. Accessible under
`/api/client/metrics/bulk`.
This allows us to piggyback on the need for an API user with access.
This PR mostly copies the behaviour from our `/edge/metrics` endpoint,
but it filters metrics to only include the environment that the token
has access to.
So a client token that has access to the `production` will not be
allowed to report metrics for the `development` environment. More
importantly, a `development` token will not be allowed to post metrics
for the `production` environment.
This PR adds the schedule suspended event to the slack-app and webhook
definitions.
It also slightly tweaks the markdown formatting of change requests to
add a definite article. This means the snapshot also needs to be
updated.
This PR adds a new `reason` column to the change request schedules table
and populates it with the data that is in the `failure_reason` column.
This is the expand phase of the expand/contract pattern. The code in
enterprise will be updated to try and use the new column name, but fall
back to the old one if no value is present.
The old column can be removed later.
This metric was used while developing the optimal304 feature. The
feature flag has been removed and this data is not longer being
collected and this will remove the metric from Prometheus.
## About the changes
This allows us to encrypt emails at signup for demo users to further
secure our demo instance. Currently, emails are anonymized before
displaying events performed by demo users. But this means that emails
are stored at rest in our DB. By encrypting the emails at login, we're
adding another layer of protection.
This can be enabled with a flag and requires the encryption key and the
initialization vector (IV for short) to be present as environment
variables.
Related to our work for making Edge bulk metrics a 1st class citizen of
Unleash, this PR adds an X-Unleash-Version header to the response from
client registration.
Based on when we add the new `/api/client/metrics/bulk` endpoint, Edge
can use the response header from upstream to decide whether to post
metrics to `/edge/metrics` or `/api/client/metrics/bulk`.
If the kill switch is enabled unleash returns 404 and a json body explaining why a 404 was given, encouraging users to upgrade to the most recent version of Edge.
## About the changes
Creating an incoming webhook with an admin token means we can't
correlate the action with a real user. In this case we should support
null.
Was having some trouble running these migration tests locally due to
`dbm` not correctly picking up the passed in config. This fixes it by
setting the custom config property after it has been initialized, always
overriding any wrong values.
PS: I think I found the issue. `dbm` was prioritizing my `DATABASE_URL`
for some reason, as I started having issues when it was set, and stopped
having issues when I unset it.
I still think this is a good change, as it prevents similar
hard-to-debug issues in the future.
To help clarify this, running this locally:
- `export
DATABASE_URL=postgres://unleash_user:passord@localhost:5432/unleash`
- `yarn test dedupe-permissions`
Fails on `main`, but passes on this branch. For some reason the `dbm`
instance prioritizes whatever is set in `DATABASE_URL` instead of the
options that are passed in `getInstance`.
We've had a couple of misunderstandings from people surprised that
Unleash allows posts against the `/edge/validate` endpoint without an
API key. It is intentional that this endpoint does not require an
Authorization header, so this PR updates our OpenAPI spec to clarify
that there is no security required for `/edge/validate`
## About the changes
Migrations for:
- Adds column is_system to users
- Inserts unleash_system_user id -1337 to users
includes `is_system: false` in the activeUsers and activeAccounts where filter
Tested by running:
`
select * into users_pre_check from users where id > -1;
delete from users where id > -1;
`
before starting unleash, then inspecting users table after unleash has
started and verifying that an 'admin' user has been created.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.ai>
This backwards compatible change allows us to specify a schema `id`
(full path) which to me feels a bit better than specifying the schema
name as a string, since a literal string is prone to typos.
### Before
```ts
requestBody: createRequestSchema(
'createResourceSchema',
),
responses: {
...getStandardResponses(400, 401, 403, 415),
201: resourceCreatedResponseSchema(
'resourceSchema',
),
},
```
### After
```ts
requestBody: createRequestSchema(
createResourceSchema.$id,
),
responses: {
...getStandardResponses(400, 401, 403, 415),
201: resourceCreatedResponseSchema(
resourceSchema.$id,
),
},
```
With the recent changes it's common that we'll need both the id and
processed username from the auth user in the request, so this PR
provides some helper methods to simplify this.
## About the changes
Adds the new nullable column created_by_user_id to the data used by
feature-tag-store and feature-tag-service. Also updates openapi schemas.
## About the changes
Replaces #5616
Renamed newly added `created_by` columns to `created_by_user_id` for
these tables:
features
feature_tag
feature_strategies
feature_types
role_permission
role_user
roles
users
api_tokens
Two changes were needed to sort better
1. Since we are still using `last seen` from `features` table for
backwards compatibility, we needed to add it to sort condition.
2. Nulls break the order, so now sorting nulls as last.
I noticed I was getting warnings logged in my local instance when
visiting the users page (`/admin/users`)
```json
{
"schema": "#/components/schemas/publicSignupTokensSchema",
"errors": [
{
"instancePath": "/tokens/0/users/0/username",
"schemaPath": "#/components/schemas/userSchema/properties/username/type",
"keyword": "type",
"params": {
"type": "string"
},
"message": "must be string"
}
]
}
```
It was complaining because one of my users doesn't have a username, so
the value returned from the API was:
```json
{
"users": [
{
"id": 2,
"name": "2mas",
"username": null
}
]
}
```
This adjustment fixes that oversight by allowing `null` values for the
username.
## Why
Currently AWS API Gateway doesn't have compression enabled by default,
this PR will make it easier to for example deploy Unleash over to AWS
Lambda without further configuration in API Gateway, frameworks like
Serverless requires a bit more work to set up compression and some times
one might not need compression at all.
## How
Create a new config flag called `disableCompression` which will not
include `compression` middleware in express' instance when set as true.
## About the changes
This adds a Makefile to make it easy to test migrations from one version
of Unleash to another.
The script depends on [docker compose
V2](https://docs.docker.com/compose/migrate/)
**Before starting**: make sure you're inside test-migrations folder and
run `make clean` to be in a clean state.
We can run 2 versions of Unleash side by side with a shared database
(the second version will apply migrations to the DB):
```shell
UNLEASH_DOCKER_IMAGE=unleashorg/unleash-server:5.6.10 make start-unleash # defaults to port 4242
UNLEASH_DOCKER_IMAGE=unleashorg/unleash-server:latest make start-another-unleash # defaults to port 4243
make test # run basic UI tests against port 4242 (first image)
EXPOSED_PORT=4243 make test # run basic UI tests against port 4243
```
This also enables us to test our local repository with our code of
Unleash server running at port 4244 (`EXPOSE_PORT=4444 make run-current`
if you want to change it):
```shell
UNLEASH_DOCKER_IMAGE=unleashorg/unleash-server:5.6.10 make start-unleash # defaults to port 4242
make run-current # exposes the current backend at 4244
```
You can also connect the latest UI to any of the ports specified above,
starting the UI at port 3000:
```shell
EXPOSED_PORT=4242 make run-current-ui # exposed port defaults to 4244 which is the port of the current backend
```
### What
Adds `createdByUserId` to all events exposed by unleash. In addition
this PR updates all tests and usages of the methods in this codebase to
include the required number.
## About the changes
Adds the column `created_by_user_id` to `events` table and adds index
for it
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.ai>
This PR fixes the issue discussed in SR-234, where you would get a 200
OK response even if your POST request to
`/api/admin/projects/<project-name>/access` contains invalid data (and
nothing is persisted).
Today we include a lot of "secutiry headers" for all API calls. Quite a
lot of them are only relevent when we return a HTML document for the
browser.
This PR removes and simplify these headers for API calls, so that we do
not include unecessary data in the HTTP headers.
Each header have been carfully examied by following best practices from
these source:
-
https://cheatsheetseries.owasp.org/cheatsheets/REST_Security_Cheat_Sheet.html
- https://owasp.org/www-project-secure-headers/
This feature is protected with feature flag named 'stripHeadersOnAPI'.
As it says on the tin. In an attempt to make all operations in Unleash
traceable to an originator. This PR adds created_by to role_permission,
which will show which user assigned a permission to a role.
This change adds a property to the segmentStrategiesSchema to make sure
that change request strategies are listed in the openapi spec
It also renames the files that contains that schema and its tests from
`admin-strategies-schema` to `segment-strategies-schema`.
Adding new project overview endpoint and deprecating the old one.
The new one has extra info about feature types, but does not have
features anymore, because features are coming from search endpoint.
## About the changes
Add user ids to group changes. This also modifies the payload of group created to include only the user id and creates events for SSO sync functionality
This adds more data to the setting events, so that its possible to see
what has changed
Used to look like:
```
{
"id": "maintenance.mode"
}
```
Now it looks like this:
```
{
"id": "maintenance.mode",
"enabled": false
}
```
because this is setting events, the default behaviour is to hide the content.
This PR checks that the unleash instance is an enterprise instance
before fetching change request data. This is to prevent Change Request
usage from preventing OSS users from deleting segments (when they don't
have access to change requests).
This PR also does a little bit of refactoring (which we can remove if
you want)
This PR updates the returned value about segments to also include the CR
title and to be one list item per strategy per change request. This
means that if the same strategy is used multiple times in multiple
change requests, they each get their own line (as has been discussed
with Nicolae).
Because of this, this pr removes a collection step in the query and
fixes some test cases.
The previous check would return `false` if the value was 0, causing a
bug where the usage data wouldn't be included.
This also adds tests to ensure that usage data for CR segments is
propagated correctly because that's where I first encountered the issue.
Before this fix, if the values were 0, the data would display like the
bottom element in the screenshot:
![image](https://github.com/Unleash/unleash/assets/17786332/9642b945-12c4-4217-aec9-7fef4a88e9af)
- Create 2 new events to replace the SCHEDULED_CHANGE_REQUEST_EXECUTED
event
- Handle the 3 events in slack-app and webhook addon definitions
3 events handled:
- CHANGE_REQUEST_SCHEDULED
- CHANGE_REQUEST_SCHEDULED_APPLICATION_SUCCESS
- CHANGE_REQUEST_SCHEDULED_APPLICATION_FAILURE
Closes #
[1-1555](https://linear.app/unleash/issue/1-1555/update-change-request-scheduled-and-scheduled-change-request-executed)
Note: SCHEDULED_CHANGE_REQUEST_EXECUTED will be removed in follow up PR
not to break current enterprise build
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
This PR addresses some cleanup related to removing the
useLastSeenRefactor flag:
* Added fallback last seen to the feature table last_seen_at column
* Remove foreign key on environment since we can not guarantee that we
will get valid data in this field
* Add environments to cleanup function
* Add test for cleanup environments
This PR changes the behavior of the API a little bit. Instead of
removing any strategies from `changeRequestStrategies` that are also
in `strategies`, we keep them in instead.
The reason for this is that the overview of where a segment is used is
incomplete if it shows only strategies but not CRs. Imagine this:
You want to delete a segment, but you're told it's only used in strategy
S.
So you go and remove it from strategy S, but then you're told it's
suddenly used in CRs A, B, and C. This is now a two-step operation
with a bad surprise. Instead, we could show you immediately that this
segment is used in strategy S and CRs A, B, and C.
Otherwise, we might accidentally display CR data to open source users.
But more importantly, it might keep them from being able to delete a
segment that's in use by a CR in their database that they can't touch.
So by checking that they're on an enterprise instance, we avoid this
potential blocker.
I've added the `includeChangeRequestUsageData` parameter as a boolean
now, but I'm open to other suggestions.
This PR handles the case where a single strategy is used in multiple
change requests. Instead of listing the strategy several times in the
output, we consolidate the entries and add a new `changeRequestIds`
property. This is a non-empty list that points to all the change
requests it is used in.
This is required for us to be able to link back to the change requests
from the UI overview.
This change is just a refactor, removing code that's no longer used. Instead of
checking just whether a segment is in use, we now extract the list of
strategies that use this segment. This is slightly more costly,
perhaps, but it will be necessary for the upcoming implementation.
This PR changes the payload of the strategiesBySegment endpoint when the
flag is active. In addition to returning just the strategies, the object
will also contain a new property, called `changeRequestStrategies`
containing the strategies that are used in change requests.
This PR does not update the schema. That can be done later when the
changes go into beta. This also allows us some time to iterate on the
payload without changing the public API.
## Discussion points:
Should `strategies` and `changeRequestStrategies` ever contain
duplicates? Take this scenario:
- Strategy S uses segment T.
- There is an open change request that updates the list of segments for
S to T and a new segment U.
- In this case, strategy S would show up both in `strategies` _and_ in
`changeRequestStrategies`.
We have two options:
1. Filter the list of change request strategies, so that they don't
contain any duplicates (this is currently how it's implemented)
2. Ignore the duplicates and just send both lists as is.
We're doing option 2 for now.
Removing a user from a project was impossible if you only had 1 owner.
It worked fine when having more than an owner. This should fix it and
we'll add tests later
In short the issue is that after our last seen improvements, we did not
update where we are getting last_seen field. It was still using features
table, which is not the source of last seen anymore.
## PR Description
https://linear.app/unleash/issue/2-1645/address-post-mortem-action-point-all-flags-should-be-runtime
Refactor with the goal of ensuring that flags are runtime controllable,
mostly focused on the current scheduler logic.
This includes the following changes:
- Moves scheduler into its own "scheduler" feature folder
- Reverts dependency: SchedulerService takes in the MaintenanceService,
not the other way around
- Scheduler now evaluates maintenance mode at runtime instead of relying
only on its mode state (active / paused)
- Favors flag checks to happen inside the scheduled methods, instead of
controlling whether the method is scheduled at all (favor runtime over
startup)
- Moves "account last seen update" to scheduler
- Updates tests accordingly
- Boyscouting
Here's a manual test showing this behavior, where my local instance was
controlled by a remote instance. Whenever I toggle `maintenanceMode`
through a flag remotely, my scheduled functions stop running:
https://github.com/Unleash/unleash/assets/14320932/ae0a7fa9-5165-4c0b-9b0b-53b9fb20de72
Had a look through all of our current flags and it *seems to me* that
they are all used in a runtime controllable way, but would still feel
more comfortable if this was double checked, since it can be complex to
ensure this.
The only exception to this was `migrationLock`, which I believe is OK,
since the migration only happens at the start anyways.
## Discussion / Questions
~~Scheduler `mode` (active / paused) is currently not *really* being
used, along with its respective methods, except in tests. I think this
could be a potential footgun. Should we remove it in favor of only
controlling the scheduler state through maintenance mode?~~ Addressed in
7c52e3f638
~~The config property `disableScheduler` is still a startup
configuration, but perhaps that makes sense to leave as is?~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
Are there any other tests we should add?
Is there anything I missed?
Identified some `setInterval` and `setTimeout` that may make sense to
leave as is instead of moving over to the scheduler service:
- ~~`src/lib/metrics` - This is currently considered a `MetricsMonitor`.
Should this be refactored to a service instead and adapt these
setIntervals to use the scheduler instead? Is there anything special
with this we need to take into account? @chriswk @ivarconr~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1820501511)
by @ivarconr, leaving as is.
- ~~`src/lib/proxy/proxy-repository.ts` - This seems to have a complex
and specific logic currently. Perhaps we should leave it alone for now?
@FredrikOseberg~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
- `src/lib/services/user-service.ts` - This one also seems to be a bit
more specific, where we generate new timeouts for each receiver id.
Might not belong in the scheduler service. @Tymek
This PR adds the ability to detect which strategies use a specific
segment in active change requests.
It does not wire this functionality up to anything just yet. Follow-up
PRs will integrate this with the segment service and eventually with the
front end.
The issue was that we all features were created exactly in same time,
and our feature counter waas expecting time to be unique to feature,
which was not the case.
Instead of throwing an error when the project doesn't exist, we say that
the names are valid, because we have nothing to say that they're not.
Presumably there is already something in place to prevent you from
importing into a non-existent project.
## About the changes
This feature allows our Enterprise customers to configure banners to be
displayed on their Unleash instance for all their users to see and
interact with. Previously known as "internal message banners".
Optimizations:
1. Removed extra round trip to database to count environments
2. Removed extra round trip to database to count features
Fixes:
Currently, we were using a very optimistic query to set correct limit
and offset. This breaks as soon we we join tags.
` query = query
.select(selectColumns)
.limit(limit * environmentCount)
.offset(offset * environmentCount);`
The solution was to use common table expressions, so we could count and
rank features.
Rename event to SCHEDULED_CHANGE_REQUEST_EXECUTED
This event will be triggered when the executor runs a scheduled change
request.
The ChangeRequestApplied event will remain as is (going out to project
members - but will have a scheduled = true property in the data if it
scheduled.
This new event will fire on execution of the schedule and have a result
= "failed" | "succeeded" property.
Because notifications are tied to events, this notification will go out
to the creator and the applier
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
This PR updates the segment usage counting to also include segment usage
in pending change requests.
The changes include:
- Updating the schema to explicitly call out that change request usage
is included.
- Adding two tests to verify the new features
- Writing an alternate query to count this data
Specifically, it'll update the part of the UI that tells you how many
places a segment is used:
![image](https://github.com/Unleash/unleash/assets/17786332/a77cf932-d735-4a13-ae43-a2840f7106cb)
## Implementation
Implementing this was a little tricky. Previously, we'd just count
distinct instances of feature names and project names on the
feature_strategy table. However, to merge this with change request data,
we can't just count existing usage and change request usage separately,
because that could cause duplicates.
Instead of turning this into a complex DB query, I've broken it up into
a few separate queries and done the merging in JS. I think that's more
readable and it was easier to reason about.
Here's the breakdown:
1. Get the list of pending change requests. We need their IDs and their
project.
2. Get the list of updateStrategy and addStrategy events that have
segment data.
3. Take the result from step 2 and turn it into a dictionary of segment
id to usage data.
4. Query the feature_strategy_segment and feature_strategies table, to
get existing segment usage data
5. Fold that data into the change request data.
6. Perform the preexisting segment query (without counting logic) to get
other segment data
7. Enrich the results of the query from step 2 with usage data.
## Discussion points
I feel like this could be done in a nicer way, so any ideas on how to
achieve that (whether that's as a db query or just breaking up the code
differently) is very welcome.
Second, using multiple queries obviously yields more overhead than just
a single one. However, I do not think this is in the hot path, so I
don't consider performance to be critical here, but I'm open to hearing
opposing thoughts on this of course.
https://linear.app/unleash/issue/SR-169/ticket-1107-project-feature-flag-limit-is-not-correctly-updatedFixes#5315, an issue where it would not be possible to set an empty
flag limit.
This also fixes the UI behavior: Before, when the flag limit field was
emptied, it would disappear from the UI.
I'm a bit unsure of the original intent of the `(data.defaultStickiness
!== undefined || data.featureLimit !== undefined)` condition. We're in
an update method, triggered by a PUT endpoint - I think it's safe to
assume that we'll always want to set these values to whatever they come
as, we just need to convert them to `null` in case they are not present
(i.e. `undefined`).
This fixes an edge case not caught originally in
https://github.com/Unleash/unleash/pull/5304 - When creating a new
segment on the global level:
- There is no `projectId`, either in the params or body
- The `UPDATE_PROJECT_SEGMENT` is still a part of the permissions
checked on the endpoint
- There is no `id` on the params
This made it so that we would run `segmentStore.get(id)` with an
undefined `id`, causing issues.
The fix was simply checking for the presence of `params.id` before
proceeding.
This PR hooks up the changes introduced in #5301 to the API and puts
them behind a feature flag. A new test has been added and the test setup
has been slightly tweaked to allow this test.
When the flag is enabled, the API will now not let you delete a segment
that's used in any active CRs.
https://linear.app/unleash/issue/SR-164/ticket-1106-user-with-createedit-project-segment-is-not-able-to-edit-a
Fixes a bug where the `UPDATE_PROJECT_SEGMENT` permission is not
respected, both on the UI and on the API. The original intention was
stated
[here](https://github.com/Unleash/unleash/pull/3346#discussion_r1140434517).
This was easy to fix on the UI, since we were simply missing the extra
permission on the button permission checks.
Unfortunately the API can be tricky. Our auth middleware tries to grab
the `project` information from either the params or body object, but our
`DELETE` method does not contain this information. There is no body and
the endpoint looks like `/admin/segments/:id`, only including the
segment id.
This means that, in the rbac middleware when we check the permissions,
we need to figure out if we're in such a scenario and fetch the project
information from the DB, which feels a bit hacky, but it's something
we're seemingly already doing for features, so at least it's somewhat
consistent.
Ideally what we could do is leave this API alone and create a separate
one for project segments, with endpoints where we would have project as
a param, like so:
`http://localhost:4242/api/admin/projects/:projectId/segments/1`.
This PR opts to go with the quick and hacky solution for now since this
is an issue we want to fix quickly, but this is something that we should
be aware of. I'm also unsure if we want to create a new API for project
segments. If we decide that we want a different solution I don't mind
either adapting this PR or creating a follow up.
This test was flaky because it relied on the order of the array
returned. To make it less flaky, we now turn the array into an object
instead and compare that.
This PR adds a way to tell if a specific segment is being used in any
active change requests. It's the first step towards preventing segments
that are being used in change requests from being deleted.
It does that by checking the db for any unclosed CRs and using those CR
ids to look for "addStrategy" and "updateStrategy" events in the cr
events table.
## Upcoming PRs
This only puts in a way to detect it, but doesn't add that to anything.
That'll be in an upcoming iteration.
The `dataPath` was present (but not in the type) in previous versions of
the
error library that we use. But with the recent major upgrade, it's
been removed and the `instancePath` property has finally come into use.
This PR removes all the handling for the previous property and
replaces it with `instancePath`. Because the `dataPath` used full
stops and the `instancePath` uses slashes, we need to change a little
bit of the handling too.
Switch the express-openapi implementation from our internal fork to the
upstream version. We have upstreamed our changes and a new version has
been released, so this should be the last step before we can retire our
fork.
Because some of the dependencies have been updated since our internal
fork, we also need to update some of our error handling to reflect this.
Expose new interface while also getting rid of unneeded compiler ignores
None of the changes should add new security risks, despite this report:
> Code scanning results / CodeQL Failing after 4s — 2 new alerts
including 2 high severity security vulnerabilities
Not sure what that means, maybe a removed ignore...
Sort the items before inserting them into the database in order to
reduce the chance of deadlocks happening when multiple pods are
inserting at the same time.
For a while we ran a diffing algorithm in production to verify that the
results of the refactor did not differ from the previous results. As the
experiment has run it's course and new attributes have been added on top
of the new flow, this will remove the logging and associated code.
`EXECUTE FUNCTION` was introduced in Postgres v11. In Postgres v10 the
syntax was `EXECUTE PROCEDURE`. This fix changes the syntax to `EXECUTE
PROCEDURE`, which is perfectly fine sense our function does not return
anything.
### What
This PR makes the rate limit for user creation and simple login (our
password based login) configurable in the same way you can do
metricsRateLimiting.
### Worth noting
In addition this PR adds a `rate_limit{endpoint, method}` prometheus
gauge, which gets the data from the UnleashConfig.
This PR adds a db table for CR schedules. The table has two columns:
1. `change_request` :: This acts as both a foreign key and as the
primary key for this table.
2. `scheduled_at` :: When the change is scheduled to be applied.
We could use a separate ID column for these rows and put a `unique`
constraint on the `change_request` FK, but I don't think that adds any
more value. However, I'm happy to hear other thoughts around it.
As #4475 says, MD5 is not available in secure places anymore. This PR
swaps out gravatar-url with an inline function using crypto:sha256 which
is FIPS-140-2 compliant. Since we only used this method for generating
avatar URLs the extra customization wasn't needed and we could hard code
the URL parameters.
fixes: Linear
https://linear.app/unleash/issue/SR-112/gh-support-swap-out-gravatar-url-libcloses: #4475
To prepare for 5.6 GA,
I've done a find through both Frontend and Backend here to remove the
usages of the flag. Seems like the flag was only in use in the frontend.
@nunogois can you confirm?
This PR adds a cleanup job that removes unknown feature flags from
last_seen_at_metrics table every 24 hours since we no longer have a
foreign key on the name column in the features table.
## About the changes
This fixes a bug updating a project, when optional data
(defaultStickiness and featureLimit are not part of the payload).
The problem happens due to:
1. ProjectController does not use the type: UpdateProjectSchema for the
request body (will be addressed in another PR in unleash-enterprise)
2. Project Store interface does not match UpdateProjectSchema (but it
relies on accepting `additional properties: true`, which is what we
agreed on for input)
3. Feature limit is not defined in UpdateProjectSchema (also addressed
in the other PR)
https://linear.app/unleash/issue/2-1531/rename-message-banners-to-banners
This renames "message banners" to "banners".
I also added support for external banners coming from a `banner` flag
instead of only `messageBanner` flag, so we can eventually migrate to
the new one in the future if we want.
## About the changes
This makes sure that projects have at least one owner, either a group or
a user. This is to prevent accidentally losing access to a project.
We check this when removing a user/group or when changing the role of a
user/group
**Note**: We can still leave a group empty as the only owner of the
project, but that's okay because we can still add more users to the
group
Sort array items before running compare. Feature flag certain properties
of strategy that were previously not present in the /api/admin/features
endpoint.
### What
The heaviest requests we serve are the register and metrics POSTs from
our SDKs/clients.
This PR adds ratelimiting to /api/client/register, /api/client/metrics,
/api/frontend/register and /api/frontend/metrics with a default set to
6000 requests per minute (or 100 rps) for each of the endpoints.
It will be overrideable by the environment variables documented.
### Points of discussion
@kwasniew already suggested using featuretoggles with variants to
control the rate per clientId. I struggled to see if we could
dynamically update the middleware after initialisation, so this attempt
will need a restart of the pod to update the request limit.
## About the changes
This small improvement aims to help developers when instantiating
services. They need to be constructed without injecting services or
stores created elsewhere so they can be bound to the same transactional
scope.
This suggests that you need to create the services and stores on your
own
This fixes a return type error by changing the logic of
`extractUsernameFromUser` to never return undefined.
In the previous code, `user` could be truthy, but that doesn't mean
`email` or `username` were defined. This assumes we always fallback to
"unknown" in those scenarios.
This PR is the first step in separating the client and admin stores.
Currently our feature toggle services uses the client store to serve
multiple purposes.
Admin API uses the feature toggle service to serve both the feature
toggle list and playground features, while the client API uses the
feature toggle service to serve client features. The admin API can
change often and have very different requirements than the client API,
which changes infrequently and generally keeps the same stable structure
for long periods of time. This architecture is error prone, because when
you need to make changes to the admin API, you can very easily affect
the client API.
I aim to put up a stone wall between the two APIs. Complete separation
between the two APIs, at the cost of some duplication.
In this PR I have created a feature oriented architecture for client
features and disconnected the client API from the feature toggle
service. It now goes through it's own service to it's own store. For
feature toggle service I have duplicated and replaced the functionality
that serves /api/admin/features, I have kept a lot of the ugliness in
the code and haven't removed anything in order to avoid breaking
changes.
Next steps:
* Move playground to admin API
* Remove client-feature-toggle-store from feature-toggle-service
## About the changes
This splits the interfaces for import and export, especially because the
import functionality has to be replaced in enterprise repo.
This is a breaking change because of the service renames, but I'll have
the PR for the other repository ready so we reduce the time to fix. I
intentionally avoided doing it backward compatible because of time.
https://linear.app/unleash/issue/2-1494/re-order-message-banners
- Re-orders message banners to fit into this logic:
>1. Maintenance banner
>2. External message banner(s) - Most likely coming from Unleash
>3. Internal message banner(s)
- Renames the feature flag to better reflect the feature behavior;
- Lays a basic skeleton structure for this new feature;
As part of more telemetry on the usage of Unleash.
This PR adds a new `stat_` prefixed table as well as a trigger on the
events table trigger on each insert to increment a counter per
environment per day.
The trigger will trigger on every insert into the events base, but will
filter and only increment the counter for events that actually have the
environment set. (there are events, like user-created, that does not
relate to a specific environment).
Bit wary on this, but since we truncate down to row per (day,
environment) combo, finding conflict and incrementing shouldn't take too
long here.
@ivarconr was it something like this you were considering?
This PR cleans up and refactors the feature-strategy-store method
getFeatureOverview to join on the new table and attempts to make the
function more readable by extracting some of the logic into separate
functions. Keeping the LastSeenMapper for now in case there is a reason
to use it for the other endpoints.
## About the changes
Segment changes in predata and data columns were both showing the new
segments list
Adds formatting of what's changed with segments to feature strategy
update events, so when a user changes the strategy from using
constraints, to using segments instead, it's communicated in event
updates
results in:
admin updated
[sample-toggle](http://localhost/projects/default/features/sample-toggle)
in project [default](http://localhost/projects/default) by updating
strategy Sample Strategy in development constraints from [userId is one
of (1,2,3)] to empty set of constraints; segments from empty set of
segments to (1)
Closes #
#4912
### Important files
- `src/lib/services/feature-toggle-service.ts` - Segment changes in
preData and data
- `src/lib/addons/feature-event-formatter-md.ts` - Formatting segments
## Discussion points
This is an SR least effort PR - we should plan a task where we look at
how to render this list of segments in a more comprehensible way (it's
just rendering ids now)
Fixes an issue where SSO group sync would delete a syncable group that a
user was manually added to
## Discussion points
Is this the longterm fix for this? Or would we want another column in
the mapping table for future-proofing this?
## About the changes
This transactional implementation decorates a service with a
transactional method that removes the need to start transactions in the
method using the service.
This is a gradual rollout with a feature toggle, just because
transactions are not easy.
https://linear.app/unleash/issue/2-1253/add-support-for-more-events-in-the-slack-app-integration
Adds support for a lot more events in our integrations. Here is how the
full list looks like:
- ADDON_CONFIG_CREATED
- ADDON_CONFIG_DELETED
- ADDON_CONFIG_UPDATED
- API_TOKEN_CREATED
- API_TOKEN_DELETED
- CHANGE_ADDED
- CHANGE_DISCARDED
- CHANGE_EDITED
- CHANGE_REQUEST_APPLIED
- CHANGE_REQUEST_APPROVAL_ADDED
- CHANGE_REQUEST_APPROVED
- CHANGE_REQUEST_CANCELLED
- CHANGE_REQUEST_CREATED
- CHANGE_REQUEST_DISCARDED
- CHANGE_REQUEST_REJECTED
- CHANGE_REQUEST_SENT_TO_REVIEW
- CONTEXT_FIELD_CREATED
- CONTEXT_FIELD_DELETED
- CONTEXT_FIELD_UPDATED
- FEATURE_ARCHIVED
- FEATURE_CREATED
- FEATURE_DELETED
- FEATURE_ENVIRONMENT_DISABLED
- FEATURE_ENVIRONMENT_ENABLED
- FEATURE_ENVIRONMENT_VARIANTS_UPDATED
- FEATURE_METADATA_UPDATED
- FEATURE_POTENTIALLY_STALE_ON
- FEATURE_PROJECT_CHANGE
- FEATURE_REVIVED
- FEATURE_STALE_OFF
- FEATURE_STALE_ON
- FEATURE_STRATEGY_ADD
- FEATURE_STRATEGY_REMOVE
- FEATURE_STRATEGY_UPDATE
- FEATURE_TAGGED
- FEATURE_UNTAGGED
- GROUP_CREATED
- GROUP_DELETED
- GROUP_UPDATED
- PROJECT_CREATED
- PROJECT_DELETED
- SEGMENT_CREATED
- SEGMENT_DELETED
- SEGMENT_UPDATED
- SERVICE_ACCOUNT_CREATED
- SERVICE_ACCOUNT_DELETED
- SERVICE_ACCOUNT_UPDATED
- USER_CREATED
- USER_DELETED
- USER_UPDATED
I added the events that I thought were relevant based on my own
discretion. Know of any event we should add? Let me know and I'll add it
🙂
For now I only added these events to the new Slack App integration, but
we can add them to the other integrations as well since they are now
supported.
The event formatter was refactored and changed quite a bit in order to
make it easier to maintain and add new events in the future. As a
result, events are now posted with different text. Do we consider this a
breaking change? If so, I can keep the old event formatter around,
create a new one and only use it for the new Slack App integration.
I noticed we don't have good 404 behaviors in the UI for things that are
deleted in the meantime, that's why I avoided some links to specific
resources (like feature strategies, integration configurations, etc),
but we could add them later if we improve this.
This PR also tries to add some consistency to the the way we log events.
## About the changes
In our staging setup, we create ad-hoc environments and import Unleash
state from production. After unleash is deployed on such environment,
the import job kicks in and feeds Unleash instance with feature flags
from production.
Between Unleash being up and running and the import job running, some
applications start polling Unleash. They get an empty feature toggle
list with `meta.revisionId=0`. Then apps use this as part of `eTag`
header in subsequent requests. Even though after import Unleash server
finally has toggles to serve, it doesn't because it calculates _max
revision id_ based on toggle updates (not null `feature_name` column in
query) or `SEGMENT_UPDATED`.
This change adds an extra condition to query so feature toggles import
is considered something that should invalidate the cache.
This commit changes our linter/formatter to biome (https://biomejs.dev/)
Causing our prehook to run almost instantly, and our "yarn lint" task to
run in sub 100ms.
Some trade-offs:
* Biome isn't quite as well established as ESLint
* Are we ready to install a different vscode plugin (the biome plugin)
instead of the prettier plugin
The configuration set for biome also has a set of recommended rules,
this is turned on by default, in order to get to something that was
mergeable I have turned off a couple the rules we seemed to violate the
most, that we also explicitly told eslint to ignore.
## About the changes
This fixes a bunch of openHandles from our tests
I've used this script to find out the ones that leave them:
`find src -name "*.test.ts" -printf "%f\n" | xargs -i sh -c "echo =====
{} && yarn test {}"`
If there's an issue, the script will halt and the last filename will be
the one that has to be fixed.
Each commit fixes one problem so it's easy to review
## About the changes
Add partial index on events by announced. This should help avoid `Seq
Scan on events` when the majority of events are announced=true
---
Co-authored-by: Ivar Østhus <ivar@getunleash.io>
Co-authored-by: Gard Rimestad <gard@getunleash.io>
## About the changes
When the events table is large we might be doing a full table scan
searching for unannounced events. We spotted it due to a performance
alert and confirmed in AWS performance insights
![image](https://github.com/Unleash/unleash/assets/455064/8e815fa3-7a1b-4453-881a-98a148eae119)
The proposal is to limit this operation to 500 events (rule of thumb)
per round
f82ae354eb/src/lib/services/index.ts (L141-L147)
and also ignore the events older than a day (because it seems
reasonable)
## Discussion points
**Idea**: split the `events` table into `recent_events` and
`historical_events`. Recent can be anything from a day/week/month. This
would help with recurrent queries that rely on recent data from the
event's table such as optimal 304 calculation or event this scheduled
task that sends unannounced events.
https://linear.app/unleash/issue/2-1403/consider-refactoring-the-way-tags-are-fetched-for-the-events
This adds 2 methods to `EventService`:
- `storeEvent`;
- `storeEvents`;
This allows us to run event-specific logic inside these methods. In the
case of this PR, this means fetching the feature tags in case the event
contains a `featureName` and there are no tags specified in the event.
This prevents us from having to remember to fetch the tags in order to
store feature-related events except for very specific cases, like the
deletion of a feature - You can't fetch tags for a feature that no
longer exists, so in that case we need to pre-fetch the tags before
deleting the feature.
This also allows us to do any event-specific post-processing to the
event before reaching the DB layer.
In general I think it's also nicer that we reference the event service
instead of the event store directly.
There's a lot of changes and a lot of files touched, but most of it is
boilerplate to inject the `eventService` where needed instead of using
the `eventStore` directly.
Hopefully this will be a better approach than
https://github.com/Unleash/unleash/pull/4729
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Improvement to the description of the datadog integration. Adds 2
missing event types, removes an event type that is deprecated and about
to be completely removed, adds missing description of extra json headers
and source type name, and adds description for the new configuration
option for JSON body support
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
https://linear.app/unleash/issue/2-1235/docs-slack-app-integration-documentation
This adds a new reference doc for the new Unleash Slack App integration
and marks the previous Slack integration as deprecated.
As a side-effect this PR also fixes an issue where we wouldn't be able
to delete tags with special characters.
---------
Co-authored-by: David Leek <david@getunleash.io>
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
We love all open-source Unleash users. in 2022 we built the [segment
capability](https://docs.getunleash.io/reference/segments) (v4.13) as an
enterprise feature, simplify life for our customers.
Now it is time to contribute it to the world 🌏
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
## About the changes
Adds optional support for specifying JSON templates for datadog message
payload
![image](https://github.com/Unleash/unleash/assets/707867/eb7c838a-7abf-441e-972e-ddd7ada07efa)
### Important files
<!-- PRs can contain a lot of changes, but not all changes are equally
important. Where should a reviewer start looking to get an overview of
the changes? Are any files particularly important? -->
`frontend/src/component/integrations/IntegrationForm/IntegrationParameters/IntegrationParameter/IntegrationParameterEnableWithDropdown.tsx`
- a new component comprising of a text field and a dropdown menu
`src/lib/addons/datadog.ts` - Where the integration is taking place
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
- Should I have implemented the new component type as a specifiable
addon parameter type in definitions? Felt a bit YAGNI/Premature
- Would like input on naming and the new component etc
## About the changes
This enables us to use names instead of permission ids across all our
APIs at the computational cost of searching for the ids in the DB but
improving the API user experience
## Open topics
We're using methods that are test-only and circumvent our business
logic. This makes our test to rely on assumptions that are not always
true because these assumptions are not validated frequently.
i.e. We are expecting that after removing a permission it's no longer
there, but to test this, the permission has to be there before:
78273e4ff3/src/test/e2e/services/access-service.e2e.test.ts (L367-L375)
But it seems that's not the case.
We'll look into improving this later.
## About the changes
Open API code generator does not get along with `oneOf` alongside
`properties`:
```shell
$ openapi-generator-cli validate -i modified-openapi.json --recommend
Validating spec (modified-openapi.json)
Warnings:
- Schemas defining properties and oneOf are not clearly defined in the OpenAPI
Specification. While our tooling supports this, it may cause issues with other tools.
```
bab67e44e4/modules/openapi-generator/src/main/java/org/openapitools/codegen/validations/oas/OpenApiSchemaValidations.java (L25-L29)
This PR adds a meta-schema rule to validate this and fixes one issue
## About the changes
- `getActiveUsers` is using multiple stores, so it is refactored into
read-model
- Refactored Instance stats service into `features` to co-locate related
code
Closes https://linear.app/unleash/issue/UNL-230/active-users-prometheus
### Important files
`src/lib/features/instance-stats/getActiveUsers.ts`
## Discussion points
`getActiveUsers` is coded less _class-based_ then previous similar
read-models. In one file instead of 3 (read-model interface, fake read
model, sql read model). I find types and functions way more readable,
but I'm ready to refactor it to interfaces and classes if consistency is
more important.
https://linear.app/unleash/issue/2-1393/drop-the-always-post-to-default-channels-field
This drops the "Always post to default channels" field in the Slack App
integration in favor of always posting to the configured channels. This
should simplify the configuration of this integration.
Here's a breakdown of the logic with this change:
- Always post to the configured Slack channels, regardless of tags;
- Tags are still respected. E.g. if we have a configured channel
"channel-1" and a tag for "channel-2", then we post to both channels;
- As channels are optional, if you would like to skip default channels
for certain events and handle everything through tags, you can just
create a new configuration without any default channels;
This also updates the labels and changes the tests to better reflect the
intended behavior.
![image](https://github.com/Unleash/unleash/assets/14320932/a2427bdd-4b92-44b3-9bad-8adb0f94c34d)
https://linear.app/unleash/issue/2-1401/misc-fixes-and-improvements-related-to-the-new-slack-app-integration
This includes multiple UI-related misc fixes and improvements that are
not only related with the new Slack App integration but also
integrations in general.
- Improves the styling in the "how does it work" section;
- Improves the text in the `IntegrationMultiSelector`s;
- Switches "Configure" and "Open" around to match designs;
- Properly handles click event on `IntegrationCardMenu` (fix navigation
on dialog click);
- Fixes titles and contents for "enable/disable" and "delete"
integration dialogs to match designs;
- Updates Slack App integration "how does it work" section to better
reflect the intended behavior;
- Removes redundant alerts after previous point;
- Adds an alert in the old Slack integration configuration warning of
its deprecation and suggesting the new Slack App integration instead;
- Fixes typos;
- Slight refactors;
![image](https://github.com/Unleash/unleash/assets/14320932/17b09742-f00b-4be2-829f-8248ffe67996)
Co-authored by @nicolaesocaciu
Seems like when 2 pods are trying to POST lastSeen metrics, the db gets
into a deadlock state.
This is an attempt to fix the deadlock by sorting the toggleNames before
the update.
The hypothesis is that sorted toggle names will reduce the chance of
working on the same row at the same exact time
Closes #
[1-1382](https://linear.app/unleash/issue/1-1382/order-data-before-updating-the-lastseen-to-reduce-change-of-deadlock)
Signed-off-by: andreas-unleash <andreas@getunleash.ai>