This PR allows you to gradually lower constraint values, even if they're
above the limits.
It does, however, come with a few caveats because of how Unleash deals
with constraints:
Constraints are just json blobs. They have no IDs or other
distinguishing features. Because of this, we can't compare the current
and previous state of a specific constraint.
What we can do instead, is to allow you to lower the amount of
constraint values if and only if the number of constraints hasn't
changed. In this case, we assume that you also haven't reordered the
constraints (not possible from the UI today). That way, we can compare
constraint values between updated and existing constraints based on
their index in the constraint list.
It's not foolproof, but it's a workaround that you can use. There's a
few edge cases that pop up, but that I don't think it's worth trying to
cover:
Case: If you **both** have too many constraints **and** too many
constraint values
Result: You won't be allowed to lower the amount of constraints as long
as the amount of strategy values is still above the limit.
Workaround: First, lower the amount of constraint values until you're
under the limit and then lower constraints. OR, set the constraint you
want to delete to a constraint that is trivially true (e.g. `currentTime
> yesterday` ). That will essentially take that constraint out of the
equation, achieving the same end result.
Case: You re-order constraints and at least one of them has too many
values
Result: You won't be allowed to (except for in the edge case where the
one with too many values doesn't move or switches places with another
one with the exact same amount of values).
Workaround: We don't need one. The order of constraints has no effect on
the evaluation.
https://linear.app/unleash/issue/2-2450/register-integration-events-webhook
Registers integration events in the **Webhook** integration.
Even though this touches a lot of files, most of it is preparation for
the next steps. The only actual implementation of registering
integration events is in the **Webhook** integration. The rest will
follow on separate PRs.
Here's an example of how this looks like in the database table:
```json
{
"id": 7,
"integration_id": 2,
"created_at": "2024-07-18T18:11:11.376348+01:00",
"state": "failed",
"state_details": "Webhook request failed with status code: ECONNREFUSED",
"event": {
"id": 130,
"data": null,
"tags": [],
"type": "feature-environment-enabled",
"preData": null,
"project": "default",
"createdAt": "2024-07-18T17:11:10.821Z",
"createdBy": "admin",
"environment": "development",
"featureName": "test",
"createdByUserId": 1
},
"details": {
"url": "http://localhost:1337",
"body": "{ \"id\": 130, \"type\": \"feature-environment-enabled\", \"createdBy\": \"admin\", \"createdAt\": \"2024-07-18T17: 11: 10.821Z\", \"createdByUserId\": 1, \"data\": null, \"preData\": null, \"tags\": [], \"featureName\": \"test\", \"project\": \"default\", \"environment\": \"development\" }"
}
}
```
This PR updates the limit validation for constraint numbers on a single
strategy. In cases where you're already above the limit, it allows you
to still update the strategy as long as you don't add any **new**
constraints (that is: the number of constraints doesn't increase).
A discussion point: I've only tested this with unit tests of the method
directly. I haven't tested that the right parameters are passed in from
calling functions. The main reason being that that would involve
updating the fake strategy and feature stores to sync their flag lists
(or just checking that the thrown error isn't a limit exceeded error),
because right now the fake strategy store throws an error when it
doesn't find the flag I want to update.
https://linear.app/unleash/issue/2-2453/validate-patched-data-against-schema
This adds schema validation to patched data, fixing potential issues of
patching data to an invalid state.
This can be easily reproduced by patching a strategy constraints to be
an object (invalid), instead of an array (valid):
```sh
curl -X 'PATCH' \
'http://localhost:4242/api/admin/projects/default/features/test/environments/development/strategies/8cb3fec6-c40a-45f7-8be0-138c5aaa5263' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '[
{
"path": "/constraints",
"op": "replace",
"from": "/constraints",
"value": {}
}
]'
```
Unleash will accept this because there's no validation that the patched
data actually looks like a proper strategy, and we'll start seeing
Unleash errors due to the invalid state.
This PR adapts some of our existing logic in the way we handle
validation errors to support any dynamic object. This way we can perform
schema validation with any object and still get the benefits of our
existing validation error handling.
This PR also takes the liberty to expose the full instancePath as
propertyName, instead of only the path's last section. We believe this
has more upsides than downsides, especially now that we support the
validation of any type of object.
![image](https://github.com/user-attachments/assets/f6503261-f6b5-4e1d-9ec3-66547d0d061f)
This PR adds prometheus metrics for when users attempt to exceed the
limits for a given resource.
The implementation sets up a second function exported from the
ExceedsLimitError file that records metrics and then throws the error.
This could also be a static method on the class, but I'm not sure that'd
be better.
This PR updates the OpenAPI error converter to also work for errors with
query parameters.
We previously only sent the body of the request along with the error,
which meant that query parameter errors would show up incorrectly.
For instance given a query param with the date format and the invalid
value `01-2020-01`, you'd previously get the message:
> The `from` value must match format "date". You sent undefined
With this change, you'll get this instead:
> The `from` value must match format "date". You sent "01-2020-01".
The important changes here are two things:
- passing both request body and query params
- the 3 lines in `fromOpenApiValidationError` that check where we should
get the value you sent from.
The rest of it is primarily updating tests to send the right arguments
and some slight rewording to more accurately reflect that this can be
either request body or query params.
https://linear.app/unleash/issue/2-2435/create-migration-for-a-new-integration-events-table
Adds a DB migration that creates the `integration_events` table:
- `id`: Auto-incrementing primary key;
- `integration_id`: The id of the respective integration (i.e.
integration configuration);
- `created_at`: Date of insertion;
- `state`: Integration event state, as text. Can be anything we'd like,
but I'm thinking this will be something like:
- Success ✅
- Failed ❌
- SuccessWithErrors ⚠️
- `state_details`: Expands on the previous column with more details, as
text. Examples:
- OK. Status code: 200
- Status code: 429 - Rate limit reached
- No access token provided
- `event`: The whole event object, stored as a JSON blob;
- `details`: JSON blob with details about the integration execution.
Will depend on the integration itself, but for example:
- Webhook: Request body
- Slack App: Message text and an array with all the channels we're
posting to
I think this gives us enough flexibility to cover all present and
(possibly) future integrations, but I'd like to hear your thoughts.
I'm also really torn on what to call this table:
- `integration_events`: Consistent with the feature name. Addons are now
called integrations, so this would be consistent with the new thing;
- `addon_events`: Consistent with the existing `addons` table.
Our CSP reports that unsafe-inline is not recommended for styleSrc. This
PR adds a flag for making it possible to remove this element of our CSP
headers. It should allow us to see what (if anything) breaks hard.
We'll store hashes for the last 5 passwords, fetch them all for the user
wanting to change their password, and make sure the password does not
verify against any of the 5 stored hashes.
Includes some password-related UI/UX improvements and refactors. Also
some fixes related to reset password rate limiting (instead of an
unhandled exception), and token expiration on error.
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
PR #7519 introduced the pattern of using `createApiTokenService` instead
of newing it up. This usage was introduced in a concurrent PR (#7503),
so we're just cleaning up and making the usage consistent.
Deletes API tokens bound to specific projects when the last project they're mapped to is deleted.
---------
Co-authored-by: Tymoteusz Czech <2625371+Tymek@users.noreply.github.com>
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
If you have SDK tokens scoped to projects that are deleted, you should
not get access to any flags with those.
---------
Co-authored-by: David Leek <david@getunleash.io>
This PR adds a feature flag limit to Unleash. It's set up to be
overridden in Enterprise, where we turn the limit up.
I've also fixed a couple bugs in the fake feature flag store.
This adds an extended metrics format to the metrics ingested by Unleash
and sent by running SDKs in the wild. Notably, we don't store this
information anywhere new in this PR, this is just streamed out to
Victoria metrics - the point of this project is insight, not analysis.
Two things to look out for in this PR:
- I've chosen to take extend the registration event and also send that
when we receive metrics. This means that the new data is received on
startup and on heartbeat. This takes us in the direction of collapsing
these two calls into one at a later point
- I've wrapped the existing metrics events in some "type safety", it
ain't much because we have 0 type safety on the event emitter so this
also has some if checks that look funny in TS that actually check if the
data shape is correct. Existing tests that check this are more or less
preserved
This PR adds the back end for API token resource limits.
It adds the limit to the schema and checks the limit in the service.
## Discussion points
The PAT service uses a different service and different store entirely,
so I have not included testing any edge cases where PATs are included.
However, that could be seen as "knowing too much". We could add tests
that check both of the stores in tandem, but I think it's overkill for
now.
This PR updates the Unleash UI to use the new environment limit.
As it turns out, we already had an environment limit in the UI, but it
was hardcoded (luckily, its value is the same as the new default value
🥳).
In addition to the existing places this limit was used, it also disables
the "new environment" button if you've reached the limit. Because this
limit already exists, I don't think we need a flag for it. The only
change is that you can't click a button (that should be a link!) that
takes you to a page you can't do anything on.
This PR adds limits for environments to the resource limit schema. The
actual limiting will have to be done in Enterprise, however, so this is
just laying the groundwork.
This fixes the issue where project names that are 100 characters long
or longer would cause the project creation to fail. This is because
the resulting ID would be longer than the 100 character limit imposed
by the back end.
We solve this by capping the project ID to 90 characters, which leaves
us with 10 characters for the suffix, meaning you can have 1 billion
projects (999,999,999 + 1) that start with the same 90
characters (after slugification) before anything breaks.
It's a little shorter than what it strictly has to be (we could
probably get around with 95 characters), but at this point, you're
reaching into edge case territory anyway, and I'd rather have a little
too much wiggle room here.
This PR removes the last two flags related to the project managament
improvements project, making the new project creation form GA.
In doing so, we can also delete the old project creation form (or at
least the page, the form is still in use in the project settings).
This PR:
- adds a flag to anonymize user emails in the new project cards
- performs the anonymization using the existing `anonymise` function
that we have.
It does not anonymize the system user, nor does it anonymize groups. It
does, however, leave the gravatar url unchanged, as that is already
hashed (but we may want to hide that too).
This PR also does not affect the user's name or username. Considering
the target is the demo instance where the vast majority of users don't
have this (and if they do, they've chosen to set it themselves), this
seems an appropriate mitigation.
With the flag turned off:
![image](https://github.com/Unleash/unleash/assets/17786332/10a84562-c025-4e5c-b642-f949595b4e7e)
With the flag on:
![image](https://github.com/Unleash/unleash/assets/17786332/6fc35203-e2fa-4208-9650-0a87d3898996)
Fix project role assignment for users with `ADMIN` permission, even if
they don't have the Admin root role. This happens when e.g. users
inherit the `ADMIN` permission from a group root role, but are not
Admins themselves.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
This PR adds metrics tracking for:
- "maxConstraintValues": the highest number of constraint values that
are in use
- "maxConstraintsPerStrategy": the highest number of constraints used on
a strategy
It updates the existing feature strategy read model that returns max
metrics for other strategy-related things.
It also moves one test into a more fitting describe block.
Instead of running exists on every row, we are joining the exists, which
runs the query only once.
This decreased load time on my huge dataset from 2000ms to 200ms.
Also added tests that values still come through as expected.
Instead of running exists on every row, we are joining the exists, which
runs the query only once.
This decreased load time on my huge dataset from 2000ms to 200ms.
Also added tests that values still come through as expected.
**Upgrade to React v18 for Unleash v6. Here's why I think it's a good
time to do it:**
- Command Bar project: We've begun work on the command bar project, and
there's a fantastic library we want to use. However, it requires React
v18 support.
- Straightforward Upgrade: I took a look at the upgrade guide
https://react.dev/blog/2022/03/08/react-18-upgrade-guide and it seems
fairly straightforward. In fact, I was able to get React v18 running
with minimal changes in just 10 minutes!
- Dropping IE Support: React v18 no longer supports Internet Explorer
(IE), which is no longer supported by Microsoft as of June 15, 2022.
Upgrading to v18 in v6 would be a good way to align with this change.
TS updates:
* FC children has to be explicit:
https://stackoverflow.com/questions/71788254/react-18-typescript-children-fc
* forcing version 18 types in resolutions:
https://sentry.io/answers/type-is-not-assignable-to-type-reactnode/
Test updates:
* fixing SWR issue that we have always had but it manifests more in new
React (https://github.com/vercel/swr/issues/2373)
---------
Co-authored-by: kwasniew <kwasniewski.mateusz@gmail.com>
This change adds a test to the tags API to ensure that even if you
can't create tags that are pure whitespace anymore, you'll still
receive pre-existing tags from the API that fit this description.
The test is here to ensure that we don't break this in future versions
of Unleash.
This PR fixes how Unleash handles tag values. Specifically, it does
these things:
1. Trims leading and trailing whitespace from tag values before
inserting them into the database
2. Updates OpenAPI validation to not allow whitespace-only and to ignore
leading and trailing whitespace
Additionally, it moves the tag length constants into the constants file
from the Joi tag schema file. This is because importing the values
previously rendered them as undefined (probably due to a circular
dependency somewhere in the system). This means that the previous values
were also ignored by OpenAPI.
UI updates reflecting this wil follow.
## Background
When you tag a flag, there's nothing stopping you from using an entirely
empty tag or a tag with leading/trailing whitespace.
Empty tags make little sense and leading trailing whitespace differences
are incredibly subtle:
![image](https://github.com/Unleash/unleash/assets/17786332/ec8fe193-8837-4c6a-b7bf-8766eff34eed)
Additionally, leading and trailing whitespace is not shown in the
dropdown list, so you'd have to guess at which is the right one.
![image](https://github.com/Unleash/unleash/assets/17786332/a14698ab-2bfd-4ec3-8814-b8e876d0aadb)
Joining might not always be the best solution. If a table contains too
much data, and you later run sorting on top of it, it will be slow.
In this case, we will first reduce the instances table to a minimal
version because instances usually share the same SDK versions. Only
after that, we join.
Based on some customer data, we reduced query time from 3000ms to 60ms.
However, this will vary based on the number of instances the customer
has.
This PR removes the flag for the new project card design, making it GA.
It also removes deprecated components and updates one reference (in the
groups card) to the new components instead.
This PR removes all the feature flags related to the project list split
and updates the snapshot.
Now the project list will always contain "my projects" and "other
projects"
## About the changes
After an internal conversation, we concluded that syncExternalGroups is
an action that Unleash performs as a system, not something triggered by
the user. We keep the method and just write the event log that the
action was performed by the system user.
We have an issue that if you open up Insights, the Time to Production
chart was showing nothing because it was taking the median across all
projects. You might have many new or empty projects where the median was
0 (no data).
For example, the median from [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.7, 50.3, 140] was 0.
Now, we will remove the 0 values to have a more reasonable median.
## About the changes
Removes the deprecated state endpoint, state-service (despite the
service itself not having been marked as deprecated), and the file
import in server-impl. Leaves a TODO in place of where file import was
as traces for a replacement file import based on the new import/export
functionality
Removes /edge/metrics. This has been superseded by
/api/client/metrics/bulk. Once this is merged, Unleash 6.0 will require
Edge > 17.0.0. (We recommend at least v19.1.3)
We are keeping the UI hidden for mdsol behind kill switch, but I feel
like we can remove the flag completely for backend, so everyone will
keep collecting data.
Co-authored-by: Gitar Bot <noreply@gitar.co>
## About the changes
This aligns us with the requirement of having ip in all events. After
tackling the enterprise part we will be able to make the ip field
mandatory here:
2c66a4ace4/src/lib/types/events.ts (L362)
In preparation for v6, this PR removes usage and references to
`error.description` instead favoring `error.message` (as mentioned
#4380)
I found no references in the front end, so this might be (I believe it
to be) all the required changes.
This PR is part of #4380 - Remove legacy `/api/feature` endpoint.
## About the changes
### Frontend
- Removes the useFeatures hook
- Removes the part of StrategyView that displays features using this
strategy (not been working since v4.4)
- Removes 2 unused features entries from routes
### Backend
- Removes the /api/admin/features endpoint
- Moves a couple of non-feature related tests (auth etc) to use
/admin/projects endpoint instead
- Removes a test that was directly related to the removed endpoint
- Moves a couple of tests to the projects/features endpoint
- Reworks some tests to fetch features from projects features endpoint
and strategies from project strategies
## About the changes
EdgeService is the only place where we use active tokens validation in
bulk. By switching to validating from the cache, we no longer need a
method to return all active tokens from the DB.
We talked with @nunogois that this test is testing migration from 2023,
but time has passed and the migration is working properly, so we think
to cut down on test run time, we can remove it.
## About the changes
We've identified that Bearer token middleware is not working for
/enterprise instance.
Looking at a few lines below:
88e3b1b79e/src/lib/app.ts (L81-L84)
we can see that we were missing the basePath in the use definition.
## About the changes
Current state, when returning the HTML entry point from the server,
there are no headers attached. We encountered an issue with a deployment
and this had an impact for us.
A brief description:
1. We deployed the most recent version. Noticed an unrelated issue.
2. Users tried to use the most recent version and due to their client
cache, requested assets that did not exist in the newest version.
3. Our cache layer cached the assets that were not there with the HTML
response. It had to infer the type based on the filename because there
was no attached `Content-Type` header. This cache was very sticky.
4. After rolling back we saw the HTML response (from the cache) instead
of the appropriate response from the upstream Unleash application.
This PR does a few things.
1. When responding with the HTML entry point, it adds header
(`Content-Type: text/html`).
2. When the client is requesting an asset (a path that ends with an
extension), it also instructs the resource not to be cached
(`Cache-Control: no-cache`) and returns a 404. This will prevent misses
from getting cached.
## Discussion points
To me, there doesn't seem to be a lot of test infra on serving the SPA
application. If that is an error, please indicate where that is and an
appropriate test can be added.
Adds a postgres_version gauge to allow us to see postgres_version in
prometheus and to post it upstream when version checking. Depends on
https://github.com/bricks-software/version-function/pull/20 to be merged
first to ensure our version-function doesn't crash when given the
postgres-version data.
1. Added new schema and tests
2. Controller also accepts the data
3. Also sending fake data from frontend currently
Next steps, implement service/store layer and frontend
This fixes the case when a customer have thousands of strategies causing
the react UI to crash. We still consider it incorrect to use that amount
of strategies and this is more a workaround to help the customer out of
a crashing state.
We put it behind a flag called `manyStrategiesPagination` and plan to
only enable it for the customer in trouble.
Now we are also sending project id to prometheus, also querying from
database. This sets us up for grafana dashboard.
Also put the metrics behind flag, just incase it causes cpu/memory
issues.
This PR updates the project service to automatically create a project id
if it is not provided. The feature is behind a flag. If an ID is
provided, it will still attempt to use that ID instead.
This PR adds a function to automatically generate a project ID on
creation. Using this when the id is missing will be handled in following
PRs.
The function uses the existing `slug` package to create a slug, and then
takes the 12 characters of a uuidv4 string to generate an ID.
The included tests check that the 12 character hash is added and that
the resulting string is url friendly (by checking that
`encodeURIComponent` doesn't change it).
We could also test a lot of edge cases (such as dealing with double
spaces, trimming the string, etc), but I think that's better handled by
the library itself (but you can check out what I removed in
2d9bcb6390
for an idea).
The function doesn't really need to be in the service; it could be moved to a util. But for proximity, I'll create it here first.
Regarding ticket #6892:
I would like to enable the use of a CA certificate without requiring
other certificates. This would be useful for AWS Helm, as AWS only
provides a single PEM file for DB connections.
We are getting questions from engineers, why I do not see lifecycle. The
same will happen with our customers. Now customers will see lifecycle
component unified across features.
Final rank has always been ordering correctly by default. But after 5.12
I see some issues that sometimes it is not ordered. Just to be extra
sure, I am for ordering it.
Add a flag to enable/disable the new UI for project creation.
This flag is separate from the impl on the back end so that we can
enable one without the other (but uses flag dependencies in Unleash, so
that we can never enable the new UI without the new back end).
I have not set the flag to `true` in server startup because the form
doesn't work yet, so it's a manual step for now.
This PR removes the workaround introduced in
https://github.com/Unleash/unleash/pull/6931. After
https://github.com/ivarconr/unleash-enterprise/pull/1268 has been
merged, this should be safe to apply.
Notably, this PR:
- tightens up the type for the enable change request function, so we can
use that to inform the code
- skips trying to do anything with an empty array
The last point is less important than it might seem because both the env
validation and the current implementation of the callback is essentially
a no-op when there are no envs. However, that's hard to enforce. If we
just exit out early, then at least we know nothing happens.
Optionally, we could do something like this instead, but I'm not sure
it's better or worse. Happy to take input.
```ts
const crEnvs = newProject.changeRequestEnvironments ?? []
await this.validateEnvironmentsExist(crEnvs.map((env) => env.name));
const changeRequestEnvironments =
await enableChangeRequestsForSpecifiedEnvironments(crEnvs,);
data.changeRequestEnvironments = changeRequestEnvironments;
```
This PR improves the handling of change request enables on project
creation in two ways:
1. We now verify that the envs you try to enable CRs for exist before
passing them on to the enterprise functionality.
2. We include data about environments and change request environments in
the project created events.
Due to how we handle redirects of embedded proxy, we ended up counting
the same request twice. This PR adds a boolean to res.locals which we
then check if set to avoid double counting.
## About the changes
What's going on is the following:
1. When a token is not found in the token's cache we try to find it in
the db
2. To prevent a denial of service attack using invalid tokens, we cache
the invalid tokens so we don't hit the db.
3. The issue is that we stored this token in the cache regardless we
found it or not. And if the token was valid the first time we'd add a
timestamp to avoid querying this token again the next time.
4. The next iteration the token should be in the cache:
54383a6578/src/lib/services/api-token-service.ts (L162)
but for some reason it is not and therefore we have to make a query. But
this is where the query prevention mechanism kicks in because it finds
the token in the cache and kicks us out. This PR fixes this by only
storing in the cache for misses if not found:
54383a6578/src/lib/services/api-token-service.ts (L164-L165)
The token was added to the cache because we were not checking if it had
expired. Now we added a check and we also have a log for expired tokens.
Some improvement opportunities:
- I don't think we display that a token has expired in the UI which
probably led to this issue
- When a token expired we don't display a specific error message or
error response saying that which is not very helpful for users
This PR introduces a configuration option (`authentication.demoAllowAdminLogin`) that allows you to log in as admin when using demo authentication. To do this, use the username `admin`.
## About the changes
The `admin` user currently cannot be accessed in `demo` authentication
mode, as the auth mode requires only an email to log in, and the admin
user is not created with an email. This change allows for logging in as
the admin user only if an `AUTH_DEMO_ALLOW_ADMIN_LOGIN` is set to `true`
(or the corresponding `authDemoAllowAdminLogin` config is enabled).
<!-- Does it close an issue? Multiple? -->
Closes#6398
### Important files
[demo-authentication.ts](https://github.com/Unleash/unleash/compare/main...00Chaotic:unleash:feat/allow_admin_login_using_demo_auth?expand=1#diff-c166f00f0a8ca4425236b3bcba40a8a3bd07a98d067495a0a092eec26866c9f1R25)
## Discussion points
Can continue discussion of [this
comment](https://github.com/Unleash/unleash/pull/6447#issuecomment-2042405647)
in this PR.
---------
Co-authored-by: Thomas Heartman <thomasheartman+github@gmail.com>
This commit adds an `environments` property to the project created
payload. The list contains only the projects that the project has
enabled.
The point of adding it is that it gives you a better overview over
what you have created.
This PR adds the `projectListNewCards` flag to the constant defined in
`experimental.ts`. This should allow the API to pass that value to the
front end.
## About the changes
Add time metrics to relevant queries:
- get
- getAll
- bulkInsert
- count
- exists
- get
Ignored because might not be that relevant:
- insert
- delete
- deleteAll
- update
## About the changes
This PR removes the feature flag `queryMissingTokens` that was fully
rolled out.
It introduces a new way of checking edgeValidTokens controlled by the
flag `checkEdgeValidTokensFromCache` that relies in the cached data but
hits the DB if needed.
The assumption is that most of the times edge will find tokens in the
cache, except for a few cases in which a new token is queried. From all
tokens we expect at most one to hit the DB and in this case querying a
single token should be better than querying all the tokens.
This makes it configurable either through a single JSON file with all
three certificates as separate keys or via separate files per
ca/cert/key key.
fixes#6718
I've tried to use/add the audit info to all events I could see/find.
This makes this PR necessarily huge, because we do store quite a few
events.
I realise it might not be complete yet, but tests
run green, and I think we now have a pattern to follow for other events.
This PR adds an optional function parameter to the `createProject`
function that is intended to enable change requests for the newly
created project.
The assumption is that all the logic within will be decided in the
enterprise impl. The only thing we want to verify here is that it is
called after the project has been created.
This PR adds functionality to the `createProject` function to choose
which environments should be enabled when you create a new project. The
new `environments` property is optional and omitting it will make it
work exactly as it does today.
The current implementation is fairly strict. We have some potential
ideas to make it easier to work with, but we haven't agreed on any yet.
Making it this strict means that we can always relax the rules later.
The rules are (codified in tests):
- If `environments` is not provided, all non-deprecated environments are
enabled
- If `environments` is provided, only the environments listed are
enabled, regardless of whether they're deprecated or not
- If `environments` is provided and is an empty array, the service
throws an error. The API should dilsallow that via the schema anyway,
but this catches it in case it sneaks in some other way.
- If `environments` is provided and contains one or more environments
that don't exist, the service throws an error. While we could ignore
them, that would lead to more complexity because we'd have to also check
that the at least one of the environments is valid. It also leads to
silent ignoring of errors, which may or may not be good for the user
experience.
The API endpoint for this sits in enterprise, so no customer-facing
changes are part of this.
We encountered an issue with a customer because this query was returning
3 million rows. The problem arose from each instance reporting
approximately 100 features, with a total of 30,000 instances. The query
was joining these, thus multiplying the data. This approach was fine for
a reasonable number of instances, but in this extreme case, it did not
perform well.
This PR modifies the logic; instead of performing outright joins, we are
now grouping features by environment into an array, resulting in just
one row returned per instance.
I tested locally with the same dataset. Previously, loading this large
instance took about 21 seconds; now it has reduced to 2 seconds.
Although this is still significant, the dataset is extensive.
Previously, we were not validating that the ID was a number, which
sometimes resulted in returning our database queries (source code) to
the frontend. Now, we have validation middleware.
Previously, we were extracting the project from the token, but now we
will retrieve it from the session, which contains the full list of
projects.
This change also resolves an issue we encountered when the token was a
multi-project token, formatted as []:dev:token. Previously, it was
unable to display the exact list of projects. Now, it will show the
exact project names.
<details>
<summary>Feature Flag Cleanup</summary>
| Stale Flag | Value |
| ---------- | ------- |
| stripClientHeadersOn304 | true |
</details>
<details>
<summary>Trigger</summary>
https://github.com/Unleash/unleash/issues/6559#issuecomment-2058848984
</details>
<details>
<summary>Bot Commands</summary>
`@gitar-bot cleanup stale_flag=value` will cleanup a stale feature flag.
Replace `stale_flag` with the name of the stale feature flag and `value`
with either `true` or `false`.
</details>
---------
Co-authored-by: Gitar Bot <noreply@gitar.co>
## About the changes
- Removes the feature flag for the created_by migrations.
- Adds a configuration option in IServerOption for
`ENABLE_SCHEDULED_CREATED_BY_MIGRATION` that defaults to `false`
- the new configuration option when set on startup enables scheduling of
the two created_by migration services (features+events)
- Removes the dependency on flag provider in EventStore as it's no
longer needed
- Adds a brief description of the new configuration option in
`configuring-unleash.md`
- Sets the events created_by migration interval to 15 minutes, up from
2.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
This PR provides a service that allows a scheduled function to run in a
single instance. It's currently not in use but tests show how to wrap a
function to make it single-instance:
65b7080e05/src/lib/features/scheduler/job-service.test.ts (L26-L32)
The key `'test'` is used to identify the group and most likely should
have the same name as the scheduled job.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.io>
This PR adds a counter in Prometheus for counting the number of
"environment disabled" events we get per project. The purpose of this is
to establish a baseline for one of the "project management UI" project's
key results.
## On gauges vs counters
This PR uses a counter. Using a gauge would give you the total number of
envs disabled, not the number of disable events. The difference is
subtle, but important.
For projects that were created before the new feature, the gauge might
be appropriate. Because each disabled env would require at least one
disabled event, we can get a floor of how many events were triggered for
each project.
However, for projects created after we introduce the planned change,
we're not interested in the total envs anymore, because you can disable
a hundred envs on creation with a single action. In this case, a gauge
showing 100 disabled envs would be misleading, because it didn't take
100 events to disable them.
So the interesting metric here is how many times did you specifically
disable an environment in project settings, hence the counter.
## Assumptions and future plans
To make this easier on ourselves, we make the follow assumption: people
primarily disable envs **when creating a project**.
This means that there might be a few lagging indicators granting some
projects a smaller number of events than expected, but we may be able to
filter those out.
Further, if we had a metric for each project and its creation date, we
could correlate that with the metrics to answer the question "how many
envs do people disable in the first week? Two weeks? A month?". Or
worded differently: after creating a project, how long does it take for
people to configure environments?
Similarly, if we gather that data, it will also make filtering out the
number of events for projects created **after** the new changes have
been released much easier.
The good news: Because the project creation metric with dates is a
static aggregate, it can be applied at any time, even retroactively, to
see the effects.
This PR expands upon #6773 by returning the list of removed properties
in the API response. To achieve this, I added a new top-level `warnings`
key to the API response and added an `invalidContextProperties` property
under it. This is a list with the keys that were removed.
## Discussion points
**Should we return the type of each removed key's value?** We could
expand upon this by also returning the type that was considered invalid
for the property, e.g. `invalidProp: 'object'`. This would give us more
information that we could display to the user. However, I'm not sure
it's useful? We already return the input as-is, so you can always
cross-check. And the only type we allow for non-`properties` top-level
properties is `string`. Does it give any useful info? I think if we want
to display this in the UI, we might be better off cross-referencing with
the input?
**Can properties be invalid for any other reason?** As far as I can
tell, that's the only reason properties can be invalid for the context.
OpenAPI will prevent you from using a type other than string for the
context fields we have defined and does not let you add non-string
properties to the `properties` object. So all we have to deal with are
top-level properties. And as long as they are strings, then they should
be valid.
**Should we instead infer the diff when creating the model?** In this
first approach, I've amended the `clean-context` function to also return
the list of context fields it has removed. The downside to this approach
is that we need to thread it through a few more hoops. Another approach
would be to compare the input context with the context used to evaluate
one of the features when we create the view model and derive the missing
keys from that. This would probably work in 98 percent of cases.
However, if your result contains no flags, then we can't calculate the
diff. But maybe that's alright? It would likely be fewer lines of code
(but might require additional testing), although picking an environment
from feels hacky.
Don't include invalid context properties in the contexts that we
evaluate.
This PR removes any non-`properties` fields that have a non-string
value.
This prevents the front end from crashing when trying to render an
object.
Expect follow-up PRs to include more warnings/diagnostics we can show to
the end user to inform them of what fields have been removed and why.
## About the changes
This PR establishes a simple yet effective mechanism to avoid DDoS
against our DB while also protecting against memory leaks.
This will enable us to release the flag `queryMissingTokens` to make our
token validation consistent across different nodes
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
Converts `newContextFieldUI` release flag to
`disableShowContextFieldSelectionValues` kill switch.
The kill switch controls whether we show the value selection above the
search filed when > 100 values
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
Adds a bearer token middleware that adds support for tokens prefixed
with "Bearer" scheme. Prefixing with "Bearer" is optional and the old
way of authenticating still works, so we now support both ways.
Also, added as part of our OpenAPI spec which now displays authorization
as follows:
![image](https://github.com/Unleash/unleash/assets/455064/77b17342-2315-4c08-bf34-4655e12a1cc3)
Related to #4630. Doesn't fully close the issue as we're still using
some invalid characters for the RFC, in particular `*` and `[]`
For safety reasons this is behind a feature flag
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
This change fixes the OpenAPI schema to disallow non-string properties
on the top level of the context (except, of course, the `properties`
object).
This means that we'll no longer be seeing issues with rendering
invalid contexts, because we don't accept them in the first place.
This solution comes with some tradeoffs discussed in the [PR](https://github.com/Unleash/unleash/pull/6676). Following on from that, this solution isn't optimal, but it's a good stop gap. A better solution (proposed in the PR discussion) has been added as an idea for future projects.
The bulk of the discussion around the solution is included here for reference:
@kwasniew:
Was it possible to pass non string properties with our UI before?
Is there a chance that something will break after this change?
@thomasheartman:
Good question and good looking out 😄
You **could** pass non-string, top-level properties into the API before. In other words, this would be allowed:
```js
{
appName: "my-app",
nested: { object: "accepted" }
}
```
But notably, non-string values under `properties` would **not** be accepted:
```js
{
appName: "my-app",
properties: {
nested: { object: "not accepted" }
}
}
```
**However**, the values would not contribute to the evaluation of any constraints (because their type is invalid), so they would effectively be ignored.
Now, however, you'll instead get a 400 saying that the "nested" value must be a string.
I would consider this a bug fix because:
- if you sent a nested object before, it was most likely an oversight
- if you sent the nested object on purpose, expecting it to work, you would be perplexed as to why it didn't work, as the API accepted it happily
Furthermore, the UI will also tell you that the property must be a string now if you try to do it from the UI.
On the other hand, this does mean that while you could send absolute garbage in before and we would just ignore it, we don't do that anymore. This does go against how we allow you to send anything for pretty much all other objects in our API.
However, the SDK context is special. Arbitrary keys aren't ignored, they're actually part of the context itself and as such should have a valid value.
So if anything breaks, I think it breaks in a way that tells you why something wasn't working before. However, I'd love to hear your take on it and we can re-evaluate whether this is the right fix, if you think it isn't.
@kwasniew:
Coming from the https://en.wikipedia.org/wiki/Robustness_principle mindset I'm thinking if ignoring the fields that are incorrect wouldn't be a better option. So we'd accept incorrect value and drop it instead of:
* failing with client error (as this PR) or
* saving incorrect value (as previous code we had)
@thomasheartman:
Yeah, I considered that too. In fact, that was my initial idea (for the reason you stated). However, there's a couple tradeoffs here (as always):
1. If we just ignore those values, the end user doesn't know what's happened unless they go and dig through the responses. And even then, they don't necessarily know why the value is gone.
2. As mentioned, for the context, arbitrary keys can't be ignored, because we use them to build the context. In other words, they're actually invalid input.
Now, I agree that you should be liberal in what you accept and try to handle things gracefully, but that means you need to have a sensible default to fall back to. Or, to quote the Wikipedia article (selectively; with added emphasis):
> programs that receive messages should accept non-conformant input **as long as the meaning is clear**.
In this case, the meaning isn't clear when you send extra context values that aren't strings.
For instance, what's the meaning here:
```js
{
appName: "my-app",
nested: { object: "accepted", more: { further: "nesting" } }
}
```
If you were trying to use the `nested` value as an object, then that won't work. Ideally, you should be alerted.
Should we "unwind" the object and add all string keys as context values? That doesn't sound very feasible **or** necessarily like the right thing.
Did you just intend to use the `appName` and for the `nested` object to be ignored?
And it's because of this caveat that I'm not convinced just ignoring the keys are the right thing to do. Because if you do, the user never knows they were ignored or why.
----
**However**, I'd be in favor of ignoring they keys if we could **also** give the users warnings at the same time. (Something like what we do in the CR api, right? Success with warnings?)
If we can tell the user that "we ignored the `a`, `b`, and `c` keys in the context you sent because they are invalid values. Here is the result of the evaluation without taking those keys into account: [...]", then I think that's the ideal solution.
But of course, the tradeoff is that that increases the complexity of the API and the complexity of the task. It also requires UI adjustments etc. This means that it's not a simple fix anymore, but more of a mini-project.
But, in the spirit of the playground, I think it would be a worthwhile thing to do because it helps people learn and understand how Unleash works.
Via the API you can currently create gradualRollout strategies without
any parameters set, when visiting the UI afterwards, you can edit this,
because the UI reads the parameter list from the database and sees that
some parameters are required, and refuses to accept the data. This PR
adds defaults for gradualRollout strategies created from the API, making
sure gradual rollout strategies always have `rollout`, `groupId` and
`stickiness` set.
Provides store method for retrieving traffic usage data based on
period parameter, and UI + ui hook with the new chart for displaying
traffic usage data spread out over selectable month.
![Skjermbilde 2024-03-21 kl 12 40
38](https://github.com/Unleash/unleash/assets/707867/539c6c98-b6f6-488a-97fb-baf4fccec687)
In this PR we copied and adapted a plugin written by DX for highlighting
a column in the chart:
![image](https://github.com/Unleash/unleash/assets/707867/70532b22-44ed-44c0-a9b4-75f65ed6a63d)
There are some minor improvements planned which will come in a separate
PR, reversing the order in legend and tooltip so the colors go from
light to dark, and adding a month -sum below the legend
## Discussion points
- Should any of this be extracted as a separate reusable component?
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
## About the changes
There seems to be a typo in the authorization header. We're keeping the
old typo as preferred just in case, but if not present we'll default to
the authorization header (not authorisation).
Not sure about the impact of this bug, as all registrations might be
using default project.
The changes to arbitraries here is to make typescript agree with our
schema types. Seems like somewhere between 4.8.4 and 5.4.2, typescript
got stricter.
## About the changes
We see some logs with: `Failed to store events: Error: The query is
empty` which suggests we're not sending events to batchStore. This will
help us confirm that and will give us better insights
An entrypoint for enterprise to register a hook which will be called
before the database and scheduler services are torn down. That way
enterprise can also perform graceful shutdown.
There was no need to join the entire metrics table, as it is a huge
table. We only needed all combinations of app_name, environment, and
feature_name. The new query retrieves all this data, which will then be
joined into the main query.