### What
The heaviest requests we serve are the register and metrics POSTs from
our SDKs/clients.
This PR adds ratelimiting to /api/client/register, /api/client/metrics,
/api/frontend/register and /api/frontend/metrics with a default set to
6000 requests per minute (or 100 rps) for each of the endpoints.
It will be overrideable by the environment variables documented.
### Points of discussion
@kwasniew already suggested using featuretoggles with variants to
control the rate per clientId. I struggled to see if we could
dynamically update the middleware after initialisation, so this attempt
will need a restart of the pod to update the request limit.
This PR is the first step in separating the client and admin stores.
Currently our feature toggle services uses the client store to serve
multiple purposes.
Admin API uses the feature toggle service to serve both the feature
toggle list and playground features, while the client API uses the
feature toggle service to serve client features. The admin API can
change often and have very different requirements than the client API,
which changes infrequently and generally keeps the same stable structure
for long periods of time. This architecture is error prone, because when
you need to make changes to the admin API, you can very easily affect
the client API.
I aim to put up a stone wall between the two APIs. Complete separation
between the two APIs, at the cost of some duplication.
In this PR I have created a feature oriented architecture for client
features and disconnected the client API from the feature toggle
service. It now goes through it's own service to it's own store. For
feature toggle service I have duplicated and replaced the functionality
that serves /api/admin/features, I have kept a lot of the ugliness in
the code and haven't removed anything in order to avoid breaking
changes.
Next steps:
* Move playground to admin API
* Remove client-feature-toggle-store from feature-toggle-service
As part of more telemetry on the usage of Unleash.
This PR adds a new `stat_` prefixed table as well as a trigger on the
events table trigger on each insert to increment a counter per
environment per day.
The trigger will trigger on every insert into the events base, but will
filter and only increment the counter for events that actually have the
environment set. (there are events, like user-created, that does not
relate to a specific environment).
Bit wary on this, but since we truncate down to row per (day,
environment) combo, finding conflict and incrementing shouldn't take too
long here.
@ivarconr was it something like this you were considering?
This commit changes our linter/formatter to biome (https://biomejs.dev/)
Causing our prehook to run almost instantly, and our "yarn lint" task to
run in sub 100ms.
Some trade-offs:
* Biome isn't quite as well established as ESLint
* Are we ready to install a different vscode plugin (the biome plugin)
instead of the prettier plugin
The configuration set for biome also has a set of recommended rules,
this is turned on by default, in order to get to something that was
mergeable I have turned off a couple the rules we seemed to violate the
most, that we also explicitly told eslint to ignore.
https://linear.app/unleash/issue/2-1403/consider-refactoring-the-way-tags-are-fetched-for-the-events
This adds 2 methods to `EventService`:
- `storeEvent`;
- `storeEvents`;
This allows us to run event-specific logic inside these methods. In the
case of this PR, this means fetching the feature tags in case the event
contains a `featureName` and there are no tags specified in the event.
This prevents us from having to remember to fetch the tags in order to
store feature-related events except for very specific cases, like the
deletion of a feature - You can't fetch tags for a feature that no
longer exists, so in that case we need to pre-fetch the tags before
deleting the feature.
This also allows us to do any event-specific post-processing to the
event before reaching the DB layer.
In general I think it's also nicer that we reference the event service
instead of the event store directly.
There's a lot of changes and a lot of files touched, but most of it is
boilerplate to inject the `eventService` where needed instead of using
the `eventStore` directly.
Hopefully this will be a better approach than
https://github.com/Unleash/unleash/pull/4729
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
We love all open-source Unleash users. in 2022 we built the [segment
capability](https://docs.getunleash.io/reference/segments) (v4.13) as an
enterprise feature, simplify life for our customers.
Now it is time to contribute it to the world 🌏
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
## About the changes
- `getActiveUsers` is using multiple stores, so it is refactored into
read-model
- Refactored Instance stats service into `features` to co-locate related
code
Closes https://linear.app/unleash/issue/UNL-230/active-users-prometheus
### Important files
`src/lib/features/instance-stats/getActiveUsers.ts`
## Discussion points
`getActiveUsers` is coded less _class-based_ then previous similar
read-models. In one file instead of 3 (read-model interface, fake read
model, sql read model). I find types and functions way more readable,
but I'm ready to refactor it to interfaces and classes if consistency is
more important.
This PR adds feature name pattern validation to the import validation
step. When errors occur, they are rendered with all the offending
features, the pattern to match, plus the pattern's description and
example if available.
![image](https://github.com/Unleash/unleash/assets/17786332/69956090-afc6-41c8-8f6e-fb45dfaf0a9d)
To achieve this I've added an extra method to the feature toggle service
that checks feature names without throwing errors (because catching `n`
async errors in a loop became tricky and hard to grasp). This method is
also reused in the existing feature name validation method and handles
the feature enabled chcek.
In doing so, I've also added tests to check that the pattern is applied.
Adds a first iteration of feature flag naming patterns. Currently behind a flag.
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
Co-authored-by: andreas-unleash <andreas@getunleash.ai>
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
## About the changes
Returns Not Found on create and get project api tokens when given a
project id that doesn't exist
## Discussion points
- This is an extra lookup per execution of the endpoint
## About the changes
Returns either 400 or 404 when token isn't found or doesn't match single
project must be provided projectId criteria
<!-- Does it close an issue? Multiple? -->
Closes #
Linear 2-1003
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
Is projects.length > 1 a 400?
https://linear.app/unleash/issue/2-1136/custom-root-roles-documentation
- [Adds documentation referencing custom root
roles](https://unleash-docs-git-docs-custom-root-roles-unleash-team.vercel.app/reference/rbac);
- [Adds a "How to create and assign custom root roles" how-to
guide](https://unleash-docs-git-docs-custom-root-roles-unleash-team.vercel.app/how-to/how-to-create-and-assign-custom-root-roles);
- Standardizes "global" roles to "root" roles;
- Standardizes "standard" roles to "predefined" roles to better reflect
their behavior and what is shown in our UI;
- Updates predefined role descriptions and makes them consistent;
- Updates the side panel description of the user form;
- Includes some boy scouting with some tiny fixes of things identified
along the way (e.g. the role form was persisting old data when closed
and re-opened);
Questions:
- Is it worth expanding the "Assigning custom root roles" section in the
"How to create and assign custom root roles" guide to include the steps
for assigning a root role for each entity (user, service account,
group)?
- Should this PR include an update to the existing "How to create and
assign custom project roles" guide? We've since updated the UI;
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
## About the changes
Open API is creating 2 resources under the same URL path, we want them
aggregated:
```shell
$ curl -s https://us.app.unleash-hosted.com/ushosted/docs/openapi.json | jq '.paths."/api/admin/strategies/{name}" | keys'
[
"delete",
"get"
]
gaston@gaston-Summit-E16Flip-A12UCT: ~/poc/terraform-provider-unleash on main [!$]
$ curl -s https://us.app.unleash-hosted.com/ushosted/docs/openapi.json | jq '.paths."/api/admin/strategies/{strategyName}" | keys'
[
"put"
]
```
Note one is under `{name}` while the other is under `{strategyName}`
https://linear.app/unleash/issue/2-1311/add-a-new-prometheus-metric-with-custom-root-roles-in-use
As a follow-up to https://github.com/Unleash/unleash/pull/4435, this PR
adds a metric for total custom root roles in use by at least one entity:
users, service accounts, groups.
`custom_root_roles_in_use_total`
Output from `http://localhost:4242/internal-backstage/prometheus`:
```
# HELP process_cpu_user_seconds_total Total user CPU time spent in seconds.
# TYPE process_cpu_user_seconds_total counter
process_cpu_user_seconds_total 0.060755
# HELP process_cpu_system_seconds_total Total system CPU time spent in seconds.
# TYPE process_cpu_system_seconds_total counter
process_cpu_system_seconds_total 0.01666
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.077415
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1691420275
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 199196672
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0
# HELP nodejs_eventloop_lag_min_seconds The minimum recorded event loop delay.
# TYPE nodejs_eventloop_lag_min_seconds gauge
nodejs_eventloop_lag_min_seconds 0.009076736
# HELP nodejs_eventloop_lag_max_seconds The maximum recorded event loop delay.
# TYPE nodejs_eventloop_lag_max_seconds gauge
nodejs_eventloop_lag_max_seconds 0.037683199
# HELP nodejs_eventloop_lag_mean_seconds The mean of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_mean_seconds gauge
nodejs_eventloop_lag_mean_seconds 0.011063251638989169
# HELP nodejs_eventloop_lag_stddev_seconds The standard deviation of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_stddev_seconds gauge
nodejs_eventloop_lag_stddev_seconds 0.0013618102764025837
# HELP nodejs_eventloop_lag_p50_seconds The 50th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p50_seconds gauge
nodejs_eventloop_lag_p50_seconds 0.011051007
# HELP nodejs_eventloop_lag_p90_seconds The 90th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p90_seconds gauge
nodejs_eventloop_lag_p90_seconds 0.011321343
# HELP nodejs_eventloop_lag_p99_seconds The 99th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p99_seconds gauge
nodejs_eventloop_lag_p99_seconds 0.013688831
# HELP nodejs_active_resources Number of active resources that are currently keeping the event loop alive, grouped by async resource type.
# TYPE nodejs_active_resources gauge
nodejs_active_resources{type="FSReqCallback"} 1
nodejs_active_resources{type="TTYWrap"} 3
nodejs_active_resources{type="TCPSocketWrap"} 5
nodejs_active_resources{type="TCPServerWrap"} 1
nodejs_active_resources{type="Timeout"} 1
nodejs_active_resources{type="Immediate"} 1
# HELP nodejs_active_resources_total Total number of active resources.
# TYPE nodejs_active_resources_total gauge
nodejs_active_resources_total 12
# HELP nodejs_active_handles Number of active libuv handles grouped by handle type. Every handle type is C++ class name.
# TYPE nodejs_active_handles gauge
nodejs_active_handles{type="WriteStream"} 2
nodejs_active_handles{type="ReadStream"} 1
nodejs_active_handles{type="Socket"} 5
nodejs_active_handles{type="Server"} 1
# HELP nodejs_active_handles_total Total number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 9
# HELP nodejs_active_requests Number of active libuv requests grouped by request type. Every request type is C++ class name.
# TYPE nodejs_active_requests gauge
nodejs_active_requests{type="FSReqCallback"} 1
# HELP nodejs_active_requests_total Total number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 1
# HELP nodejs_heap_size_total_bytes Process heap size from Node.js in bytes.
# TYPE nodejs_heap_size_total_bytes gauge
nodejs_heap_size_total_bytes 118587392
# HELP nodejs_heap_size_used_bytes Process heap size used from Node.js in bytes.
# TYPE nodejs_heap_size_used_bytes gauge
nodejs_heap_size_used_bytes 89642552
# HELP nodejs_external_memory_bytes Node.js external memory size in bytes.
# TYPE nodejs_external_memory_bytes gauge
nodejs_external_memory_bytes 1601594
# HELP nodejs_heap_space_size_total_bytes Process heap space size total from Node.js in bytes.
# TYPE nodejs_heap_space_size_total_bytes gauge
nodejs_heap_space_size_total_bytes{space="read_only"} 0
nodejs_heap_space_size_total_bytes{space="old"} 70139904
nodejs_heap_space_size_total_bytes{space="code"} 3588096
nodejs_heap_space_size_total_bytes{space="map"} 2899968
nodejs_heap_space_size_total_bytes{space="large_object"} 7258112
nodejs_heap_space_size_total_bytes{space="code_large_object"} 1146880
nodejs_heap_space_size_total_bytes{space="new_large_object"} 0
nodejs_heap_space_size_total_bytes{space="new"} 33554432
# HELP nodejs_heap_space_size_used_bytes Process heap space size used from Node.js in bytes.
# TYPE nodejs_heap_space_size_used_bytes gauge
nodejs_heap_space_size_used_bytes{space="read_only"} 0
nodejs_heap_space_size_used_bytes{space="old"} 66992120
nodejs_heap_space_size_used_bytes{space="code"} 2892640
nodejs_heap_space_size_used_bytes{space="map"} 2519280
nodejs_heap_space_size_used_bytes{space="large_object"} 7026824
nodejs_heap_space_size_used_bytes{space="code_large_object"} 983200
nodejs_heap_space_size_used_bytes{space="new_large_object"} 0
nodejs_heap_space_size_used_bytes{space="new"} 9236136
# HELP nodejs_heap_space_size_available_bytes Process heap space size available from Node.js in bytes.
# TYPE nodejs_heap_space_size_available_bytes gauge
nodejs_heap_space_size_available_bytes{space="read_only"} 0
nodejs_heap_space_size_available_bytes{space="old"} 1898360
nodejs_heap_space_size_available_bytes{space="code"} 7328
nodejs_heap_space_size_available_bytes{space="map"} 327888
nodejs_heap_space_size_available_bytes{space="large_object"} 0
nodejs_heap_space_size_available_bytes{space="code_large_object"} 0
nodejs_heap_space_size_available_bytes{space="new_large_object"} 16495616
nodejs_heap_space_size_available_bytes{space="new"} 7259480
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v18.16.0",major="18",minor="16",patch="0"} 1
# HELP nodejs_gc_duration_seconds Garbage collection duration by kind, one of major, minor, incremental or weakcb.
# TYPE nodejs_gc_duration_seconds histogram
# HELP http_request_duration_milliseconds App response time
# TYPE http_request_duration_milliseconds summary
# HELP db_query_duration_seconds DB query duration time
# TYPE db_query_duration_seconds summary
db_query_duration_seconds{quantile="0.1",store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds{quantile="0.5",store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds{quantile="0.9",store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds{quantile="0.95",store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds{quantile="0.99",store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds_sum{store="api-tokens",action="getAllActive"} 0.03091475
db_query_duration_seconds_count{store="api-tokens",action="getAllActive"} 1
# HELP feature_toggle_update_total Number of times a toggle has been updated. Environment label would be "n/a" when it is not available, e.g. when a feature toggle is created.
# TYPE feature_toggle_update_total counter
# HELP feature_toggle_usage_total Number of times a feature toggle has been used
# TYPE feature_toggle_usage_total counter
# HELP feature_toggles_total Number of feature toggles
# TYPE feature_toggles_total gauge
feature_toggles_total{version="5.3.0"} 31
# HELP users_total Number of users
# TYPE users_total gauge
users_total 1011
# HELP projects_total Number of projects
# TYPE projects_total gauge
projects_total 4
# HELP environments_total Number of environments
# TYPE environments_total gauge
environments_total 10
# HELP groups_total Number of groups
# TYPE groups_total gauge
groups_total 5
# HELP roles_total Number of roles
# TYPE roles_total gauge
roles_total 11
# HELP custom_root_roles_total Number of custom root roles
# TYPE custom_root_roles_total gauge
custom_root_roles_total 3
# HELP custom_root_roles_in_use_total Number of custom root roles in use
# TYPE custom_root_roles_in_use_total gauge
custom_root_roles_in_use_total 2
# HELP segments_total Number of segments
# TYPE segments_total gauge
segments_total 5
# HELP context_total Number of context
# TYPE context_total gauge
context_total 7
# HELP strategies_total Number of strategies
# TYPE strategies_total gauge
strategies_total 5
# HELP client_apps_total Number of registered client apps aggregated by range by last seen
# TYPE client_apps_total gauge
client_apps_total{range="allTime"} 0
client_apps_total{range="30d"} 0
client_apps_total{range="7d"} 0
# HELP saml_enabled Whether SAML is enabled
# TYPE saml_enabled gauge
saml_enabled 1
# HELP oidc_enabled Whether OIDC is enabled
# TYPE oidc_enabled gauge
oidc_enabled 0
# HELP client_sdk_versions Which sdk versions are being used
# TYPE client_sdk_versions counter
# HELP optimal_304_diffing Count the Optimal 304 diffing with status
# TYPE optimal_304_diffing counter
# HELP db_pool_min Minimum DB pool size
# TYPE db_pool_min gauge
db_pool_min 0
# HELP db_pool_max Maximum DB pool size
# TYPE db_pool_max gauge
db_pool_max 4
# HELP db_pool_free Current free connections in DB pool
# TYPE db_pool_free gauge
db_pool_free 0
# HELP db_pool_used Current connections in use in DB pool
# TYPE db_pool_used gauge
db_pool_used 4
# HELP db_pool_pending_creates how many asynchronous create calls are running in DB pool
# TYPE db_pool_pending_creates gauge
db_pool_pending_creates 0
# HELP db_pool_pending_acquires how many acquires are waiting for a resource to be released in DB pool
# TYPE db_pool_pending_acquires gauge
db_pool_pending_acquires 24
```
This PR stabilizes the playground and feature types endpoints by
changing their tag from 'Unstable' to respectively 'Playground' and
'Feature Types'. It also moves the existing feature type endpoint (which
was previously tagged as 'Features') to the 'Feature Types' tag.
This PR adds an e2e test to the OpenAPI tests that checks that all
openapi operations have both summaries and descriptions. It also fixes
the few schemas that were missing one or the other.
Enable strict schema validation by default. It can still be overridden
by explicitly setting it to false.
I've also fixed the validation errors that appeared when turning it on.
I've opted for the simplest route and changed the schemas to comply with
the tests.
This PR updates the feature type service by adding a new
`updateLifetime` method. This method handles the connection between the
API (#4256) and the store (#4252).
I've also added some new e2e tests to ensure that the API behaves as
expected.
This PR adds an operation and accompanying openapi docs for the new
"update feature type lifetime" API operation.
It also fixes an oversight where the other endpoint on the same
controller didn't use `respondWithValidation`.
Note: the API here is a suggestion. I'd like to hear whether you agree
with this implementation or not.
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
When reordering strategies for a feature environment:
- Adds stop when CR are enabled
- Emits an event
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
<!-- Does it close an issue? Multiple? -->
Closes #
<!-- (For internal contributors): Does it relate to an issue on public
roadmap? -->
<!--
Relates to [roadmap](https://github.com/orgs/Unleash/projects/10) item:
#
-->
### Important files
<!-- PRs can contain a lot of changes, but not all changes are equally
important. Where should a reviewer start looking to get an overview of
the changes? Are any files particularly important? -->
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
This reverts commit 16e3799b9a.
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
<!-- Does it close an issue? Multiple? -->
Closes #
<!-- (For internal contributors): Does it relate to an issue on public
roadmap? -->
<!--
Relates to [roadmap](https://github.com/orgs/Unleash/projects/10) item:
#
-->
### Important files
<!-- PRs can contain a lot of changes, but not all changes are equally
important. Where should a reviewer start looking to get an overview of
the changes? Are any files particularly important? -->
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
Wraps the whole `registerClientMetrics` function with try/catch to
return 400 on error
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
<!-- Does it close an issue? Multiple? -->
Closes #
[1-1037](https://linear.app/unleash/issue/1-1037/return-4xx-error-for-incorrect-metrics-input)
<!-- (For internal contributors): Does it relate to an issue on public
roadmap? -->
<!--
Relates to [roadmap](https://github.com/orgs/Unleash/projects/10) item:
#
-->
### Important files
<!-- PRs can contain a lot of changes, but not all changes are equally
important. Where should a reviewer start looking to get an overview of
the changes? Are any files particularly important? -->
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
![Screenshot 2023-07-10 at 14 23
13](https://github.com/Unleash/unleash/assets/104830839/5417fb39-ce24-4b70-b3d3-c63374a29a12)
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
## About the changes
- Adding descriptions and examples to tag and tag types schemas
- Adding standard errors, summaries, and descriptions to tag and tag
types endpoints
- Some improvements on compilation errors
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
Adds description and summary to `favorite` endpoints
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
<!-- Does it close an issue? Multiple? -->
Closes #
[1-1098](https://linear.app/unleash/issue/1-1098/openapi-features-favorites)
<!-- (For internal contributors): Does it relate to an issue on public
roadmap? -->
<!--
Relates to [roadmap](https://github.com/orgs/Unleash/projects/10) item:
#
-->
### Important files
<!-- PRs can contain a lot of changes, but not all changes are equally
important. Where should a reviewer start looking to get an overview of
the changes? Are any files particularly important? -->
## Discussion points
<!-- Anything about the PR you'd like to discuss before it gets merged?
Got any questions or doubts? -->
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
In some of the places we used `NoAccessError` for permissions, other
places we used it for a more generic 403 error with a different
message. This refactoring splits the error type into two distinct
types instead to make the error messages more consistent.
## About the changes
Fix OpenAPI definitions for endpoint
`/api/admin/projects/{projectId}/features/{featureName}/environments/{environment}`
and similar.
### What
This PR adds documentation for our endpoints that are covered by our
"Events" tag. It also adds a type for all valid events, and then uses
this as valid values for type argument.
This PR updates endpoints and schemas for the API tokens tag.
As part of that, they also handle oneOf openapi validation errors and improve the console output for the enforcer tests.
## What
This adds openapi documentation for the Auth tagged operations and
connected schemas.
## Discussion points
Our user schema seems to be exposing quite a bit of internal fields, I
flagged the isApi field as deprecated, I can imagine quite a few of
these fields also being deprecated to prepare for removal in next major
version, but I was unsure which ones were safe to do so with.
## Observation
We have some technical debt around the shape of the schema we're
claiming we're returning and what we actually are returning. I believe
@gastonfournier also observed this when we turned on validation for our
endpoints.
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
This removes the experimental feature flag that defaulted to turn off
telemetry collection
<!-- Thanks for creating a PR! To make it easier for reviewers and
everyone else to understand what your changes relate to, please add some
relevant content to the headings below. Feel free to ignore or delete
sections that you don't think are relevant. Thank you! ❤️ -->
## About the changes
<!-- Describe the changes introduced. What are they and why are they
being introduced? Feel free to also add screenshots or steps to view the
changes if they're visual. -->
Adds a UI that shows current status of version and feature usage
collection configuration, and a presence in the configuration menu +
menu bar.
Configuring these features is done by setting environment variables. The
version info collection is an existing feature that we're making more
visible, the feature usage collection feature is a new feature that has
it's own environment configuration but also depends on version info
collection being active to work.
When version collection is turned off and the experimental feature flag
for feature usage collection is turned off:
<img width="1269" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/435a07da-d238-4b5b-a150-07e3bd6b816f">
When version collection is turned on and the experimental feature flag
is off:
<img width="1249" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/8d1a76c5-99c9-4551-9a4f-86d477bbbf6f">
When the experimental feature flag is enabled, and version+telemetry is
turned off:
<img width="1239" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/e0bc532b-be94-4076-bee1-faef9bc48a5b">
When version collection is turned on, the experimental feature flag is
enabled, and telemetry collection is turned off:
<img width="1234" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/1bd190c1-08fe-4402-bde3-56f863a33289">
When version collection is turned on, the experimental feature flag is
enabled, and telemetry collection is turned on:
<img width="1229" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/848912cd-30bd-43cf-9b81-c58a4cbad1e4">
When version collection is turned off, the experimental feature flag is
enabled, and telemetry collection is turned on:
<img width="1241" alt="image"
src="https://github.com/Unleash/unleash/assets/707867/d2b981f2-033f-4fae-a115-f93e0653729b">
---------
Co-authored-by: sighphyre <liquidwicked64@gmail.com>
Co-authored-by: Nuno Góis <github@nunogois.com>
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
### What
Added documentation for context fields and endpoints tagged with the
"Context" tag.
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.ai>
## About the changes
This PR enables or disables create API token button based on the
permissions.
**Note:** the button is only displayed if you have READ permissions on
some API token. This is a minor limitation as having CREATE permissions
should also grant READ permissions, but right now this is up to the user
to set up the custom role with the correct permissions
**Note 2:** Project-specific API tokens are also ruled by the
project-specific permission to create API tokens in a project (just
having the root permissions to create a client token or frontend token
does not grant access to create a project-specific API token). The
permissions to access the creation of a project-specific API token then
rely on the root permissions to allow the user to create either a client
token or a frontend token.
---------
Co-authored-by: David Leek <david@getunleash.io>
https://github.com/Unleash/unleash/pull/4019 introduced a bug where API
token filtering was not taking into account ADMIN permissions, which
means the API tokens were not being displayed on the UI.
## What
As part of the move to enable custom-root-roles, our permissions model
was found to not be granular enough to allow service accounts to only be
allowed to create read-only tokens (client, frontend), but not be
allowed to create admin tokens to avoid opening up a path for privilege
escalation.
## How
This PR adds 12 new roles, a CRUD set for each of the three token types
(admin, client, frontend). To access the `/api/admin/api-tokens`
endpoints you will still need the existing permission (CREATE_API_TOKEN,
DELETE_API_TOKEN, READ_API_TOKEN, UPDATE_API_TOKEN). Once this PR has
been merged the token type you're modifying will also be checked, so if
you're trying to create a CLIENT api-token, you will need
`CREATE_API_TOKEN` and `CREATE_CLIENT_API_TOKEN` permissions. If the
user performing the create call does not have these two permissions or
the `ADMIN` permission, the creation will be rejected with a `403 -
FORBIDDEN` status.
### Discussion points
The test suite tests all operations using a token with
operation_CLIENT_API_TOKEN permission and verifies that it fails trying
to do any of the operations against FRONTEND and ADMIN tokens. During
development the operation_FRONTEND_API_TOKEN and
operation_ADMIN_API_TOKEN permission has also been tested in the same
way. I wonder if it's worth it to re-add these tests in order to verify
that the permission checker works for all operations, or if this is
enough. Since we're running them using e2e tests, I've removed them for
now, to avoid hogging too much processing time.
## About the changes
Implements custom root roles, encompassing a lot of different areas of
the project, and slightly refactoring the current roles logic. It
includes quite a clean up.
This feature itself is behind a flag: `customRootRoles`
This feature covers root roles in:
- Users;
- Service Accounts;
- Groups;
Apologies in advance. I may have gotten a bit carried away 🙈
### Roles
We now have a new admin tab called "Roles" where we can see all root
roles and manage custom ones. We are not allowed to edit or remove
*predefined* roles.
![image](https://github.com/Unleash/unleash/assets/14320932/1ad8695c-8c3f-440d-ac32-39746720d588)
This meant slightly pushing away the existing roles to `project-roles`
instead. One idea we want to explore in the future is to unify both
types of roles in the UI instead of having 2 separate tabs. This
includes modernizing project roles to fit more into our current design
and decisions.
Hovering the permissions cell expands detailed information about the
role:
![image](https://github.com/Unleash/unleash/assets/14320932/81c4aae7-8b4d-4cb4-92d1-8f1bc3ef1f2a)
### Create and edit role
Here's how the role form looks like (create / edit):
![image](https://github.com/Unleash/unleash/assets/14320932/85baec29-bb10-48c5-a207-b3e9a8de838a)
Here I categorized permissions so it's easier to visualize and manage
from a UX perspective.
I'm using the same endpoint as before. I tried to unify the logic and
get rid of the `projectRole` specific hooks. What distinguishes custom
root roles from custom project roles is the extra `root-custom` type we
see on the payload. By default we assume `custom` (custom project role)
instead, which should help in terms of backwards compatibility.
### Delete role
When we delete a custom role we try to help the end user make an
informed decision by listing all the entities which currently use this
custom root role:
![image](https://github.com/Unleash/unleash/assets/14320932/352ed529-76be-47a8-88da-5e924fb191d4)
~~As mentioned in the screenshot, when deleting a custom role, we demote
all entities associated with it to the predefined `Viewer` role.~~
**EDIT**: Apparently we currently block this from the API
(access-service deleteRole) with a message:
![image](https://github.com/Unleash/unleash/assets/14320932/82a8e50f-8dc5-4c18-a2ba-54e2ae91b91c)
What should the correct behavior be?
### Role selector
I added a new easy-to-use role selector component that is present in:
- Users
![image](https://github.com/Unleash/unleash/assets/14320932/76953139-7fb6-437e-b3fa-ace1d9187674)
- Service Accounts
![image](https://github.com/Unleash/unleash/assets/14320932/2b80bd55-9abb-4883-b715-15650ae752ea)
- Groups
![image](https://github.com/Unleash/unleash/assets/14320932/ab438f7c-2245-4779-b157-2da1689fe402)
### Role description
I also added a new role description component that you can see below the
dropdown in the selector component, but it's also used to better
describe each role in the respective tables:
![image](https://github.com/Unleash/unleash/assets/14320932/a3eecac1-2a34-4500-a68c-e3f62ebfa782)
I'm not listing all the permissions of predefined roles. Those simply
show the description in the tooltip:
![image](https://github.com/Unleash/unleash/assets/14320932/7e5b2948-45f0-4472-8311-bf533409ba6c)
### Role badge
Groups is a bit different, since it uses a list of cards, so I added yet
another component - Role badge:
![image](https://github.com/Unleash/unleash/assets/14320932/1d62c3db-072a-4c97-b86f-1d8ebdd3523e)
I'm using this same component on the profile tab:
![image](https://github.com/Unleash/unleash/assets/14320932/214272db-a828-444e-8846-4f39b9456bc6)
## Discussion points
- Are we being defensive enough with the use of the flag? Should we
cover more?
- Are we breaking backwards compatibility in any way?
- What should we do when removing a role? Block or demote?
- Maybe some existing permission-related issues will surface with this
change: Are we being specific enough with our permissions? A lot of
places are simply checking for `ADMIN`;
- We may want to get rid of the API roles coupling we have with the
users and SAs and instead use the new hooks (e.g. `useRoles`)
explicitly;
- We should update the docs;
- Maybe we could allow the user to add a custom role directly from the
role selector component;
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Edit application under
https://app.unleash-hosted.com/demo/applications/test-app is currently
not working as the appName is expected to come in the request body, but
it's actually part of the url. We have two options here:
1. We change the UI to adapt to the expectations of the request by
adding appName to the request body (and eventually removing appName from
the URL, which would be a breaking change)
2. We remove the restriction of only sending the appName in the body and
take the one that comes in the URL. We have a validation that verifies
that at least one of the two sets the appName
([here](e376088668/src/lib/services/client-metrics/instance-service.ts (L208))
we validate using [this
schema](e376088668/src/lib/services/client-metrics/schema.ts (L55-L70)))
In terms of REST API, we can assume that the appName will be present in
the resource `/api/admin/metrics/applications` (an endpoint we don't
have), but when we're updating an application we should refer to that
application by its URL: `/api/admin/metrics/applications/<appName>` and
the presence of an appName in the body might indicate that we're trying
to change the name of the application (something we currently not
support)
Based on the above, I believe going with the second option is best, as
it adheres to REST principles and does not require a breaking change.
Despite that, we only support updating applications as the creation is
done from metrics ingestion
Fixes: #3580
This PR aims to handle the increased log alarm volume seen by the SREs.
It appears that we get a large number of alarms because a client
disconnects early from the front-end API. These errors are then
converted into 500s because of missing handling. These errors appear to
be caused by the `http-errors` library in a dependency.
We also introduced a log line whenever we see errors now a while back,
and I don't think we need this logging (I was also the one who
introduced it).
The changes in this PR are specifically:
- When converting from arbitrary errors, give `BadRequestError` a 400
status code, not a 500.
- Add a dependency on `http-errors` (which is already a transitive
dependency because of the body parser) and use that to check whether an
error is an http-error.
- If an error is an http error, then propagate it to the user with the
status code and message.
- Remove warning logs when an error occurs. This was introduced to make
it easier to correlate an API error and the logs, but the system hasn't
been set up for that (yet?), so it's just noise now.
- When logging errors as errors, only do that if their status code would
be 500.
This change makes the logs that happen when we encounter an error a
little bit clearer. It logs the error message before the error ID and
also logs the full serialized message just in case.
This PR improves our handling of internal Joi errors, to make them more
sensible to the end user. It does that by providing a better description
of the errors and by telling the user what they value they provided was.
Previous conversion:
```json
{
"id": "705a8dc0-1198-4894-9015-f1e5b9992b48",
"name": "BadDataError",
"message": "\"value\" must contain at least 1 items",
"details": [
{
"message": "\"value\" must contain at least 1 items",
"description": "\"value\" must contain at least 1 items"
}
]
}
```
New conversion:
```json
{
"id": "87fb4715-cbdd-48bb-b4d7-d354e7d97380",
"name": "BadDataError",
"message": "Request validation failed: your request body contains invalid data. Refer to the `details` list for more information.",
"details": [
{
"description": "\"value\" must contain at least 1 items. You provided [].",
"message": "\"value\" must contain at least 1 items. You provided []."
}
]
}
```
## Restructuring
This PR moves some code out of `unleash-error.ts` and into a new file.
The purpose of this is twofold:
1. avoid circular dependencies (we need imports from both UnleashError
and BadDataError)
2. carve out a clearer separation of concerns, keeping `unleash-error` a
little more focused.
This change lowers the log level from warning to debug for when we see
unexpected error types.
Right now this triggers for Joi errors, which we still rely on pretty
heavily. Lowering this should clear up logs for most users.
https://linear.app/unleash/issue/2-1071/prevent-users-from-disabling-password-authentication-when-there-are-no
Improves the behavior of disabling password based login by adding some
relevant information and a confirmation dialog with a warning. This felt
better than trying to disable the toggle, by still allowing the end
users to make the decision, except now it should be a properly informed
decision with confirmation.
![image](https://github.com/Unleash/unleash/assets/14320932/2ca754d8-cfa2-4fda-984d-0c34b89750f3)
- **Password based administrators**: Admin accounts that have a password
set;
- **Other administrators**: Other admin users that do not have a
password. May be SSO, but may also be users that did not set a password
yet;
- **Admin service accounts**: Service accounts that have the admin root
role. Depending on how you're using the SA this may not necessarily mean
locking yourself out of an admin account, especially if you secured its
token beforehand;
- **Admin API tokens**: Similar to the above. If you secured an admin
API token beforehand, you still have access to all features through the
API;
Each one of them link to the respective page inside Unleash (e.g. users
page, service accounts page, tokens page...);
If you try to disable and press "save", and only in that scenario, you
are presented with the following confirmation dialog:
![image](https://github.com/Unleash/unleash/assets/14320932/5ad6d105-ad47-4d31-a1df-04737aed4e00)