In order to stop privilege escalation via
`/api/admin/projects/:project/users/:userId/roles` and
`/api/admin/projects/:project/groups/:groupId/roles` this PR adds the
same check we added to setAccess methods to the methods updating access
for these two methods.
Also adds tests that verify that we throw an exception if you try to
assign roles you do not have.
Thank you @nunogois for spotting this during testing.
In order to prevent users from being able to assign roles/permissions
they don't have, this PR adds a check that the user performing the
action either is Admin, Project owner or has the same role they are
trying to grant/add.
This addAccess method is only used from Enterprise, so there will be a
separate PR there, updating how we return the roles list for a user, so
that our frontend can only present the roles a user is actually allowed
to grant.
This adds the validation to the backend to ensure that even if the
frontend thinks we're allowed to add any role to any user here, the
backend can be smart enough to stop it.
We should still update frontend as well, so that it doesn't look like we
can add roles we won't be allowed to.
Created a build script that generates orval schemas with automatic
cleanup. Also generating new ones.
1. yarn gen:api **(generates schemas)**
2. rm -rf src/openapi/apis **(remove apis)**
3. sed -i '1q' src/openapi/index.ts **(remove all rows except first)**
## About the changes
Implements a new store for collected traffic data usage that connects to
the new table `stat_traffic_data` primary key'd on [day, trafficGroup,
status_code_series].
Day being a date
Traffic group being which endpoint is being counted for, ie /api/admin,
/api/frontend etc
Status code series grouping 2xx status responses and 304 into their
respective 200 / 300 series.
No service here, this is for pro/enterprise
This PR adds an endpoint to Unleash that accepts an error message and
option error stack and logs it as an error. This allows us to leverage
errors in logs observability to catch UI errors consistently.
Considered a test, but this endpoint only accepts and logs input, so I'm
not sure how useful it would be.
## About the changes
App stats is mainly used to cap the number of applications reported to
Unleash based on the last 7 days information:
cc2ccb1134/src/lib/middleware/response-time-metrics.ts (L24-L28)
Instead of getting all stats, just calculate appCount statistics
Use scheduler service instead of setInterval
In the beginning we used process.nextTick() as a trick to load some data
initally in the constructor of a service.
This is a bad pattern and we should generally avoid any async operations
in the constructor. Today we have two alternatives:
1. Defer loading until data is needed (wrap it in async)
2. Use the schdule-service.
When a stop signal is sent to Unleash the scheduler-service should
cancel any scheduled jobs. This also applies to the job scheduled for
initial execution with jitter.
We observed that initial jobs was executed after the database
connections are terminated. This appears after v5.9.0 of Unleash.
```
Error: aborted
at Object.queryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
at createQueryBuilder (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
at EventStore.knex [as db] (/unleash/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
at EventStore.setUnannouncedToAnnounced (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:286:33)
at EventStore.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/features/events/event-store.js:293:35)
at EventAnnouncer.publishUnannouncedEvents (/unleash/node_modules/unleash-server/dist/lib/services/event-announcer-service.js:9:32)
at runScheduledFunctionWithEvent (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:30:23)
at Timeout.<anonymous> (/unleash/node_modules/unleash-server/dist/lib/features/scheduler/scheduler-service.js:50:27)
at runNextTicks (node:internal/process/task_queues:60:5)
at process.processTimers (node:internal/timers:509:9)
```
## About the changes
Queries on client-feature-toggle store have many purposes depending on
the requestType, making the query more complex or not depending on the
use case. Also, each use case has different frequencies (i.e. playground
is expected to be used rarely).
The name for the store metrics was wrong, copy&pasted from:
7b04db0547/src/lib/features/feature-toggle/feature-toggle-store.ts (L107)
Which was also present in feature-tag metrics:
7b04db0547/src/lib/db/feature-tag-store.ts (L37)
With this, we'll have more granularity to understand the execution time
and frequency of each
Usually maintenance mode is disabled. If the call throws, which we see a
lot of when a unleash instance is in terminating state, we should return
a default value.
By having it throw inside of the memoizee function, the response is not
cached, and it will trigger new calls until it return a cachable result.
## About the changes
Every schedule job will now check if maintenance is enabled. This ends
up querying the settings table in the db at least once per second per
running unleash instance. This small fix caches this query for 60
seconds to reduce the load somewhat.
We should reconsider this solution for the long term, but this will be a
great improvement on the short term.
**Logs after this fix running locally.**
We can observe that we resolve settings from the DB once per minute.
![image](https://github.com/Unleash/unleash/assets/158948/c313cf38-8d86-4b86-a0ba-4f4df60d50d6)
Also we should consider giving a warning in section where you enable
maintenance mode that it can take up to a minute to propagate.
## About the changes
the created_by_user_id data migration from resolving events.created_by
(for both events and features) now emits events on how many rows were
updated.
Adds listeners for these events that records these metrics with
prometheus
![image](https://github.com/Unleash/unleash/assets/707867/3bb02645-0919-4a9a-83fe-a07383ac0be1)
We were sending `user.id` to the service, but if an admin token is used,
there is no `user.id.` Instead, there is
`user.internalAdminTokenUserId`. so we need to use the special method
`extractUserIdFromUser`.
This PR adds this implementation, and now the service correctly
retrieves the appropriate ID for admins.
Related to: https://github.com/Unleash/unleash/pull/5924
## About the changes
Schedules a best-effort task setting the value of
events.created_by_user_id based on what is found in the created_by
column and if it's capable of resolving that to a userid/a system id.
The process is executed in the events-store, it takes a chunk of events
that haven't been processed yet, attempts to join users and api_tokens
tables on created_by = username/email, loops through and tries to figure
out an id to set. Then updates the record.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Change the sorting of features to migrate created_by_user_id for, and
filter out unresolvable feature/users
Query tested manually in enterprise
## About the changes
Sets data migration of features and events created_by_user_id to
disabled by default
Map to promise and await all in created by user id migration for features
## About the changes
Adds a scheduled task that every 5 seconds updates 500 entries in the
features table setting `created_by_user_id`.
It does this by looking at the related event, checks created_by and
joins users table for match on username or email, and joins api_tokens
table on username matches. Then picks either a users id if set, or uses
-42 (admin token user)
Only triggers if there is any rows in client instances that have
sdk_version: unleash-edge with version < 17.0.0
The function that checks this memoizes the check for 10 minutes to avoid
scanning the client instances table too often.
## About the changes
This was spotted while testing automated actions. Steps to reproduce:
1. Add an editor user
2. Get a PAT for the editor user
3. As Admin create a feature in a project where the editor user is not a
member and enable the feature
4. Try using the editor's PAT to modify the feature
5. As the editor create a project (you'd be made owner) and try the same
request but just change the project name for the new project just
created (don't change anything else)
**Expected behavior**: you can't disable the feature
**Actual behavior**: the feature is disabled
This does not happen when trying to turn on a flag because during the
turn-on process we do validate if the feature belongs to project when we
call updateStrategy:
c18a7c0dc2/src/lib/features/feature-toggle/feature-toggle-service.ts (L1751-L1764)
So, this was causing a lot of ERROR in our logs, due to the metric
having gotten an extra label the last month.
Two things for this fix.
1. add the missing label to the two calls that did not have it added
2. update the log line to include the error as another argument to the
logger, so we actually get a stacktrace from the error.
## About the changes
EventsService is a dependency in most of our services. This creates
helper methods to create them easily and replace a few places where
we're creating them manually
Lots of work here, mostly because I didn't want to turn off the
`noImplicitAnyLet` lint. This PR tries its best to type all the untyped
lets biome complained about (Don't ask me how many hours that took or
how many lints that was >200...), which in the future will force test
authors to actually type their global variables setup in `beforeAll`.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
Adds the new nullable column created_by_user_id to the data used by
feature-tag-store and feature-tag-service. Also updates openapi schemas.
Two changes were needed to sort better
1. Since we are still using `last seen` from `features` table for
backwards compatibility, we needed to add it to sort condition.
2. Nulls break the order, so now sorting nulls as last.
### What
Adds `createdByUserId` to all events exposed by unleash. In addition
this PR updates all tests and usages of the methods in this codebase to
include the required number.
This change adds a property to the segmentStrategiesSchema to make sure
that change request strategies are listed in the openapi spec
It also renames the files that contains that schema and its tests from
`admin-strategies-schema` to `segment-strategies-schema`.
Adding new project overview endpoint and deprecating the old one.
The new one has extra info about feature types, but does not have
features anymore, because features are coming from search endpoint.
This PR checks that the unleash instance is an enterprise instance
before fetching change request data. This is to prevent Change Request
usage from preventing OSS users from deleting segments (when they don't
have access to change requests).
This PR also does a little bit of refactoring (which we can remove if
you want)
This PR updates the returned value about segments to also include the CR
title and to be one list item per strategy per change request. This
means that if the same strategy is used multiple times in multiple
change requests, they each get their own line (as has been discussed
with Nicolae).
Because of this, this pr removes a collection step in the query and
fixes some test cases.
This PR addresses some cleanup related to removing the
useLastSeenRefactor flag:
* Added fallback last seen to the feature table last_seen_at column
* Remove foreign key on environment since we can not guarantee that we
will get valid data in this field
* Add environments to cleanup function
* Add test for cleanup environments
This PR handles the case where a single strategy is used in multiple
change requests. Instead of listing the strategy several times in the
output, we consolidate the entries and add a new `changeRequestIds`
property. This is a non-empty list that points to all the change
requests it is used in.
This is required for us to be able to link back to the change requests
from the UI overview.
This change is just a refactor, removing code that's no longer used. Instead of
checking just whether a segment is in use, we now extract the list of
strategies that use this segment. This is slightly more costly,
perhaps, but it will be necessary for the upcoming implementation.
This PR changes the payload of the strategiesBySegment endpoint when the
flag is active. In addition to returning just the strategies, the object
will also contain a new property, called `changeRequestStrategies`
containing the strategies that are used in change requests.
This PR does not update the schema. That can be done later when the
changes go into beta. This also allows us some time to iterate on the
payload without changing the public API.
## Discussion points:
Should `strategies` and `changeRequestStrategies` ever contain
duplicates? Take this scenario:
- Strategy S uses segment T.
- There is an open change request that updates the list of segments for
S to T and a new segment U.
- In this case, strategy S would show up both in `strategies` _and_ in
`changeRequestStrategies`.
We have two options:
1. Filter the list of change request strategies, so that they don't
contain any duplicates (this is currently how it's implemented)
2. Ignore the duplicates and just send both lists as is.
We're doing option 2 for now.
In short the issue is that after our last seen improvements, we did not
update where we are getting last_seen field. It was still using features
table, which is not the source of last seen anymore.
## PR Description
https://linear.app/unleash/issue/2-1645/address-post-mortem-action-point-all-flags-should-be-runtime
Refactor with the goal of ensuring that flags are runtime controllable,
mostly focused on the current scheduler logic.
This includes the following changes:
- Moves scheduler into its own "scheduler" feature folder
- Reverts dependency: SchedulerService takes in the MaintenanceService,
not the other way around
- Scheduler now evaluates maintenance mode at runtime instead of relying
only on its mode state (active / paused)
- Favors flag checks to happen inside the scheduled methods, instead of
controlling whether the method is scheduled at all (favor runtime over
startup)
- Moves "account last seen update" to scheduler
- Updates tests accordingly
- Boyscouting
Here's a manual test showing this behavior, where my local instance was
controlled by a remote instance. Whenever I toggle `maintenanceMode`
through a flag remotely, my scheduled functions stop running:
https://github.com/Unleash/unleash/assets/14320932/ae0a7fa9-5165-4c0b-9b0b-53b9fb20de72
Had a look through all of our current flags and it *seems to me* that
they are all used in a runtime controllable way, but would still feel
more comfortable if this was double checked, since it can be complex to
ensure this.
The only exception to this was `migrationLock`, which I believe is OK,
since the migration only happens at the start anyways.
## Discussion / Questions
~~Scheduler `mode` (active / paused) is currently not *really* being
used, along with its respective methods, except in tests. I think this
could be a potential footgun. Should we remove it in favor of only
controlling the scheduler state through maintenance mode?~~ Addressed in
7c52e3f638
~~The config property `disableScheduler` is still a startup
configuration, but perhaps that makes sense to leave as is?~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
Are there any other tests we should add?
Is there anything I missed?
Identified some `setInterval` and `setTimeout` that may make sense to
leave as is instead of moving over to the scheduler service:
- ~~`src/lib/metrics` - This is currently considered a `MetricsMonitor`.
Should this be refactored to a service instead and adapt these
setIntervals to use the scheduler instead? Is there anything special
with this we need to take into account? @chriswk @ivarconr~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1820501511)
by @ivarconr, leaving as is.
- ~~`src/lib/proxy/proxy-repository.ts` - This seems to have a complex
and specific logic currently. Perhaps we should leave it alone for now?
@FredrikOseberg~~
[Answered](https://github.com/Unleash/unleash/pull/5363#issuecomment-1819005445)
by @FredrikOseberg, leaving as is.
- `src/lib/services/user-service.ts` - This one also seems to be a bit
more specific, where we generate new timeouts for each receiver id.
Might not belong in the scheduler service. @Tymek
This PR adds the ability to detect which strategies use a specific
segment in active change requests.
It does not wire this functionality up to anything just yet. Follow-up
PRs will integrate this with the segment service and eventually with the
front end.
The issue was that we all features were created exactly in same time,
and our feature counter waas expecting time to be unique to feature,
which was not the case.
Instead of throwing an error when the project doesn't exist, we say that
the names are valid, because we have nothing to say that they're not.
Presumably there is already something in place to prevent you from
importing into a non-existent project.
Optimizations:
1. Removed extra round trip to database to count environments
2. Removed extra round trip to database to count features
Fixes:
Currently, we were using a very optimistic query to set correct limit
and offset. This breaks as soon we we join tags.
` query = query
.select(selectColumns)
.limit(limit * environmentCount)
.offset(offset * environmentCount);`
The solution was to use common table expressions, so we could count and
rank features.
This PR hooks up the changes introduced in #5301 to the API and puts
them behind a feature flag. A new test has been added and the test setup
has been slightly tweaked to allow this test.
When the flag is enabled, the API will now not let you delete a segment
that's used in any active CRs.
https://linear.app/unleash/issue/SR-164/ticket-1106-user-with-createedit-project-segment-is-not-able-to-edit-a
Fixes a bug where the `UPDATE_PROJECT_SEGMENT` permission is not
respected, both on the UI and on the API. The original intention was
stated
[here](https://github.com/Unleash/unleash/pull/3346#discussion_r1140434517).
This was easy to fix on the UI, since we were simply missing the extra
permission on the button permission checks.
Unfortunately the API can be tricky. Our auth middleware tries to grab
the `project` information from either the params or body object, but our
`DELETE` method does not contain this information. There is no body and
the endpoint looks like `/admin/segments/:id`, only including the
segment id.
This means that, in the rbac middleware when we check the permissions,
we need to figure out if we're in such a scenario and fetch the project
information from the DB, which feels a bit hacky, but it's something
we're seemingly already doing for features, so at least it's somewhat
consistent.
Ideally what we could do is leave this API alone and create a separate
one for project segments, with endpoints where we would have project as
a param, like so:
`http://localhost:4242/api/admin/projects/:projectId/segments/1`.
This PR opts to go with the quick and hacky solution for now since this
is an issue we want to fix quickly, but this is something that we should
be aware of. I'm also unsure if we want to create a new API for project
segments. If we decide that we want a different solution I don't mind
either adapting this PR or creating a follow up.
This test was flaky because it relied on the order of the array
returned. To make it less flaky, we now turn the array into an object
instead and compare that.
This PR adds a way to tell if a specific segment is being used in any
active change requests. It's the first step towards preventing segments
that are being used in change requests from being deleted.
It does that by checking the db for any unclosed CRs and using those CR
ids to look for "addStrategy" and "updateStrategy" events in the cr
events table.
## Upcoming PRs
This only puts in a way to detect it, but doesn't add that to anything.
That'll be in an upcoming iteration.
For a while we ran a diffing algorithm in production to verify that the
results of the refactor did not differ from the previous results. As the
experiment has run it's course and new attributes have been added on top
of the new flow, this will remove the logging and associated code.
As #4475 says, MD5 is not available in secure places anymore. This PR
swaps out gravatar-url with an inline function using crypto:sha256 which
is FIPS-140-2 compliant. Since we only used this method for generating
avatar URLs the extra customization wasn't needed and we could hard code
the URL parameters.
fixes: Linear
https://linear.app/unleash/issue/SR-112/gh-support-swap-out-gravatar-url-libcloses: #4475
## About the changes
This makes sure that projects have at least one owner, either a group or
a user. This is to prevent accidentally losing access to a project.
We check this when removing a user/group or when changing the role of a
user/group
**Note**: We can still leave a group empty as the only owner of the
project, but that's okay because we can still add more users to the
group
Sort array items before running compare. Feature flag certain properties
of strategy that were previously not present in the /api/admin/features
endpoint.
## About the changes
This small improvement aims to help developers when instantiating
services. They need to be constructed without injecting services or
stores created elsewhere so they can be bound to the same transactional
scope.
This suggests that you need to create the services and stores on your
own
This PR is the first step in separating the client and admin stores.
Currently our feature toggle services uses the client store to serve
multiple purposes.
Admin API uses the feature toggle service to serve both the feature
toggle list and playground features, while the client API uses the
feature toggle service to serve client features. The admin API can
change often and have very different requirements than the client API,
which changes infrequently and generally keeps the same stable structure
for long periods of time. This architecture is error prone, because when
you need to make changes to the admin API, you can very easily affect
the client API.
I aim to put up a stone wall between the two APIs. Complete separation
between the two APIs, at the cost of some duplication.
In this PR I have created a feature oriented architecture for client
features and disconnected the client API from the feature toggle
service. It now goes through it's own service to it's own store. For
feature toggle service I have duplicated and replaced the functionality
that serves /api/admin/features, I have kept a lot of the ugliness in
the code and haven't removed anything in order to avoid breaking
changes.
Next steps:
* Move playground to admin API
* Remove client-feature-toggle-store from feature-toggle-service