This PR fixes three things that were wrong with the lifecycle summary
count query:
1. When counting the number of flags in each stage, it does not take
into account whether a flag has moved out of that stage. So if you have
a flag that's gone through initial -> pre-live -> live, it'll be counted
for each one of those steps, not just the last one.
2. Some flags that have been archived don't have the corresponding
archived state row in the db. This causes them to count towards their
other recorded lifecycle stages, even when they shouldn't. This is
related to the previous one, but slightly different. Cross-reference the
features table's archived_at to make sure it hasn't been archived
3. The archived number should probably be all flags ever archived in the
project, regardless of whether they were archived before or after
feature lifecycles. So we should check the feature table's archived_at
flag for the count there instead
This PR:
- conditionally deprecates the project health report endpoint. We only
use this for technical debt dashboard that we're removing. Now it's
deprecated once you turn the simplifiy flag on.
- extracts the calculate project health function into the project health
functions file in the appropriate domain folder. That same function is
now shared by the project health service and the project status service.
For the last point, it's a little outside of how we normally do things,
because it takes its stores as arguments, but it slots in well in that
file. An option would be to make a project health read model and then
wire that up in a couple places. It's more code, but probably closer to
how we do things in general. That said, I wanted to suggest this because
it's quick and easy (why do much work when little work do trick?).
This PR updates the project status service (and schemas and UI) to use
the project's current health instead of the 4-week average.
I nabbed the `calculateHealthRating` from
`src/lib/services/project-health-service.ts` instead of relying on the
service itself, because that service relies on the project service,
which relies on pretty much everything in the entire system.
However, I think we can split the health service into a service that
*does* need the project service (which is used for 1 of 3 methods) and a
service (or read model) that doesn't. We could then rely on the second
one for this service without too much overhead. Or we could extract the
`calculateHealthRating` into a shared function that takes its stores as
arguments. ... but I suggest doing that in a follow-up PR.
Because the calculation has been tested other places (especially if we
rely on a service / shared function for it), I've simplified the tests
to just verify that it's present.
I've changed the schema's `averageHealth` into an object in case we want
to include average health etc. in the future, but this is up for debate.
This PR fixes the counting of unhealthy flags for the project status
page. The issue was that we were looking for `archived = false`, but we
don't set that flag in the db anymore. Instead, we set the `archived_at`
date, which should be null if the flag is unarchived.
**This migration introduces a query that calculates the licensed user
counts and inserts them into the licensed_users table.**
**The logic ensures that:**
1. All users created up to a specific date are included as active users
until they are explicitly deleted.
2. Deleted users are excluded after their deletion date, except when
their deletion date falls within the last 30 days or before their
creation date.
3. The migration avoids duplicating data by ensuring records are only
inserted if they don’t already exist in the licensed_users table.
**Logic Breakdown:**
**Identify User Events (user_events):** Extracts email addresses from
user-related events (user-created and user-deleted) and tracks the type
and timestamp of the event. This step ensures the ability to
differentiate between user creation and deletion activities.
**Generate a Date Range (dates):** Creates a continuous range of dates
spanning from the earliest recorded event up to the current date. This
ensures we analyze every date, even those without events.
**Determine Active Users (active_emails):** Links dates with user events
to calculate the status of each email address (active or deleted) on a
given day. This step handles:
- The user's creation date.
- The user's deletion date (if applicable).
**Calculate Daily Active User Counts (result):**
For each date, counts the distinct email addresses that are active based
on the conditions:
- The user has no deletion date.
- The user's deletion date is within the last 30 days relative to the
current date.
- The user's creation date is before the deletion date.
This PR adds the option to select potentially stale flags from the UI.
It also updates the name we use for parsing from the API: instead of
`potentiallyStale` we use `potentially-stale`. This follows the
precedent set by "kill switch" (which we send as 'kill-switch'), the
only other multi-word option that I could find in our filters.
This PR adds support for the `potentiallyStale` value in the feature
search API. The value is added as a third option for `state` (in
addition to `stale` and `active`). Potentially stale is a subset of
active flags, so stale flags are never considered potentially stale,
even if they have the flag set in the db.
Because potentially stale is a separate column in the db, this
complicates the query a bit. As such, I've created a specialized
handling function in the feature search store: if the query doesn't
include `potentiallyStale`, handle it as we did before (the mapping has
just been moved). If the query *does* contain potentially stale, though,
the handling is quite a bit more involved because we need to check
multiple different columns against each other.
In essence, it's based on this logic:
when you’re searching for potentially stale flags, you should only get flags that are active and marked as potentially stale. You should not get stale flags.
This can cause some confusion, because in the db, we don’t clear the potentially stale status when we mark a flag as stale, so we can get flags that are both stale and potentially stale.
However, as a user, if you’re looking for potentially stale flags, I’d be surprised to also get (only some) stale flags, because if a flag is stale, it’s definitely stale, not potentially stale.
This leads us to these six different outcomes we need to handle when your search includes potentially stale and stale or active:
1. You filter for “potentially stale” flags only. The API will give you only flags that are active and marked as potentially stale. You will not get stale flags.
2. You filter only for flags that are not potentially stale. You will get all flags that are active and not potentially stale and all stale flags.
3. You search for “is any of stale, potentially stale”. This is our “unhealthy flags” metric. You get all stale flags and all flags that are active and potentially stale
4. You search for “is none of stale, potentially stale”: This gives you all flags that are active and not potentially stale. Healthy flags, if you will.
5. “is any of active, potentially stale”: you get all active flags. Because we treat potentially stale as a subset of active, this is the same as “is active”
6. “is none of active, potentially stale”: you get all stale flags. As in the previous point, this is the same as “is not active”
This change adds a db migration to make the potentially_stale column
non-nullable. It'll set any NULL values to `false`.
In the down-migration, make the column nullable again.
## About the changes
- Remove `idNumberMiddleware` method and change to use `parameters`
field in `openApiService.validPath` method for the flexibility.
- Remove unnecessary `Number` type converting method and change them to
use `<{id: number}>` to specify the type.
### Reference
The changed response looks like the one below.
```JSON
{
"id":"8174a692-7427-4d35-b7b9-6543b9d3db6e",
"name":"BadDataError",
"message":"Request validation failed: your request body or params contain invalid data. Refer to the `details` list for more information.",
"details":[
{
"message":"The `/params/id` property must be integer. You sent undefined.",
"path":"/params/id"
}
]
}
```
I think it might be better to customize the error response, especially
`"You sent undefined."`, on another pull request if this one is
accepted. I prefer to separate jobs to divide the context and believe
that it helps reviewer easier to understand.
Remove everything related to the connected environment count for project
status. We decided that because we don't have anywhere to link it to at
the moment, we don't want to show it yet.
## About the changes
Builds on top of #8766 to use memoized results from stats-service.
Because stats service depends on version service, and to avoid making
the version service depend on stats service creating a cyclic
dependency. I've introduced a telemetry data provider. It's not clean
code, but it does the job.
After validating this works as expected I'll clean up
Added an e2e test validating that the replacement is correct:
[8475492](847549234c)
and it did:
https://github.com/Unleash/unleash/actions/runs/11861854341/job/33060032638?pr=8776#step:9:294
Finally, cleaning up version service
## About the changes
Our stats are used for many places and many times to publish prometheus
metrics and some other things.
Some of these queries are heavy, traversing all tables to calculate
aggregates.
This adds a feature flag to be able to memoize 1 minute (by default) how
long to keep the calculated values in memory.
We can use the key of the function to individually control which ones
are memoized or not and for how long using a numeric variant.
Initially, this will be disabled and we'll test in our instances first
This PR adds stale flag count to the project status payload. This is
useful for the project status page to show the number of stale flags in
the project.
Adding email_hash column to users table.
We will update all existing users to have hashed email.
All new users will also get the hash.
We are fine to use md5, because we just need uniqueness. We have emails
in events table stored anyways, so it is not sensitive.
This PR moves the project lifecycle summary to its own subdirectory and
adds files for types (interface) and a fake implementation.
It also adds a query for archived flags within the last 30 days taken
from `getStatusUpdates` in `src/lib/features/project/project-service.ts`
and maps the gathered data onto the expected structure. The expected
types have also been adjusted to account for no data.
Next step will be hooking it up to the project status service, adding
schema, and exposing it in the controller.
This PR adds more tests to check a few more cases for the lifecycle
calculation query. Specifically, it tests that:
- If we don't have any data for a stage, we return `null`.
- We filter on projects
- It correctly takes `0` days into account when calculating averages.
This PR adds a project lifecycle read model file along with the most
important (and most complicated) query that runs with it: calculating
the average time spent in each stage.
The calculation relies on the following:
- when calculating the average of a stage, only flags who have gone into
a following stage are taken into account.
- we'll count "next stage" as the next row for the same feature where
the `created_at` timestamp is higher than the current row
- if you skip a stage (go straight to live or archived, for instance),
that doesn't matter, because we don't look at that.
The UI only shows the time spent in days, so I decided to go with
rounding to days directly in the query.
## Discussion point:
This one uses a subquery, but I'm not sure it's possible to do without
it. However, if it's too expensive, we can probably also cache the value
somehow, so it's not calculated more than every so often.
This change introduces a new method `countProjectTokens` on the
`IApiTokenStore` interface. It also swaps out the manual filtering for
api tokens belonging to a project in the project status service.
Do not count stale flags as potentially stale flags to remove
duplicates.
Stale flags feel like more superior state and it should not show up
under potentially stale.
This PR adds member, api token, and segment counts to the project status
payload. It updates the schemas and adds the necessary stores to get
this information. It also adds a new query to the segments store for
getting project segments.
I'll add tests in a follow-up.
As part of the release plan template work. This PR adds the three events
for actions with the templates.
Actually activating milestones should probably trigger existing
FeatureStrategyAdd events when adding and FeatureStrategyRemove when
changing milestones.
This PR wires up the connectedenvironments data from the API to the
resources widget.
Additionally, it adjusts the orval schema to add the new
connectedEnvironments property, and adds a loading state indicator for
the resource values based on the project status endpoint response.
As was discussed in a previous PR, I think this is a good time to update
the API to include all the information required for this view. This
would get rid of three hooks, lots of loading state indicators (because
we **can** do them individually; check out
0a334f9892)
and generally simplify this component a bit.
Here's the loading state:
![image](https://github.com/user-attachments/assets/c9938383-afcd-4f4b-92df-c64b83f5b1df)
In some cases, people want to disable database migration. For example,
some people or companies want to grant whole permissions to handle the
schema by DBAs, not by application level hence I use
`parseEnvVarBoolean` to handle `disableMigration` option by environment
variable. I set the default value as `false` for the backward
compatibility.
This PR adds connected environments to the project status payload.
It's done by:
- adding a new `getConnectedEnvironmentCountForProject` method to the
project store (I opted for this approach instead of creating a new view
model because it already has a `getEnvironmentsForProject` method)
- adding the project store to the project status service
- updating the schema
For the schema, I opted for adding a `resources` property, under which I
put `connectedEnvironments`. My thinking was that if we want to add the
rest of the project resources (that go in the resources widget), it'd
make sense to group those together inside an object. However, I'd also
be happy to place the property on the top level. If you have opinions
one way or the other, let me know.
As for the count, we're currently only counting environments that have
metrics and that are active for the current project.
This was an oversight. The test would always fail after 2024-11-04,
because yesterday is no longer 2024-11-03. This way, we're using the
same string in both places.
Adding project status schema definition, controller, service, e2e test.
Next PR will add functionality for activity object.
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
The two lints being turned off are new for 1.9.x and caused a massive
diff inside frontend if activated. To reduce impact, these were turned off for
the merge. We might want to look at turning them back on once we're
ready to have a semantic / a11y refactor of our frontend.
Archived features can be searched now.
This is the backend and small parts of frontend preparing to add
filters, buttons etc in next PR.
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
This change fixes a bug in the event filter's `to` query parameter.
The problem was that in an attempt to make it inclusive, we also
stripped it of the `IS:` prefix, which meant it had no effect. This
change fixes that by first splitting the value, fixing only the
date (because we want it to include the entire day), and then joining
it back together.
We do some validation on flag names, but there's some cases that slip
through. These are some cases that we should handle better.
With `..` as a name, you can't go into the flag in Unleash and you can't
activate any environments because the it is interpreted as "go up a
level".
## About the changes
We have many aggregation queries that run on a schedule:
f63496d47f/src/lib/metrics.ts (L714-L719)
These staticCounters are usually doing db query aggregations that
traverse tables and we run all of them in parallel:
f63496d47f/src/lib/metrics.ts (L410-L412)
This can add strain to the db. This PR suggests a way of handling these
queries in a more structured way, allowing us to run them sequentially
(therefore spreading the load):
f02fe87835/src/lib/metrics-gauge.ts (L38-L40)
As an additional benefit, we get both the gauge definition and the
queries in a single place:
f02fe87835/src/lib/metrics.ts (L131-L141)
This PR only tackles 1 metric, and it only focuses on gauges to gather
initial feedback. The plan is to migrate these metrics and eventually
incorporate more types (e.g. counters)
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
Give the ability to change when users are considered inactive via an
environment variable `USER_INACTIVITY_THRESHOLD_IN_DAYS` or
configuration option: `userInactivityThresholdInDays`. Default remains
180 days
## About the changes
This fixes#8029. How to reproduce the issue is in the ticket.
The issue happens because when a web app is hosted in the same domain as
Unleash UI and the web app uses unleash SDK to make requests to Unleash,
the browser automatically includes the cookie in the request headers,
because:
- The request URL matches the cookie's Path attribute (which it does in
this case).
- The request is sent to the same domain (which it is, since both apps
are under the same domain).
And this is by design in the HTTP cookie specification:
https://datatracker.ietf.org/doc/html/rfc6265
This PR avoids overriding the API user with the session user if there's
already an API user in the request. It's an alternative to
https://github.com/Unleash/unleash/pull/8434Closes#8029
This gives us better types for our wrapTimer function.
Maybe the type `(args: any) => any` could also be improved
---------
Co-authored-by: Nuno Góis <nuno@getunleash.io>
https://linear.app/unleash/issue/2-2787/add-openai-api-key-to-our-configuration
Adds the OpenAI API key to our configuration and exposes a new
`unleashAIAvailable` boolean in our UI config to let our frontend know
that we have configured this. This can be used together with our flag to
decide whether we should enable our experiment for our users.
This PR fixes a bug where the default project would have no listed
owners. The issue was that the default project has no user owners by
default, so we didn't get a result back when looking for user owners.
Now we check whether we have any owners for that project, and if we
don't, then we return the system user as an owner instead.
This also fixes an issue for the default project where you have no roles
(because by default, you don't) by updating the schema to allow an empty
list.
This PR contains a number of small updates to the dashboard schemas,
including rewording descriptions, changing numbers to integers, setting
minimum values.
This property does not seem to be used anywhere, so we can remove it.
Can't find any references in code here or in enterprise. Let's try it
and see if it breaks.
This PR updates the personal dashboard project endpoint to return owners
and roles. It also adds the impl for getting roles (via the access
store).
I'm filtering the roles for a project to only include project roles for
now, but we might wanna change this later.
Tests and UI update will follow.
This PR is part 1 of returning project owners and your project roles for
the personal dashboard single-project endpoint.
It moves the responsibility of adding owners and roles to the project to
the service from the controller and adds a new method to the project
owners read model to take care of it.
I'll add roles and tests in follow-up PRs.
Splitting #8271 into smaller pieces. This first PR will focus on making
access service handle empty string inputs gracefully and converting them
to null before inserting them into the database.