Accepts the new impact metrics into the singleton registry and then does
nothing with them. If the relevant flag is off, the metrics are stripped
from the existing metrics data format and dropped on the floor
**BREAKING CHANGE**: DEFAULT_ENV changed from `default` (should not be
used anymore) to `development`
## About the changes
- Only delete default env if the install is fresh new.
- Consider development the new default. The main consequence of this
change is that the default is no longer considered `type=production`
environment but also for frontend tokens due to this assumption:
724c4b78a2/src/lib/schema/api-token-schema.test.ts (L54-L59)
(I believe this is mostly due to the [support for admin
tokens](https://github.com/Unleash/unleash/pull/10080#discussion_r2126871567))
- `feature_toggle_update_total` metric reports `n/a` in environment and
environment type as it's not environment specific
Now we can receive custom metrics, return those for UI and have extra
prometheus endpoint for it.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.io>
Vitest Pros:
* Automated failing test comments on github PRs
* A nice local UI with incremental testing when changing files (`yarn
test:ui`)
* Also nicely supported in all major IDEs, click to run test works (so
we won't miss what we had with jest).
* Works well with ESM
Vitest Cons:
* The ESBuild transformer vitest uses takes a little longer to transform
than our current SWC/jest setup, however, it is possible to setup SWC as
the transformer for vitest as well (though it only does one transform,
so we're paying ~7-10 seconds instead of ~ 2-3 seconds in transform
phase).
* Exposes how slow our tests are (tongue in cheek here)
https://linear.app/unleash/issue/2-3564/remove-filterexistingflagnames-feature-flag
We're removing the `filterExistingFlagNames` feature flag since we've
decided we want this to be the default behavior.
We don't need to rush to merge it, just in case we need to disable this
for any reason. However it should also be pretty easy to just revert if
needed.
Changes in tests are a bit tricky since they assumed the previous
behavior where we always registered metrics, even for non existing flag
names. `cachedFeatureNames` is also memoized with a TTL of 10s, so the
easiest way to overcome this was to override `cachedFeatureNames` to
return what we expected. As long as they return the same flag names that
we expect, we're able to register their metrics.
Let me know if you can think of a better approach.
We're migrating to ESM, which will allow us to import the latest
versions of our dependencies.
Co-Authored-By: Christopher Kolstad <chriswk@getunleash.io>
https://linear.app/unleash/issue/2-3406/hold-unknown-flags-in-memory-and-show-them-in-the-ui-somehow
This PR introduces a suggestion for a “unknown flags” feature.
When clients report metrics for flags that don’t exist in Unleash (e.g.
due to typos), we now track a limited set of these unknown flag names
along with the appnames that reported them. The goal is to help users
identify and clean up incorrect flag usage across their apps.
We store up to 10 unknown flag + appName combinations, keeping only the
most recent reports. Data is collected in-memory and flushed
periodically to the DB, with deduplication and merging to ensure we
don’t exceed the cap even across pods.
We were especially careful to make this implementation defensive, as
unknown flags could be reported in very high volumes. Writes are
batched, deduplicated, and hard-capped to avoid DB pressure.
No UI has been added yet — this is backend-only for now and intended as
a step toward better visibility into client misconfigurations.
I would suggest starting with a simple banner that opens a dialog
showing the list of unknown flags and which apps reported them.
<img width="497" alt="image"
src="https://github.com/user-attachments/assets/b7348e0d-0163-4be4-a7f8-c072e8464331"
/>
This test was flaky once an hour because subminutes 3 made it fall into
the wrong bucket when tests were run exactly or minutes after the our
had passed.
Also, the databases created were created with the system clock. I
altered it to be explicitly UTC.
As part of preparation for ESM and node/TSC updates, this PR will make
Unleash build with strictNullChecks set to true, since that's what's in
our tsconfig file. Hence, this PR also removes the `--strictNullChecks
false` flag in our compile tasks in package.json.
TL;DR - Clean up your code rather than turning off compiler security
features :)
This fixes a problem where the yggdrasil-engine does not send the
correct value for bucket.start. In practice clients sends metrics every
60s and it does not matter if we use start or the stop timestamp to
resolve the nearest full hour.
## About the changes
Based on the first hypothesis from
https://github.com/Unleash/unleash/pull/9264, I decided to find an
alternative way of initializing the DB, mainly trying to run migrations
only once and removing that from the actual test run.
I found in [Postgres template
databases](https://www.postgresql.org/docs/current/manage-ag-templatedbs.html)
an interesting option in combination with jest global initializer.
### Changes on how we use DBs for testing
Previously, we were relying on a single DB with multiple schemas to
isolate tests, but each schema was empty and required migrations or
custom DB initialization scripts.
With this method, we don't need to use different schema names
(apparently there's no templating for schemas), and we can use new
databases. We can also eliminate custom initialization code.
### Legacy tests
This method also highlighted some wrong assumptions in existing tests.
One example is the existence of `default` environment, that because of
being deprecated is no longer available, but because tests are creating
the expected db state manually, they were not updated to match the
existing db state.
To keep tests running green, I've added a configuration to use the
`legacy` test setup (24 tests). By migrating these, we'll speed up
tests, but the code of these tests has to be modified, so I leave this
for another PR.
## Downsides
1. The template db initialization happens at the beginning of any test,
so local development may suffer from slower unit tests. As a workaround
we could define an environment variable to disable the db migration
2. Proliferation of test dbs. In ephemeral environments, this is not a
problem, but for local development we should clean up from time to time.
There's the possibility of cleaning up test dbs using the db name as a
pattern:
2ed2e1c274/scripts/jest-setup.ts (L13-L18)
but I didn't want to add this code yet. Opinions?
## Benefits
1. It allows us migrate only once and still get the benefits of having a
well known state for tests.
3. It removes some of the custom setup for tests (which in some cases
ends up testing something not realistic)
4. It removes the need of testing migrations:
https://github.com/Unleash/unleash/blob/main/src/test/e2e/migrator.e2e.test.ts
as migrations are run at the start
5. Forces us to keep old tests up to date when we modify our database
Adds a killswitch called "filterExistingFlagNames". When enabled it will
filter out reported SDK metrics and remove all reported metrics for
names that does not match an exiting feature flag in Unleash.
This have proven critical in the rare case of an SDK that start sending
random flag-names back to unleash, and thus filling up the database. At
some point the database will start slowing down due to the noisy data.
In order to not resolve the flagNames all the time we have added a small
cache (10s) for feature flag names. This gives a small delay (10s) from
flag is created until we start allow metrics for the flag when
kill-switch is enabled. We should probably listen to the event-stream
and use that invalidate the cache when a flag is created.
We now have customers that exceed INT capacity, so we need to change
this to BIGINT in client_metrics_env_variants_daily as well.
Even heavy users only have about 10000 rows here, so should be a quick
enough operation.
## About the changes
When storing last seen metrics we no longer validate at insert time that
the feature exists. Instead, there's a job cleaning up on a regular
interval.
Metrics for features with more than 255 characters, makes the whole
batch to fail, resulting in metrics being lost.
This PR helps mitigate the issue while also logs long name feature names