This test was flaky once an hour because subminutes 3 made it fall into
the wrong bucket when tests were run exactly or minutes after the our
had passed.
Also, the databases created were created with the system clock. I
altered it to be explicitly UTC.
As part of preparation for ESM and node/TSC updates, this PR will make
Unleash build with strictNullChecks set to true, since that's what's in
our tsconfig file. Hence, this PR also removes the `--strictNullChecks
false` flag in our compile tasks in package.json.
TL;DR - Clean up your code rather than turning off compiler security
features :)
This fixes a problem where the yggdrasil-engine does not send the
correct value for bucket.start. In practice clients sends metrics every
60s and it does not matter if we use start or the stop timestamp to
resolve the nearest full hour.
## About the changes
Based on the first hypothesis from
https://github.com/Unleash/unleash/pull/9264, I decided to find an
alternative way of initializing the DB, mainly trying to run migrations
only once and removing that from the actual test run.
I found in [Postgres template
databases](https://www.postgresql.org/docs/current/manage-ag-templatedbs.html)
an interesting option in combination with jest global initializer.
### Changes on how we use DBs for testing
Previously, we were relying on a single DB with multiple schemas to
isolate tests, but each schema was empty and required migrations or
custom DB initialization scripts.
With this method, we don't need to use different schema names
(apparently there's no templating for schemas), and we can use new
databases. We can also eliminate custom initialization code.
### Legacy tests
This method also highlighted some wrong assumptions in existing tests.
One example is the existence of `default` environment, that because of
being deprecated is no longer available, but because tests are creating
the expected db state manually, they were not updated to match the
existing db state.
To keep tests running green, I've added a configuration to use the
`legacy` test setup (24 tests). By migrating these, we'll speed up
tests, but the code of these tests has to be modified, so I leave this
for another PR.
## Downsides
1. The template db initialization happens at the beginning of any test,
so local development may suffer from slower unit tests. As a workaround
we could define an environment variable to disable the db migration
2. Proliferation of test dbs. In ephemeral environments, this is not a
problem, but for local development we should clean up from time to time.
There's the possibility of cleaning up test dbs using the db name as a
pattern:
2ed2e1c274/scripts/jest-setup.ts (L13-L18)
but I didn't want to add this code yet. Opinions?
## Benefits
1. It allows us migrate only once and still get the benefits of having a
well known state for tests.
3. It removes some of the custom setup for tests (which in some cases
ends up testing something not realistic)
4. It removes the need of testing migrations:
https://github.com/Unleash/unleash/blob/main/src/test/e2e/migrator.e2e.test.ts
as migrations are run at the start
5. Forces us to keep old tests up to date when we modify our database
Adds a killswitch called "filterExistingFlagNames". When enabled it will
filter out reported SDK metrics and remove all reported metrics for
names that does not match an exiting feature flag in Unleash.
This have proven critical in the rare case of an SDK that start sending
random flag-names back to unleash, and thus filling up the database. At
some point the database will start slowing down due to the noisy data.
In order to not resolve the flagNames all the time we have added a small
cache (10s) for feature flag names. This gives a small delay (10s) from
flag is created until we start allow metrics for the flag when
kill-switch is enabled. We should probably listen to the event-stream
and use that invalidate the cache when a flag is created.
We now have customers that exceed INT capacity, so we need to change
this to BIGINT in client_metrics_env_variants_daily as well.
Even heavy users only have about 10000 rows here, so should be a quick
enough operation.
## About the changes
When storing last seen metrics we no longer validate at insert time that
the feature exists. Instead, there's a job cleaning up on a regular
interval.
Metrics for features with more than 255 characters, makes the whole
batch to fail, resulting in metrics being lost.
This PR helps mitigate the issue while also logs long name feature names
This adds an extended metrics format to the metrics ingested by Unleash
and sent by running SDKs in the wild. Notably, we don't store this
information anywhere new in this PR, this is just streamed out to
Victoria metrics - the point of this project is insight, not analysis.
Two things to look out for in this PR:
- I've chosen to take extend the registration event and also send that
when we receive metrics. This means that the new data is received on
startup and on heartbeat. This takes us in the direction of collapsing
these two calls into one at a later point
- I've wrapped the existing metrics events in some "type safety", it
ain't much because we have 0 type safety on the event emitter so this
also has some if checks that look funny in TS that actually check if the
data shape is correct. Existing tests that check this are more or less
preserved
## About the changes
This aligns us with the requirement of having ip in all events. After
tackling the enterprise part we will be able to make the ip field
mandatory here:
2c66a4ace4/src/lib/types/events.ts (L362)
I've tried to use/add the audit info to all events I could see/find.
This makes this PR necessarily huge, because we do store quite a few
events.
I realise it might not be complete yet, but tests
run green, and I think we now have a pattern to follow for other events.
Previously, we were extracting the project from the token, but now we
will retrieve it from the session, which contains the full list of
projects.
This change also resolves an issue we encountered when the token was a
multi-project token, formatted as []:dev:token. Previously, it was
unable to display the exact list of projects. Now, it will show the
exact project names.
<details>
<summary>Feature Flag Cleanup</summary>
| Stale Flag | Value |
| ---------- | ------- |
| stripClientHeadersOn304 | true |
</details>
<details>
<summary>Trigger</summary>
https://github.com/Unleash/unleash/issues/6559#issuecomment-2058848984
</details>
<details>
<summary>Bot Commands</summary>
`@gitar-bot cleanup stale_flag=value` will cleanup a stale feature flag.
Replace `stale_flag` with the name of the stale feature flag and `value`
with either `true` or `false`.
</details>
---------
Co-authored-by: Gitar Bot <noreply@gitar.co>
## About the changes
There seems to be a typo in the authorization header. We're keeping the
old typo as preferred just in case, but if not present we'll default to
the authorization header (not authorisation).
Not sure about the impact of this bug, as all registrations might be
using default project.
In order to prevent users from being able to assign roles/permissions
they don't have, this PR adds a check that the user performing the
action either is Admin, Project owner or has the same role they are
trying to grant/add.
This addAccess method is only used from Enterprise, so there will be a
separate PR there, updating how we return the roles list for a user, so
that our frontend can only present the roles a user is actually allowed
to grant.
This adds the validation to the backend to ensure that even if the
frontend thinks we're allowed to add any role to any user here, the
backend can be smart enough to stop it.
We should still update frontend as well, so that it doesn't look like we
can add roles we won't be allowed to.
Only triggers if there is any rows in client instances that have
sdk_version: unleash-edge with version < 17.0.0
The function that checks this memoizes the check for 10 minutes to avoid
scanning the client instances table too often.