When deleting stale sessions, we sort them by createdAt. If both
sessions are created with the same createdAt, there's a chance we get a
different sort order and we end up with the wrong order:
https://github.com/Unleash/unleash/actions/runs/16438565746/job/46453700977
I think adding 10ms between inserts should be enough (1ms should do,
but this gives me more confidence and doesn't hurt that much)
---------
Co-authored-by: Thomas Heartman <thomas@getunleash.io>
https://linear.app/unleash/issue/2-3696/report-unknown-flags-when-sent-to-the-bulk-metrics-endpoint
Unifies metrics sifting logic across both metrics endpoints:
- `/metrics`
- `/metrics/bulk`
This PR improves consistency between the `/metrics` and `/metrics/bulk`
endpoints by introducing a shared `siftMetrics` method, now used within
`registerBulkMetrics`. Both endpoints already call this method at the
end of their respective logic flows, ensuring that metrics are sifted in
the same way regardless of the path taken.
While the primary goal was to enable reporting of unknown flags via the
`/metrics/bulk` endpoint, this change also improves bulk processing by
consistently dropping invalid or unknown flags before insertion, just
like in the regular `/metrics` endpoint.
Fixes a bug in the instance store where insert and bulkUpsert would
overwrite existing properties if there was a row there already. Now
it'll ignore any properties that are undefined.
The implementation is lifted directly from
`src/lib/db/client-applications-store.ts` (line 107 atm).
Additionally, I've renamed the `insert` method to `upsert` to make it
clearer what it does (and because we already have `bulkUpsert`). The
method seems to only be used in tests, anyway. I do not anticipate any
changes to be required in enterprise (I've checked).
## Discussion points:
This implementation uses `delete` to remove properties from the object.
Why didn't I do it some other way? Two main reasons:
1. We've had this implementation for 4 years in the client applications
store. If there were serious issues with it, we'd probably know by know.
(Probably.)
2. The only way I can think of without deleting, would be to use
`Object.fromEntries` and `Object.toEntries` and either map or reduce.
That'll double the amount of property iterations we'll need to do.
So naively, this strikes me as being more efficient. If you know better
solutions, I will of course be happy to take them. If not, I'd like to
leave this as is and then change it if we see that it's causing issues.
Fixes a bug where `registerInstance` and
`register{Frontend|Backend}Client` would overwrite each other's data in
the instance service, leading to the bulk update being made with partial
data, often missing SDK version. There's a different issue in the actual
store that causes sdk version and type to be overwritten when it's
updated (because we don't use `setLastSeen` anymore), but I'll handle
that in a different PR.
This PR adds tests for the changes I've made. Additionally, I've made
these semi-related bonus changes:
- In registerInstance, don't expect a partial `IClientApp`. We used to
validate that it was actual a metrics object instead. Instead, update
the signature to expect the actual properties we need from the cilent
metrics schema and set a default for instanceId the way Joi did.
- In `metrics.ts`, use the `ClientMetricsSchema` type in the function
signature, so that the request body is correctly typed in the function
(instead of being `any`).
- Delete two unused properties from the`createApplicationSchema`. They
would get ignored and were never used as far as I can tell. (`appName`
is taken from the URL, and applications don't store `sdkVersion`
information).
- Add `sdkVersion` to `IClientApp` because it's used in instance
service.
I've been very confused about all the weird type shenanigans we do in
the instance service (expecting `IClientApp`, then validating with a
different Joi schema etc). I think this makes it a little bit better and
updates the bits I'm touching, but I'm happy to take input if you
disagree.
This type wasn't available in enterprise, so I'm adding it to serer-impl
to make it available.
I was a little unsure whether this would be an implementation detail
that we shouldn't expose in server-impl, but comparing it with the other
things we export (`applyGenericQueryParams`, `flattenPayload`,
`basePaginationParameters`), that seems to be fine. I'm guessing this
isn't the main public export? Or it is, and we just don't care 🤷
https://linear.app/unleash/issue/2-3695/allow-empty-flag-names-to-be-reported-in-bulk-metrics
Accepts metrics with empty flag names in the `/api/client/metrics/bulk`
endpoint.
When testing unknown flags through Edge, which uses the `/bulk`
endpoint, we noticed that there's a slight difference in validation
behavior compared to the regular metrics endpoint. While the regular
endpoint allows empty flag names, this one does not.
We can argue that we don't care about empty flag names in the first
place, which is true, but this inconsistency between the metric
endpoints can be confusing, and it also means that a single empty flag
name evaluation would break metrics being reported for that entire Edge
instance, for example.
This way we still accept it, just like we currently do if we point to
Unleash directly instead of going through Edge.
**Note**: We noticed that, due to the slightly different logic branch,
the bulk metrics endpoint does not report unknown flags. We'll take a
look at this at a later point.
This is primarily to facilitate reading and processing these events in
the payg cloud section of Unleash. We only emit these in one place, so I
added the types in there.
I found this method when running through the environment store that has
0 references. I also can't find any references to it in enterprise and
it's not in the interface. I think it's safe to remove.
## About the changes
Users could have been created in Unleash without a corresponding event
(a.k.a. audit log), due to a non transactional user insert
([fix](https://github.com/Unleash/unleash/pull/10327)). This could have
happened because of providing the wrong role id or some other causes
we're not aware of.
This amends the situation by inserting an event for each user that
exists in the instance (not deleted) and doesn't have it's corresponding
user-created event.
The event is inserted as already announced because this happened in the
past.
The event log will look like this (simulated the situation in local
dev):
```json
{
"id": 11,
"type": "user-created",
"createdBy": "unleash_system_user",
"createdAt": "2025-07-08T16:06:17.428Z",
"createdByUserId": null,
"data": {
"id": "6",
"email": "xyz@three.com"
},
"preData": null,
"tags": [],
"featureName": null,
"project": null,
"environment": null,
"label": "User created",
"summary": "**unleash_system_user** created user ****"
}
```
The main problem is we can't create the event in the past, so this will
have to do it
## About the changes
When inserting a user with an invalid role id, the user creation will
succeed but there will be no record in the audit log.
The API call returns a 400 misleading you to believe the user was not
created, but it actually was.
This makes the whole user creation transactional, so if something fails,
data will be in the right state.
## Testing
The e2e test was split in 2 scenarios, one with smtp and another one
without.
This test was added, and it was failing before adding the transaction,
because when fetching the users, the user was there, despite having
returned a 400 error in the API call:
80a2e65b6f/src/test/e2e/api/admin/user-admin.e2e.test.ts (L181-L204)
I noticed event search, as it is doing `ILIKE` search, is slow
sometimes. Lets get some statistics about it.
Meanwhile added timers for other interesting queries.