## About the changes
We've identified that Bearer token middleware is not working for
/enterprise instance.
Looking at a few lines below:
88e3b1b79e/src/lib/app.ts (L81-L84)
we can see that we were missing the basePath in the use definition.
## About the changes
Current state, when returning the HTML entry point from the server,
there are no headers attached. We encountered an issue with a deployment
and this had an impact for us.
A brief description:
1. We deployed the most recent version. Noticed an unrelated issue.
2. Users tried to use the most recent version and due to their client
cache, requested assets that did not exist in the newest version.
3. Our cache layer cached the assets that were not there with the HTML
response. It had to infer the type based on the filename because there
was no attached `Content-Type` header. This cache was very sticky.
4. After rolling back we saw the HTML response (from the cache) instead
of the appropriate response from the upstream Unleash application.
This PR does a few things.
1. When responding with the HTML entry point, it adds header
(`Content-Type: text/html`).
2. When the client is requesting an asset (a path that ends with an
extension), it also instructs the resource not to be cached
(`Cache-Control: no-cache`) and returns a 404. This will prevent misses
from getting cached.
## Discussion points
To me, there doesn't seem to be a lot of test infra on serving the SPA
application. If that is an error, please indicate where that is and an
appropriate test can be added.
Adds a postgres_version gauge to allow us to see postgres_version in
prometheus and to post it upstream when version checking. Depends on
https://github.com/bricks-software/version-function/pull/20 to be merged
first to ensure our version-function doesn't crash when given the
postgres-version data.
1. Added new schema and tests
2. Controller also accepts the data
3. Also sending fake data from frontend currently
Next steps, implement service/store layer and frontend
This fixes the case when a customer have thousands of strategies causing
the react UI to crash. We still consider it incorrect to use that amount
of strategies and this is more a workaround to help the customer out of
a crashing state.
We put it behind a flag called `manyStrategiesPagination` and plan to
only enable it for the customer in trouble.
Now we are also sending project id to prometheus, also querying from
database. This sets us up for grafana dashboard.
Also put the metrics behind flag, just incase it causes cpu/memory
issues.
This PR updates the project service to automatically create a project id
if it is not provided. The feature is behind a flag. If an ID is
provided, it will still attempt to use that ID instead.
This PR adds a function to automatically generate a project ID on
creation. Using this when the id is missing will be handled in following
PRs.
The function uses the existing `slug` package to create a slug, and then
takes the 12 characters of a uuidv4 string to generate an ID.
The included tests check that the 12 character hash is added and that
the resulting string is url friendly (by checking that
`encodeURIComponent` doesn't change it).
We could also test a lot of edge cases (such as dealing with double
spaces, trimming the string, etc), but I think that's better handled by
the library itself (but you can check out what I removed in
2d9bcb6390
for an idea).
The function doesn't really need to be in the service; it could be moved to a util. But for proximity, I'll create it here first.
Regarding ticket #6892:
I would like to enable the use of a CA certificate without requiring
other certificates. This would be useful for AWS Helm, as AWS only
provides a single PEM file for DB connections.
We are getting questions from engineers, why I do not see lifecycle. The
same will happen with our customers. Now customers will see lifecycle
component unified across features.
Final rank has always been ordering correctly by default. But after 5.12
I see some issues that sometimes it is not ordered. Just to be extra
sure, I am for ordering it.
Add a flag to enable/disable the new UI for project creation.
This flag is separate from the impl on the back end so that we can
enable one without the other (but uses flag dependencies in Unleash, so
that we can never enable the new UI without the new back end).
I have not set the flag to `true` in server startup because the form
doesn't work yet, so it's a manual step for now.
This PR removes the workaround introduced in
https://github.com/Unleash/unleash/pull/6931. After
https://github.com/ivarconr/unleash-enterprise/pull/1268 has been
merged, this should be safe to apply.
Notably, this PR:
- tightens up the type for the enable change request function, so we can
use that to inform the code
- skips trying to do anything with an empty array
The last point is less important than it might seem because both the env
validation and the current implementation of the callback is essentially
a no-op when there are no envs. However, that's hard to enforce. If we
just exit out early, then at least we know nothing happens.
Optionally, we could do something like this instead, but I'm not sure
it's better or worse. Happy to take input.
```ts
const crEnvs = newProject.changeRequestEnvironments ?? []
await this.validateEnvironmentsExist(crEnvs.map((env) => env.name));
const changeRequestEnvironments =
await enableChangeRequestsForSpecifiedEnvironments(crEnvs,);
data.changeRequestEnvironments = changeRequestEnvironments;
```
This PR improves the handling of change request enables on project
creation in two ways:
1. We now verify that the envs you try to enable CRs for exist before
passing them on to the enterprise functionality.
2. We include data about environments and change request environments in
the project created events.
Due to how we handle redirects of embedded proxy, we ended up counting
the same request twice. This PR adds a boolean to res.locals which we
then check if set to avoid double counting.
## About the changes
What's going on is the following:
1. When a token is not found in the token's cache we try to find it in
the db
2. To prevent a denial of service attack using invalid tokens, we cache
the invalid tokens so we don't hit the db.
3. The issue is that we stored this token in the cache regardless we
found it or not. And if the token was valid the first time we'd add a
timestamp to avoid querying this token again the next time.
4. The next iteration the token should be in the cache:
54383a6578/src/lib/services/api-token-service.ts (L162)
but for some reason it is not and therefore we have to make a query. But
this is where the query prevention mechanism kicks in because it finds
the token in the cache and kicks us out. This PR fixes this by only
storing in the cache for misses if not found:
54383a6578/src/lib/services/api-token-service.ts (L164-L165)
The token was added to the cache because we were not checking if it had
expired. Now we added a check and we also have a log for expired tokens.
Some improvement opportunities:
- I don't think we display that a token has expired in the UI which
probably led to this issue
- When a token expired we don't display a specific error message or
error response saying that which is not very helpful for users
This PR introduces a configuration option (`authentication.demoAllowAdminLogin`) that allows you to log in as admin when using demo authentication. To do this, use the username `admin`.
## About the changes
The `admin` user currently cannot be accessed in `demo` authentication
mode, as the auth mode requires only an email to log in, and the admin
user is not created with an email. This change allows for logging in as
the admin user only if an `AUTH_DEMO_ALLOW_ADMIN_LOGIN` is set to `true`
(or the corresponding `authDemoAllowAdminLogin` config is enabled).
<!-- Does it close an issue? Multiple? -->
Closes#6398
### Important files
[demo-authentication.ts](https://github.com/Unleash/unleash/compare/main...00Chaotic:unleash:feat/allow_admin_login_using_demo_auth?expand=1#diff-c166f00f0a8ca4425236b3bcba40a8a3bd07a98d067495a0a092eec26866c9f1R25)
## Discussion points
Can continue discussion of [this
comment](https://github.com/Unleash/unleash/pull/6447#issuecomment-2042405647)
in this PR.
---------
Co-authored-by: Thomas Heartman <thomasheartman+github@gmail.com>
This commit adds an `environments` property to the project created
payload. The list contains only the projects that the project has
enabled.
The point of adding it is that it gives you a better overview over
what you have created.
This PR adds the `projectListNewCards` flag to the constant defined in
`experimental.ts`. This should allow the API to pass that value to the
front end.
## About the changes
Add time metrics to relevant queries:
- get
- getAll
- bulkInsert
- count
- exists
- get
Ignored because might not be that relevant:
- insert
- delete
- deleteAll
- update
## About the changes
This PR removes the feature flag `queryMissingTokens` that was fully
rolled out.
It introduces a new way of checking edgeValidTokens controlled by the
flag `checkEdgeValidTokensFromCache` that relies in the cached data but
hits the DB if needed.
The assumption is that most of the times edge will find tokens in the
cache, except for a few cases in which a new token is queried. From all
tokens we expect at most one to hit the DB and in this case querying a
single token should be better than querying all the tokens.
This makes it configurable either through a single JSON file with all
three certificates as separate keys or via separate files per
ca/cert/key key.
fixes#6718
I've tried to use/add the audit info to all events I could see/find.
This makes this PR necessarily huge, because we do store quite a few
events.
I realise it might not be complete yet, but tests
run green, and I think we now have a pattern to follow for other events.
This PR adds an optional function parameter to the `createProject`
function that is intended to enable change requests for the newly
created project.
The assumption is that all the logic within will be decided in the
enterprise impl. The only thing we want to verify here is that it is
called after the project has been created.
This PR adds functionality to the `createProject` function to choose
which environments should be enabled when you create a new project. The
new `environments` property is optional and omitting it will make it
work exactly as it does today.
The current implementation is fairly strict. We have some potential
ideas to make it easier to work with, but we haven't agreed on any yet.
Making it this strict means that we can always relax the rules later.
The rules are (codified in tests):
- If `environments` is not provided, all non-deprecated environments are
enabled
- If `environments` is provided, only the environments listed are
enabled, regardless of whether they're deprecated or not
- If `environments` is provided and is an empty array, the service
throws an error. The API should dilsallow that via the schema anyway,
but this catches it in case it sneaks in some other way.
- If `environments` is provided and contains one or more environments
that don't exist, the service throws an error. While we could ignore
them, that would lead to more complexity because we'd have to also check
that the at least one of the environments is valid. It also leads to
silent ignoring of errors, which may or may not be good for the user
experience.
The API endpoint for this sits in enterprise, so no customer-facing
changes are part of this.
We encountered an issue with a customer because this query was returning
3 million rows. The problem arose from each instance reporting
approximately 100 features, with a total of 30,000 instances. The query
was joining these, thus multiplying the data. This approach was fine for
a reasonable number of instances, but in this extreme case, it did not
perform well.
This PR modifies the logic; instead of performing outright joins, we are
now grouping features by environment into an array, resulting in just
one row returned per instance.
I tested locally with the same dataset. Previously, loading this large
instance took about 21 seconds; now it has reduced to 2 seconds.
Although this is still significant, the dataset is extensive.
Previously, we were not validating that the ID was a number, which
sometimes resulted in returning our database queries (source code) to
the frontend. Now, we have validation middleware.
Previously, we were extracting the project from the token, but now we
will retrieve it from the session, which contains the full list of
projects.
This change also resolves an issue we encountered when the token was a
multi-project token, formatted as []:dev:token. Previously, it was
unable to display the exact list of projects. Now, it will show the
exact project names.
<details>
<summary>Feature Flag Cleanup</summary>
| Stale Flag | Value |
| ---------- | ------- |
| stripClientHeadersOn304 | true |
</details>
<details>
<summary>Trigger</summary>
https://github.com/Unleash/unleash/issues/6559#issuecomment-2058848984
</details>
<details>
<summary>Bot Commands</summary>
`@gitar-bot cleanup stale_flag=value` will cleanup a stale feature flag.
Replace `stale_flag` with the name of the stale feature flag and `value`
with either `true` or `false`.
</details>
---------
Co-authored-by: Gitar Bot <noreply@gitar.co>
## About the changes
- Removes the feature flag for the created_by migrations.
- Adds a configuration option in IServerOption for
`ENABLE_SCHEDULED_CREATED_BY_MIGRATION` that defaults to `false`
- the new configuration option when set on startup enables scheduling of
the two created_by migration services (features+events)
- Removes the dependency on flag provider in EventStore as it's no
longer needed
- Adds a brief description of the new configuration option in
`configuring-unleash.md`
- Sets the events created_by migration interval to 15 minutes, up from
2.
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
## About the changes
This PR provides a service that allows a scheduled function to run in a
single instance. It's currently not in use but tests show how to wrap a
function to make it single-instance:
65b7080e05/src/lib/features/scheduler/job-service.test.ts (L26-L32)
The key `'test'` is used to identify the group and most likely should
have the same name as the scheduled job.
---------
Co-authored-by: Christopher Kolstad <chriswk@getunleash.io>
This PR adds a counter in Prometheus for counting the number of
"environment disabled" events we get per project. The purpose of this is
to establish a baseline for one of the "project management UI" project's
key results.
## On gauges vs counters
This PR uses a counter. Using a gauge would give you the total number of
envs disabled, not the number of disable events. The difference is
subtle, but important.
For projects that were created before the new feature, the gauge might
be appropriate. Because each disabled env would require at least one
disabled event, we can get a floor of how many events were triggered for
each project.
However, for projects created after we introduce the planned change,
we're not interested in the total envs anymore, because you can disable
a hundred envs on creation with a single action. In this case, a gauge
showing 100 disabled envs would be misleading, because it didn't take
100 events to disable them.
So the interesting metric here is how many times did you specifically
disable an environment in project settings, hence the counter.
## Assumptions and future plans
To make this easier on ourselves, we make the follow assumption: people
primarily disable envs **when creating a project**.
This means that there might be a few lagging indicators granting some
projects a smaller number of events than expected, but we may be able to
filter those out.
Further, if we had a metric for each project and its creation date, we
could correlate that with the metrics to answer the question "how many
envs do people disable in the first week? Two weeks? A month?". Or
worded differently: after creating a project, how long does it take for
people to configure environments?
Similarly, if we gather that data, it will also make filtering out the
number of events for projects created **after** the new changes have
been released much easier.
The good news: Because the project creation metric with dates is a
static aggregate, it can be applied at any time, even retroactively, to
see the effects.
This PR expands upon #6773 by returning the list of removed properties
in the API response. To achieve this, I added a new top-level `warnings`
key to the API response and added an `invalidContextProperties` property
under it. This is a list with the keys that were removed.
## Discussion points
**Should we return the type of each removed key's value?** We could
expand upon this by also returning the type that was considered invalid
for the property, e.g. `invalidProp: 'object'`. This would give us more
information that we could display to the user. However, I'm not sure
it's useful? We already return the input as-is, so you can always
cross-check. And the only type we allow for non-`properties` top-level
properties is `string`. Does it give any useful info? I think if we want
to display this in the UI, we might be better off cross-referencing with
the input?
**Can properties be invalid for any other reason?** As far as I can
tell, that's the only reason properties can be invalid for the context.
OpenAPI will prevent you from using a type other than string for the
context fields we have defined and does not let you add non-string
properties to the `properties` object. So all we have to deal with are
top-level properties. And as long as they are strings, then they should
be valid.
**Should we instead infer the diff when creating the model?** In this
first approach, I've amended the `clean-context` function to also return
the list of context fields it has removed. The downside to this approach
is that we need to thread it through a few more hoops. Another approach
would be to compare the input context with the context used to evaluate
one of the features when we create the view model and derive the missing
keys from that. This would probably work in 98 percent of cases.
However, if your result contains no flags, then we can't calculate the
diff. But maybe that's alright? It would likely be fewer lines of code
(but might require additional testing), although picking an environment
from feels hacky.
Don't include invalid context properties in the contexts that we
evaluate.
This PR removes any non-`properties` fields that have a non-string
value.
This prevents the front end from crashing when trying to render an
object.
Expect follow-up PRs to include more warnings/diagnostics we can show to
the end user to inform them of what fields have been removed and why.
## About the changes
This PR establishes a simple yet effective mechanism to avoid DDoS
against our DB while also protecting against memory leaks.
This will enable us to release the flag `queryMissingTokens` to make our
token validation consistent across different nodes
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
Converts `newContextFieldUI` release flag to
`disableShowContextFieldSelectionValues` kill switch.
The kill switch controls whether we show the value selection above the
search filed when > 100 values
---------
Signed-off-by: andreas-unleash <andreas@getunleash.ai>
Adds a bearer token middleware that adds support for tokens prefixed
with "Bearer" scheme. Prefixing with "Bearer" is optional and the old
way of authenticating still works, so we now support both ways.
Also, added as part of our OpenAPI spec which now displays authorization
as follows:
![image](https://github.com/Unleash/unleash/assets/455064/77b17342-2315-4c08-bf34-4655e12a1cc3)
Related to #4630. Doesn't fully close the issue as we're still using
some invalid characters for the RFC, in particular `*` and `[]`
For safety reasons this is behind a feature flag
---------
Co-authored-by: Gastón Fournier <gaston@getunleash.io>
This change fixes the OpenAPI schema to disallow non-string properties
on the top level of the context (except, of course, the `properties`
object).
This means that we'll no longer be seeing issues with rendering
invalid contexts, because we don't accept them in the first place.
This solution comes with some tradeoffs discussed in the [PR](https://github.com/Unleash/unleash/pull/6676). Following on from that, this solution isn't optimal, but it's a good stop gap. A better solution (proposed in the PR discussion) has been added as an idea for future projects.
The bulk of the discussion around the solution is included here for reference:
@kwasniew:
Was it possible to pass non string properties with our UI before?
Is there a chance that something will break after this change?
@thomasheartman:
Good question and good looking out 😄
You **could** pass non-string, top-level properties into the API before. In other words, this would be allowed:
```js
{
appName: "my-app",
nested: { object: "accepted" }
}
```
But notably, non-string values under `properties` would **not** be accepted:
```js
{
appName: "my-app",
properties: {
nested: { object: "not accepted" }
}
}
```
**However**, the values would not contribute to the evaluation of any constraints (because their type is invalid), so they would effectively be ignored.
Now, however, you'll instead get a 400 saying that the "nested" value must be a string.
I would consider this a bug fix because:
- if you sent a nested object before, it was most likely an oversight
- if you sent the nested object on purpose, expecting it to work, you would be perplexed as to why it didn't work, as the API accepted it happily
Furthermore, the UI will also tell you that the property must be a string now if you try to do it from the UI.
On the other hand, this does mean that while you could send absolute garbage in before and we would just ignore it, we don't do that anymore. This does go against how we allow you to send anything for pretty much all other objects in our API.
However, the SDK context is special. Arbitrary keys aren't ignored, they're actually part of the context itself and as such should have a valid value.
So if anything breaks, I think it breaks in a way that tells you why something wasn't working before. However, I'd love to hear your take on it and we can re-evaluate whether this is the right fix, if you think it isn't.
@kwasniew:
Coming from the https://en.wikipedia.org/wiki/Robustness_principle mindset I'm thinking if ignoring the fields that are incorrect wouldn't be a better option. So we'd accept incorrect value and drop it instead of:
* failing with client error (as this PR) or
* saving incorrect value (as previous code we had)
@thomasheartman:
Yeah, I considered that too. In fact, that was my initial idea (for the reason you stated). However, there's a couple tradeoffs here (as always):
1. If we just ignore those values, the end user doesn't know what's happened unless they go and dig through the responses. And even then, they don't necessarily know why the value is gone.
2. As mentioned, for the context, arbitrary keys can't be ignored, because we use them to build the context. In other words, they're actually invalid input.
Now, I agree that you should be liberal in what you accept and try to handle things gracefully, but that means you need to have a sensible default to fall back to. Or, to quote the Wikipedia article (selectively; with added emphasis):
> programs that receive messages should accept non-conformant input **as long as the meaning is clear**.
In this case, the meaning isn't clear when you send extra context values that aren't strings.
For instance, what's the meaning here:
```js
{
appName: "my-app",
nested: { object: "accepted", more: { further: "nesting" } }
}
```
If you were trying to use the `nested` value as an object, then that won't work. Ideally, you should be alerted.
Should we "unwind" the object and add all string keys as context values? That doesn't sound very feasible **or** necessarily like the right thing.
Did you just intend to use the `appName` and for the `nested` object to be ignored?
And it's because of this caveat that I'm not convinced just ignoring the keys are the right thing to do. Because if you do, the user never knows they were ignored or why.
----
**However**, I'd be in favor of ignoring they keys if we could **also** give the users warnings at the same time. (Something like what we do in the CR api, right? Success with warnings?)
If we can tell the user that "we ignored the `a`, `b`, and `c` keys in the context you sent because they are invalid values. Here is the result of the evaluation without taking those keys into account: [...]", then I think that's the ideal solution.
But of course, the tradeoff is that that increases the complexity of the API and the complexity of the task. It also requires UI adjustments etc. This means that it's not a simple fix anymore, but more of a mini-project.
But, in the spirit of the playground, I think it would be a worthwhile thing to do because it helps people learn and understand how Unleash works.
Via the API you can currently create gradualRollout strategies without
any parameters set, when visiting the UI afterwards, you can edit this,
because the UI reads the parameter list from the database and sees that
some parameters are required, and refuses to accept the data. This PR
adds defaults for gradualRollout strategies created from the API, making
sure gradual rollout strategies always have `rollout`, `groupId` and
`stickiness` set.
Provides store method for retrieving traffic usage data based on
period parameter, and UI + ui hook with the new chart for displaying
traffic usage data spread out over selectable month.
![Skjermbilde 2024-03-21 kl 12 40
38](https://github.com/Unleash/unleash/assets/707867/539c6c98-b6f6-488a-97fb-baf4fccec687)
In this PR we copied and adapted a plugin written by DX for highlighting
a column in the chart:
![image](https://github.com/Unleash/unleash/assets/707867/70532b22-44ed-44c0-a9b4-75f65ed6a63d)
There are some minor improvements planned which will come in a separate
PR, reversing the order in legend and tooltip so the colors go from
light to dark, and adding a month -sum below the legend
## Discussion points
- Should any of this be extracted as a separate reusable component?
---------
Co-authored-by: Nuno Góis <github@nunogois.com>
## About the changes
There seems to be a typo in the authorization header. We're keeping the
old typo as preferred just in case, but if not present we'll default to
the authorization header (not authorisation).
Not sure about the impact of this bug, as all registrations might be
using default project.
The changes to arbitraries here is to make typescript agree with our
schema types. Seems like somewhere between 4.8.4 and 5.4.2, typescript
got stricter.
## About the changes
We see some logs with: `Failed to store events: Error: The query is
empty` which suggests we're not sending events to batchStore. This will
help us confirm that and will give us better insights
An entrypoint for enterprise to register a hook which will be called
before the database and scheduler services are torn down. That way
enterprise can also perform graceful shutdown.
There was no need to join the entire metrics table, as it is a huge
table. We only needed all combinations of app_name, environment, and
feature_name. The new query retrieves all this data, which will then be
joined into the main query.
To check that users do indeed have permissions to update the roles from
project-service, we've been depending on req.user.id.
We had one error on Friday March 8th, where we managed to send
undefined/null to a method that requires a number. This PR assumes that
if we have an API token, and we have admin permissions and userId is not
set we're a legacy admin token.
It uses the util method for extractUserId(req: IAuthRequest | IApiRequest), so if we've passed through the apiTokenMiddleware first, we'll have userId -42, if we haven't, we'll get -1337.
Now frontend API requests will be counted separately under
getAllByfrontend. We are already tracking new FE db calls, so we can
build grafana dashboard.
https://linear.app/unleash/issue/2-2022/improve-actions-validation
Improves our current actions form validation.
Empty actions are now ignored on the payload and we get errors in
actions where any of the required fields are empty.
Also refactored our current actions into a constant map that can be
shared across frontend and backend.