1
0
mirror of https://github.com/Unleash/unleash.git synced 2025-01-31 00:16:47 +01:00
Commit Graph

31 Commits

Author SHA1 Message Date
Thomas Heartman
f15bcdc2a6
chore: send prometheus metrics when someone tries to exceed resource limits (#7617)
This PR adds prometheus metrics for when users attempt to exceed the
limits for a given resource.

The implementation sets up a second function exported from the
ExceedsLimitError file that records metrics and then throws the error.
This could also be a static method on the class, but I'm not sure that'd
be better.
2024-07-18 13:35:45 +02:00
Simon Hornby
2e205fc14e
chore: make sdk metrics snake case (#7547) 2024-07-05 12:29:00 +02:00
Simon Hornby
30073d527a
feat: extended SDK metrics (#7527)
This adds an extended metrics format to the metrics ingested by Unleash
and sent by running SDKs in the wild. Notably, we don't store this
information anywhere new in this PR, this is just streamed out to
Victoria metrics - the point of this project is insight, not analysis.

Two things to look out for in this PR:

- I've chosen to take extend the registration event and also send that
when we receive metrics. This means that the new data is received on
startup and on heartbeat. This takes us in the direction of collapsing
these two calls into one at a later point
- I've wrapped the existing metrics events in some "type safety", it
ain't much because we have 0 type safety on the event emitter so this
also has some if checks that look funny in TS that actually check if the
data shape is correct. Existing tests that check this are more or less
preserved
2024-07-04 08:51:27 +02:00
Mateusz Kwasniewski
3a3b6a29ff
feat: lifecycle stage entered counter (#7449) 2024-06-25 14:40:16 +02:00
Mateusz Kwasniewski
c3fa468a9d
refactor: lifecycle stage duration outside instance stats (#7442) 2024-06-25 11:22:26 +02:00
Jaanus Sellin
d17ae37800
feat: now CLIENT_METRICS event will be emitted with new structure (#7210)
1. CLIENT_METRICS event will be emitted with new structure
2. CLIENT_METRICS event will be emitted from bulkMetrics endpoint
2024-05-31 12:40:46 +03:00
Jaanus Sellin
2fb95339ef
chore: change toggle to flag #3 (#7101) 2024-05-22 09:58:53 +03:00
Jaanus Sellin
8a2b977ac0
fix: fix prometheus metrics for lifecycle (#7030)
getAll was not properly tested, added test and fixed query. Now metrics
should come up.
2024-05-10 11:50:47 +03:00
Jaanus Sellin
cd49ae2a26
feat: add project id to prometheus and feature flag (#7008)
Now we are also sending project id to prometheus, also querying from
database. This sets us up for grafana dashboard.
Also put the metrics behind flag, just incase it causes cpu/memory
issues.
2024-05-08 15:19:23 +03:00
Jaanus Sellin
02440dfed2
feat: duration in stage, add feature lifecycle prometheus metrics (#6973)
Introduce a new concept. Duration in stage.
Also add it into prometheus metric.
2024-05-08 11:33:51 +03:00
Thomas Heartman
cfd9e4894a
chore: Establish a baseline for the number of envs disabled per project (#6807)
This PR adds a counter in Prometheus for counting the number of
"environment disabled" events we get per project. The purpose of this is
to establish a baseline for one of the "project management UI" project's
key results.

## On gauges vs counters

This PR uses a counter. Using a gauge would give you the total number of
envs disabled, not the number of disable events. The difference is
subtle, but important.

For projects that were created before the new feature, the gauge might
be appropriate. Because each disabled env would require at least one
disabled event, we can get a floor of how many events were triggered for
each project.

However, for projects created after we introduce the planned change,
we're not interested in the total envs anymore, because you can disable
a hundred envs on creation with a single action. In this case, a gauge
showing 100 disabled envs would be misleading, because it didn't take
100 events to disable them.

So the interesting metric here is how many times did you specifically
disable an environment in project settings, hence the counter.

## Assumptions and future plans

To make this easier on ourselves, we make the follow assumption: people
primarily disable envs **when creating a project**.

This means that there might be a few lagging indicators granting some
projects a smaller number of events than expected, but we may be able to
filter those out.

Further, if we had a metric for each project and its creation date, we
could correlate that with the metrics to answer the question "how many
envs do people disable in the first week? Two weeks? A month?". Or
worded differently: after creating a project, how long does it take for
people to configure environments?

Similarly, if we gather that data, it will also make filtering out the
number of events for projects created **after** the new changes have
been released much easier.

The good news: Because the project creation metric with dates is a
static aggregate, it can be applied at any time, even retroactively, to
see the effects.
2024-04-10 08:49:15 +02:00
Jaanus Sellin
d3847fd8ee
feat: collect prometheus data about archived features (#6728) 2024-03-28 13:40:30 +02:00
Christopher Kolstad
53354224fc
chore: Bump biome and configure husky (#6589)
Upgrades biome to 1.6.1, and updates husky pre-commit hook.

Most changes here are making type imports explicit.
2024-03-18 13:58:05 +01:00
Jaanus Sellin
2a57acca41
feat: start monitoring total time to update cache (#6517) 2024-03-12 14:27:04 +02:00
Jaanus Sellin
b7915171ff
feat: start tracking operation duration (#6514) 2024-03-12 12:30:30 +02:00
Gastón Fournier
fa3352786a
chore: reimplementation of app stats (#6155)
## About the changes
App stats is mainly used to cap the number of applications reported to
Unleash based on the last 7 days information:
cc2ccb1134/src/lib/middleware/response-time-metrics.ts (L24-L28)

Instead of getting all stats, just calculate appCount statistics

Use scheduler service instead of setInterval
2024-02-08 17:15:42 +01:00
Christopher Kolstad
5a3bb1ffc3
Biome1.5.1 (#5867)
Lots of work here, mostly because I didn't want to turn off the
`noImplicitAnyLet` lint. This PR tries its best to type all the untyped
lets biome complained about (Don't ask me how many hours that took or
how many lints that was >200...), which in the future will force test
authors to actually type their global variables setup in `beforeAll`.

---------

Co-authored-by: Gastón Fournier <gaston@getunleash.io>
2024-01-12 09:25:59 +00:00
Gard Rimestad
24b202ef0b
feat: include environment type label in feature_toggle_update metrics (#5809)
This is needed in order to identify what type of an environment a toggle
is updated in. This can be test, development, pre-production or
production.
2024-01-09 16:33:00 +01:00
Christopher Kolstad
1edd73db45
feat: feature changes counted in new table (#4958)
As part of more telemetry on the usage of Unleash. 

This PR adds a new `stat_` prefixed table as well as a trigger on the
events table trigger on each insert to increment a counter per
environment per day.

The trigger will trigger on every insert into the events base, but will
filter and only increment the counter for events that actually have the
environment set. (there are events, like user-created, that does not
relate to a specific environment).

Bit wary on this, but since we truncate down to row per (day,
environment) combo, finding conflict and incrementing shouldn't take too
long here.

@ivarconr was it something like this you were considering?
2023-10-10 12:32:23 +02:00
Tymoteusz Czech
2c826bdbba
feat: Add active users statistics to metrics (#4674)
## About the changes
- `getActiveUsers` is using multiple stores, so it is refactored into
read-model
- Refactored Instance stats service into `features` to co-locate related
code

Closes https://linear.app/unleash/issue/UNL-230/active-users-prometheus

### Important files
`src/lib/features/instance-stats/getActiveUsers.ts`


## Discussion points
`getActiveUsers` is coded less _class-based_ then previous similar
read-models. In one file instead of 3 (read-model interface, fake read
model, sql read model). I find types and functions way more readable,
but I'm ready to refactor it to interfaces and classes if consistency is
more important.
2023-09-18 15:05:17 +02:00
Nuno Góis
555b27a653
feat: add prom metric for total custom root roles (#4435)
https://linear.app/unleash/issue/2-1293/label-our-metrics-about-roles-to-include-also-if-the-role-is-a-root

Adds a Prometheus metric for total custom root roles. Also adds it to
the instance telemetry collection.

Q: Should we use a `labeledRoles` kind of metric instead, similar to
what we're doing for `clientApps` and their ranges?
2023-08-07 14:59:29 +01:00
Gastón Fournier
ea9bf7f447
chore: add linter rules for regexp (#3500)
## About the changes
Add linter rules for regexp security vulnerabilities

Commit 1c5d54c76e [fails due to
regexp/no-super-linear-backtracking](https://github.com/Unleash/unleash/actions/runs/4668430535/jobs/8265506170#step:5:37)
as reported here:
https://github.com/Unleash/unleash/security/code-scanning/1


[0127d1a](0127d1a746)
fixes the issues and warnings by running `yarn lint --fix`
2023-04-17 07:11:22 +00:00
Gastón Fournier
2979f21631
feat: expose number of registered applications metric (#2692)
## About the changes
This metric will expose an aggregated view of how many client
applications are registered in Unleash. Since applications are ephemeral
we are exposing this metric in different time windows based on when the
application was last seen.

The caveat is that we issue a database query for each new range we want
to add. Hopefully, this should not be a problem because:
a) the amount of ranges we'd expose is small and unlikely to grow
b) this is currently updated at startup time and even if we update it on
a scheduled basis the refresh rate will be rather sparse

## Sample data
This is how metrics will look like
```
# HELP client_apps_total Number of registered client apps aggregated by range by last seen
# TYPE client_apps_total gauge
client_apps_total{range="allTime"} 3
client_apps_total{range="30d"} 3
client_apps_total{range="7d"} 2
```
2022-12-16 11:16:51 +00:00
Ivar Conradi Østhus
cf4fc2303b
Feat/stats service (#2211)
Introduces an instance stats service exposing usage metrics of the Unleash installation.
2022-10-25 13:10:27 +02:00
Ivar Conradi Østhus
5141e77bce
fix: add appName to http response time metrics (#2117) 2022-09-30 15:28:50 +02:00
Gard Rimestad
9aa1c7aeb0
fix: client registration events are on eventStore (#2093)
Client registration events are on eventStore and not on eventBus. This
change makes us have sdk name and version metrics in unleash.
2022-09-27 11:06:06 +02:00
Ivar Conradi Østhus
a7ed7557ec
fix: add env and project labels to feature updated metrics. (#2043) 2022-09-08 11:01:27 +02:00
Christopher Kolstad
5bacc7ba36
task: add sdk version metric (#1828)
* task: add sdk version metric
2022-07-22 09:00:22 +00:00
Ivar Conradi Østhus
4a9939ccb1 feat: remove old metrics service 2021-12-10 09:31:54 +01:00
Ivar Conradi Østhus
d8478dd928
feat: clean up events (#1089)
Co-authored-by: Christopher Kolstad <chriswk@getunleash.ai>
2021-11-12 13:15:51 +01:00
Christopher Kolstad
ff7be7696c
fix: Stores as typescript and with interfaces. (#902)
Co-authored-by: Ivar Conradi Østhus <ivarconr@gmail.com>
2021-08-12 15:04:37 +02:00