feat: add middleware backend analytics by alanpeixinho · Pull Request #1933 · kernelci/dashboard

alanpeixinho · 2026-06-09T19:58:40Z

What it is

Adds a middleware custom metric to log anonymous user visitor metrics for the endpoint access.

How to test

Start a docker compose dev stack
Check prometheus metric port (default 8081) to check that metrics are being stored (mainly unique visitors, and endpoints_by_client).

Closes #1928 #1929

bhcopeland · 2026-06-10T06:40:45Z

Not familiar with GoatCounter. Looking at Google and screenshots, it seems to do the job, but my worry is that it's another application layer we have to host. Would it make sense to export this to Prometheus instead? Or look into a module like GoatCounter that has a Prometheus endpoint we can scrape into a dashboard. A good use case for Prometheus is that it's a single endpoint for our graphs.

alanpeixinho · 2026-06-10T13:57:50Z

Hi @bhcopeland , changed it to draft, because it is more of a discussion starter than a proper ready to production PR.
But yes, we are discussing the pros/cons of just trying to track this ourselves and send to Prometheus. The biggest problem we see on this is to make sure we are keeping .
But you made a good point, we might be able to capture analytics in a privacy compliant library, and store it in prometheus, having the best of both worlds. I will take a look on this.

alanpeixinho · 2026-06-10T20:56:58Z

Not familiar with GoatCounter. Looking at Google and screenshots, it seems to do the job, but my worry is that it's another application layer we have to host. Would it make sense to export this to Prometheus instead? Or look into a module like GoatCounter that has a Prometheus endpoint we can scrape into a dashboard. A good use case for Prometheus is that it's a single endpoint for our graphs.

I can see two paths we could take here:

We could implement an exporter for the analytics sqlite metrics to prometheus.
Or, since we already have grafana dashboard, we could limit our analytics to just nginx log available info (might give us at least unique users (which we could derive in a similar fashion to GC), pages, browsers, OS and origin).

What you think @bhcopeland .

alanpeixinho · 2026-06-11T21:10:47Z

Not familiar with GoatCounter. Looking at Google and screenshots, it seems to do the job, but my worry is that it's another application layer we have to host. Would it make sense to export this to Prometheus instead? Or look into a module like GoatCounter that has a Prometheus endpoint we can scrape into a dashboard. A good use case for Prometheus is that it's a single endpoint for our graphs.

I can see two paths we could take here:

We could implement an exporter for the analytics sqlite metrics to prometheus.

Or, since we already have grafana dashboard, we could limit our analytics to just nginx log available info (might give us at least unique users (which we could derive in a similar fashion to GC), pages, browsers, OS and origin).

What you think @bhcopeland .

Made some changes to just a simple custom metric on our already existing prometheus logger, we have less rich information, than with a proper analytics, but we might have the necessary information to our needs, and properly integrated with our grafana metrics analytics.

alanpeixinho · 2026-06-12T18:57:36Z

A screenshot sample of dashboard for the new metrics. (extending the current grafana dashboard)
@bhcopeland @tales-aparecida you guys have any suggestions here?

tales-aparecida · 2026-06-12T19:25:35Z

I think this is doing exactly what I had in mind! Looking great

mentonin

Overall looks pretty good and fairly comprehensive for backend visitor metrics, my comments are mostly nits or suggestions.

mentonin · 2026-06-12T17:32:50Z

    "kernelCI_app.middleware.logServerErrorMiddleware.LogServerErrorMiddleware",
+    "kernelCI_app.middleware.backendRequestMetricsMiddleware.BackendRequestMetricsMiddleware",


We should probably use a module and re-export the middleware classes to improve readability

I prefer explicit naming the class, module exporting in python can lead to far more confusion

I don't see how kernelCI_app.middleware.backendRequestMetricsMiddleware.BackendRequestMetricsMiddleware is better than kernelCI_app.middleware.BackendRequestMetricsMiddleware, but not a blocker.

Because it is explicit, changing module exports in runtime introduces quite a lot of noise for a simple string. Or you think differently?

mentonin · 2026-06-12T17:34:06Z

+UNIQUE_VISITOR_TTL_SECONDS = 48 * 60 * 60
+UNIQUE_VISITOR_SALT_BYTES = 32


Why 2 days?

I have mostly overshoot the necessary time (maybe too much), just to be sure we don't loose cached values.

mentonin · 2026-06-12T17:59:00Z

+    [
+        "endpoint",
+        "method",
+        "status_class",
+        "browser",
+        "os",
+        "device",
+        "referrer_domain",
+    ],


This may result in very high cardinality

Do you have any suggestions?

I think we can test it and deal with it later when/if issues arise.
For sanity checks, we could look into the cardinality and relevance of each item and see if we should drop anything:

endpoint: around 30?, grows with API surface, very relevant, could be filtered or processed into smaller bins if needed

method: 9, I think only one value is valid for most endpoints, which endpoints need this? Are invalid requests filtered out before or after this point? Could bloat cardinality unnecessarily if we track invalid requests.

status_class: 5, relevant for tracking availability and server errors

browser: 8. Relevant for tracking build targets and separating browsers, bots and non-browser consumers (kci-dev). We could reduce browser granularity, or track specific tools (kci-dev sets a specific user-agent).

os: 7, I don't think it is very relevant besides device type

device: 3 (no "unknown"), very relevant

referrer_domain: virtually infinite, a handful in practice. I think we could reduce to internal, direct, external and maybe add specific tracking of other sources later?

Potential cardinality of over 2 million with 10 referrers, 75600 if we only have one method per endpoint and 3 referrers

I think the more sound thing to do here is to start collecting some information. This way we have a proper idea of which columns might shown problems, and then act on them.

mentonin · 2026-06-12T18:01:22Z

+DASHBOARD_UNIQUE_VISITORS_TOTAL = Counter(
+    "dashboard_unique_visitors_total",
+    "Daily unique backend visitors",
+)
+
+DASHBOARD_UNIQUE_VISITORS_BY_ENDPOINT_TOTAL = Counter(
+    "dashboard_unique_visitors_by_endpoint_total",
+    "Daily unique backend visitors deduplicated per endpoint by rotated Redis salt",
+    ["endpoint"],
+)
+


I think the name might be confusing. I associate "visitors" with website access, while this tracks api usage. I would rather have a name using "backend" or "api", or maybe replacing "visitors" with something like "consumers"

@tales-aparecida do you have any thoughts here? Do you think using "consumers" to address what we have been calling "visitors" would be clearer for the intended audience?

I'll say: copy the names from well-established frameworks

For what I see, visitors is by far the most used term here, even for backend-only tracking.
Client is sometimes adopted as well, and could be an alternative.

mentonin · 2026-06-12T18:05:06Z

+def record_client(**kwargs) -> None:
+    DASHBOARD_BACKEND_REQUESTS_BY_CLIENT.labels(**kwargs).inc()


I don't like kwargs here, the dict keys are known and must be the same as DASHBOARD_BACKEND_REQUESTS_BY_CLIENT labels.

mentonin · 2026-06-12T18:49:56Z

+    if "mobile" in normalized_user_agent or "iphone" in normalized_user_agent:
+        return "mobile"
+    if "android" in normalized_user_agent:
+        return "mobile"


Suggested change

if "mobile" in normalized_user_agent or "iphone" in normalized_user_agent:

return "mobile"

if "android" in normalized_user_agent:

return "mobile"

if any(s in normalized_user_agent for s in ["mobile", "iphone", "android"]):

return true

This is clearer in my opinion, and keeps each specific return value easily traceable

I will disagree on this one, despite liking to keep the code more declarative, a nested loop might be less readable

I would still prefer a single return "mobile" exit point

mentonin · 2026-06-12T18:54:55Z

+    if "curl/" in normalized_user_agent or "wget/" in normalized_user_agent:
+        return "HTTP Client"
+    if "python-requests/" in normalized_user_agent:
+        return "HTTP Client"


see comment in get_device

mentonin · 2026-06-12T19:32:21Z

        proxy_send_timeout 240s;
        send_timeout 240s;
    }
+


Formatting change for an otherwise unchaged file

mentonin · 2026-06-15T18:18:51Z

 - **Metrics Path**: `/metrics/`
 - **Scrape Interval**: 15 seconds
+
+## Client Analytics & Privacy


I believe we must move this into a proper Privacy Policy file and link to it from the frontend for full compliance. I suggest going through with these changes and tracking the Privacy Policy as a new issue.

Extending on this, we might need to include contact information on the PRIVACY.md doc, which contact should we include here @bhcopeland ?

There's should be some tracker for this already. I vaguely recall that we had planned to use the same framework the Linux Foundation website is using

mentonin · 2026-06-15T18:49:48Z

+    [
+        "endpoint",
+        "method",
+        "status_class",
+        "browser",
+        "os",
+        "device",
+        "referrer_domain",
+    ],


I think we can test it and deal with it later when/if issues arise.
For sanity checks, we could look into the cardinality and relevance of each item and see if we should drop anything:

endpoint: around 30?, grows with API surface, very relevant, could be filtered or processed into smaller bins if needed

method: 9, I think only one value is valid for most endpoints, which endpoints need this? Are invalid requests filtered out before or after this point? Could bloat cardinality unnecessarily if we track invalid requests.

status_class: 5, relevant for tracking availability and server errors

browser: 8. Relevant for tracking build targets and separating browsers, bots and non-browser consumers (kci-dev). We could reduce browser granularity, or track specific tools (kci-dev sets a specific user-agent).

os: 7, I don't think it is very relevant besides device type

device: 3 (no "unknown"), very relevant

referrer_domain: virtually infinite, a handful in practice. I think we could reduce to internal, direct, external and maybe add specific tracking of other sources later?

Potential cardinality of over 2 million with 10 referrers, 75600 if we only have one method per endpoint and 3 referrers

mentonin · 2026-06-15T18:52:31Z

+    if "mobile" in normalized_user_agent or "iphone" in normalized_user_agent:
+        return "mobile"
+    if "android" in normalized_user_agent:
+        return "mobile"


I would still prefer a single return "mobile" exit point

* Add custom metrics to estimate unique visitors. * Add custom metrics to estimate usefull client information (browser, os, device) * Add PRIVACY.md file Signed-off-by: Alan Peixinho <alan.peixinho@profusion.mobi>

* Add custom metrics to estimate unique visitors. * Add custom metrics to estimate usefull client information (browser, os, device) Signed-off-by: Alan Peixinho <alan.peixinho@profusion.mobi>

alanpeixinho marked this pull request as draft June 10, 2026 13:47

alanpeixinho force-pushed the feat/user-metrics branch from a1021c6 to 3f808e6 Compare June 11, 2026 21:06

alanpeixinho changed the title ~~feat: add self-hosted GoatCounter analytics~~ feat: add middleware backend analytics Jun 11, 2026

mentonin reviewed Jun 12, 2026

View reviewed changes

alanpeixinho force-pushed the feat/user-metrics branch 2 times, most recently from 70a06a7 to ee0f4ce Compare June 15, 2026 15:35

alanpeixinho requested a review from mentonin June 15, 2026 15:40

alanpeixinho marked this pull request as ready for review June 15, 2026 17:22

mentonin reviewed Jun 15, 2026

View reviewed changes

feat: add backend middleware analytics for prometheus

477a759

* Add custom metrics to estimate unique visitors. * Add custom metrics to estimate usefull client information (browser, os, device) * Add PRIVACY.md file Signed-off-by: Alan Peixinho <alan.peixinho@profusion.mobi>

alanpeixinho force-pushed the feat/user-metrics branch from ee0f4ce to 477a759 Compare June 15, 2026 21:13

alanpeixinho added Backend Most or all of the changes for this issue will be in the backend code. Metrics Related to open metrics, measurements or usage data labels Jun 16, 2026

feat: add backend middleware analytics for prometheus

124516d

* Add custom metrics to estimate unique visitors. * Add custom metrics to estimate usefull client information (browser, os, device) Signed-off-by: Alan Peixinho <alan.peixinho@profusion.mobi>

		"kernelCI_app.middleware.logServerErrorMiddleware.LogServerErrorMiddleware",
		"kernelCI_app.middleware.backendRequestMetricsMiddleware.BackendRequestMetricsMiddleware",

		UNIQUE_VISITOR_TTL_SECONDS = 48 * 60 * 60
		UNIQUE_VISITOR_SALT_BYTES = 32

		def record_client(**kwargs) -> None:
		DASHBOARD_BACKEND_REQUESTS_BY_CLIENT.labels(**kwargs).inc()

Conversation

alanpeixinho commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What it is

How to test

Uh oh!

bhcopeland commented Jun 10, 2026

Uh oh!

alanpeixinho commented Jun 10, 2026

Uh oh!

alanpeixinho commented Jun 10, 2026

Uh oh!

alanpeixinho commented Jun 11, 2026

Uh oh!

alanpeixinho commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tales-aparecida commented Jun 12, 2026

Uh oh!

mentonin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mentonin Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanpeixinho Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanpeixinho commented Jun 9, 2026 •

edited

Loading

alanpeixinho commented Jun 12, 2026 •

edited

Loading

mentonin Jun 15, 2026 •

edited

Loading

alanpeixinho Jun 15, 2026 •

edited

Loading

mentonin Jun 15, 2026 •

edited

Loading