Commit ca39956
authored
feat(webapp): add per-worker Node.js heap metrics (#3437)
## Summary
Adds direct V8 heap and process-memory gauges to the webapp's
OpenTelemetry meter. The webapp already exports per-cluster-worker
Node.js runtime metrics (event-loop lag / utilization, active handles,
active requests, libuv threadpool size) via a custom meter under the
`trigger.dev` scope. Heap and memory were missing; this PR adds them
alongside, in the same observable-batch pattern.
## New gauges
| Metric | Source | Unit |
| --- | --- | --- |
| `nodejs.memory.heap.used` | `process.memoryUsage().heapUsed` | bytes |
| `nodejs.memory.heap.total` | `process.memoryUsage().heapTotal` | bytes
|
| `nodejs.memory.heap.limit` | `v8.getHeapStatistics().heap_size_limit`
| bytes |
| `nodejs.memory.external` | `process.memoryUsage().external` | bytes |
| `nodejs.memory.array_buffers` | `process.memoryUsage().arrayBuffers` |
bytes |
| `nodejs.memory.rss` | `process.memoryUsage().rss` | bytes |
Gated by the existing `INTERNAL_OTEL_NODEJS_METRICS_ENABLED` flag, same
as the adjacent event-loop / handle gauges. Zero overhead when disabled.
## Why
`@opentelemetry/host-metrics` publishes `process.memory.usage`, which is
RSS only. RSS is the sum of V8 heap, external memory (Buffers, etc.),
native code, and thread stacks. Without a direct heap metric it is not
possible to size the V8 heap cap (`--max-old-space-size`) from metrics
alone, because RSS overstates heap by the external + native footprint. A
worker can have a 4 GB RSS with a 2.5 GB heap and 1.5 GB of buffers; the
former constrains `--max-old-space-size`, the latter does not.
`nodejs.memory.heap.limit` also surfaces the configured
`--max-old-space-size` (read from
`v8.getHeapStatistics().heap_size_limit`), so operators can see the
current limit in the same dashboard as actual usage rather than
cross-referencing container environment variables.
## Risk
Minimal. Observable gauges are sampled at the configured metric-export
interval. `v8.getHeapStatistics()` and `process.memoryUsage()` are each
microsecond-level calls, and six gauges are added to the same batch
callback that already reads ~20 other Node.js runtime values per sample.
Same registration pattern as the existing event-loop metrics in the
file.
## Test plan
- [ ] Deploy and confirm the six new gauges appear at the configured
exporter
- [ ] In cluster mode, confirm per-worker granularity (one series per
cluster worker, tagged by `process.executable.name` /
`service.instance.id`)
- [ ] Confirm `nodejs.memory.heap.limit` reports the configured
`--max-old-space-size` value in bytes1 parent 41434b5 commit ca39956
2 files changed
Lines changed: 69 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| |||
630 | 631 | | |
631 | 632 | | |
632 | 633 | | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
633 | 667 | | |
634 | 668 | | |
635 | 669 | | |
| |||
683 | 717 | | |
684 | 718 | | |
685 | 719 | | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
686 | 723 | | |
687 | 724 | | |
688 | 725 | | |
689 | 726 | | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
690 | 730 | | |
691 | 731 | | |
692 | 732 | | |
| |||
702 | 742 | | |
703 | 743 | | |
704 | 744 | | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
705 | 753 | | |
706 | 754 | | |
707 | 755 | | |
| |||
714 | 762 | | |
715 | 763 | | |
716 | 764 | | |
| 765 | + | |
717 | 766 | | |
718 | 767 | | |
719 | 768 | | |
| |||
739 | 788 | | |
740 | 789 | | |
741 | 790 | | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
742 | 799 | | |
743 | 800 | | |
744 | 801 | | |
| |||
753 | 810 | | |
754 | 811 | | |
755 | 812 | | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
756 | 819 | | |
757 | 820 | | |
758 | 821 | | |
| |||
0 commit comments