NET9中的RuntimeMetrics

程序员有二十年 2024-09-16 12:58:49
.NET 9 中的 RuntimeMetricsIntro

.NET 9 中引入了 RuntimeMetrics,基于 dotnet 里的 metrics 实现 System.Diagnostic.Metrics.Meter 来生成 metrics 数据,包含了 CPU、内存、GC、JIT 以及线程等信息

Sample

那我们就结合 OpenTelemetry 来看一个简单的示例,sample 引用了 OpenTelemetry.Exporter.Console 将 metrics 数据直接导出到 console

using var _ = Sdk.CreateMeterProviderBuilder() .AddMeter("System.Runtime") .AddConsoleExporter() .Build();while (true){await Task.Delay(TimeSpan.FromSeconds(10)); GC.Collect();}

运行结果如下:

Metric Name: dotnet.gc.collections, The number of garbage collections that have occurred since the process has started., Unit: {collection}, Meter: System.Runtime(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen2 LongSumValue: 0(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen1 LongSumValue: 0(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen0 LongSumValue: 0Metric Name: dotnet.process.memory.working_set, The number of bytes of physical memory mapped to the process context., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3779803Z, 2024-09-15T11:24:36.3640471Z] LongSumNonMonotonicValue: 29241344Metric Name: dotnet.gc.heap.total_allocated, The approximate number of bytes allocated on the managed GC heap since the process has started. The returned value does not include any native allocations., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3779975Z, 2024-09-15T11:24:36.3640481Z] LongSumValue: 3427736Metric Name: dotnet.gc.pause.time, The total amount of time paused in GC since the process has started., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3793018Z, 2024-09-15T11:24:36.3640951Z] DoubleSumValue: 0Metric Name: dotnet.jit.compiled_il.size, Count of bytes of intermediate language that have been compiled since the process has started., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3793233Z, 2024-09-15T11:24:36.3640965Z] LongSumValue: 67761Metric Name: dotnet.jit.compiled_methods, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: {method}, Meter: System.Runtime(2024-09-15T11:24:26.3793381Z, 2024-09-15T11:24:36.3640975Z] LongSumValue: 829Metric Name: dotnet.jit.compilation.time, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3793520Z, 2024-09-15T11:24:36.3640985Z] DoubleSumValue: 1.3528453Metric Name: dotnet.monitor.lock_contentions, The number of times there was contention when trying to acquire a monitor lock since the process has started., Unit: {contention}, Meter: System.Runtime(2024-09-15T11:24:26.3793612Z, 2024-09-15T11:24:36.3641010Z] LongSumValue: 0Metric Name: dotnet.thread_pool.thread.count, The number of thread pool threads that currently exist., Unit: {thread}, Meter: System.Runtime(2024-09-15T11:24:26.3793740Z, 2024-09-15T11:24:36.3641018Z] LongSumValue: 1Metric Name: dotnet.thread_pool.work_item.count, The number of work items that the thread pool has completed since the process has started., Unit: {work_item}, Meter: System.Runtime(2024-09-15T11:24:26.3793850Z, 2024-09-15T11:24:36.3641026Z] LongSumValue: 2Metric Name: dotnet.thread_pool.queue.length, The number of work items that are currently queued to be processed by the thread pool., Unit: {work_item}, Meter: System.Runtime(2024-09-15T11:24:26.3793936Z, 2024-09-15T11:24:36.3641034Z] LongSumValue: 0Metric Name: dotnet.timer.count, The number of timer instances that are currently active. An active timer is registered to tick at some point in the future and has not yet been canceled., Unit: {timer}, Meter: System.Runtime(2024-09-15T11:24:26.3794100Z, 2024-09-15T11:24:36.3641060Z] LongSumNonMonotonicValue: 2Metric Name: dotnet.assembly.count, The number of .NET assemblies that are currently loaded., Unit: {assembly}, Meter: System.Runtime(2024-09-15T11:24:26.3794199Z, 2024-09-15T11:24:36.3641072Z] LongSumNonMonotonicValue: 33Metric Name: dotnet.process.cpu.count, The number of processors available to the process., Unit: {cpu}, Meter: System.Runtime(2024-09-15T11:24:26.3794408Z, 2024-09-15T11:24:36.3641099Z] LongSumNonMonotonicValue: 22Metric Name: dotnet.process.cpu.time, CPU time used by the process., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:24:36.3641105Z] cpu.mode: user DoubleSumValue: 0.15625(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:24:36.3641105Z] cpu.mode: system DoubleSumValue: 0

随着我们触发 GC.Collect GC 回收的 metrics 也会发生变化

Metric Name: dotnet.gc.collections, The number of garbage collections that have occurred since the process has started., Unit: {collection}, Meter: System.Runtime(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen2 LongSumValue: 3(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen1 LongSumValue: 0(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen0 LongSumValue: 0Metric Name: dotnet.process.memory.working_set, The number of bytes of physical memory mapped to the process context., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3779803Z, 2024-09-15T11:25:06.3409766Z] LongSumNonMonotonicValue: 36999168Metric Name: dotnet.gc.heap.total_allocated, The approximate number of bytes allocated on the managed GC heap since the process has started. The returned value does not include any native allocations., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3779975Z, 2024-09-15T11:25:06.3409774Z] LongSumValue: 3602608Metric Name: dotnet.gc.last_collection.memory.committed_size, The amount of committed virtual memory in use by the .NET GC, as observed during the latest garbage collection., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3780104Z, 2024-09-15T11:25:06.3409782Z] LongSumNonMonotonicValue: 4530176Metric Name: dotnet.gc.last_collection.heap.size, The managed GC heap size (including fragmentation), as observed during the latest garbage collection., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen0 LongSumNonMonotonicValue: 0(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen1 LongSumNonMonotonicValue: 560(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen2 LongSumNonMonotonicValue: 466256(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: loh LongSumNonMonotonicValue: 2739800(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: poh LongSumNonMonotonicValue: 9232Metric Name: dotnet.gc.last_collection.heap.fragmentation.size, The heap fragmentation, as observed during the latest garbage collection., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen0 LongSumNonMonotonicValue: 0(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen1 LongSumNonMonotonicValue: 80(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen2 LongSumNonMonotonicValue: 24(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: loh LongSumNonMonotonicValue: 608(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: poh LongSumNonMonotonicValue: 0Metric Name: dotnet.gc.pause.time, The total amount of time paused in GC since the process has started., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3793018Z, 2024-09-15T11:25:06.3409815Z] DoubleSumValue: 0.007868Metric Name: dotnet.jit.compiled_il.size, Count of bytes of intermediate language that have been compiled since the process has started., Unit: By, Meter: System.Runtime(2024-09-15T11:24:26.3793233Z, 2024-09-15T11:25:06.3409821Z] LongSumValue: 102621Metric Name: dotnet.jit.compiled_methods, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: {method}, Meter: System.Runtime(2024-09-15T11:24:26.3793381Z, 2024-09-15T11:25:06.3409826Z] LongSumValue: 1196Metric Name: dotnet.jit.compilation.time, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3793520Z, 2024-09-15T11:25:06.3409832Z] DoubleSumValue: 1.7024187Metric Name: dotnet.monitor.lock_contentions, The number of times there was contention when trying to acquire a monitor lock since the process has started., Unit: {contention}, Meter: System.Runtime(2024-09-15T11:24:26.3793612Z, 2024-09-15T11:25:06.3409859Z] LongSumValue: 0Metric Name: dotnet.thread_pool.thread.count, The number of thread pool threads that currently exist., Unit: {thread}, Meter: System.Runtime(2024-09-15T11:24:26.3793740Z, 2024-09-15T11:25:06.3409867Z] LongSumValue: 1Metric Name: dotnet.thread_pool.work_item.count, The number of work items that the thread pool has completed since the process has started., Unit: {work_item}, Meter: System.Runtime(2024-09-15T11:24:26.3793850Z, 2024-09-15T11:25:06.3409875Z] LongSumValue: 8Metric Name: dotnet.thread_pool.queue.length, The number of work items that are currently queued to be processed by the thread pool., Unit: {work_item}, Meter: System.Runtime(2024-09-15T11:24:26.3793936Z, 2024-09-15T11:25:06.3409883Z] LongSumValue: 0Metric Name: dotnet.timer.count, The number of timer instances that are currently active. An active timer is registered to tick at some point in the future and has not yet been canceled., Unit: {timer}, Meter: System.Runtime(2024-09-15T11:24:26.3794100Z, 2024-09-15T11:25:06.3409890Z] LongSumNonMonotonicValue: 2Metric Name: dotnet.assembly.count, The number of .NET assemblies that are currently loaded., Unit: {assembly}, Meter: System.Runtime(2024-09-15T11:24:26.3794199Z, 2024-09-15T11:25:06.3409895Z] LongSumNonMonotonicValue: 36Metric Name: dotnet.process.cpu.count, The number of processors available to the process., Unit: {cpu}, Meter: System.Runtime(2024-09-15T11:24:26.3794408Z, 2024-09-15T11:25:06.3409908Z] LongSumNonMonotonicValue: 22Metric Name: dotnet.process.cpu.time, CPU time used by the process., Unit: s, Meter: System.Runtime(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:25:06.3409913Z] cpu.mode: user DoubleSumValue: 0.203125(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:25:06.3409913Z] cpu.mode: system DoubleSumValue: 0

详细的 metrics 如下:

dotnet.process.cpu.time 进程使用的 CPU 时间(Counter)

cpu.mode:user/system,分为两种,用户态和系统态分别占用的时间,分别对应了 process 的 UserProcessorTime 和 PrivilegedProcessorTime

dotnet.process.memory.working_set 映射到进程上下文的物理内存字节数 (UpDownCounter),对应 Environment.WorkingSet

dotnet.gc.collections 垃圾回收的次数(Counter)

dotnet.gc.heap.generation 垃圾回收所属堆的最大代数,比如 gen0/gen1/gen2

dotnet.gc.heap.total_allocated GC 堆总计大约分配的字节数(Counter),对应 GC.GetTotalAllocatedBytes()

dotnet.gc.last_collection.memory.committed_size 最近一次垃圾回收期间 GC 占用的提交内存(UpDownCounter),对应 GCMemoryInfo.TotalCommittedBytes

dotnet.gc.last_collection.heap.size 在最近一次垃圾回收期间观察到的托管GC堆大小(包括碎片)(UpDownCounter)

dotnet.gc.heap.generation 垃圾回收器托管堆代数名称 (gen0/gen1/gen2/loh/poh)

dotnet.gc.last_collection.heap.fragmentation.size 在最近的垃圾回收中观察到的堆碎片化情况(UpDownCounter)对应 GCGenerationInfo.FragmentationAfterBytes

dotnet.gc.pause.time GC暂停的总时间(Counter)对应 GC.GetTotalPauseDuration()

dotnet.jit.compiled_il.size 已编译的中间语言字节数(Counter)对应 JitInfo.GetCompiledILBytes

dotnet.jit.compiled_methods JIT编译器(重新)编译方法的次数(Counter)对应 JitInfo.GetCompiledMethodCount

dotnet.jit.compilation.time JIT编译器花费在编译方法上的时间(Counter)对应 JitInfo.GetCompilationTime

dotnet.thread_pool.thread.count 当前存在的线程池线程数量(UpDownCounter)对应 ThreadPool.ThreadCount

dotnet.thread_pool.work_item.count 线程池已完成的工作项数量(Counter)对应 ThreadPool.CompletedWorkItemCount

dotnet.thread_pool.queue.length 当前排队等待线程池处理的工作项数量(UpDownCounter)对应 ThreadPool.PendingWorkItemCount

dotnet.monitor.lock_contentions 尝试获取 Monitor 锁时发生争用的次数(Counter)对应 Monitor.LockContentionCount

dotnet.timer.count 当前活动的 Timer 实例数量(UpDownCounter)对应 Timer.ActiveCount

dotnet.assembly.count 当前加载的 .NET 程序集数量(UpDownCounter) 对应 AppDomain.GetAssemblies() 的数量

dotnet.exceptions 在托管代码中抛出的异常数量(Counter),对应 AppDomain.FirstChanceException event 的触发次数

error.type Exception 的类型,例如:System.OperationCanceledException/Contoso.MyExceptionMore

结合这些信息可以比较轻松地了解当前的 CPU,memory,GC,thread 等信息,对于了解当前应用是否健康非常的有帮助,除了在进程内使用 OpenTelemetry 来导出 metrics 之外也可以使用进程外的 dotnet-counters 等诊断工具来观察。

观察 CPU 数据来观察是否有过高的 CPU 使用

观察内存和 GC 数据看是否有垃圾回收、内存泄漏以及内存碎片之类的问题

观察线程池中队列和线程数的情况来查看是否有线程池饿死(thread pool starvation)的情况

观察 lock contention 来看是否有死锁以及锁不合理的使用

总而言之, runtime metrics 使得我们的应用可以有更好的观测性,获取当前应用的状态信息更加地方便了

Referenceshttps://github.com/dotnet/runtime/issues/85372https://github.com/dotnet/runtime/pull/104680https://github.com/dotnet/runtime/issues/105845https://github.com/dotnet/runtime/pull/106014https://github.com/open-telemetry/opentelemetry-dotnet-contrib/tree/main/src/OpenTelemetry.Instrumentation.Runtimehttps://learn.microsoft.com/en-us/dotnet/core/diagnostics/built-in-metrics-runtimehttps://github.com/WeihanLi/SamplesInPractice/blob/main/net9sample/Net9Samples/RuntimeMetricsSample.cs

0 阅读:0

程序员有二十年

简介:感谢大家的关注