Skip to content

Metrics

Emit emits metrics through six OpenTelemetry meters, each covering a distinct concern: the core pipeline, the outbox, distributed locking, the mediator, Kafka client operations, and low-level broker statistics. All meters are registered via a single call to AddEmitInstrumentation.

Enrichment tags

emit.node.id is automatically included as a static tag on all metrics. It identifies the node that emitted the metric and is set once at startup from INodeIdentity.

Setup

builder.Services.AddOpenTelemetry()
.WithMetrics(metrics =>
{
metrics.AddEmitInstrumentation();
metrics.AddOtlpExporter();
});

Selectively enable or disable individual meters:

metrics.AddEmitInstrumentation(options =>
{
options.EnableKafkaBrokerMeter = false; // skip librdkafka statistics
});

All options default to true:

OptionMeter controlled
EnableEmitMeterEmit
EnableOutboxMeterEmit.Outbox
EnableLockMeterEmit.Lock
EnableMediatorMeterEmit.Mediator
EnableKafkaMeterEmit.Kafka
EnableKafkaBrokerMeterEmit.Kafka.Broker

AddEmitInstrumentation also registers histogram views with second-scale bucket boundaries for all *duration instruments and byte-scale boundaries for emit.kafka.produce.message_size. The OTel SDK default boundaries ([0, 5, 10, 25, ...]) assume milliseconds; without this override, every sub-second observation lands in the le=5 bucket and histogram_quantile returns nonsense values.

Meter: Emit

Pipeline and consumer middleware metrics.

InstrumentTypeTagsDescription
emit.pipeline.produce.durationHistogram (s)provider, resultTime from ProduceAsync call to outbox write or direct delivery
emit.pipeline.produce.completedCounterprovider, resultProduce operations completed
emit.pipeline.consume.durationHistogram (s)provider, result, consumerTime from message receipt to handler return
emit.pipeline.consume.completedCounterprovider, result, consumerConsume operations completed
emit.consumer.error.actionsCounteractionError actions taken (retry, dead_letter, discard)
emit.consumer.retry.attemptsHistogramresultRetry counts per sequence (success, exhausted)
emit.consumer.retry.durationHistogram (s)resultTotal time spent in retry loops
emit.consumer.discardsCounterMessages discarded by error policy
emit.consumer.validation.completedCounterresult, actionValidation results
emit.consumer.validation.durationHistogram (s)Time spent in validation
emit.consumer.circuit_breaker.state_transitionsCounterto_stateCircuit state changes
emit.consumer.circuit_breaker.open_durationHistogram (s)How long the circuit stayed open
emit.consumer.circuit_breaker.stateGaugeCurrent circuit state: 0=closed, 1=open, 2=half-open
emit.consumer.rate_limit.wait_durationHistogram (s)Time waiting for a rate limit permit
emit.consumer.dlq.producedCounterreasonMessages forwarded to DLQ
emit.consumer.dlq.produce_errorsCounterreasonFailures while forwarding to DLQ

Meter: Emit.Outbox

Outbox entry lifecycle and worker health.

InstrumentTypeTagsDescription
emit.outbox.enqueuedCountersystemEntries written to the outbox
emit.outbox.processing.durationHistogram (s)system, resultTime to deliver one entry
emit.outbox.processing.completedCountersystem, resultEntries processed
emit.outbox.critical_timeHistogram (s)systemEnd-to-end latency from enqueue to delivery
emit.outbox.worker.poll_cyclesCounterhas_entriesWorker poll cycles
emit.outbox.worker.batch_entriesHistogramEntries per poll cycle
emit.outbox.worker.errorsCounterWorker-level errors

emit.outbox.critical_time is particularly useful for alerting: it measures the total delay from when a message was produced to when it was delivered to the broker.

Meter: Emit.Lock

Distributed lock acquisition and hold duration.

InstrumentTypeTagsDescription
emit.lock.acquire.durationHistogram (s)resultTime to acquire a lock (including retries)
emit.lock.acquire.retriesHistogramRetry count before a lock was acquired
emit.lock.renewal.completedCounterresultLock renewal attempts
emit.lock.held.durationHistogram (s)How long a lock was held before release

Meter: Emit.Mediator

Mediator request dispatch.

InstrumentTypeTagsDescription
emit.mediator.send.durationHistogram (s)request_type, resultTime for SendAsync to complete
emit.mediator.send.completedCounterrequest_type, resultMediator dispatches completed
emit.mediator.send.activeUpDownCounterIn-flight mediator requests

Meter: Emit.Kafka

Kafka producer and consumer client metrics.

Producer

InstrumentTypeTagsDescription
emit.kafka.produce.durationHistogram (s)topic, resultBroker round-trip time for produce operations
emit.kafka.produce.messagesCountertopic, resultKafka produce operations
emit.kafka.produce.message_sizeHistogram (By)topicMessage size distribution (key + value bytes)

Consumer

InstrumentTypeTagsDescription
emit.kafka.consume.messagesCounterconsumer_group, topic, partitionMessages read from Kafka
emit.kafka.consume.deserialization.errorsCounterconsumer_group, topic, component, actionDeserialization exceptions
emit.kafka.consume.deserialization.durationHistogram (s)consumer_group, componentDeserializer execution time
emit.kafka.consume.partition.eventsCounterconsumer_group, topic, typePartition rebalance events
emit.kafka.consume.offset.commitsCounterconsumer_group, resultOffset commit operations
emit.kafka.consume.worker.faultsCounterconsumer_groupWorker fault events
emit.kafka.consume.dlq.producedCounterconsumer_group, source_topicDeserialization errors forwarded to DLQ
emit.kafka.consume.worker.channel_depthGaugeconsumer_group, worker_indexMessages queued in worker channels
emit.kafka.consume.groups.activeGaugeActive consumer groups
emit.kafka.consume.worker_pool.sizeGaugeconsumer_groupConfigured worker count per group

Meter: Emit.Kafka.Broker

Low-level librdkafka broker statistics. These come directly from Confluent’s statistics callback and reflect the health of the underlying native client.

Prerequisite: these instruments are only populated when KafkaClientConfig.StatisticsInterval is set to a non-zero value (e.g. TimeSpan.FromSeconds(5)). Without it, the callback never fires and all broker meter values remain zero.

Broker health

InstrumentTypeTagsDescription
emit.kafka.broker.request.rttGauge (s)broker_id, percentileBroker round-trip time (p50/p95/p99)
emit.kafka.broker.throttle.durationGauge (s)broker_idAverage broker throttle time
emit.kafka.broker.outbuf.messagesGaugebroker_idMessages in outbound buffer
emit.kafka.broker.waitresp.messagesGaugebroker_idMessages awaiting broker response
emit.kafka.broker.request.timeoutsCounterbroker_idCumulative timed-out requests
emit.kafka.broker.tx.errorsCounterbroker_idCumulative transmission errors
emit.kafka.broker.connection.stateGaugebroker_id, stateConnection state (1=UP, 0=DOWN, 2=INIT, 3=CONNECTING, 4=AUTH)

Broker performance

InstrumentTypeTagsDescription
emit.kafka.broker.client.messages_queuedGaugeTotal messages across all producer queues
emit.kafka.broker.client.bytes_queuedGauge (By)Total size of all queued messages
emit.kafka.broker.tx.bytesCounter (By)broker_idCumulative bytes transmitted
emit.kafka.broker.tx.retriesCounterbroker_idCumulative transmission retries
emit.kafka.broker.internal_latencyGauge (μs)broker_id, percentileInternal queue latency before transmission (p50/p95/p99)

Consumer lag and group health

InstrumentTypeTagsDescription
emit.kafka.broker.consumer.lagGaugetopic, partitionConsumer lag per topic-partition
emit.kafka.broker.consumer_group.stateGaugestateConsumer group state
emit.kafka.broker.consumer_group.rebalancesCounterCumulative rebalance count

Broker connectivity

InstrumentTypeTagsDescription
emit.kafka.broker.connectsCounterbroker_idCumulative connection attempts
emit.kafka.broker.disconnectsCounterbroker_idCumulative disconnections

These instruments are useful for diagnosing network-level issues but rarely needed for day-to-day monitoring. Disable them with options.EnableKafkaBrokerMeter = false if your metrics pipeline does not need this level of detail.