UDP vs stdout for metrics

I was thinking about using the Librato Heroku integration to generate metrics from log statements but I was curious about the performance cost compared to UDP based metrics. I know UDP is pretty fast because there isn’t a connection to setup, but how does that compare to logging to stdout?

Comparison

Here are some scripts that test the performance of logging to stdout and sending the same message over UDP. We’re using 100,000 iterations and to run stdout_test.py we redirect stderr to /dev/null, to give us the best performance for the logging example.

# stdout_test.py
# python3 stdout_test.py 2> /dev/null

import timeit
import logging

log = logging.getLogger("test")
log.addHandler(logging.StreamHandler())
log.setLevel(logging.DEBUG)

total_time = timeit.timeit(
    stmt="log.debug(b'web.api.blah.test:1|c')", number=100_000, globals=dict(log=log)
)
print(total_time)

# socket_test.py
# python3 socket_test.py

import timeit
import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
total_time = timeit.timeit(
    stmt="sock.sendto(b'web.api.blah.test:1|c',('127.0.0.1',5005))",
    number=100_000,
    globals=dict(sock=sock),
)
print(total_time)

Results

The following table shows the durations for the programs above. The trial durations are for 100,000 iterations, as configured in the code.

As we can see, using UDP based metrics is about five times faster than stdout, which I my opinion is negligible for a web application where I wouldn’t expect to see a huge amount of metrics generated on each request.

metrics type	trial 1 (seconds)	trial 2 (seconds)	trial 3 (seconds)	trial 4 (seconds)	trial 5 (seconds)	avg (seconds)	avg/loop (μs/loop)
udp	0.264	0.264	0.264	0.264	0.264	0.264	2.639
stdout	1.286	1.303	1.251	1.301	1.289	1.286	12.858

Future work

I think if we were logging to a disk with large latencies I would expect a more significant difference between UDP and logging, but I think this problem could be negated by using a separate thread for logging to disk. Python has the QueueHandler log handler for this case.

I think it’s okay to say that logging is probably fine for metrics, but if you were concerned about performance implications it would be worthwhile to profile each solution within your application to accurately measure the performance impact.