Bug 33524

Summary: benchmark how much validation in the dbus-daemon costs us
Product: dbus Reporter: Simon McVittie <smcv>
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: John (J5) Palmieri <johnp>
Severity: enhancement    
Priority: medium CC: cosimo.alfarano, hp, me, robin.bateboerop
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Simon McVittie 2011-01-26 03:51:56 UTC
(I'm going to start a mailing list thread for discussion of this; this bug mainly exists to keep it on my to-do list.)

People keep proposing that we throw away D-Bus and reinvent IPC to get rid of perceived performance problems. Before throwing away the baby with the bathwater, let's benchmark it and see what's actually wrong with it...

One obvious target is message validation. It always seems slightly perverse to me that in the typical case for a single D-Bus message:

    :1.2 -----------> dbus-daemon ----------> :1.3

we have several sets of validation going on:

* :1.2 composes a message (ideally with a g_return_if_fail() or equivalent to check that strings are UTF-8 and so on, which can be turned off in a "production" build)

* dbus-daemon receives the message and validates the message body

* :1.3 receives the message; a high-quality client library implementation will validate the message again, in a way that can't be turned off

An obvious thing that would be interesting to benchmark: how does performance compare if we turn off validation entirely, and have all processes blindly trust other processes?

That's the best performance increase we could possibly achieve by not validating. Obviously, we don't actually want all processes to trust all other processes, particularly on the system bus, so that's more of an unattainable goal than anything else.

However, what if we make dbus-daemon never validate message bodies? Things that'd be required for that:

* Audit client libraries to make sure they deal with bad message bodies gracefully; this includes at least libdbus (on behalf of dbus-glib, QtDbus and dbus-python), GDBus, ndesk-dbus and dbus-java.

* Keep DBusConnection validating message bodies by default (because clients still want to do that), but have a way to turn that off for dbus-daemon's benefit

* If the message body is bad, clients can cope; they should log an error or something (synthesize a signal from org.freedesktop.DBus.Local with the bad message as a byte-array?), but can skip the rest of the message, because we know the length of the message (how many bytes to skip) from the header.

* If the variable-length part of the message header is bad, this is serious (dbus-daemon is required to have have checked it so that it can set a trusted sender), so we should disconnect

* If the 16-byte prefix of the message header (which contains the actual length) is bad, this is absolutely fatal, because we can't know how many bytes to skip; we must disconnect

* If we get the length wrong somehow, we'll skip the wrong number of bytes; people have expressed worries in the past that becoming de-synchronized with message boundaries like this will result in skipping every message, forever, silently. However, it's unlikely that the incorrect message boundary will leave us at a point that's syntactically valid as the 16-byte prefix[1], so we can detect this situation with reasonable certainty.

[1] The somewhat-wasteful header format becomes helpful! Byte 0 (endianness) must be 'l' or 'B', byte 3 must be 0x01, bytes 4-7 must be a sensible message length, and bytes 12-15 must be an array length that doesn't exceed the message length
Comment 1 Simon McVittie 2011-01-26 04:10:41 UTC
<http://lists.freedesktop.org/archives/dbus/2011-January/014037.html> is a somewhat edited version of Comment #0. In particular, if you have a favourite automated D-Bus benchmark that's not mentioned in that mail, please let me know.

I'll summarize mailing list discussion here later.
Comment 2 Simon McVittie 2011-01-26 05:22:58 UTC
(In reply to comment #0)
> An obvious thing that would be interesting to benchmark: how does performance
> compare if we turn off validation entirely, and have all processes blindly
> trust other processes?

A few months ago, Alban reported in <http://lists.freedesktop.org/archives/dbus/2010-September/013493.html>:

> First, results without kdbus:
> - Validation enabled:        19.2s
> - Validation in client only: 18.4s
> - Validation in server only: 18.3s
> - Validation disabled:       17.7s (-8%)

So, turning off validation in dbus-daemon makes us roughly 4% faster; not having validation at all gets us another 4%, but is unacceptable in general; and we could potentially get some of that speedup by making the validation faster.
Comment 3 Simon McVittie 2013-08-27 15:28:25 UTC
I think Alban's benchmark figures are enough to call this fixed, given that nobody has come forward with a different benchmark in 2.5 years.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.