There is a documentation bug in the dbus library that specifies that several functions return FALSE to mean that an out-of-memory condition occurred, even though in reality there are many other reasons why FALSE might be returned. The dbus-glib library faithfully respects this (erroneous) documentation, and calls g_error("Out of memory") when certain dbus calls return FALSE. It's extremely troublesome to have a process abort with the message "Out of memory" when really what happened was, e.g., dbus_connection_send_with_reply() was given an invalid timeout parameter (which causes it to return FALSE). I have already filed a dbus case to get the documentation fixed (https://bugs.freedesktop.org/show_bug.cgi?id=30169). I'm filing this case to inform you guys of the fact that dbus-glib needs to be fixed in light of the fact that the dbus documentation on which its implementation is based is incorrect. I consider this to be a "blocker" because dbus-glib calls abort() instead of recovering from errors, meaning that client apps end up mysteriously failing as "out of memory" for no good reason.
Yeah, dbus-glib should be checking all the same preconditions that dbus does.
Downgrading this a bit; it's clearly not a release blocker (dbus-glib is well-known to be a minefield, prefer to use GDBus in new code) but is still quite important to fix. For some programming errors (notably strings being non-UTF-8) it's problematic or inefficient for dbus-glib to duplicate libdbus' checks, but we can demote the failure mode from g_error to g_critical in these cases, and mention the possible cause other than OOM. This branch ended up being fairly long due to the complexity of D-Bus marshalling. I've fixed these: * use of invalid object paths is diagnosed as a programming error * marshalling arguments to DBusMessage (at a low level) doesn't falsely report OOM * passing bad (D-Bus-level) arguments to DBusGProxy methods is diagnosed as a programming error * negative timeouts other than -1 are diagnosed as a programming error I haven't yet fixed these: * If you pass invalid arguments in DBusGObject (service-side) API, it might crash with "OOM" rather than diagnosing a programming error * Syntactically invalid method/signal names, interfaces, error names, bus names aren't diagnosed as a programming error and might possibly be reported as "OOM" For those, a policy question for Colin and the rest of the cabal: we depend on a new enough GLib (2.26) that it's guaranteed to have GIO and at least basic D-Bus functionality. Would it be considered OK to link libdbus-glib-1 against libgio-2.0, so it can use g_dbus_is_unique_name() (etc.) in assertion checks? Otherwise, we need to either add API to libdbus (and wait for it to be widespread), or copypaste in Yet Another implementation of these syntactic checks (and having written one in telepathy-glib and one in dbus-python, I'm getting a bit tired of writing the same code).
Created attachment 44943 [details] [review] Depend on dbus 1.2.16 for dbus_message_iter_abandon_container
Created attachment 44944 [details] [review] When given an object path, use GVariant to check syntactic validity We already depend on a new enough GLib that it has these functions.
Created attachment 44945 [details] [review] dbus-gvalue: replace trivial cases of OOM with a call to a noreturn function In these cases there's no invalid value we could be supplying to libdbus, so it's definitely OOM, not a programming error (e.g. non-UTF-8) masquerading as OOM.
Created attachment 44946 [details] [review] marshal_basic: if a boolean has a non-boolean value, diagnose the error
Created attachment 44947 [details] [review] marshal_basic: if marshalling a string fails, critical and return FALSE This could be OOM, but it could also be non-UTF-8.
Created attachment 44948 [details] [review] marshal_valuearray: abandon the container if we fail
Created attachment 44949 [details] [review] marshal_valuearray: simplify teardown If close_container fails, it might be an assertion failure in libdbus, so don't assume it's OOM. We have to handle programming errors somewhat gracefully during marshalling anyway, so there's no point in asserting.
Created attachment 44950 [details] [review] marshal_proxy: check that it's really a proxy and its path is valid This means that dbus_message_iter_append_basic can't fail other than by OOM, so use the shared OOM handler.
Created attachment 44951 [details] [review] marshal_object_path: check that it's really a valid object path Again, this means failure can only be OOM, so use the shared handler. Also remove an unnecessary cast.
Created attachment 44952 [details] [review] marshal_object: check that it's really an object and its path is valid Again, we can now assume that failure is OOM.
Created attachment 44953 [details] [review] marshal_signature: check validity of the signature Also remove a spurious cast, and comment why we're *not* assuming that failure here is necessarily OOM.
Created attachment 44954 [details] [review] marshal_map: on error, abandon the container instead of closing it libdbus considers it to be a programming error if you close a container gracefully when its contents are incomplete.
Created attachment 44955 [details] [review] marshal_struct: simplify teardown, and abandon broken containers
Created attachment 44956 [details] [review] marshal_variant: abandon broken containers rather than closing them
Created attachment 44957 [details] [review] marshal_collection_ptrarray: assert that the signature is valid This means that dbus_message_iter_open_container can't fail except by OOM, so use the shared handler.
Created attachment 44958 [details] [review] marshal_collection_ptrarray: abandon broken containers, simplify exit path
Created attachment 44959 [details] [review] marshal_collection_array: fail early if the array is NULL
Created attachment 44960 [details] [review] marshal_collection_array: assert validity of signature This means we can recognise OOM reliably; do so.
Created attachment 44961 [details] [review] marshal_collection_array: if appending fails, don't assume OOM Appending can also fail by appending ((gboolean) 23) or something; admittedly, that's a programming error.
Created attachment 44962 [details] [review] marshal_collection_array: simplify exit path Also don't assume that failure of dbus_message_iter_close_container is always OOM.
Created attachment 44963 [details] [review] dbus_g_proxy_marshal_args_to_message: diagnose inability to marshal args Failure to marshal arguments is entirely reachable, albeit only via programming error.
Created attachment 44964 [details] [review] dbus_g_proxy_begin_call_internal: don't assume NULL message means OOM It might either be OOM or bad arguments, probably the latter. That's programming error, but we already raised a critical warning, so be silent.
Created attachment 44965 [details] [review] dbus_g_proxy_call_no_reply: don't assume NULL message means OOM It might either be OOM or bad arguments, probably the latter. That's programming error, but we already raised a critical warning, so be silent.
Created attachment 44966 [details] [review] DBusGProxy: share OOM handler
Created attachment 44967 [details] [review] Don't allow proxy timeouts to be set negative, except for -1 libdbus checks this as a precondition, but to avoid astonishing error behaviour (interpreted as OOM), so should we.
Created attachment 44968 [details] [review] improve documentation of proxy timeouts to mention that -1 is special
That's all for now...
Review of attachment 44957 [details] [review]: ::: dbus/dbus-gvalue.c @@ +1909,3 @@ return FALSE; } + g_assert (g_variant_is_signature (elt_sig)); Why an assertion and not a critical here?
Apart from a couple of places where you've used assertions rather than criticals (which may be defensible, I just couldn't see what was different in those cases), looks fine. (In reply to comment #2) > For those, a policy question for Colin and the rest of the cabal: we depend on > a new enough GLib (2.26) that it's guaranteed to have GIO and at least basic > D-Bus functionality. Would it be considered OK to link libdbus-glib-1 against > libgio-2.0, so it can use g_dbus_is_unique_name() (etc.) in assertion checks? I think this is fine too. Points for style!
(In reply to comment #30) > @@ +1909,3 @@ > return FALSE; > } > + g_assert (g_variant_is_signature (elt_sig)); > > Why an assertion and not a critical here? It was returned about 4 lines ago (sadly just outside the range of your diff context) by an internal dbus-glib function. If internal dbus-glib functions are returning non-NULL strings that aren't valid signatures, then we deserve SIGABRT :-) (And the same for the other use of g_assert.)
(In reply to comment #31) > Apart from a couple of places where you've used assertions rather than > criticals (which may be defensible, I just couldn't see what was different in > those cases), looks fine. Are you sufficiently convinced by Comment #32 to let me merge this? This is the first in a stack of branches which fix this bug in general.
For your convenience, here is some additional context for the added assertions: Attachment #44957 [details]: http://git.collabora.co.uk/?p=user/smcv/dbus-glib-smcv.git;a=blob;f=dbus/dbus-gvalue.c;h=6493bc5a434bbea5d0625a4fb898288daaaadf38;hb=8de6c965adcd3a992a93ea09951f09421cd0b5fc#l1900 elt_sig is produced by _dbus_gtype_to_signature, which should never return a non-NULL string that isn't a signature. Attachment #44960 [details]: http://git.collabora.co.uk/?p=user/smcv/dbus-glib-smcv.git;a=blob;f=dbus/dbus-gvalue.c;h=ab5580b7b92aec4897bb8e0eb131010215bcc2b0;hb=f384985d4d8aa1e45ea03b8a479fb967fd83d9c9#l1951 Same explanation applies, except the string is called subsignature_str.
Sure, I'm convinced. :)
Fixed in git for 0.94, thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.