Bug 90314

Summary: stop pretending that Windows has poll() / that DBusPollFD is an abstraction
Product: dbus Reporter: Simon McVittie <smcv>
Component: coreAssignee: D-Bus Maintainers <dbus>
Status: RESOLVED FIXED QA Contact: D-Bus Maintainers <dbus>
Severity: enhancement    
Priority: medium CC: rolland
Version: git master   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 89444    
Bug Blocks:    
Attachments: when HAVE_POLL, use the system definition of struct pollfd
pdated patch including _dbus_poll()

Description Simon McVittie 2015-05-05 11:12:31 UTC
_dbus_poll() pretends to be a portable abstraction for polling pollable handles (Unix: sockets, pipes, misc fds; Windows: sockets only), but it isn't really:

* on a modern Unix like Linux or FreeBSD, the best way to poll a large
  number of fds is something like Linux epoll or FreeBSD kqueue,
  which is not even "the same shape" as poll(); DBusSocketSet is a
  replacement abstraction designed to be "the right shape"

* on Unix, our current implementation requires that DBusPollFD
  has the same memory layout as struct pollfd: that isn't an
  abstraction, that's just camouflage

I think we should admit that Windows and Unix are just too different to have an abstraction at that level, and use better abstractions instead.

On Bug #89444 I outlined a possible road-map for making this make more sense:

* add a Windows-specific dbus-socket-set-winsock.c which implements
  the DBusSocketSet interface using select() directly, just like -epoll.c
  uses Linux epoll directly; stop building any other DBusSocketSet backends
  on Windows, which lets us treat -poll.c as Unix-specific

* add _dbus_socket_poll() which polls exactly one socket
  (using _dbus_poll() on Unix and select() on Windows), and use that in
  dbus-transport-socket.c, something like this:

      DBusWatchFlags _dbus_socket_poll (DBusSocket sock,
          int timeout_milliseconds,
          DBusWatchFlags events);

  (or alternatively move the DBusSocketSet stuff into the shared library
  and use that, but I suspect that _dbus_socket_poll() would be really
  simple on both Windows and Unix)

* make _dbus_poll() and DBusPollFD Unix-specific, or just use poll()
  and struct pollfd directly (we do not currently support any non-Windows
  platforms without poll() and struct pollfd), in dbus-socket-set-poll.c,
  _dbus_socket_poll_one(), and the other remaining user of _dbus_poll(),
  which is the Unix-specific dbus-spawn.c

and an optional final step:

* ... that means we can change DBusPollable from int to struct { int }
  on Unix, if desired
Comment 1 Ralf Habacker 2015-05-05 11:43:15 UTC
I worked a few in that direction, not sure if it matches your needs, see 
 http://cgit.freedesktop.org/~rhabacker/dbus/log/?h=89444-comment-84-dbus-socket-set-winsock. It is based on the patches of bug 89444
Comment 2 Simon McVittie 2015-05-05 15:46:46 UTC
That looks like a good start, thanks! I'll try to check the design soon.
Comment 3 Simon McVittie 2015-05-12 18:29:59 UTC
> +/* dbus-socket-set-winsock.c - a socket set implemented via winsock

Did this start life as a copy of dbus-socket-set-poll.c? A lot of it looks extremely familiar. It might have a clearer git history if that was represented in git: one commit that just copies dbus-socket-set-poll.c, and a subsequent commit that renames things and hooks it up to the build system.

I was hoping you wouldn't have to rebuild the fd_set every time, but I was misremembering how select() works - it is indeed destructive (edits the fd_set), so what you're doing is the best we can do, unless it's considered valid in Winsock to say

    typedef struct {
      int n;
      fd_set read;
      fd_set write;
      fd_set exc;
    } DBusSocketSetWinsock;

    ... initialize those ...

    while (1)
      {
        fd_set r = set->read;             /* <- is this considered valid? */
        fd_set w = set->write;
        fd_set e = set->exc;

        ... = select (n, &r, &w, &e, timeout);
        ...
      }

It has copyright holders other than you, for the stuff copied from the poll implementation (which I wrote for Nokia, as you can see from its copyright boilerplate).

We could have a lot less diffstat if we left _dbus_poll() in dbus-sysdeps-win.c, and later in the patch series (when -poll becomes Unix-only), moved its declaration (along with that of DBusPollFD) to dbus-sysdeps-{win,unix}.h?

> +int
> +_dbus_socket_poll (DBusSocket s, int timeout)
> +{
> + DBusSocketEvent ready_fds;
> + DBusSocketSet *set = _dbus_socket_set_new (0);
> + ready_fds.fd = s;
> + int n_ready = _dbus_socket_set_poll (set, &ready_fds, 1, timeout);
> + _dbus_socket_set_free (set);
> + return n_ready;
> +}

This is going to need a way to get the "events" in and the "revents" out. I originally said it could return the "revents", but thinking about it, I suspect that might not actually be OK, because we also need to be able to distinguish between "event", "timed out", and "some error".

Creating a new DBusSocketSet is pretty inefficient for a one-shot operation: it does a significant amount of malloc()ing. My intention had been to just implement this in terms of native poll() (Unix) or select() (Windows) directly - the trivial case for a single socket is presumably much, much simpler than the "n sockets" case.

I'd like to use DBusWatchFlags for the events/revents, either immediately or after some refactoring. _DBUS_POLLWHATEVER is part of the same leaky abstraction as _dbus_poll(). We should never have to write code like this:

#ifdef _AIX
...
#define _DBUS_POLLIN      0x0001
...
#elif defined(__QNX__)
...
#define _DBUS_POLLIN      0x0005
...

and if we do, we certainly shouldn't try to claim that it's portable or an abstraction :-)

> -_dbus_socket_poll (DBusSocket s, int timeout)
> +_dbus_pollable_poll (DBusPollable p, int timeout)

Interesting idea, but my plan had been to only use _dbus_socket_poll() for DBusTransportSocket, and just use poll() directly for its Unix-specific uses (which are the only places where it's valid to poll a non-socket).

>  set (DBUS_SHARED_SOURCES ${DBUS_SHARED_SOURCES}
> ${DBUS_DIR}/dbus-file-win.c
> ${DBUS_DIR}/dbus-init-win.cpp
> + ${DBUS_DIR}/dbus-socket-set-winsock.c

I'd rather keep DBusSocketSet out of libdbus-1 if we can help it - it is explicitly not thread-safe, for a start.

If _dbus_socket_poll() just uses poll() or select() directly, then that shouldn't be too hard.
Comment 4 Simon McVittie 2015-05-12 18:32:03 UTC
(In reply to Simon McVittie from comment #0)
> * make _dbus_poll() and DBusPollFD Unix-specific, or just use poll()
>   and struct pollfd directly (we do not currently support any non-Windows
>   platforms without poll() and struct pollfd)

Oh, ugh, this is not actually true... _dbus_poll() falls back to select() :-(
Comment 5 Simon McVittie 2015-05-12 19:00:47 UTC
OK, I have a simpler plan for a starting point. I'll attach a patch shortly.
Comment 6 Simon McVittie 2015-05-12 19:54:21 UTC
(In reply to Simon McVittie from comment #5)
> OK, I have a simpler plan for a starting point. I'll attach a patch shortly.

Or not. Sorry, still thinking about the best way to attack this, given that there are two annoyingly different code paths on Unix, and we still need something essentially equivalent to an array of DBusPollFD on Windows.
Comment 7 Simon McVittie 2015-07-03 10:30:37 UTC
11:27 < smcv> the point of DBusPollFD is to try to avoid including OS-dependent 
              bits in otherwise OS-independent APIs
11:27 < smcv> but now I think about it
11:27 < smcv> we already do
11:27 < smcv> #ifdef HAVE_POLL
11:27 < smcv> #include <sys/poll.h>
11:27 < smcv> #endif
11:27 < vapula> exactly
11:27 < smcv> so... we could maybe do
11:27 < smcv> #ifdef HAVE_POLL
11:28 < smcv> typedef struct pollfd DBusPollFD;
11:28 < vapula> yup
11:28 < smcv> #define _DBUS_POLLIN POLLIN /* etc. */
11:28 < smcv> #else
11:28 < smcv> ... what we do now
11:28 < smcv> that would be fractionally less awful
11:28 < smcv> it still isn't an abstraction
11:28 < vapula> benefit is that you can also remove the runtime check in 
                _dbus_poll
11:29 < smcv> right
11:29 < smcv> oh, it should be #if defined(HAVE_POLL) && !defined(BROKEN_POLL)
11:29 < smcv> because apparently we support platforms where poll() exists but 
              is wrong, or something
11:29 < smcv> portability. isn't it wonderful
Comment 8 Simon McVittie 2015-07-03 10:33:35 UTC
(this was in the context of vapula wanting to port libdbus to a the Integrity RTOS, where struct pollfd has additional members beyond its traditional "shape")
Comment 9 Rolland Dudemaine 2015-07-03 13:56:44 UTC
Created attachment 116919 [details] [review]
when HAVE_POLL, use the system definition of struct pollfd

Patch to map system definitions of struct pollfd and POLL* macros to d-bus ones when available.
Of note, the current portable DBusPollFD is not fully portable, since http://pubs.opengroup.org/onlinepubs/009695399/basedefs/poll.h.html mentions that there are some members _at least_ there, but there may be more.
Comment 10 Rolland Dudemaine 2015-07-03 13:57:42 UTC
Forgot to mention that was created against 1.9.16
Comment 11 Simon McVittie 2015-07-03 15:06:30 UTC
Comment on attachment 116919 [details] [review]
when HAVE_POLL, use the system definition of struct pollfd

Review of attachment 116919 [details] [review]:
-----------------------------------------------------------------

::: dbus-1.9.16.linux/dbus/dbus-sysdeps.h
@@ +341,4 @@
>  #define _DBUS_POLLHUP     0x0040
>  /** Invalid request: fd not open */
>  #define _DBUS_POLLNVAL    0x1000
> +#elif HAVE_POLL

This should be #if defined(HAVE_POLL) && !defined(BROKEN_POLL), i.e. exactly the circumstances in which we call the real poll() at all.

The OS-specific bits above here (for QNX, Haiku, AIX) can go away, because we should now be using their genuine POLLIN, POLLPRI etc. constants in any case.

The #else should have a comment "we fake poll() in terms of select()" or something.

@@ +371,5 @@
>  
> +/**
> + * A portable struct pollfd wrapper. 
> + */
> +#if HAVE_POLL

#if defined(HAVE_POLL) && !defined(BROKEN_POLL)

(Be careful with the difference between #if X and #if defined(X) - they are not the same, although the latter is equivalent to #ifdef X.)
Comment 12 Simon McVittie 2015-07-03 15:08:27 UTC
(In reply to Rolland Dudemaine from comment #9)
> Of note, the current portable DBusPollFD is not fully portable, since
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/poll.h.html mentions
> that there are some members _at least_ there, but there may be more.

If you delete the AIX, Haiku, QNX special cases as I suggested, then we will be defining our own structure if and only if we are not actually going to call poll() anyway.

See dbus/dbus-sysdeps-unix.c for the actual implementation; as you noted on IRC, the runtime checks can go away too.

In other words, the changes I'm asking you for are really just an exercise in deleting code :-)
Comment 13 Simon McVittie 2015-07-03 15:10:30 UTC
It would be ideal if you could propose a patch in "git format-patch" format - that's the easiest way to maintain attribution information when I merge stuff.
Comment 14 Rolland Dudemaine 2015-07-04 14:00:39 UTC
Created attachment 116941 [details] [review]
pdated patch including _dbus_poll()

Here is a git patch against HEAD with both the header and _dbus_poll() implementation.
Better ?
Comment 15 Simon McVittie 2015-07-13 09:59:29 UTC
Comment on attachment 116941 [details] [review]
pdated patch including _dbus_poll()

Review of attachment 116941 [details] [review]:
-----------------------------------------------------------------

Yes, this looks good. I'll adjust the comments to make it a little clearer.

::: dbus/dbus-sysdeps-unix.c
@@ +2758,3 @@
>      }
> +
> +  return poll ((struct pollfd*) fds,

This cast isn't actually needed any more, which is a plus.
Comment 16 Simon McVittie 2015-08-06 13:54:49 UTC
Fixed in git for 1.9.20, with some adjustments.

I think this is the best abstraction we're likely to get, so I'm closing this.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.