URCU background threads vs signalfd
Commit Message
On 2022-09-23 13:55, Eric Wong wrote:
> Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>> On 2022-09-22 05:15, Eric Wong via lttng-dev wrote:
>>> Hello, I'm using urcu-bp + rculfhash + call_rcu to implement
>>> malloc instrumentation (via LD_PRELOAD) on an existing
>>> single-threaded Perl codebase which uses Linux signalfd.
>>>
>>> signalfd depends on signals being blocked in all threads
>>> of the process, otherwise threads with unblocked signals
>>> can receive them and starve the signalfd.
>>>
>>> While some threads in URCU do block signals (e.g. workqueue
>>> worker for rculfhash), the call_rcu thread and rculfhash
>>> partition_resize_helper threads do not...
>>>
>>> Should all threads URCU creates block signals (aside from SIGRCU)?
>>
>> Yes, I think you are right. The SIGRCU signal is only needed for the
>> urcu-signal flavor though.
>>
>> Would you like to submit a patch ?
>
> Sure.
>
> Is there a way to detect at runtime when urcu-signal is in use
> so SIGRCU (SIGUSR1) doesn't get unblocked when using other flavors?
>
> I actually use SIGUSR1 in my signalfd-using codebase.
>
> I also want to remove cds_lfht_worker_init entirely since it's racy.
> Signal blocking needs to be done in the parent before pthread_create
> to avoid a window where the child has unblocked signals.
>
> Thanks. Anyways, this is my work-in-progress:
>
Perhaps with this on top of your wip patch ? The idea is to always block
all signals before creating threads, and only unblock SIGRCU when
registering a urcu-signal thread. (compile-tested only)
Comments
Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> On 2022-09-23 13:55, Eric Wong wrote:
> > Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> > > On 2022-09-22 05:15, Eric Wong via lttng-dev wrote:
> > > > Hello, I'm using urcu-bp + rculfhash + call_rcu to implement
> > > > malloc instrumentation (via LD_PRELOAD) on an existing
> > > > single-threaded Perl codebase which uses Linux signalfd.
> > > >
> > > > signalfd depends on signals being blocked in all threads
> > > > of the process, otherwise threads with unblocked signals
> > > > can receive them and starve the signalfd.
> > > >
> > > > While some threads in URCU do block signals (e.g. workqueue
> > > > worker for rculfhash), the call_rcu thread and rculfhash
> > > > partition_resize_helper threads do not...
> > > >
> > > > Should all threads URCU creates block signals (aside from SIGRCU)?
> > >
> > > Yes, I think you are right. The SIGRCU signal is only needed for the
> > > urcu-signal flavor though.
> > >
> > > Would you like to submit a patch ?
> >
> > Sure.
> >
> > Is there a way to detect at runtime when urcu-signal is in use
> > so SIGRCU (SIGUSR1) doesn't get unblocked when using other flavors?
> >
> > I actually use SIGUSR1 in my signalfd-using codebase.
> >
> > I also want to remove cds_lfht_worker_init entirely since it's racy.
> > Signal blocking needs to be done in the parent before pthread_create
> > to avoid a window where the child has unblocked signals.
> >
> > Thanks. Anyways, this is my work-in-progress:
> >
>
> Perhaps with this on top of your wip patch ? The idea is to always block all
> signals before creating threads, and only unblock SIGRCU when registering a
> urcu-signal thread. (compile-tested only)
Thanks, that makes sense. It passes: make check short_bench
My original signalfd + urcu-bp case works well, too, with my
constructor workarounds reverted. (I ported our patches ported to
to 0.10.2 for Debian buster (oldstable)).
I don't know if the existing test coverage is sufficient,
though. Waiting on regtest...
On 2022-09-23 18:05, Eric Wong wrote:
> Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>> On 2022-09-23 13:55, Eric Wong wrote:
>>> Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>>>> On 2022-09-22 05:15, Eric Wong via lttng-dev wrote:
>>>>> Hello, I'm using urcu-bp + rculfhash + call_rcu to implement
>>>>> malloc instrumentation (via LD_PRELOAD) on an existing
>>>>> single-threaded Perl codebase which uses Linux signalfd.
>>>>>
>>>>> signalfd depends on signals being blocked in all threads
>>>>> of the process, otherwise threads with unblocked signals
>>>>> can receive them and starve the signalfd.
>>>>>
>>>>> While some threads in URCU do block signals (e.g. workqueue
>>>>> worker for rculfhash), the call_rcu thread and rculfhash
>>>>> partition_resize_helper threads do not...
>>>>>
>>>>> Should all threads URCU creates block signals (aside from SIGRCU)?
>>>>
>>>> Yes, I think you are right. The SIGRCU signal is only needed for the
>>>> urcu-signal flavor though.
>>>>
>>>> Would you like to submit a patch ?
>>>
>>> Sure.
>>>
>>> Is there a way to detect at runtime when urcu-signal is in use
>>> so SIGRCU (SIGUSR1) doesn't get unblocked when using other flavors?
>>>
>>> I actually use SIGUSR1 in my signalfd-using codebase.
>>>
>>> I also want to remove cds_lfht_worker_init entirely since it's racy.
>>> Signal blocking needs to be done in the parent before pthread_create
>>> to avoid a window where the child has unblocked signals.
>>>
>>> Thanks. Anyways, this is my work-in-progress:
>>>
>>
>> Perhaps with this on top of your wip patch ? The idea is to always block all
>> signals before creating threads, and only unblock SIGRCU when registering a
>> urcu-signal thread. (compile-tested only)
>
> Thanks, that makes sense. It passes: make check short_bench
>
> My original signalfd + urcu-bp case works well, too, with my
> constructor workarounds reverted. (I ported our patches ported to
> to 0.10.2 for Debian buster (oldstable)).
>
> I don't know if the existing test coverage is sufficient,
> though. Waiting on regtest...
Do you mind if I fold our 2 patches together with a Co-developed-by tag
and use your Signed-off-by ?
Then the question that will arise is whether this change is sufficiently
self-contained that we can push it to stable branches, or if it's master
branch material only.
Thoughts ?
Thanks,
Mathieu
Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> Do you mind if I fold our 2 patches together with a Co-developed-by tag and
> use your Signed-off-by ?
Looks good to me, thanks.
> Then the question that will arise is whether this change is sufficiently
> self-contained that we can push it to stable branches, or if it's master
> branch material only.
It seems self-contained enough and fixes a real problem
encountered in real-world usage, so I think it's suitable for stable.
Thanks.
On 2022-09-26 15:58, Eric Wong wrote:
> Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>> Do you mind if I fold our 2 patches together with a Co-developed-by tag and
>> use your Signed-off-by ?
>
> Looks good to me, thanks.
>
>> Then the question that will arise is whether this change is sufficiently
>> self-contained that we can push it to stable branches, or if it's master
>> branch material only.
>
> It seems self-contained enough and fixes a real problem
> encountered in real-world usage, so I think it's suitable for stable.
That's tricky. I try very hard not to introduce any user-observable
regressions in stable branches. I can very well imagine an application
which, for its own peculiar reasons, would expect signals to be
unblocked in call_rcu worker thread callbacks.
On the other hand, those worker threads have always had the behavior of
having signals unblocked so far, therefore an application that wants to
use signalfd() today with those liburcu worker threads already needs to
take some care about how those threads get created.
Therefore, it's not a small self-contained change: it has effects on the
blocked signal masks that are observable from the application call_rcu
callbacks. So far, applications which expect to use signalfd() in a
non-broken way need to ensure they either have the work-around in place
around thread creation, or will need to add a dependency on an upcoming
liburcu version (e.g. 0.14).
For all those reasons, I think it would be a good time to do a liburcu
0.14 release soon, and I don't think backporting this patch makes sense
for stable branches of liburcu.
Thanks,
Mathieu
@@ -2174,29 +2174,6 @@ static struct urcu_atfork cds_lfht_atfork = {
.after_fork_child = cds_lfht_after_fork_child,
};
-/*
- * Block all signals for the workqueue worker thread to ensure we don't
- * disturb the application. The SIGRCU signal needs to be unblocked for
- * the urcu-signal flavor.
- */
-static void cds_lfht_worker_init(
- struct urcu_workqueue *workqueue __attribute__((unused)),
- void *priv __attribute__((unused)))
-{
- int ret;
- sigset_t mask;
-
- ret = sigfillset(&mask);
- if (ret)
- urcu_die(errno);
- ret = sigdelset(&mask, SIGRCU);
- if (ret)
- urcu_die(errno);
- ret = pthread_sigmask(SIG_SETMASK, &mask, NULL);
- if (ret)
- urcu_die(ret);
-}
-
static void cds_lfht_init_worker(const struct rcu_flavor_struct *flavor)
{
flavor->register_rculfhash_atfork(&cds_lfht_atfork);
@@ -2205,7 +2182,7 @@ static void cds_lfht_init_worker(const struct
rcu_flavor_struct *flavor)
if (cds_lfht_workqueue_user_count++)
goto end;
cds_lfht_workqueue = urcu_workqueue_create(0, -1, NULL,
- NULL, cds_lfht_worker_init, NULL, NULL, NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL, NULL, NULL, NULL);
end:
mutex_unlock(&cds_lfht_fork_mutex);
}
@@ -110,6 +110,8 @@ static int init_done;
void __attribute__((constructor)) rcu_init(void);
void __attribute__((destructor)) rcu_exit(void);
+
+static DEFINE_URCU_TLS(int, rcu_signal_was_blocked);
#endif
/*
@@ -537,8 +539,52 @@ int rcu_read_ongoing(void)
return _rcu_read_ongoing();
}
+#ifdef RCU_SIGNAL
+/*
+ * Make sure the signal used by the urcu-signal flavor is unblocked
+ * while the thread is registered.
+ */
+static
+void urcu_signal_unblock(void)
+{
+ sigset_t mask, oldmask;
+ int ret;
+
+ ret = sigemptyset(&mask);
+ urcu_posix_assert(!ret);
+ ret = sigaddset(&mask, SIGRCU);
+ urcu_posix_assert(!ret);
+ ret = pthread_sigmask(SIG_UNBLOCK, &mask, &oldmask);
+ urcu_posix_assert(!ret);
+ URCU_TLS(rcu_signal_was_blocked) = sigismember(&oldmask, SIGRCU);
+}
+
+static
+void urcu_signal_restore(void)
+{
+ sigset_t mask;
+ int ret;
+
+ if (!URCU_TLS(rcu_signal_was_blocked))
+ return;
+ ret = sigemptyset(&mask);
+ urcu_posix_assert(!ret);
+ ret = sigaddset(&mask, SIGRCU);
+ urcu_posix_assert(!ret);
+ ret = pthread_sigmask(SIG_BLOCK, &mask, NULL);
+ urcu_posix_assert(!ret);
+}
+#else
+static
+void urcu_signal_unblock(void) { }
+static
+void urcu_signal_restore(void) { }
+#endif
+
void rcu_register_thread(void)
{
+ urcu_signal_unblock();
+
URCU_TLS(rcu_reader).tid = pthread_self();
urcu_posix_assert(URCU_TLS(rcu_reader).need_mb == 0);
urcu_posix_assert(!(URCU_TLS(rcu_reader).ctr & URCU_GP_CTR_NEST_MASK));
@@ -558,6 +604,8 @@ void rcu_unregister_thread(void)
URCU_TLS(rcu_reader).registered = 0;
cds_list_del(&URCU_TLS(rcu_reader).node);
mutex_unlock(&rcu_registry_lock);
+
+ urcu_signal_restore();
}
#ifdef RCU_MEMBARRIER