Skip to content

Commit

Permalink
smp_call_function_many: handle concurrent clearing of mask
Browse files Browse the repository at this point in the history
commit 723aae25d5cdb09962901d36d526b44d4be1051c upstream.

Mike Galbraith reported finding a lockup ("perma-spin bug") where the
cpumask passed to smp_call_function_many was cleared by other cpu(s)
while a cpu was preparing its call_data block, resulting in no cpu to
clear the last ref and unlock the block.

Having cpus clear their bit asynchronously could be useful on a mask of
cpus that might have a translation context, or cpus that need a push to
complete an rcu window.

Instead of adding a BUG_ON and requiring yet another cpumask copy, just
detect the race and handle it.

Note: arch_send_call_function_ipi_mask must still handle an empty
cpumask because the data block is globally visible before the that arch
callback is made.  And (obviously) there are no guarantees to which cpus
are notified if the mask is changed during the call; only cpus that were
online and had their mask bit set during the whole call are guaranteed
to be called.

Reported-by: Mike Galbraith <[email protected]>
Reported-by: Jan Beulich <[email protected]>
Acked-by: Jan Beulich <[email protected]>
Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
  • Loading branch information
Milton Miller authored and Kali- committed May 12, 2011
1 parent c046d7e commit 8798f71
Showing 1 changed file with 10 additions and 3 deletions.
13 changes: 10 additions & 3 deletions kernel/smp.c
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ void smp_call_function_many(const struct cpumask *mask,
{
struct call_function_data *data;
unsigned long flags;
int cpu, next_cpu, this_cpu = smp_processor_id();
int refs, cpu, next_cpu, this_cpu = smp_processor_id();

/*
* Can deadlock when called with interrupts disabled.
Expand All @@ -438,7 +438,7 @@ void smp_call_function_many(const struct cpumask *mask,
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
&& !oops_in_progress);

/* So, what's a CPU they want? Ignoring this one. */
/* Try to fastpath. So, what's a CPU they want? Ignoring this one. */
cpu = cpumask_first_and(mask, cpu_online_mask);
if (cpu == this_cpu)
cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
Expand Down Expand Up @@ -496,6 +496,13 @@ void smp_call_function_many(const struct cpumask *mask,
/* We rely on the "and" being processed before the store */
cpumask_and(data->cpumask, mask, cpu_online_mask);
cpumask_clear_cpu(this_cpu, data->cpumask);
refs = cpumask_weight(data->cpumask);

/* Some callers race with other cpus changing the passed mask */
if (unlikely(!refs)) {
csd_unlock(&data->csd);
return;
}

raw_spin_lock_irqsave(&call_function.lock, flags);
/*
Expand All @@ -509,7 +516,7 @@ void smp_call_function_many(const struct cpumask *mask,
* to the cpumask before this write to refs, which indicates
* data is on the list and is ready to be processed.
*/
atomic_set(&data->refs, cpumask_weight(data->cpumask));
atomic_set(&data->refs, refs);
raw_spin_unlock_irqrestore(&call_function.lock, flags);

/*
Expand Down

0 comments on commit 8798f71

Please sign in to comment.