Naive lock free work stealing queue

Recently, I read a article about work stealing queue Job System 2.0: Lock-Free Work Stealing – Part 3: Going lock-free, and this is my c++11 naive implementation based on my understand of c++11 memory order model. This works on x86 , as x86 has a strong memory model, is there any issue running on other architecture (weak memory order)?

#include <atomic>

class Job;

struct WorkStealingQueue
{
    // only called by owner work thread
    void Push(Job* job) noexcept
    {
        // m_bottom -> stealing thread read, owner thread read, write.
        auto bottom = m_bottom.load(std::memory_order_relaxed);

        m_jobs[bottom & MASK] = job;

        // need  let stealing thread see the new job.
        m_bottom.fetch_add(1, std::memory_order_release);
    }

    // only called by owner worker thread 
    Job* Pop(void) noexcept
    {
        auto bottom = m_bottom.load(std::memory_order_relaxed);

        auto top = m_top.load(std::memory_order_acquire);

        if (bottom > top)
        {
            auto job = m_jobs[(bottom - 1) & MASK];

            if(top < (bottom - 1)) 
            {
                m_bottom.fetch_sub(1, std::memory_order_release);
                return job;   
            }
            if (m_top.compare_exchange_weak(
                top, top + 1,
                std::memory_order_release,
                std::memory_order_relaxed))
            {
                return job;
            } 
        }

        return nullptr;
    }

    // called by stealing thread, not owner thread
    Job* Steal(void) noexcept
    {
        auto top = m_top.load(std::memory_order_acquire);

        // Release-Acquire ordering, so the stealing thread see new job
        auto bottom = m_bottom.load(std::memory_order_acquire);

        if (bottom > top)
        {
            auto job = m_jobs[top & MASK];
            // check if other stealing thread stealing this work 
            // or owner thread pop this job.
            // no data should do sync, so use relaxed oreder
            if (m_top.compare_exchange_weak(
                top, top + 1,
                std::memory_order_release,
                std::memory_order_relaxed))
            {
                return job;
            }
        }

        return nullptr;
    }

private:

    static constexpr auto MAX_COUNT = 4096u;
    static constexpr auto MASK = MAX_COUNT - 1u;
    static_assert((MAX_COUNT & MASK) == 0, "the max number of job must be power of two.");

    Job* m_jobs[MAX_COUNT];
    std::atomic<unsigned int> m_bottom{ 0 }, m_top{ 0 };
};

Answer

Pop() has problem

Consider what happens if your queue is in the following state with two jobs:

m_top    = 0
m_bottom = 2

The owner thread calls pop(), and loads in to local variables:

top    = 0
bottom = 2

Now, before the owner thread does anything else, two stealer threads call steal(), leaving the state of the queue like this:

m_top    = 2
m_bottom = 2

When the owner thread continues inside of pop(), it will go into this case and return a job that no longer exists:

    if(top < (bottom - 1)) 
    {
        m_bottom.fetch_sub(1, std::memory_order_release);
        return job;   
    }

Your code deviated from the code you linked. In the linked code, the first thing that pop() did was to decrement m_bottom so that the stealer threads could not steal the bottom job.

Attribution
Source : Link , Question Author : benlong , Answer Author : Community

Leave a Comment