Commit eddb0860 authored by fpizlo@apple.com's avatar fpizlo@apple.com

Memory barrier support should also ensure that we always do a compiler fence

https://bugs.webkit.org/show_bug.cgi?id=114934

Reviewed by Michael Saboff.
        
This is a cherry-pick merge of the WTF part of r148836 from the dfgFourthTier
branch. It fixes a memory ordering bug that is likely asymptomatic, but
nonetheless real: TCSpinLock expects that using a memoryBarrierBeforeUnlock()
prior to setting lockword_ to 0 will ensure that the assignment to lockword_
won't get floated above any of the stores in the critical section. While that
memory barrier does indeed do the right thing on ARM, it doesn't do the right
thing on other architectures: it turns into empty code that the compiler blows
away, which is fine for the hardware since X86 won't reorder that store - but
it's not fine for the compiler, which may still do its own reorderings.
        
The WTF part of r148836 fixes this by using a compiler fence: an empty asm
volatile block that is marked as clobbering memory.
        
Instead of doing a separate surgical fix in trunk, I decided to merge the
whole WTF change over, to make merging easier in the future.
        
Performance testing of this change in dfgFourthTier showed no regression.

* wtf/Atomics.h:
(WTF::compilerFence):
(WTF::armV7_dmb):
(WTF::armV7_dmb_st):
(WTF::loadLoadFence):
(WTF::loadStoreFence):
(WTF::storeLoadFence):
(WTF::storeStoreFence):
(WTF::memoryBarrierAfterLock):
(WTF::memoryBarrierBeforeUnlock):
(WTF::x86_mfence):



git-svn-id: http://svn.webkit.org/repository/webkit/trunk@148888 268f45cc-cd09-0410-ab3c-d52691b4dbfc
parent 340e4960
2013-04-21 Filip Pizlo <fpizlo@apple.com>
Memory barrier support should also ensure that we always do a compiler fence
https://bugs.webkit.org/show_bug.cgi?id=114934
Reviewed by Michael Saboff.
This is a cherry-pick merge of the WTF part of r148836 from the dfgFourthTier
branch. It fixes a memory ordering bug that is likely asymptomatic, but
nonetheless real: TCSpinLock expects that using a memoryBarrierBeforeUnlock()
prior to setting lockword_ to 0 will ensure that the assignment to lockword_
won't get floated above any of the stores in the critical section. While that
memory barrier does indeed do the right thing on ARM, it doesn't do the right
thing on other architectures: it turns into empty code that the compiler blows
away, which is fine for the hardware since X86 won't reorder that store - but
it's not fine for the compiler, which may still do its own reorderings.
The WTF part of r148836 fixes this by using a compiler fence: an empty asm
volatile block that is marked as clobbering memory.
Instead of doing a separate surgical fix in trunk, I decided to merge the
whole WTF change over, to make merging easier in the future.
Performance testing of this change in dfgFourthTier showed no regression.
* wtf/Atomics.h:
(WTF::compilerFence):
(WTF::armV7_dmb):
(WTF::armV7_dmb_st):
(WTF::loadLoadFence):
(WTF::loadStoreFence):
(WTF::storeLoadFence):
(WTF::storeStoreFence):
(WTF::memoryBarrierAfterLock):
(WTF::memoryBarrierBeforeUnlock):
(WTF::x86_mfence):
2013-04-22 David Kilzer <ddkilzer@apple.com>
WTF::AtomicString::find() should take unsigned 'start' argument
......
/*
* Copyright (C) 2007, 2008, 2010, 2012 Apple Inc. All rights reserved.
* Copyright (C) 2007, 2008, 2010, 2012, 2013 Apple Inc. All rights reserved.
* Copyright (C) 2007 Justin Haygood (jhaygood@reaktix.com)
*
* Redistribution and use in source and binary forms, with or without
......@@ -64,6 +64,7 @@
#include <wtf/UnusedParam.h>
#if OS(WINDOWS)
#include <intrin.h>
#include <windows.h>
#elif OS(QNX)
#include <atomic.h>
......@@ -198,22 +199,70 @@ inline bool weakCompareAndSwapUIntPtr(volatile uintptr_t* location, uintptr_t ex
return weakCompareAndSwap(reinterpret_cast<void*volatile*>(location), reinterpret_cast<void*>(expected), reinterpret_cast<void*>(newValue));
}
// Just a compiler fence. Has no effect on the hardware, but tells the compiler
// not to move things around this call. Should not affect the compiler's ability
// to do things like register allocation and code motion over pure operations.
inline void compilerFence()
{
#if OS(WINDOWS)
_ReadWriteBarrier();
#else
asm volatile("" ::: "memory");
#endif
}
#if CPU(ARM_THUMB2)
inline void memoryBarrierAfterLock()
// Full memory fence. No accesses will float above this, and no accesses will sink
// below it.
inline void armV7_dmb()
{
asm volatile("dmb" ::: "memory");
}
inline void memoryBarrierBeforeUnlock()
// Like the above, but only affects stores.
inline void armV7_dmb_st()
{
asm volatile("dmb" ::: "memory");
asm volatile("dmb st" ::: "memory");
}
inline void loadLoadFence() { armV7_dmb(); }
inline void loadStoreFence() { armV7_dmb(); }
inline void storeLoadFence() { armV7_dmb(); }
inline void storeStoreFence() { armV7_dmb_st(); }
inline void memoryBarrierAfterLock() { armV7_dmb(); }
inline void memoryBarrierBeforeUnlock() { armV7_dmb(); }
#elif CPU(X86) || CPU(X86_64)
inline void x86_mfence()
{
#if OS(WINDOWS)
// I think that this does the equivalent of a dummy interlocked instruction,
// instead of using the 'mfence' instruction, at least according to MSDN. I
// know that it is equivalent for our purposes, but it would be good to
// investigate if that is actually better.
MemoryBarrier();
#else
asm volatile("mfence" ::: "memory");
#endif
}
inline void loadLoadFence() { compilerFence(); }
inline void loadStoreFence() { compilerFence(); }
inline void storeLoadFence() { x86_mfence(); }
inline void storeStoreFence() { compilerFence(); }
inline void memoryBarrierAfterLock() { compilerFence(); }
inline void memoryBarrierBeforeUnlock() { compilerFence(); }
#else
inline void memoryBarrierAfterLock() { }
inline void memoryBarrierBeforeUnlock() { }
inline void loadLoadFence() { compilerFence(); }
inline void loadStoreFence() { compilerFence(); }
inline void storeLoadFence() { compilerFence(); }
inline void storeStoreFence() { compilerFence(); }
inline void memoryBarrierAfterLock() { compilerFence(); }
inline void memoryBarrierBeforeUnlock() { compilerFence(); }
#endif
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment