• fpizlo@apple.com's avatar
    Switch FTL GetById/PutById IC's over to using AnyRegCC · d2ceb399
    fpizlo@apple.com authored
    https://bugs.webkit.org/show_bug.cgi?id=124094
    
    Source/JavaScriptCore: 
    
    Reviewed by Sam Weinig.
            
    This closes the loop on inline caches (IC's) in the FTL. The goal is to have IC's
    in LLVM-generated code that are just as efficient (if not more so) than what a
    custom JIT could do. As in zero sources of overhead. Not a single extra instruction
    or even register allocation pathology. We accomplish this by having two thingies in
    LLVM. First is the llvm.experimental.patchpoint intrinsic, which is sort of an
    inline machine code snippet that we can fill in with whatever we want and then
    modify subsequently. But you have only two choices of how to pass values to a
    patchpoint: (1) via the calling convention or (2) via the stackmap. Neither are good
    for operands to an IC (like the base pointer for a GetById, for example). (1) is bad
    because it results in things being pinned to certain registers a priori; a custom
    JIT (like the DFG) will not pin IC operands to any registers a priori but will allow
    the register allocator to do whatever it wants. (2) is bad because the operands may
    be spilled or may be represented in other crazy ways. You generally want an IC to
    have its operands in registers. Also, patchpoints only return values using the
    calling convention, which is unfortunate since it pins the return value to a
    register a priori. This is where the second thingy comes in: the AnyRegCC. This is
    a special calling convention only for use with patchpoints. It means that arguments
    passed "by CC" in the patchpoint can be placed in any register, and the register
    that gets used is reported as part of the stackmap. It also means that the return
    value (if there is one) can be placed in any register, and the stackmap will tell
    you which one it was. Thus, patchpoints combined with AnyRegCC mean that you not
    only get the kind of self-modifying code that you want for IC's, but you also get
    all of the register allocation goodness that a custom JIT would have given you.
    Except that you're getting it from LLVM and not a custom JIT. Awesome.
            
    Even though all of the fun stuff is on the LLVM side, this patch was harder than
    you'd expect.
            
    First the obvious bits:
            
    - IC patchpoints now use AnyRegCC instead of the C CC. (CC = calling convention.)
            
    - FTL::fixFunctionBasedOnStackMaps() now correctly figures out which registers the
      IC is supposed to use instead of assuming C CC argument registers.
            
    And then all of the stuff that broke and that this patch fixes:
            
    - IC sizing based on generating a dummy IC (what FTLInlineCacheSize did) is totally
      bad on x86-64, where various register permutations lead to bizarre header bytes
      and eclectic SIB encodings. I changed that to have magic constants, for now.
            
    - Slow path calls didn't preserve the CC return register.
            
    - Repatch's scratch register allocation would get totally confused if the operand
      registers weren't one of the DFG-style "temp" registers. And by "totally confused"
      I mean that it would crash.
            
    - We assumed that r10 is callee-saved. It's not. That one dude's PPT about x86-64
      cdecl that I found on the intertubes was not a trustworthy source of information,
      apparently.
            
    - Call repatching didn't know that the FTL does its IC slow calls via specially
      generated thunks. This was particularly fun to fix: basically, now when we relink
      an IC call in the FTL, we use the old call target to find the SlowPathCallKey,
      which tells us everything we need to know to generate (or look up) a new thunk for
      the new function we want to call.
            
    * assembler/MacroAssemblerCodeRef.h:
    (JSC::MacroAssemblerCodePtr::MacroAssemblerCodePtr):
    (JSC::MacroAssemblerCodePtr::isEmptyValue):
    (JSC::MacroAssemblerCodePtr::isDeletedValue):
    (JSC::MacroAssemblerCodePtr::hash):
    (JSC::MacroAssemblerCodePtr::emptyValue):
    (JSC::MacroAssemblerCodePtr::deletedValue):
    (JSC::MacroAssemblerCodePtrHash::hash):
    (JSC::MacroAssemblerCodePtrHash::equal):
    * assembler/MacroAssemblerX86Common.h:
    * assembler/RepatchBuffer.h:
    (JSC::RepatchBuffer::RepatchBuffer):
    (JSC::RepatchBuffer::codeBlock):
    * ftl/FTLAbbreviations.h:
    (JSC::FTL::setInstructionCallingConvention):
    * ftl/FTLCompile.cpp:
    (JSC::FTL::fixFunctionBasedOnStackMaps):
    * ftl/FTLInlineCacheSize.cpp:
    (JSC::FTL::sizeOfGetById):
    (JSC::FTL::sizeOfPutById):
    * ftl/FTLJITFinalizer.cpp:
    (JSC::FTL::JITFinalizer::finalizeFunction):
    * ftl/FTLLocation.cpp:
    (JSC::FTL::Location::forStackmaps):
    * ftl/FTLLocation.h:
    * ftl/FTLLowerDFGToLLVM.cpp:
    (JSC::FTL::LowerDFGToLLVM::compileGetById):
    (JSC::FTL::LowerDFGToLLVM::compilePutById):
    * ftl/FTLOSRExitCompiler.cpp:
    (JSC::FTL::compileStub):
    * ftl/FTLSlowPathCall.cpp:
    * ftl/FTLSlowPathCallKey.h:
    (JSC::FTL::SlowPathCallKey::withCallTarget):
    * ftl/FTLStackMaps.cpp:
    (JSC::FTL::StackMaps::Location::directGPR):
    (JSC::FTL::StackMaps::Location::restoreInto):
    * ftl/FTLStackMaps.h:
    * ftl/FTLThunks.h:
    (JSC::FTL::generateIfNecessary):
    (JSC::FTL::keyForThunk):
    (JSC::FTL::Thunks::keyForSlowPathCallThunk):
    * jit/FPRInfo.h:
    (JSC::FPRInfo::toIndex):
    * jit/GPRInfo.h:
    (JSC::GPRInfo::toIndex):
    (JSC::GPRInfo::debugName):
    * jit/RegisterSet.cpp:
    (JSC::RegisterSet::calleeSaveRegisters):
    * jit/RegisterSet.h:
    (JSC::RegisterSet::filter):
    * jit/Repatch.cpp:
    (JSC::readCallTarget):
    (JSC::repatchCall):
    (JSC::repatchByIdSelfAccess):
    (JSC::tryCacheGetByID):
    (JSC::tryCachePutByID):
    (JSC::tryBuildPutByIdList):
    (JSC::resetGetByID):
    (JSC::resetPutByID):
    * jit/ScratchRegisterAllocator.h:
    (JSC::ScratchRegisterAllocator::lock):
    
    Source/WTF: 
    
    Reviewed by Sam Weinig.
            
    I needed to add another set operation, namely filter(), which is an in-place set
    intersection.
    
    * wtf/BitVector.cpp:
    (WTF::BitVector::filterSlow):
    * wtf/BitVector.h:
    (WTF::BitVector::filter):
    
    
    
    git-svn-id: http://svn.webkit.org/repository/webkit/trunk@159039 268f45cc-cd09-0410-ab3c-d52691b4dbfc
    d2ceb399