-
ggaren@apple.com authored
https://bugs.webkit.org/show_bug.cgi?id=72221 Reviewed by Oliver Hunt. This patch standardizes the calling convention so that the caller always sets up the callee's CallFrame. Adjustments for call type, callee type, argument count, etc. now always take place after that initial setup. This is a step toward reversing the argument order, but also has these immediate benefits (measured on x64): (1) 1% benchmark speedup across the board. (2) 50% code size reduction in baseline JIT function calls. (3) 1.5x speedup for single-dispatch .apply forwarding. (4) 1.1x speedup for multi-dispatch .apply forwarding. This change affected the baseline JIT most, since the baseline JIT had lots of ad hoc calling conventions for different caller / callee types. * assembler/MacroAssemblerX86_64.h: (JSC::MacroAssemblerX86_64::branchPtr): (JSC::MacroAssemblerX86_64::branchAddPtr): Optimize compare to 0 into a test, like other assemblers do. (I added some compares to 0, and didn't want them to be slow.) * bytecode/CodeBlock.cpp: (JSC::CodeBlock::dump): Merged op_load_varargs into op_call_varargs so op_call_varargs could share code generation with other forms of op_call. This is also a small optimization, since op_*varargs no longer have to pass arguments to each other through the register file. (JSC::CallLinkInfo::unlink): * bytecode/CodeBlock.h: Added a new call type: CallVarargs. This allows us to link functions called through .apply syntax. We need to distinguish CallVarargs from Call because CallVarargs changes its argument count on each inovcation, so we must always link to the argument count checking version of the callee. * bytecode/Opcode.h: * bytecompiler/BytecodeGenerator.cpp: (JSC::BytecodeGenerator::emitCallVarargs): * bytecompiler/BytecodeGenerator.h: Merged op_load_varargs into op_call_varargs. * bytecompiler/NodesCodegen.cpp: (JSC::ApplyFunctionCallDotNode::emitBytecode): Ditto. Also, simplified some of this bytecode generation to remove redundant copies. * dfg/DFGJITCodeGenerator32_64.cpp: (JSC::DFG::JITCodeGenerator::emitCall): * dfg/DFGJITCodeGenerator64.cpp: (JSC::DFG::JITCodeGenerator::emitCall): Added a new call type: CallVarargs. DFG doesn't support this type, but its code needs to change slightly to accomodate a 3-state variable. Stopped passing the argument count in regT1 because this is non-standard. (The argument count goes in the CallFrame. This trades speed on the slow path for speed and code size on the fast path, and simplicity on all paths. A good trade, in my opinion.) * dfg/DFGJITCompiler.cpp: (JSC::DFG::JITCompiler::compileEntry): (JSC::DFG::JITCompiler::link): (JSC::DFG::JITCompiler::compile): (JSC::DFG::JITCompiler::compileFunction): Tweaked code to make CallFrame setup more obvious when single-stepping. Also, updated for argument count not being in regT1. * dfg/DFGJITCompiler.h: (JSC::DFG::JITCompiler::addJSCall): (JSC::DFG::JITCompiler::JSCallRecord::JSCallRecord): Added a new call type: CallVarargs. * dfg/DFGOperations.cpp: Do finish CallFrame setup in one place before doing anything else. Don't check for stack overflow because we have no callee registers, and our caller has already checked for its own registers. * dfg/DFGRepatch.cpp: (JSC::DFG::dfgLinkFor): We can link to our callee even if our argument count doesn't match -- we just need to link to the argument count checking version. * interpreter/CallFrameClosure.h: (JSC::CallFrameClosure::setArgument): BUG FIX: When supplying too many arguments from C++, we need to supply a full copy of the arguments prior to the subset copy that matches our callee's argument count. (That is what the standard calling convention would have produced in JS.) I would have split this into its own patch, but I couldn't find a way to get the JIT to fail a regression test in this area without my patch applied. * interpreter/Interpreter.cpp: Let the true code bomb begin! (JSC::eval): Fixed up this helper function to operate on eval()'s CallFrame, and not eval()'s caller frame. We no longer leave the CallFrame pointing to eval()'s caller during a call to eval(), since that is not standard. (JSC::loadVarargs): Factored out a shared helper function for use by JIT and interpreter because half the code means one quarter the bugs -- in my programming, at least. (JSC::Interpreter::execute): Removed a now-unused way to invoke eval. (JSC::Interpreter::privateExecute): Removed an invalid ASSERT following putDirect, because it got in the way of my testing. (When putting a function, the cached base of a PutPropertySlot can be 0 to signify "do not optimize".) op_call_eval: Updated for new, standard eval calling convention. op_load_varargs: Merged op_load_varargs into op_call_varargs. op_call_varags: Updated for new, standard eval calling convention. Don't check for stack overflow because the loadVarargs helper function already checked. * interpreter/Interpreter.h: (JSC::Interpreter::execute): Headers are fun and educational! * interpreter/RegisterFile.cpp: (JSC::RegisterFile::growSlowCase): * interpreter/RegisterFile.h: (JSC::RegisterFile::grow): Factored out the slow case into a slow case because it was cramping the style of my fast case. * jit/JIT.cpp: (JSC::JIT::privateCompile): Moved initialization of RegisterFile::CodeBlock to make it more obvious when debugging. Removed assumption that argument count is in regT1, as above. Removed call to restoreArgumentReference() because the JITStubCall abstraction does this for us. (JSC::JIT::linkFor): Link even if we miss on argument count, as above. * jit/JIT.h: * jit/JITCall32_64.cpp: (JSC::JIT::emitSlow_op_call): (JSC::JIT::emitSlow_op_call_eval): (JSC::JIT::emitSlow_op_call_varargs): (JSC::JIT::emitSlow_op_construct): (JSC::JIT::emit_op_call_eval): (JSC::JIT::emit_op_call_varargs): Share all function call code generation. Don't count call_eval when accounting for linkable function calls because eval doesn't link. (Its fast path is to perform the eval.) (JSC::JIT::compileLoadVarargs): Ported this inline copying optimization to our new calling convention. The key to this optimization is the observation that, in a function that declares no arguments, if any arguments are passed, they all end up right behind 'this'. (JSC::JIT::compileCallEval): (JSC::JIT::compileCallEvalSlowCase): Factored out eval for a little clarity. (JSC::JIT::compileOpCall): (JSC::JIT::compileOpCallSlowCase): If you are still with me, dear reader, this is the whole point of my patch. The caller now unconditionally moves the CallFrame forward and fills in the data it knows before taking any branches to deal with weird caller/callee pairs. This also means that there is almost no slow path for calls -- it all gets folded into the shared virtual call stub. The only things remaining in the slow path are the rare case counter and a call to the stub. * jit/JITOpcodes32_64.cpp: (JSC::JIT::privateCompileCTIMachineTrampolines): (JSC::JIT::privateCompileCTINativeCall): Updated for values being in different registers or in memory, based on our new standard calling convention. Added a shared path for calling out to CTI helper functions for non-JS calls. * jit/JITPropertyAccess32_64.cpp: (JSC::JIT::emit_op_method_check): method_check emits its own code and the following get_by_id's code, so it needs to add both when informing result chaining of its result. This is important because the standard calling convention can now take advantage of this chaining. * jit/JITCall.cpp: (JSC::JIT::compileLoadVarargs): (JSC::JIT::compileCallEval): (JSC::JIT::compileCallEvalSlowCase): (JSC::JIT::compileOpCall): (JSC::JIT::compileOpCallSlowCase): * jit/JITOpcodes.cpp: (JSC::JIT::privateCompileCTIMachineTrampolines): (JSC::JIT::emit_op_call_eval): (JSC::JIT::emit_op_call_varargs): (JSC::JIT::emitSlow_op_call): (JSC::JIT::emitSlow_op_call_eval): (JSC::JIT::emitSlow_op_call_varargs): (JSC::JIT::emitSlow_op_construct): Observe, as I write all of my code a second time, now with 64 bits. * jit/JITStubs.cpp: (JSC::throwExceptionFromOpCall): (JSC::jitCompileFor): (JSC::arityCheckFor): (JSC::lazyLinkFor): A lot of mechanical changes here for one purpose: Exceptions thrown in the middle of a function call now use a shared helper function (throwExceptionFromOpCall). This function understands that the CallFrame currently points to the callEE, and the exception must be thrown by the callER. (The old calling convention would often still have the CallFrame pointing at the callER at the point of an exception. That is not the way of our new, standard calling convention.) (JSC::op_call_eval): Finish standard CallFrame setup before calling our eval helper function, which now depends on that setup. * runtime/Arguments.h: (JSC::Arguments::length): Renamed numProvidedArguments() to length() because that's what other objects call it, and the difference made our new loadVarargs helper function hard to read. * runtime/Executable.cpp: (JSC::FunctionExecutable::compileForCallInternal): (JSC::FunctionExecutable::compileForConstructInternal): Interpreter build fixes. * runtime/FunctionPrototype.cpp: (JSC::functionProtoFuncApply): Honor Arguments::MaxArguments even when the .apply call_varargs optimization fails. (This bug appears on layout tests when you disable the optimization.) git-svn-id: http://svn.webkit.org/repository/webkit/trunk@100165 268f45cc-cd09-0410-ab3c-d52691b4dbfc
539d1bba