Fri, 11 Jan 2013 10:40:51 +0100
8005976: Break out AccessSpecializer into one pass before CodeGenerator instead of iterative applications from CodeGenerator
Summary: Now scope and slot information is guaranteed to be fixed AND NOT CHANGE before CodeGeneration. We want to keep it that way to build future type specializations and bring all type work out of CodeGenerator.
Reviewed-by: attila, hannesw
1 This document describes system properties that are used for internal
2 debugging and instrumentation purposes, along with the system loggers,
3 which are used for the same thing.
5 This document is intended as a developer resource, and it is not
6 needed as Nashorn documentation for normal usage. Flags and system
7 properties described herein are subject to change without notice.
9 =====================================
10 1. System properties used internally
11 =====================================
13 This documentation of the system property flags assume that the
14 default value of the flag is false, unless otherwise specified.
16 SYSTEM PROPERTY: -Dnashorn.unstable.relink.threshold=x
18 This property controls how many call site misses are allowed before a
19 callsite is relinked with "apply" semantics to never change again.
20 In the case of megamorphic callsites, this is necessary, or the
21 program would spend all its time swapping out callsite targets. Dynalink
22 has a default value (currently 8 relinks) for this property if it
23 is not explicitly set.
26 SYSTEM PROPERTY: -Dnashorn.compiler.split.threshold=x
28 This will change the node weight that requires a subgraph of the IR to
29 be split into several classes in order not to run out of bytecode space.
30 The default value is 0x8000 (32768).
33 SYSTEM PROPERTY: -Dnashorn.callsiteaccess.debug
35 See the description of the access logger below. This flag is
36 equivalent to enabling the access logger with "info" level.
39 SYSTEM PROPERTY: -Dnashorn.compiler.ints.disable
41 This flag prevents ints and longs (non double values) from being used
42 for any primitive representation in the lowered IR. This is default
43 false, i.e Lower will attempt to use integer variables as long as it
44 can. For example, var x = 17 would try to use x as an integer, unless
45 other operations occur later that require coercion to wider type, for
46 example x *= 17.1;
49 SYSTEM PROPERTY: -Dnashorn.compiler.intarithmetic
51 Arithmetic operations in Nashorn (except bitwise ones) typically
52 coerce the operands to doubles (as per the JavaScript spec). To switch
53 this off and remain in integer mode, for example for "var x = a&b; var
54 y = c&d; var z = x*y;", use this flag. This will force the
55 multiplication of variables that are ints to be done with the IMUL
56 bytecode and the result "z" to become an int.
58 WARNING: Note that is is experimental only to ensure that type support
59 exists for all primitive types. The generated code is unsound. This
60 will be the case until we do optimizations based on it. There is a CR
61 in Nashorn to do better range analysis, and ensure that this is only
62 done where the operation can't overflow into a wider type. Currently
63 no overflow checking is done, so at the moment, until range analysis
64 has been completed, this option is turned off.
66 We've experimented by using int arithmetic for everything and putting
67 overflow checks afterwards, which would recompute the operation with
68 the correct precision, but have yet to find a configuration where this
69 is faster than just using doubles directly, even if the int operation
70 does not overflow. Getting access to a JVM intrinsic that does branch
71 on overflow would probably alleviate this.
73 There is also a problem with this optimistic approach if the symbol
74 happens to reside in a local variable slot in the bytecode, as those
75 are strongly typed. Then we would need to split large sections of
76 control flow, so this is probably not the right way to go, while range
77 analysis is. There is a large difference between integer bytecode
78 without overflow checks and double bytecode. The former is
79 significantly faster.
82 SYSTEM PROPERTY: -Dnashorn.codegen.debug, -Dnashorn.codegen.debug.trace=<x>
84 See the description of the codegen logger below.
87 SYSTEM_PROPERTY: -Dnashorn.fields.debug
89 See the description on the fields logger below.
92 SYSTEM PROPERTY: -Dnashorn.fields.dual
94 When this property is true, Nashorn will attempt to use primitive
95 fields for AccessorProperties (currently just AccessorProperties, not
96 spill properties). Memory footprint for script objects will increase,
97 as we need to maintain both a primitive field (a long) as well as an
98 Object field for the property value. Ints are represented as the 32
99 low bits of the long fields. Doubles are represented as the
100 doubleToLongBits of their value. This way a single field can be used
101 for all primitive types. Packing and unpacking doubles to their bit
102 representation is intrinsified by the JVM and extremely fast.
104 While dual fields in theory runs significantly faster than Object
105 fields due to reduction of boxing and memory allocation overhead,
106 there is still work to be done to make this a general purpose
107 solution. Research is ongoing.
109 In the future, this might complement or be replaced by experimental
110 feature sun.misc.TaggedArray, which has been discussed on the mlvm
111 mailing list. TaggedArrays are basically a way to share data space
112 between primitives and references, and have the GC understand this.
114 As long as only primitive values are written to the fields and enough
115 type information exists to make sure that any reads don't have to be
116 uselessly boxed and unboxed, this is significantly faster than the
117 standard "Objects only" approach that currently is the default. See
118 test/examples/dual-fields-micro.js for an example that runs twice as
119 fast with dual fields as without them. Here, the compiler, can
120 determine that we are dealing with numbers only throughout the entire
121 property life span of the properties involved.
123 If a "real" object (not a boxed primitive) is written to a field that
124 has a primitive representation, its callsite is relinked and an Object
125 field is used forevermore for that particular field in that
126 PropertyMap and its children, even if primitives are later assigned to
127 it.
129 As the amount of compile time type information is very small in a
130 dynamic language like JavaScript, it is frequently the case that
131 something has to be treated as an object, because we don't know any
132 better. In reality though, it is often a boxed primitive is stored to
133 an AccessorProperty. The fastest way to handle this soundly is to use
134 a callsite typecheck and avoid blowing the field up to an Object. We
135 never revert object fields to primitives. Ping-pong:ing back and forth
136 between primitive representation and Object representation would cause
137 fatal performance overhead, so this is not an option.
139 For a general application the dual fields approach is still slower
140 than objects only fields in some places, about the same in most cases,
141 and significantly faster in very few. This is due the program using
142 primitives, but we still can't prove it. For example "local_var a =
143 call(); field = a;" may very well write a double to the field, but the
144 compiler dare not guess a double type if field is a local variable,
145 due to bytecode variables being strongly typed and later non
146 interchangeable. To get around this, the entire method would have to
147 be replaced and a continuation retained to restart from. We believe
148 that the next steps we should go through are instead:
150 1) Implement method specialization based on callsite, as it's quite
151 frequently the case that numbers are passed around, but currently our
152 function nodes just have object types visible to the compiler. For
153 example "var b = 17; func(a,b,17)" is an example where two parameters
154 can be specialized, but the main version of func might also be called
155 from another callsite with func(x,y,"string").
157 2) This requires lazy jitting as the functions have to be specialized
158 per callsite.
160 Even though "function square(x) { return x*x }" might look like a
161 trivial function that can always only take doubles, this is not
162 true. Someone might have overridden the valueOf for x so that the
163 toNumber coercion has side effects. To fulfil JavaScript semantics,
164 the coercion has to run twice for both terms of the multiplication
165 even if they are the same object. This means that call site
166 specialization is necessary, not parameter specialization on the form
167 "function square(x) { var xd = (double)x; return xd*xd; }", as one
168 might first think.
170 Generating a method specialization for any variant of a function that
171 we can determine by types at compile time is a combinatorial explosion
172 of byte code (try it e.g. on all the variants of am3 in the Octane
173 benchmark crypto.js). Thus, this needs to be lazy
175 3) Possibly optimistic callsite writes, something on the form
177 x = y; //x is a field known to be a primitive. y is only an object as
178 far as we can tell
180 turns into
182 try {
183 x = (int)y;
184 } catch (X is not an integer field right now | ClassCastException e) {
185 x = y;
186 }
188 Mini POC shows that this is the key to a lot of dual field performance
189 in seemingly trivial micros where one unknown object, in reality
190 actually a primitive, foils it for us. Very common pattern. Once we
191 are "all primitives", dual fields runs a lot faster than Object fields
192 only.
194 We still have to deal with objects vs primitives for local bytecode
195 slots, possibly through code copying and versioning.
198 SYSTEM PROPERTY: -Dnashorn.compiler.symbol.trace=<x>
200 When this property is set, creation and manipulation of any symbol
201 named "x" will show information about when the compiler changes its
202 type assumption, bytecode local variable slot assignment and other
203 data. This is useful if, for example, a symbol shows up as an Object,
204 when you believe it should be a primitive. Usually there is an
205 explanation for this, for example that it exists in the global scope
206 and type analysis has to be more conservative. In that case, the stack
207 trace upon type change to object will usually tell us why.
210 SYSTEM PROPERTY: nashorn.lexer.xmlliterals
212 If this property it set, it means that the Lexer should attempt to
213 parse XML literals, which would otherwise generate syntax
214 errors. Warning: there are currently no unit tests for this
215 functionality.
217 XML literals, when this is enabled, end up as standard LiteralNodes in
218 the IR.
221 SYSTEM_PROPERTY: nashorn.debug
223 If this property is set to true, Nashorn runs in Debug mode. Debug
224 mode is slightly slower, as for example statistics counters are enabled
225 during the run. Debug mode makes available a NativeDebug instance
226 called "Debug" in the global space that can be used to print property
227 maps and layout for script objects, as well as a "dumpCounters" method
228 that will print the current values of the previously mentioned stats
229 counters.
231 These functions currently exists for Debug:
233 "map" - print(Debug.map(x)) will dump the PropertyMap for object x to
234 stdout (currently there also exist functions called "embedX", where X
235 is a value from 0 to 3, that will dump the contents of the embed pool
236 for the first spill properties in any script object and "spill", that
237 will dump the contents of the growing spill pool of spill properties
238 in any script object. This is of course subject to change without
239 notice, should we change the script object layout.
241 "methodHandle" - this method returns the method handle that is used
242 for invoking a particular script function.
244 "identical" - this method compares two script objects for reference
245 equality. It is a == Java comparison
247 "dumpCounters" - will dump the debug counters' current values to
248 stdout.
250 Currently we count number of ScriptObjects in the system, number of
251 Scope objects in the system, number of ScriptObject listeners added,
252 removed and dead (without references).
254 We also count number of ScriptFunctions, ScriptFunction invocations
255 and ScriptFunction allocations.
257 Furthermore we count PropertyMap statistics: how many property maps
258 exist, how many times were property maps cloned, how many times did
259 the property map history cache hit, prevent new allocations, how many
260 prototype invalidations were done, how many time the property map
261 proto cache hit.
263 Finally we count callsite misses on a per callsite bases, which occur
264 when a callsite has to be relinked, due to a previous assumption of
265 object layout being invalidated.
268 SYSTEM PROPERTY: nashorn.methodhandles.debug,
269 nashorn.methodhandles.debug=create
271 If this property is enabled, each MethodHandle related call that uses
272 the java.lang.invoke package gets its MethodHandle intercepted and an
273 instrumentation printout of arguments and return value appended to
274 it. This shows exactly which method handles are executed and from
275 where. (Also MethodTypes and SwitchPoints). This can be augmented with
276 more information, for example, instance count, by subclassing or
277 further extending the TraceMethodHandleFactory implementation in
278 MethodHandleFactory.java.
280 If the property is specialized with "=create" as its option,
281 instrumentation will be shown for method handles upon creation time
282 rather than at runtime usage.
285 SYSTEM PROPERTY: nashorn.methodhandles.debug.stacktrace
287 This does the same as nashorn.methodhandles.debug, but when enabled
288 also dumps the stack trace for every instrumented method handle
289 operation. Warning: This is enormously verbose, but provides a pretty
290 decent "grep:able" picture of where the calls are coming from.
292 See the description of the codegen logger below for a more verbose
293 description of this option
296 SYSTEM PROPERTY: nashorn.scriptfunction.specialization.disable
298 There are several "fast path" implementations of constructors and
299 functions in the NativeObject classes that, in their original form,
300 take a variable amount of arguments. Said functions are also declared
301 to take Object parameters in their original form, as this is what the
302 JavaScript specification mandates.
304 However, we often know quite a lot more at a callsite of one of these
305 functions. For example, Math.min is called with a fixed number (2) of
306 integer arguments. The overhead of boxing these ints to Objects and
307 folding them into an Object array for the generic varargs Math.min
308 function is an order of magnitude slower than calling a specialized
309 implementation of Math.min that takes two integers. Specialized
310 functions and constructors are identified by the tag
311 @SpecializedFunction and @SpecializedConstructor in the Nashorn
312 code. The linker will link in the most appropriate (narrowest types,
313 right number of types and least number of arguments) specialization if
314 specializations are available.
316 Every ScriptFunction may carry specializations that the linker can
317 choose from. This framework will likely be extended for user defined
318 functions. The compiler can often infer enough parameter type info
319 from callsites for in order to generate simpler versions with less
320 generic Object types. This feature depends on future lazy jitting, as
321 there tend to be many calls to user defined functions, some where the
322 callsite can be specialized, some where we mostly see object
323 parameters even at the callsite.
325 If this system property is set to true, the linker will not attempt to
326 use any specialized function or constructor for native objects, but
327 just call the generic one.
330 SYSTEM PROPERTY: nashorn.tcs.miss.samplePercent=<x>
332 When running with the trace callsite option (-tcs), Nashorn will count
333 and instrument any callsite misses that require relinking. As the
334 number of relinks is large and usually produces a lot of output, this
335 system property can be used to constrain the percentage of misses that
336 should be logged. Typically this is set to 1 or 5 (percent). 1% is the
337 default value.
340 SYSTEM_PROPERTY: nashorn.profilefile=<filename>
342 When running with the profile callsite options (-pcs), Nashorn will
343 dump profiling data for all callsites to stderr as a shutdown hook. To
344 instead redirect this to a file, specify the path to the file using
345 this system property.
348 ===============
349 2. The loggers.
350 ===============
352 The Nashorn loggers can be used to print per-module or per-subsystem
353 debug information with different levels of verbosity. The loggers for
354 a given subsystem are available are enabled by using
356 --log=<systemname>[:<level>]
358 on the command line.
360 Here <systemname> identifies the name of the subsystem to be logged
361 and the optional colon and level argument is a standard
362 java.util.logging.Level name (severe, warning, info, config, fine,
363 finer, finest). If the level is left out for a particular subsystem,
364 it defaults to "info". Any log message logged as the level or a level
365 that is more important will be output to stderr by the logger.
367 Several loggers can be enabled by a single command line option, by
368 putting a comma after each subsystem/level tuple (or each subsystem if
369 level is unspecified). The --log option can also be given multiple
370 times on the same command line, with the same effect.
372 For example: --log=codegen,fields:finest is equivalent to
373 --log=codegen:info --log=fields:finest
375 The subsystems that currently support logging are:
378 * compiler
380 The compiler is in charge of turning source code and function nodes
381 into byte code, and installs the classes into a class loader
382 controlled from the Context. Log messages are, for example, about
383 things like new compile units being allocated. The compiler has global
384 settings that all the tiers of codegen (e.g. Lower and CodeGenerator)
385 use.
388 * codegen
390 The code generator is the emitter stage of the code pipeline, and
391 turns the lowest tier of a FunctionNode into bytecode. Codegen logging
392 shows byte codes as they are being emitted, line number information
393 and jumps. It also shows the contents of the bytecode stack prior to
394 each instruction being emitted. This is a good debugging aid. For
395 example:
397 [codegen] #41 line:2 (f)_afc824e
398 [codegen] #42 load symbol x slot=2
399 [codegen] #43 {1:O} load int 0
400 [codegen] #44 {2:I O} dynamic_runtime_call GT:ZOI_I args=2 returnType=boolean
401 [codegen] #45 signature (Ljava/lang/Object;I)Z
402 [codegen] #46 {1:Z} ifeq ternary_false_5402fe28
403 [codegen] #47 load symbol x slot=2
404 [codegen] #48 {1:O} goto ternary_exit_107c1f2f
405 [codegen] #49 ternary_false_5402fe28
406 [codegen] #50 load symbol x slot=2
407 [codegen] #51 {1:O} convert object -> double
408 [codegen] #52 {1:D} neg
409 [codegen] #53 {1:D} convert double -> object
410 [codegen] #54 {1:O} ternary_exit_107c1f2f
411 [codegen] #55 {1:O} return object
413 shows a ternary node being generated for the sequence "return x > 0 ?
414 x : -x"
416 The first number on the log line is a unique monotonically increasing
417 emission id per bytecode. There is no guarantee this is the same id
418 between runs. depending on non deterministic code
419 execution/compilation, but for small applications it usually is. If
420 the system variable -Dnashorn.codegen.debug.trace=<x> is set, where x
421 is a bytecode emission id, a stack trace will be shown as the
422 particular bytecode is about to be emitted. This can be a quick way to
423 determine where it comes from without attaching the debugger. "Who
424 generated that neg?"
426 The --log=codegen option is equivalent to setting the system variable
427 "nashorn.codegen.debug" to true.
430 * lower
432 The lowering annotates a FunctionNode with symbols for each identifier
433 and transforms high level constructs into lower level ones, that the
434 CodeGenerator consumes.
436 Lower logging typically outputs things like post pass actions,
437 insertions of casts because symbol types have been changed and type
438 specialization information. Currently very little info is generated by
439 this logger. This will probably change.
442 * access
444 The --log=access option is equivalent to setting the system variable
445 "nashorn.callsiteaccess.debug" to true. There are several levels of
446 the access logger, usually the default level "info" is enough
448 It is very simple to create your own logger. Use the DebugLogger class
449 and give the subsystem name as a constructor argument.
452 * fields
454 The --log=fields option (at info level) is equivalent to setting the
455 system variable "nashorn.fields.debug" to true. At the info level it
456 will only show info about type assumptions that were invalidated. If
457 the level is set to finest, it will also trace every AccessorProperty
458 getter and setter in the program, show arguments, return values
459 etc. It will also show the internal representation of respective field
460 (Object in the normal case, unless running with the dual field
461 representation)