Tue, 11 Jun 2013 13:09:43 +0530
8015357: a = []; a[0x7fffffff]=1; a.sort()[0] should evaluate to 1 instead of undefined
Reviewed-by: hannesw, lagergren
jlaskey@3 | 1 | This document describes system properties that are used for internal |
jlaskey@3 | 2 | debugging and instrumentation purposes, along with the system loggers, |
jlaskey@3 | 3 | which are used for the same thing. |
jlaskey@3 | 4 | |
jlaskey@3 | 5 | This document is intended as a developer resource, and it is not |
jlaskey@3 | 6 | needed as Nashorn documentation for normal usage. Flags and system |
jlaskey@3 | 7 | properties described herein are subject to change without notice. |
jlaskey@3 | 8 | |
jlaskey@3 | 9 | ===================================== |
jlaskey@3 | 10 | 1. System properties used internally |
jlaskey@3 | 11 | ===================================== |
jlaskey@3 | 12 | |
jlaskey@3 | 13 | This documentation of the system property flags assume that the |
jlaskey@3 | 14 | default value of the flag is false, unless otherwise specified. |
jlaskey@3 | 15 | |
lagergren@147 | 16 | SYSTEM PROPERTY: -Dnashorn.args=<string> |
lagergren@147 | 17 | |
lagergren@147 | 18 | This property takes as its value a space separated list of Nashorn |
lagergren@147 | 19 | command line options that should be passed to Nashorn. This might be useful |
lagergren@147 | 20 | in environments where it is hard to tell how a nashorn.jar is launched. |
lagergren@147 | 21 | |
lagergren@147 | 22 | Example: |
lagergren@147 | 23 | |
lagergren@147 | 24 | > java -Dnashorn.args="--lazy-complation --log=compiler" large-java-app-with-nashorn.jar |
lagergren@147 | 25 | > ant -Dnashorn.args="--log=codegen" antjob |
lagergren@147 | 26 | |
lagergren@8 | 27 | SYSTEM PROPERTY: -Dnashorn.unstable.relink.threshold=x |
lagergren@8 | 28 | |
lagergren@8 | 29 | This property controls how many call site misses are allowed before a |
lagergren@8 | 30 | callsite is relinked with "apply" semantics to never change again. |
lagergren@8 | 31 | In the case of megamorphic callsites, this is necessary, or the |
lagergren@8 | 32 | program would spend all its time swapping out callsite targets. Dynalink |
lagergren@8 | 33 | has a default value (currently 8 relinks) for this property if it |
lagergren@8 | 34 | is not explicitly set. |
lagergren@8 | 35 | |
jlaskey@3 | 36 | |
hannesw@83 | 37 | SYSTEM PROPERTY: -Dnashorn.compiler.splitter.threshold=x |
lagergren@24 | 38 | |
lagergren@24 | 39 | This will change the node weight that requires a subgraph of the IR to |
lagergren@24 | 40 | be split into several classes in order not to run out of bytecode space. |
lagergren@24 | 41 | The default value is 0x8000 (32768). |
lagergren@24 | 42 | |
lagergren@24 | 43 | |
hannesw@83 | 44 | SYSTEM PROPERTY: -Dnashorn.compiler.intarithmetic |
jlaskey@3 | 45 | |
jlaskey@3 | 46 | Arithmetic operations in Nashorn (except bitwise ones) typically |
jlaskey@3 | 47 | coerce the operands to doubles (as per the JavaScript spec). To switch |
jlaskey@3 | 48 | this off and remain in integer mode, for example for "var x = a&b; var |
jlaskey@3 | 49 | y = c&d; var z = x*y;", use this flag. This will force the |
jlaskey@3 | 50 | multiplication of variables that are ints to be done with the IMUL |
jlaskey@3 | 51 | bytecode and the result "z" to become an int. |
jlaskey@3 | 52 | |
jlaskey@3 | 53 | WARNING: Note that is is experimental only to ensure that type support |
jlaskey@3 | 54 | exists for all primitive types. The generated code is unsound. This |
jlaskey@3 | 55 | will be the case until we do optimizations based on it. There is a CR |
jlaskey@3 | 56 | in Nashorn to do better range analysis, and ensure that this is only |
jlaskey@3 | 57 | done where the operation can't overflow into a wider type. Currently |
jlaskey@3 | 58 | no overflow checking is done, so at the moment, until range analysis |
jlaskey@3 | 59 | has been completed, this option is turned off. |
jlaskey@3 | 60 | |
jlaskey@3 | 61 | We've experimented by using int arithmetic for everything and putting |
jlaskey@3 | 62 | overflow checks afterwards, which would recompute the operation with |
jlaskey@3 | 63 | the correct precision, but have yet to find a configuration where this |
jlaskey@3 | 64 | is faster than just using doubles directly, even if the int operation |
jlaskey@3 | 65 | does not overflow. Getting access to a JVM intrinsic that does branch |
jlaskey@3 | 66 | on overflow would probably alleviate this. |
jlaskey@3 | 67 | |
jlaskey@3 | 68 | There is also a problem with this optimistic approach if the symbol |
jlaskey@3 | 69 | happens to reside in a local variable slot in the bytecode, as those |
jlaskey@3 | 70 | are strongly typed. Then we would need to split large sections of |
jlaskey@3 | 71 | control flow, so this is probably not the right way to go, while range |
jlaskey@3 | 72 | analysis is. There is a large difference between integer bytecode |
jlaskey@3 | 73 | without overflow checks and double bytecode. The former is |
jlaskey@3 | 74 | significantly faster. |
jlaskey@3 | 75 | |
jlaskey@3 | 76 | |
jlaskey@3 | 77 | SYSTEM PROPERTY: -Dnashorn.codegen.debug, -Dnashorn.codegen.debug.trace=<x> |
jlaskey@3 | 78 | |
jlaskey@3 | 79 | See the description of the codegen logger below. |
jlaskey@3 | 80 | |
jlaskey@3 | 81 | |
jlaskey@3 | 82 | SYSTEM_PROPERTY: -Dnashorn.fields.debug |
jlaskey@3 | 83 | |
jlaskey@3 | 84 | See the description on the fields logger below. |
jlaskey@3 | 85 | |
jlaskey@3 | 86 | |
jlaskey@3 | 87 | SYSTEM PROPERTY: -Dnashorn.fields.dual |
jlaskey@3 | 88 | |
jlaskey@3 | 89 | When this property is true, Nashorn will attempt to use primitive |
jlaskey@3 | 90 | fields for AccessorProperties (currently just AccessorProperties, not |
jlaskey@3 | 91 | spill properties). Memory footprint for script objects will increase, |
jlaskey@3 | 92 | as we need to maintain both a primitive field (a long) as well as an |
jlaskey@3 | 93 | Object field for the property value. Ints are represented as the 32 |
jlaskey@3 | 94 | low bits of the long fields. Doubles are represented as the |
jlaskey@3 | 95 | doubleToLongBits of their value. This way a single field can be used |
jlaskey@3 | 96 | for all primitive types. Packing and unpacking doubles to their bit |
jlaskey@3 | 97 | representation is intrinsified by the JVM and extremely fast. |
jlaskey@3 | 98 | |
jlaskey@3 | 99 | While dual fields in theory runs significantly faster than Object |
jlaskey@3 | 100 | fields due to reduction of boxing and memory allocation overhead, |
jlaskey@3 | 101 | there is still work to be done to make this a general purpose |
jlaskey@3 | 102 | solution. Research is ongoing. |
jlaskey@3 | 103 | |
jlaskey@3 | 104 | In the future, this might complement or be replaced by experimental |
jlaskey@3 | 105 | feature sun.misc.TaggedArray, which has been discussed on the mlvm |
jlaskey@3 | 106 | mailing list. TaggedArrays are basically a way to share data space |
jlaskey@3 | 107 | between primitives and references, and have the GC understand this. |
jlaskey@3 | 108 | |
jlaskey@3 | 109 | As long as only primitive values are written to the fields and enough |
jlaskey@3 | 110 | type information exists to make sure that any reads don't have to be |
jlaskey@3 | 111 | uselessly boxed and unboxed, this is significantly faster than the |
jlaskey@3 | 112 | standard "Objects only" approach that currently is the default. See |
jlaskey@3 | 113 | test/examples/dual-fields-micro.js for an example that runs twice as |
jlaskey@3 | 114 | fast with dual fields as without them. Here, the compiler, can |
jlaskey@3 | 115 | determine that we are dealing with numbers only throughout the entire |
jlaskey@3 | 116 | property life span of the properties involved. |
jlaskey@3 | 117 | |
jlaskey@3 | 118 | If a "real" object (not a boxed primitive) is written to a field that |
jlaskey@3 | 119 | has a primitive representation, its callsite is relinked and an Object |
jlaskey@3 | 120 | field is used forevermore for that particular field in that |
jlaskey@3 | 121 | PropertyMap and its children, even if primitives are later assigned to |
jlaskey@3 | 122 | it. |
jlaskey@3 | 123 | |
jlaskey@3 | 124 | As the amount of compile time type information is very small in a |
jlaskey@3 | 125 | dynamic language like JavaScript, it is frequently the case that |
jlaskey@3 | 126 | something has to be treated as an object, because we don't know any |
jlaskey@3 | 127 | better. In reality though, it is often a boxed primitive is stored to |
jlaskey@3 | 128 | an AccessorProperty. The fastest way to handle this soundly is to use |
jlaskey@3 | 129 | a callsite typecheck and avoid blowing the field up to an Object. We |
jlaskey@3 | 130 | never revert object fields to primitives. Ping-pong:ing back and forth |
jlaskey@3 | 131 | between primitive representation and Object representation would cause |
jlaskey@3 | 132 | fatal performance overhead, so this is not an option. |
jlaskey@3 | 133 | |
jlaskey@3 | 134 | For a general application the dual fields approach is still slower |
jlaskey@3 | 135 | than objects only fields in some places, about the same in most cases, |
jlaskey@3 | 136 | and significantly faster in very few. This is due the program using |
jlaskey@3 | 137 | primitives, but we still can't prove it. For example "local_var a = |
jlaskey@3 | 138 | call(); field = a;" may very well write a double to the field, but the |
jlaskey@3 | 139 | compiler dare not guess a double type if field is a local variable, |
jlaskey@3 | 140 | due to bytecode variables being strongly typed and later non |
jlaskey@3 | 141 | interchangeable. To get around this, the entire method would have to |
jlaskey@3 | 142 | be replaced and a continuation retained to restart from. We believe |
jlaskey@3 | 143 | that the next steps we should go through are instead: |
jlaskey@3 | 144 | |
jlaskey@3 | 145 | 1) Implement method specialization based on callsite, as it's quite |
jlaskey@3 | 146 | frequently the case that numbers are passed around, but currently our |
jlaskey@3 | 147 | function nodes just have object types visible to the compiler. For |
jlaskey@3 | 148 | example "var b = 17; func(a,b,17)" is an example where two parameters |
jlaskey@3 | 149 | can be specialized, but the main version of func might also be called |
jlaskey@3 | 150 | from another callsite with func(x,y,"string"). |
jlaskey@3 | 151 | |
jlaskey@3 | 152 | 2) This requires lazy jitting as the functions have to be specialized |
jlaskey@3 | 153 | per callsite. |
jlaskey@3 | 154 | |
jlaskey@3 | 155 | Even though "function square(x) { return x*x }" might look like a |
jlaskey@3 | 156 | trivial function that can always only take doubles, this is not |
jlaskey@3 | 157 | true. Someone might have overridden the valueOf for x so that the |
jlaskey@3 | 158 | toNumber coercion has side effects. To fulfil JavaScript semantics, |
jlaskey@3 | 159 | the coercion has to run twice for both terms of the multiplication |
jlaskey@3 | 160 | even if they are the same object. This means that call site |
jlaskey@3 | 161 | specialization is necessary, not parameter specialization on the form |
jlaskey@3 | 162 | "function square(x) { var xd = (double)x; return xd*xd; }", as one |
jlaskey@3 | 163 | might first think. |
jlaskey@3 | 164 | |
jlaskey@3 | 165 | Generating a method specialization for any variant of a function that |
jlaskey@3 | 166 | we can determine by types at compile time is a combinatorial explosion |
jlaskey@3 | 167 | of byte code (try it e.g. on all the variants of am3 in the Octane |
jlaskey@3 | 168 | benchmark crypto.js). Thus, this needs to be lazy |
jlaskey@3 | 169 | |
jlaskey@3 | 170 | 3) Possibly optimistic callsite writes, something on the form |
jlaskey@3 | 171 | |
jlaskey@3 | 172 | x = y; //x is a field known to be a primitive. y is only an object as |
jlaskey@3 | 173 | far as we can tell |
jlaskey@3 | 174 | |
jlaskey@3 | 175 | turns into |
jlaskey@3 | 176 | |
jlaskey@3 | 177 | try { |
jlaskey@3 | 178 | x = (int)y; |
jlaskey@3 | 179 | } catch (X is not an integer field right now | ClassCastException e) { |
jlaskey@3 | 180 | x = y; |
jlaskey@3 | 181 | } |
jlaskey@3 | 182 | |
jlaskey@3 | 183 | Mini POC shows that this is the key to a lot of dual field performance |
jlaskey@3 | 184 | in seemingly trivial micros where one unknown object, in reality |
jlaskey@3 | 185 | actually a primitive, foils it for us. Very common pattern. Once we |
jlaskey@3 | 186 | are "all primitives", dual fields runs a lot faster than Object fields |
jlaskey@3 | 187 | only. |
jlaskey@3 | 188 | |
jlaskey@3 | 189 | We still have to deal with objects vs primitives for local bytecode |
jlaskey@3 | 190 | slots, possibly through code copying and versioning. |
jlaskey@3 | 191 | |
jlaskey@3 | 192 | |
lagergren@57 | 193 | SYSTEM PROPERTY: -Dnashorn.compiler.symbol.trace=[<x>[,*]], |
lagergren@57 | 194 | -Dnashorn.compiler.symbol.stacktrace=[<x>[,*]] |
jlaskey@3 | 195 | |
jlaskey@3 | 196 | When this property is set, creation and manipulation of any symbol |
jlaskey@3 | 197 | named "x" will show information about when the compiler changes its |
jlaskey@3 | 198 | type assumption, bytecode local variable slot assignment and other |
jlaskey@3 | 199 | data. This is useful if, for example, a symbol shows up as an Object, |
jlaskey@3 | 200 | when you believe it should be a primitive. Usually there is an |
jlaskey@3 | 201 | explanation for this, for example that it exists in the global scope |
lagergren@57 | 202 | and type analysis has to be more conservative. |
lagergren@57 | 203 | |
lagergren@57 | 204 | Several symbols names to watch can be specified by comma separation. |
lagergren@57 | 205 | |
lagergren@57 | 206 | If no variable name is specified (and no equals sign), all symbols |
lagergren@57 | 207 | will be watched |
lagergren@57 | 208 | |
lagergren@57 | 209 | By using "stacktrace" instead of or together with "trace", stack |
lagergren@57 | 210 | traces will be displayed upon symbol changes according to the same |
lagergren@57 | 211 | semantics. |
jlaskey@3 | 212 | |
jlaskey@3 | 213 | |
jlaskey@3 | 214 | SYSTEM PROPERTY: nashorn.lexer.xmlliterals |
jlaskey@3 | 215 | |
jlaskey@3 | 216 | If this property it set, it means that the Lexer should attempt to |
jlaskey@3 | 217 | parse XML literals, which would otherwise generate syntax |
jlaskey@3 | 218 | errors. Warning: there are currently no unit tests for this |
jlaskey@3 | 219 | functionality. |
jlaskey@3 | 220 | |
jlaskey@3 | 221 | XML literals, when this is enabled, end up as standard LiteralNodes in |
jlaskey@3 | 222 | the IR. |
jlaskey@3 | 223 | |
jlaskey@3 | 224 | |
jlaskey@3 | 225 | SYSTEM_PROPERTY: nashorn.debug |
jlaskey@3 | 226 | |
jlaskey@3 | 227 | If this property is set to true, Nashorn runs in Debug mode. Debug |
jlaskey@3 | 228 | mode is slightly slower, as for example statistics counters are enabled |
jlaskey@3 | 229 | during the run. Debug mode makes available a NativeDebug instance |
jlaskey@3 | 230 | called "Debug" in the global space that can be used to print property |
jlaskey@3 | 231 | maps and layout for script objects, as well as a "dumpCounters" method |
jlaskey@3 | 232 | that will print the current values of the previously mentioned stats |
jlaskey@3 | 233 | counters. |
jlaskey@3 | 234 | |
jlaskey@3 | 235 | These functions currently exists for Debug: |
jlaskey@3 | 236 | |
jlaskey@3 | 237 | "map" - print(Debug.map(x)) will dump the PropertyMap for object x to |
jlaskey@3 | 238 | stdout (currently there also exist functions called "embedX", where X |
jlaskey@3 | 239 | is a value from 0 to 3, that will dump the contents of the embed pool |
jlaskey@3 | 240 | for the first spill properties in any script object and "spill", that |
jlaskey@3 | 241 | will dump the contents of the growing spill pool of spill properties |
jlaskey@3 | 242 | in any script object. This is of course subject to change without |
jlaskey@3 | 243 | notice, should we change the script object layout. |
jlaskey@3 | 244 | |
jlaskey@3 | 245 | "methodHandle" - this method returns the method handle that is used |
jlaskey@3 | 246 | for invoking a particular script function. |
jlaskey@3 | 247 | |
jlaskey@3 | 248 | "identical" - this method compares two script objects for reference |
jlaskey@3 | 249 | equality. It is a == Java comparison |
jlaskey@3 | 250 | |
jlaskey@3 | 251 | "dumpCounters" - will dump the debug counters' current values to |
jlaskey@3 | 252 | stdout. |
jlaskey@3 | 253 | |
jlaskey@3 | 254 | Currently we count number of ScriptObjects in the system, number of |
jlaskey@3 | 255 | Scope objects in the system, number of ScriptObject listeners added, |
jlaskey@3 | 256 | removed and dead (without references). |
jlaskey@3 | 257 | |
jlaskey@3 | 258 | We also count number of ScriptFunctions, ScriptFunction invocations |
jlaskey@3 | 259 | and ScriptFunction allocations. |
jlaskey@3 | 260 | |
jlaskey@3 | 261 | Furthermore we count PropertyMap statistics: how many property maps |
jlaskey@3 | 262 | exist, how many times were property maps cloned, how many times did |
jlaskey@3 | 263 | the property map history cache hit, prevent new allocations, how many |
jlaskey@3 | 264 | prototype invalidations were done, how many time the property map |
jlaskey@3 | 265 | proto cache hit. |
jlaskey@3 | 266 | |
jlaskey@3 | 267 | Finally we count callsite misses on a per callsite bases, which occur |
jlaskey@3 | 268 | when a callsite has to be relinked, due to a previous assumption of |
jlaskey@3 | 269 | object layout being invalidated. |
jlaskey@3 | 270 | |
jlaskey@3 | 271 | |
jlaskey@3 | 272 | SYSTEM PROPERTY: nashorn.methodhandles.debug, |
jlaskey@3 | 273 | nashorn.methodhandles.debug=create |
jlaskey@3 | 274 | |
jlaskey@3 | 275 | If this property is enabled, each MethodHandle related call that uses |
jlaskey@3 | 276 | the java.lang.invoke package gets its MethodHandle intercepted and an |
jlaskey@3 | 277 | instrumentation printout of arguments and return value appended to |
jlaskey@3 | 278 | it. This shows exactly which method handles are executed and from |
jlaskey@3 | 279 | where. (Also MethodTypes and SwitchPoints). This can be augmented with |
jlaskey@3 | 280 | more information, for example, instance count, by subclassing or |
jlaskey@3 | 281 | further extending the TraceMethodHandleFactory implementation in |
jlaskey@3 | 282 | MethodHandleFactory.java. |
jlaskey@3 | 283 | |
jlaskey@3 | 284 | If the property is specialized with "=create" as its option, |
jlaskey@3 | 285 | instrumentation will be shown for method handles upon creation time |
jlaskey@3 | 286 | rather than at runtime usage. |
jlaskey@3 | 287 | |
jlaskey@3 | 288 | |
jlaskey@3 | 289 | SYSTEM PROPERTY: nashorn.methodhandles.debug.stacktrace |
jlaskey@3 | 290 | |
jlaskey@3 | 291 | This does the same as nashorn.methodhandles.debug, but when enabled |
jlaskey@3 | 292 | also dumps the stack trace for every instrumented method handle |
jlaskey@3 | 293 | operation. Warning: This is enormously verbose, but provides a pretty |
jlaskey@3 | 294 | decent "grep:able" picture of where the calls are coming from. |
jlaskey@3 | 295 | |
jlaskey@3 | 296 | See the description of the codegen logger below for a more verbose |
jlaskey@3 | 297 | description of this option |
jlaskey@3 | 298 | |
jlaskey@3 | 299 | |
jlaskey@3 | 300 | SYSTEM PROPERTY: nashorn.scriptfunction.specialization.disable |
jlaskey@3 | 301 | |
jlaskey@3 | 302 | There are several "fast path" implementations of constructors and |
jlaskey@3 | 303 | functions in the NativeObject classes that, in their original form, |
jlaskey@3 | 304 | take a variable amount of arguments. Said functions are also declared |
jlaskey@3 | 305 | to take Object parameters in their original form, as this is what the |
jlaskey@3 | 306 | JavaScript specification mandates. |
jlaskey@3 | 307 | |
jlaskey@3 | 308 | However, we often know quite a lot more at a callsite of one of these |
jlaskey@3 | 309 | functions. For example, Math.min is called with a fixed number (2) of |
jlaskey@3 | 310 | integer arguments. The overhead of boxing these ints to Objects and |
jlaskey@3 | 311 | folding them into an Object array for the generic varargs Math.min |
jlaskey@3 | 312 | function is an order of magnitude slower than calling a specialized |
jlaskey@3 | 313 | implementation of Math.min that takes two integers. Specialized |
jlaskey@3 | 314 | functions and constructors are identified by the tag |
jlaskey@3 | 315 | @SpecializedFunction and @SpecializedConstructor in the Nashorn |
jlaskey@3 | 316 | code. The linker will link in the most appropriate (narrowest types, |
jlaskey@3 | 317 | right number of types and least number of arguments) specialization if |
jlaskey@3 | 318 | specializations are available. |
jlaskey@3 | 319 | |
jlaskey@3 | 320 | Every ScriptFunction may carry specializations that the linker can |
jlaskey@3 | 321 | choose from. This framework will likely be extended for user defined |
jlaskey@3 | 322 | functions. The compiler can often infer enough parameter type info |
jlaskey@3 | 323 | from callsites for in order to generate simpler versions with less |
jlaskey@3 | 324 | generic Object types. This feature depends on future lazy jitting, as |
jlaskey@3 | 325 | there tend to be many calls to user defined functions, some where the |
jlaskey@3 | 326 | callsite can be specialized, some where we mostly see object |
jlaskey@3 | 327 | parameters even at the callsite. |
jlaskey@3 | 328 | |
jlaskey@3 | 329 | If this system property is set to true, the linker will not attempt to |
jlaskey@3 | 330 | use any specialized function or constructor for native objects, but |
jlaskey@3 | 331 | just call the generic one. |
jlaskey@3 | 332 | |
jlaskey@3 | 333 | |
jlaskey@3 | 334 | SYSTEM PROPERTY: nashorn.tcs.miss.samplePercent=<x> |
jlaskey@3 | 335 | |
jlaskey@3 | 336 | When running with the trace callsite option (-tcs), Nashorn will count |
jlaskey@3 | 337 | and instrument any callsite misses that require relinking. As the |
jlaskey@3 | 338 | number of relinks is large and usually produces a lot of output, this |
jlaskey@3 | 339 | system property can be used to constrain the percentage of misses that |
jlaskey@3 | 340 | should be logged. Typically this is set to 1 or 5 (percent). 1% is the |
jlaskey@3 | 341 | default value. |
jlaskey@3 | 342 | |
jlaskey@3 | 343 | |
jlaskey@3 | 344 | SYSTEM_PROPERTY: nashorn.profilefile=<filename> |
jlaskey@3 | 345 | |
jlaskey@3 | 346 | When running with the profile callsite options (-pcs), Nashorn will |
jlaskey@3 | 347 | dump profiling data for all callsites to stderr as a shutdown hook. To |
jlaskey@3 | 348 | instead redirect this to a file, specify the path to the file using |
jlaskey@3 | 349 | this system property. |
jlaskey@3 | 350 | |
jlaskey@3 | 351 | |
hannesw@115 | 352 | SYSTEM_PROPERTY: nashorn.regexp.impl=[jdk|joni] |
hannesw@115 | 353 | |
hannesw@115 | 354 | This property defines the regular expression engine to be used by |
hannesw@115 | 355 | Nashorn. The default implementation is "jdk" which is based on the |
hannesw@115 | 356 | JDK's java.util.regex package. Set this property to "joni" to install |
hannesw@115 | 357 | an implementation based on Joni, the regular expression engine used by |
hannesw@115 | 358 | the JRuby project. |
hannesw@115 | 359 | |
hannesw@115 | 360 | |
jlaskey@3 | 361 | =============== |
jlaskey@3 | 362 | 2. The loggers. |
jlaskey@3 | 363 | =============== |
jlaskey@3 | 364 | |
lagergren@57 | 365 | It is very simple to create your own logger. Use the DebugLogger class |
lagergren@57 | 366 | and give the subsystem name as a constructor argument. |
lagergren@57 | 367 | |
jlaskey@3 | 368 | The Nashorn loggers can be used to print per-module or per-subsystem |
jlaskey@3 | 369 | debug information with different levels of verbosity. The loggers for |
jlaskey@3 | 370 | a given subsystem are available are enabled by using |
jlaskey@3 | 371 | |
jlaskey@3 | 372 | --log=<systemname>[:<level>] |
jlaskey@3 | 373 | |
jlaskey@3 | 374 | on the command line. |
jlaskey@3 | 375 | |
jlaskey@3 | 376 | Here <systemname> identifies the name of the subsystem to be logged |
jlaskey@3 | 377 | and the optional colon and level argument is a standard |
jlaskey@3 | 378 | java.util.logging.Level name (severe, warning, info, config, fine, |
jlaskey@3 | 379 | finer, finest). If the level is left out for a particular subsystem, |
jlaskey@3 | 380 | it defaults to "info". Any log message logged as the level or a level |
jlaskey@3 | 381 | that is more important will be output to stderr by the logger. |
jlaskey@3 | 382 | |
jlaskey@3 | 383 | Several loggers can be enabled by a single command line option, by |
jlaskey@3 | 384 | putting a comma after each subsystem/level tuple (or each subsystem if |
jlaskey@3 | 385 | level is unspecified). The --log option can also be given multiple |
jlaskey@3 | 386 | times on the same command line, with the same effect. |
jlaskey@3 | 387 | |
jlaskey@3 | 388 | For example: --log=codegen,fields:finest is equivalent to |
jlaskey@3 | 389 | --log=codegen:info --log=fields:finest |
jlaskey@3 | 390 | |
jlaskey@3 | 391 | The subsystems that currently support logging are: |
jlaskey@3 | 392 | |
jlaskey@3 | 393 | |
jlaskey@3 | 394 | * compiler |
jlaskey@3 | 395 | |
jlaskey@3 | 396 | The compiler is in charge of turning source code and function nodes |
jlaskey@3 | 397 | into byte code, and installs the classes into a class loader |
jlaskey@3 | 398 | controlled from the Context. Log messages are, for example, about |
jlaskey@3 | 399 | things like new compile units being allocated. The compiler has global |
jlaskey@3 | 400 | settings that all the tiers of codegen (e.g. Lower and CodeGenerator) |
lagergren@57 | 401 | use.s |
jlaskey@3 | 402 | |
jlaskey@3 | 403 | |
jlaskey@3 | 404 | * codegen |
jlaskey@3 | 405 | |
jlaskey@3 | 406 | The code generator is the emitter stage of the code pipeline, and |
jlaskey@3 | 407 | turns the lowest tier of a FunctionNode into bytecode. Codegen logging |
jlaskey@3 | 408 | shows byte codes as they are being emitted, line number information |
jlaskey@3 | 409 | and jumps. It also shows the contents of the bytecode stack prior to |
jlaskey@3 | 410 | each instruction being emitted. This is a good debugging aid. For |
jlaskey@3 | 411 | example: |
jlaskey@3 | 412 | |
jlaskey@3 | 413 | [codegen] #41 line:2 (f)_afc824e |
jlaskey@3 | 414 | [codegen] #42 load symbol x slot=2 |
jlaskey@3 | 415 | [codegen] #43 {1:O} load int 0 |
jlaskey@3 | 416 | [codegen] #44 {2:I O} dynamic_runtime_call GT:ZOI_I args=2 returnType=boolean |
jlaskey@3 | 417 | [codegen] #45 signature (Ljava/lang/Object;I)Z |
jlaskey@3 | 418 | [codegen] #46 {1:Z} ifeq ternary_false_5402fe28 |
jlaskey@3 | 419 | [codegen] #47 load symbol x slot=2 |
jlaskey@3 | 420 | [codegen] #48 {1:O} goto ternary_exit_107c1f2f |
jlaskey@3 | 421 | [codegen] #49 ternary_false_5402fe28 |
jlaskey@3 | 422 | [codegen] #50 load symbol x slot=2 |
jlaskey@3 | 423 | [codegen] #51 {1:O} convert object -> double |
jlaskey@3 | 424 | [codegen] #52 {1:D} neg |
jlaskey@3 | 425 | [codegen] #53 {1:D} convert double -> object |
jlaskey@3 | 426 | [codegen] #54 {1:O} ternary_exit_107c1f2f |
jlaskey@3 | 427 | [codegen] #55 {1:O} return object |
jlaskey@3 | 428 | |
jlaskey@3 | 429 | shows a ternary node being generated for the sequence "return x > 0 ? |
jlaskey@3 | 430 | x : -x" |
jlaskey@3 | 431 | |
jlaskey@3 | 432 | The first number on the log line is a unique monotonically increasing |
jlaskey@3 | 433 | emission id per bytecode. There is no guarantee this is the same id |
jlaskey@3 | 434 | between runs. depending on non deterministic code |
jlaskey@3 | 435 | execution/compilation, but for small applications it usually is. If |
jlaskey@3 | 436 | the system variable -Dnashorn.codegen.debug.trace=<x> is set, where x |
jlaskey@3 | 437 | is a bytecode emission id, a stack trace will be shown as the |
jlaskey@3 | 438 | particular bytecode is about to be emitted. This can be a quick way to |
jlaskey@3 | 439 | determine where it comes from without attaching the debugger. "Who |
jlaskey@3 | 440 | generated that neg?" |
jlaskey@3 | 441 | |
jlaskey@3 | 442 | The --log=codegen option is equivalent to setting the system variable |
jlaskey@3 | 443 | "nashorn.codegen.debug" to true. |
jlaskey@3 | 444 | |
jlaskey@3 | 445 | |
jlaskey@3 | 446 | * lower |
jlaskey@3 | 447 | |
lagergren@57 | 448 | This is the first lowering pass. |
lagergren@57 | 449 | |
lagergren@57 | 450 | Lower is a code generation pass that turns high level IR nodes into |
lagergren@57 | 451 | lower level one, for example substituting comparisons to RuntimeNodes |
lagergren@57 | 452 | and inlining finally blocks. |
lagergren@57 | 453 | |
lagergren@57 | 454 | Lower is also responsible for determining control flow information |
lagergren@57 | 455 | like end points. |
lagergren@57 | 456 | |
lagergren@57 | 457 | |
lagergren@57 | 458 | * attr |
lagergren@57 | 459 | |
jlaskey@3 | 460 | The lowering annotates a FunctionNode with symbols for each identifier |
jlaskey@3 | 461 | and transforms high level constructs into lower level ones, that the |
jlaskey@3 | 462 | CodeGenerator consumes. |
jlaskey@3 | 463 | |
jlaskey@3 | 464 | Lower logging typically outputs things like post pass actions, |
jlaskey@3 | 465 | insertions of casts because symbol types have been changed and type |
jlaskey@3 | 466 | specialization information. Currently very little info is generated by |
jlaskey@3 | 467 | this logger. This will probably change. |
jlaskey@3 | 468 | |
jlaskey@3 | 469 | |
lagergren@57 | 470 | * finalize |
jlaskey@3 | 471 | |
lagergren@57 | 472 | This --log=finalize log option outputs information for type finalization, |
lagergren@57 | 473 | the third tier of the compiler. This means things like placement of |
lagergren@57 | 474 | specialized scope nodes or explicit conversions. |
jlaskey@3 | 475 | |
jlaskey@3 | 476 | |
jlaskey@3 | 477 | * fields |
jlaskey@3 | 478 | |
jlaskey@3 | 479 | The --log=fields option (at info level) is equivalent to setting the |
jlaskey@3 | 480 | system variable "nashorn.fields.debug" to true. At the info level it |
jlaskey@3 | 481 | will only show info about type assumptions that were invalidated. If |
jlaskey@3 | 482 | the level is set to finest, it will also trace every AccessorProperty |
jlaskey@3 | 483 | getter and setter in the program, show arguments, return values |
jlaskey@3 | 484 | etc. It will also show the internal representation of respective field |
jlaskey@3 | 485 | (Object in the normal case, unless running with the dual field |
lagergren@24 | 486 | representation) |