README

Mon, 30 May 2016 02:01:38 -0400

author
aoqi
date
Mon, 30 May 2016 02:01:38 -0400
changeset 13
bc227c49eaae
parent 0
f90c822e73f8
permissions
-rw-r--r--

[C2] Rewrite generate_disjoint_short_copy.
Eliminated unaligned access and Optimized copy algorithm.
xml.transform improved by 50%, total GEO improved by 13%.
Copy Algorithm:
Generate stub for disjoint short copy. If "aligned" is true, the
"from" and "to" addresses are assumed to be heapword aligned.

Arguments for generated stub:
from: A0
to: A1
elm.count: A2 treated as signed
one element: 2 bytes

Strategy for aligned==true:

If length <= 9:
1. copy 1 elements at a time (l_5)

If length > 9:
1. copy 4 elements at a time until less than 4 elements are left (l_7)
2. copy 2 elements at a time until less than 2 elements are left (l_6)
3. copy last element if one was left in step 2. (l_1)


Strategy for aligned==false:

If length <= 9: same as aligned==true case

If length > 9:
1. continue with step 7. if the alignment of from and to mod 4
is different.
2. align from and to to 4 bytes by copying 1 element if necessary
3. at l_2 from and to are 4 byte aligned; continue with
6. if they cannot be aligned to 8 bytes because they have
got different alignment mod 8.
4. at this point we know that both, from and to, have the same
alignment mod 8, now copy one element if necessary to get
8 byte alignment of from and to.
5. copy 4 elements at a time until less than 4 elements are
left; depending on step 3. all load/stores are aligned.
6. copy 2 elements at a time until less than 2 elements are
left. (l_6)
7. copy 1 element at a time. (l_5)
8. copy last element if one was left in step 6. (l_1)

TODO:

1. use loongson 128-bit load/store
2. use loop unrolling optimization when len is big enough, for example if
len > 0x2000:
__ bind(l_x);
__ ld(AT, tmp1, 0);
__ ld(tmp, tmp1, 8);
__ sd(AT, tmp2, 0);
__ sd(tmp, tmp2, 8);
__ ld(AT, tmp1, 16);
__ ld(tmp, tmp1, 24);
__ sd(AT, tmp2, 16);
__ sd(tmp, tmp2, 24);
__ daddi(tmp1, tmp1, 32);
__ daddi(tmp2, tmp2, 32);
__ daddi(tmp3, tmp3, -16);
__ daddi(AT, tmp3, -16);
__ bgez(AT, l_x);
__ delayed()->nop();

     1 README:
     2   This file should be located at the top of the hotspot Mercurial repository.
     4   See http://openjdk.java.net/ for more information about the OpenJDK.
     6   See ../README-builds.html for complete details on build machine requirements.
     8 Simple Build Instructions:
    10     cd make && gnumake
    12   The files that will be imported into the jdk build will be in the "build"
    13   directory.

mercurial