KEMBAR78
Turbo charging v8 engine | PPTX
Turbocharging JavaScript:
V8
Rahul Dé
Compiler Junkie @ThoughtWorks
Problem
Find the 25, 000th Prime number
JavaScript
The Language
Invented by Brendan Eich at Mozilla Research.
Defined by the ECMA 262 Standard - ECMAScript 6.
Dynamically Typed.
Duck Typing.
Prototype and Object based.
Asynchrony is a language feature.
Functions are first class citizens.
Has its fair share of the Good, the Bad and the Ugly.
Implementing the Good,
Bad and the Ugly.
Major JavaScript
Implementations
SpiderMonkey, IonMonkey and JägerMonkey JIT
compilers by Mozilla powering Firefox.
V8 by Google powering the Chromium family of
Browsers, Opera, Node.js, MongoDB etc.
JSCore by Apple powering Safari and WebKit based
browsers.
Rhino/Nashorn by Mozilla Research and Oracle
implement JavaScript in the JVM.
Fully
Compiled
Languages
C, C++, Pascal, Fortan
etc…
Tokenize
r
Parser
Optimize
r
Gramma
r
Code Gen
Assembly
Code
Tokens AST IR
Abstract Syntax Tree
http://jointjs.com/demos/javascript-ast
Intermediate
Representation
int a = 5;
int b = 10;
int c = a + b;
stor a, 5
mov r1, a
stor b, 10
mov r2, 10
add r1, r2
store c, r1
Assembly Language
mov rax, 05H
mov rbx, 0AH
add rax, rbx
mov rax, [1000]
Partly
Compiled
Languages
Java, Python, Ruby
etc.
Tokenize
r
Parser
Optimize
r
Gramma
r
Code Gen Byte Code
Tokens AST
VM
Byte Codes
a = 5;
b = 10;
c = a + b;
iconst 5
set a
iconst 10
set b
push a
push b
add
set c
Virtual Machines
Stack VM - JVM,
Python VM, Ruby
VM(YARV) etc.
Register VM - Dalvik
VM, Lua VM, LLVM etc.
Running Byte codes
iconst 5
set a
iconst 10
set b
push a
push b
add
set c
5
pop
10
pop
a
a b
a + b
pop
a = 5
b = 10
c = 15
Problems??
A full program executes them
Lot of context switching
A big switch case to dispatch the byte codes
Big libs to handle the ops
But, necessary as its a dynamic language and makes
it cross platform
JIT VM
JVM, CLR, V8
Tokenize
r
Parser
Optimize
r
Gramma
r
Code Gen Byte Code
Tokens AST
VM JIT
JIT
iconst 5
set a
iconst 10
set b
push a
push b
add
set c
mov rax, 05H
mov rbx, 0AH
add rax, rbx
mov rax, [1000]
At Runtime
Why
code directly executes on CPU hence MUCH faster
no need of complex slow code to dispatch byte codes
more efficient execution with lesser machine
resources used.
Enables much more runtime optimizations
but is quite difficult to implement
Can JavaScript ever be faster
than C++?
Features
NO BYTECODES!
Full Register JIT
Advanced
runtime/compile time
optimizations
Highly embeddable
Implements the full ES6
spec
Architecture and Design
Compile
r
Full Code Gen Crankshaft Compiler
Hydrogen DFG Lithium IR
Optimized Code
Gen
Inline Caches
Compiler
Reads the JavaScript
source.
Tokenizes, Parses and
checks for syntax
errors.
Produces a persistant
AST or pAST.
Very VERY fast.
Full Code Gen
Reads the AST
Blindly generates platform
specific assembly code.
Generates some extra code
to infer types called inline
caches (IC).
Dispatches whole code to
CPU directly.
Very VERY VERY fast.
Inline Caches (IC)
function add(a, b) {
return a + b;
}
add(5, 6)
add(“5”, 6)
add(“Rahul”, “Dé”)
int, int -> int
str, int -> str
str, str -> str
asm1
asm2
asm3
Call Sites
Crankshaft Compiler
Profiles the CPU performance via a thread.
Gathers type info from the ICs.
Generates the Hydrogen representation for
advanced optimizations
From that generates platform specific
Lithium IR for register allocation with more
optimizations.
Finally super fast, optimized assembly is
generated and is set free on the CPU.
De-optimizes wrongly optimized code to full
code gen level.
Crankshaft is slower than others as it does a
lot of analysis.
Hydrogen DFG
Derived from type info from the
ICs.
Is a Acyclic Data Flow
Graphical Representation of
the pAST.
Enables advanced
optimizations like hotspots,
unreachable code, peep hole,
loop invariant code motions
etc.
Used to generate the platform
specific Lithium code.
Lithium IR
Is a three address register
representation of the Hydrogen
DFG
Enables low level optimizations
like register reduction
Maps exactly to the underlying
architecture to maximize faster
code gen.
Used to generate the fast
optimized native assembly
code.
Possible Future of V8
Ignition Compiler
Much faster and produces streamlined,
compact byte codes which saves a lot of
memory with at par perf.
Turbofan Engine
Reads the ignition byte codes and JITs
directly with blazing fast speed while
reducing memory footprints.
Replace the pAST, full code gen and
crankshaft with the ignition and turbofan
compilers.
Already in pre Alpha builds.
Can I help V8 to execute
my code better??
Of course. By now you know a lot
about V8. And definitely you can now
write much better code.
Writing V8 friendly
JavaScript.
Hidden Classes
function Point(x, y) {
this.x = x;
this.y = y;
}
var p1 = new Point(5, 6);
var p2 = new Point(7, 8);
x
y
x 5
y 6
x 7
y 8
Hidden Classes
var p1 = new Point(5, 6);
p1.z = 4;
delete p1.z;
x
y
x
y
z
x 5
y 6
z 4
Performance Tips
Try to use homogeneous data types
in Arrays.
Pre allocate arrays if non
homogeneous data is needed.
Avoid pushing non homogenous
types like bools into int arrays.
Initialize Function objects instead of
pushing in attributes later on.
Catch blocks are not crankshaft
optimizable, write the catch code in a
function and call it from the catch
block.
Turbocharging JavaScript
Fixing the Problem
Find the 25, 000th Prime number
Thank you.
rahul080327@gmail.com
https://github.com/rahul080327

Turbo charging v8 engine