KEMBAR78
차세대컴파일러, VM의미래: 애플 오픈소스 LLVM | PDF
차세대 컴파일러
VM의 미래
애플 오픈소스 프로젝트 LLVM

twitter.com/@godrm
facebook.com/godrm
JCO
Mac OS X

MacOS X
iOS의 아버지
Father of OS X
Java vs. Objective-C

“
!

!

”

- Patrick Naughton
JVM vs Objective-C Runtime

Java

Java

Obj-C

Obj-C

App.

App.

App.

App.

Runtime

JVM
운영체제
하드웨어
OS X Open-sources
http://opensource.apple.com/
http://www.apple.com/opensource/
https://developer.apple.com/opensource/
LLVM
아이디어의 시작은…
•

JVMs do all optimizations online at JIT time:
- Hugely redundant across runs
- Applications launch slowly
- What if we could do heavy lifting at install time?
!

•

Problem: Java bytecode is too limiting!

- Memory safety prevents some optzns (eg. bounds checks)
- JVM type system doesn’t lend itself to machine optzns
“

With some sort of low level virtual machine,
we could optimize better and a JIT compiler
would have to do less work online!

”
Introduction
•

LLVM

- Low-Level Virtual Machine
•

An Infrastructure for Multi-stage Optimization

- by Chris Arthur Lattner @2002
•

Design and Implementation of a compiler infrastructure

- support a unique multi-stage optimization system
- support inter-procedural and profile-driven optimizations
•

LLVM virtual instruction (IR)

- with high-level type information
•

Sponsored by APPLE
LLVM Vision and Approach
•

Primary mission: Build a set of modular compiler components:

- Reduces the time & cost to construct a particular compiler
- Components are shared across different compilers
- Allows choice of the right component for the job
•

Secondary mission: Build compilers out of these components

X86 PPC CBE clang GCC LTO
DWARF

Code
Target
gen

JIT Optzn linker IPO

BC IO LL IO System Core

Support xforms analysis

GC

...
Authors
•

Vikram Adve

- At University of Illinois

•

Chris Lattner

- LLVM: A Compilation Framework for Lifelong
Program Analysis & Transformation @2004

- Macroscopic Data Structure Analysis and
Optimization @2005 Ph.D Thesis

- work for Apple from 2007
•

Related Publications
- 15 < @2007, 30 ~ @2008, 50 ~ @2009, 30 ~ @2010
LLVM Timeline
LLVM 1.4 @Dec. 2004
WWDC @June. 2008
Clang C @Oct. 2006

LLVM 2.6 @Oct. 2009

LLVM 3.2 @Dec. 2012

LLVM 3.0 @Dec. 2011
LLVM 2.8 @Oct. 2010
LLVM 2.0 @May. 2007
LLVM 2.4 @Nov. 2008
LLVM 3.4 @Jan. 2014
LLVM 1.0 @Oct. 2003 - LTO/LTA
Open @July. 2007

osxdev.org
Compiler Architectures
C"
C++"
Objec)ve+C"

GCC#4.2#
프론트엔드#

Tree*SSA#
최적화기#

코드 생성기#

실행파일"

LLVM#
코드 생성기#

실행파일"

LLVM#
코드 생성기#

실행파일"

GCC"4.2"
C"
C++"
Objec)ve+C"

GCC#4.2#
프론트엔드#

LLVM#
최적화기#

LLVM+GCC"4.2"

C"
C++"
Objec)ve+C"

Clang#
프론트엔드#

LLVM#
최적화기#

LLVM"

osxdev.org
The Architecture
Libraries
Compiler FE 1

.
.

Compiler FE N

LLVM

.o
files

LLVM
LLVM

Native
CodeGen

Offline Reoptimizer

exe &
LLVM

exe
Linker
IPO / IPA

LLVM

exe
LLVM

CPU
exe

Link
Time

Runtime
Optimizer

LLVM

JIT

Compile
Time

Profile
& Trace
Info

Profile
Info

LLVM

Run-Time
LLVM JIT

LLVM
bytecode

exe.

LLVM
bytecode

exe.

운영체제
하드웨어

LLVM JIT
Virtual Instruction Set
for ( i=0; i<N; ++i)
sum(&A[i], &P);
loop:
%i.1 = phi int [ 0, %bb0 ], [ %i.2, %loop ]
%AiAddr = getelementptr float* %A, int %i.1
call void %sum(float %AiAddr, %pair* %P)
%i.2 = add int %i.1, 1
%tmp.4 = setlt int %i.1, %N
br bool %tmp.4, label %loop, label %outloop

http://llvm.org/docs/LangRef.html
LLVM Demo
#include <stdio.h>
#include <stdlib.h>
!

int factorial(int X) {
if (X == 0) return 1;
return X*factorial(X-1);
}
!

int main(int argc, char **argv) {
printf("%dn", factorial(atoi(argv[1])));
}

http://llvm.org/demo
Effective LLVM Projects
Xcode
Dynamic
Prog.

Clang

LLDB

LLVM

Optimize
(size)

Optimize
(speed)

OpenGL
OpenCL
Summary of the optimization
•

Analysis Passes (~50)

- Basic Alias Analysis
- Basic CallGraph construction
- Count alias analysis query response
- Dominator Tree Construction
- Counts the various types of Instructions
- Loop Dependence Analysis
- Track pointer bounds

•

Transform Passes (~70)

- Remove redundant conditional branches
- Aggressive Dead Code Elimination
- Dead Code Elimination
- Deduce function attributes
- Unroll loops
- Optimize use of memcpy and friends
- Strip debug info for unused symbols

•

Utility Passes (~10)

- Dead Argument Hacking
- View CFG of function

http://llvm.org/docs/Passes.html
Optimization Example - IPO
Optimization Example - IPO
Compile Time Performance
•

Problem: System headers are huge!
- Carbon.h contains:
✴

558 files

✴

12.3 megabytes

✴

10,000 function declarations

✴

2000 structure definitions, 8000 fields

✴

3000 enum definitions, 20000 enum constants

✴

5000 typedefs

✴

2000 file scoped variables

✴

6000 macros
AppKit Framework Compile
AppKit Framework Compile Times Time
Compile Time (minutes)

GCC 4.2
LLVM-GCC
24.5

16.0

35% Faster

5% Faster
4.2

-Os

3.9

-O0 -g
Full Build Speedup (Debug Option)

Full Build Speedup (Debug Config)
3x

2.5x
2.3x 2.2x

2.2x
1.9x 1.8x

2.0x

1.8x

1.9x

BlastApp

2x

2.1x

Growl

2.1x

1.6x

1.4x

1x

Sketch

Adium

Xcode

iCal

AppleScript Edit

Preview

Message (Mail)

Interface Builder

Dashcode

Automator

AddressBook

0x

Full Build is 2x Faster!
Clang Front-end
C, Objective-C and C++ front-end
• Aggressive project with many goals...
•

- Compatibility with GCC
- Fast compilation
- Expressive error messages (static analysis)
!
t.c:6:49: error: invalid operands to binary expression (‘int’ and ‘struct A’)!
!
return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X));!
~~~~~~~~~~~~~ ^ ~~~~~

!

•

Host for a broad range of source-level tools
http://llvm.org/devmtg/2008-08/Kremenek_StaticAnalyzer.pdf
Clang Front-end
Analyzer Xcode Integration
C, Objective-C and C++ front-end
• Aggressive project with many goals...
•

- Compatibility with GCC
- Fast compilation
- Expressive error messages (static analysis)
!
t.c:6:49: error: invalid operands to binary expression (‘int’ and ‘struct A’)!
!
return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X));!
~~~~~~~~~~~~~ ^ ~~~~~

!

•

Host for a broad range of source-level tools

Improving Your Application with the Xcode Static Analyzer

North Beach
Tuesday 5:00PM

http://llvm.org/devmtg/2008-08/Kremenek_StaticAnalyzer.pdf
Better Diagnosis of the Problem
Better Diagnosis of the Problem
GCC 4.2

Clang

test.m:4:1: error: unknown type name 'NSString'
NSString *P = @"good stuff";
^
libclang
•

Clang is not just a great compiler...
- also a library for processing source code
✴
✴

Resolves identifiers and symbols

✴

Expands macros

✴

•

Translates text into AST

Makes implicit information explicit

Features

- Parsing
- Indexing and cross-referencing
- Syntax highlighting
- Code completion

NEW
libclang
•

Clang is not just a great compiler...
- also a library for processing source code
✴
✴

Resolves identifiers and symbols

✴

Expands macros

✴

•

Translates text into AST

Makes implicit information explicit

Features

- Parsing
- Indexing and cross-referencing
- Syntax highlighting
- Code completion

NEW
ARC
•

NEW

Automatic Reference Coun
Automatic Reference Counting
- Automatic memory management of Objective-C objects
- Just Compile-time, Not Run-time

Not Garbage-Collector
• Migration Tool in Xcode 4.2
•

- with LLVM 3.0
- build-settings : -fobjc-arc (cf. -fno-objc-arc)

•

New Rules

- remove dealloc, retain/release/autorelease
✴

can still use CFRetain / CFRelease in CF

- Can’t use NSAllocateObject / NSDeallocateObject
- Can’t use object pointer in C Structures
- no casual casting id -> void*
- Can’t use NSAutoreleasePool -> @autoreleasepool
- Can’t use memory zone (NSZone)
- Can’t give a property name with new-

osxdev.org
libc++
•

Another C++ standard library?
- http://libcxx.llvm.org

•

The C++0x spec introduces several
fundamentally new ideas
- Move semantics
- Perfect forwarding
- Variadic templates

•

New language features

- C++03 implementation from the beginning
- driven several low-level design decisions.

NEW
DragonEgg
•

gcc plugin

- replaces gcc’s optimizers and code generators
- reimplementation of llvm-gcc. with gcc-4.5 or later

•

Current Status (v2.8)

- C works well, C++ works fairly well
- can compile a small amount of Obj-C/C++
- Limited debug info
- Requires patching gcc
- Only supports x86-32/64
- Only supports linux/darwin

ex) gcc hello.c -S -O1 -o -fplugin=./dragonegg.so

NEW
LLDB

NEW

• Next-generation!
• & High-performance Debugger!
• a set of reusable components in LLVM!
• Clang expression parser!
• LLVM disassembler!
• C/C++, Objective-C!
• Efficient Multi-threading, symbol manager!
• Extension - Python script!
• Support Remote protocol/debug server
Reusability
Reusability
Extensibility
Reusability
Extensibility
Reusability
Extensibility
Reusability
Extensibility
lldb
lldblldb
lldb

Xcode
Xcode 44 4 4
Xcode
Xcode

Python
Python
Python
Python

LLDB.framework
LLDB.framework
LLDB.framework
LLDB.framework

LLDB Core
LLDB Core
LLDB Core
LLDB Core
Process Dynamic
Dynamic Object FIles
Object
Symbols Disassembly
Process
Process
Dynamic
Object FIles
Object
Symbols
Disassembly
Process Dynamic Object FIles FIlesObject
Dynamic Object FIles
Object
Object Symbols
Symbols Disassembly
Disassembly
Process
Object
Symbols Disassembly
Loader
Containers
Loader
Loader
Containers
Containers
Loader
Containers
Loader
Containers
Mac OS X
GDB Remote

Darwin

Mach-O

Universal

DWARF

ELF

BSD Archive

ObjectFile

LLVM
Introduction
GDB
% gdb a.out
(gdb) break main
Breakpoint 1 at 0x100000f33:file main.c line4
(gdb) run

LLDB
% lldb a.out
(lldb) breakpoint set --name main
Breakpoint created:1:name=‘main’, locations=1
(lldb) process launch
Introduction
GDB
(gdb) info args
argc = 1
argv = (const char **) 0x7fff5fbff550
(gdb) info locals
i = 32767

LLDB
(lldb) frame variable
argc = 1
argv = 0x00007fff5fbfff68
i=0
Expression in LLDB
LLDB
(lldb) expression x+y->getCount()
(int) $0 = 2
(lldb) expression pt
(struct point_tag) $1 = {
(int) x = 2
(int) y = 3
}
(lldb) expression $1.x
(int) $2 = 2
LLDB Command Syntax
Command Syntax
<noun> <verb> [-options [option-value]] [argument [argument...]]
Uses standard getopt_long() for predicate behavior
(lldb) process launch a.out --stop-at-entry
(lldb) process launch a.out -- --arg0 --arg1
(lldb) process launch a.out -st
Options know which other options they are compatible with
(lldb) process attach --pid 123 --name a.out
Common Commands
GDB
(gdb) ^C
(gdb) signal 2
(gdb) info break
(gdb) continue
(gdb) step
(gdb) stepi
(gdb) next
(gdb) nexti
(gdb) finish
(gdb) info threads
(gdb) backtrace

LLDB
(lldb) process interrupt
(lldb) process signal SIGINT
(lldb) breakpoint list
(lldb) process continue
(lldb) thread step-in
(lldb) thread step-inst
(lldb) thread step-over
(lldb) thread step-over-inst
(lldb) thread step-out
(lldb) thread list
(lldb) thread backtrace
Common Commands
GDB

LLDB

(gdb) ^C
(gdb) signal 2
(gdb) in br
(gdb) c
(gdb) s
(gdb) si
(gdb) n
(gdb) ni
(gdb) f
(gdb) info threads
(gdb) bt

(lldb) pro int
(lldb) pro s SIGINT
(lldb) br l
(lldb) c
(lldb) s
(lldb) si
(lldb) n
(lldb) ni
(lldb) f
(lldb) th l
(lldb) bt
Java LLVM
Call for help!

Call for help

OSS community needs to unite work on various scripting languages
– Common module to represent/type infer an arbitrary dynamic language
– Who will provide this? pypy? parrot? llvm itself someday (“hlvm”)?
–

Ruby Python Perl Javascript ...
Common Dynamic Language
Representation + Type Inference

C, C++, Ada, ... GLSL, ARB VP, ...
llvm-gcc

OpenGL

What Next?

LLVM
LTO JIT Install Time Cross Lang Debugger
Codegen
Optzn
Support
IPO

http://llvm.org/
LLVM sub-projects
Clang & LLDB
• OpenMP
• Compiler-rt
•

- provides highly tuned implementations of the low-level
code generator.

•

cf. libgcc

VMKit

- is an implementation of the Java and .NET Virtual
Machines

•

KLEE

- implements a "symbolic virtual machine"
- which uses a theorem prover to try to evaluate all
dynamic paths through a program
LLVM sub-projects
•

Polly

- implements a suite of cache-locality optimizations as
well as auto-parallelism and vectorization using a
polyhedral model.

•

libclc

- implement the OpenCL standard library.

•

SAFECode

- is a memory safety compiler for C/C++ programs.
- It instruments code with run-time checks to detect

memory safety errors (ex. buffer overflows) at run-time.

LLD (Linker)
• ... to be continue ...
•
VMKit: a substrate for virtual machines
MMTk

Classpath

Mono

Pnetlib

LLVM JIT
POSIX

운영체제
하드웨어

http://vmkit.llvm.org/tuto/VMKit_pres_eng.pdf
LLJVM
LLVM
bytecode

http://da.vidr.cc/projects/lljvm/

LLJVM Runtime Library

JVM
운영체제
하드웨어
LLVM
LLVM use in Open Source OSes
Minix moved to Clang as default compiler
• FreeBSD is working on ClangBSD
• LLVM a hard dependency for Gallium3D
• Building Debian with Clang
• Unsupported GCC Flags / C Extensions
•

출처 : http://llvm.org/devmtg/2012-04-12/Slides/Mark_Charlebois.pdf
Use-case #1 - New Compiler
• Cling - CERN
• CtoVerilog - Haifa University
• OpenCL - AMD
• Click - Ericsson
• EDGtoLLVM - ARM
• Delphi XE- Embarcadero
• Jaguar - Cray
OpenCL
OpenCL
OpenGL
–

•

Very fragile, hard to understand and change (hex opcodes)

OpenGL Interpreter:
– JIT didn’t support all OpenGL features: fallback to interpreter
– Interpreter was very slow, 100x or worse than JIT

LLVM 이전까지는…

GFX Card
GLSL
Text

OpenGL
Parser

Custom JIT
OpenGL
AST

Interpreter

htt
1.Translate OpenGL AST into LLVM call instructions: one per operation
– Very fragile, hard to understand and change (hex
2.Use the LLVM inliner to inline opcodes from precompiled bytecode opcodes)
3.Optimize/codegen as before

OpenGL
•

GLSL

OpenGL Interpreter:
– JIT didn’t support all OpenGL features: fallback to interpreter
– Interpreter was very slow, 100xLLVM
OpenGL to
or worse thanLLVM
JIT
OpenGL
LLVM 이전까지는…
Parser
LLVM
Optimizer
JIT
LLVM IR

OpenGL
AST

OpenGL
to LLVM

...
vec3 viewVec = normalize(-ecPosition);
GLSL
float diffuse = max(dot(lightVec, tnorm), 0.0);
Text
...

OpenGL
Parser

OpenGL
Optimize,
AST

Codegen

PPC

X86

LLVM IR

GFX Card
...
%tmp1 = call opengl_negate(ecPosition)
%viewVec = call opengl_normalize(%tmp1);
%tmp2 = call opengl_dot(%lightVec, %tnorm)
%diffuse = call opengl_max(%tmp2, 0.0);
...

Custom JIT
Interpreter
LLVM Inliner

...
%tmp1 = sub <4 x float> <0,0,0,0>, %ecPosition
%tmp3 = shuffle <4 x float> %tmp1, ...;
%tmp4 = mul <4 x float> %tmp3, %tmp3
...

http://llvm.org/

htt
Use-case #2 - Optimization
Flash ActionScript (.as3)
➔ ActionScript Bytecode (.abc)
➔ LLVM Bytecode (.bc)

➔.abc ➔ .swf
ActionScript3 + LLVM?

Alchemy Project

Simple AS3 function
function CopyMatrix(B:Array, A:Array):void
{
var M:uint = A.length;
var N:uint = A[0].length;
var remainder:uint = N & 3;
for (var i:uint=0; i<M; i++)
{
var Bi:Array = B[i];
var Ai:Array = A[i];
for (var j:uint=0; j<remainder; j++)
Bi[j] = Ai[j];
for (j=remainder; j<N; j+=4)
{
Bi[j] = Ai[j];
Bi[j+1] = Ai[j+1];
Bi[j+2] = Ai[j+2];
Bi[j+3] = Ai[j+3];
}
}
}

// N mod 4;
As ABC
function CopyMatrix(Array,Array):void
/* disp_id=45 method_id=0 */
{
// local_count=10 max_scope=1 max_stack=5 code_len=210
0
getlocal0
1
pushscope
2
pushnull
3
coerce
Array
5
setlocal
7
7
pushnull
8
coerce
Array
10
setlocal
8
12
pushbyte
0
14
convert_u
15
setlocal
9
17
getlocal2
18
getproperty
length
20
convert_u
21
setlocal3
22
getlocal2
23
pushbyte
0
25
getproperty
null
27
getproperty
length
29
convert_u
30
setlocal
4
32
getlocal
4
34
pushbyte
3
36
bitand
37
convert_u
38
setlocal
5
40
pushbyte
0
42
convert_u
43
setlocal
6
45
jump
L1
ActionScript3 + LLVM?
As BC
; ModuleID = 'SparseCompRow'
declare avm2val @avm2_getproperty(...) readonly
declare void @avm2_setproperty(...)
declare avm2val @avm2_coerce(...) readnone
define avm2val @GO_m6_CopyMatrix(avm2val, avm2val, avm2val) {
bb_m6_b0_0_:
%i = call avm2val (...)* @avm2_getproperty( avm2val %2, avm2ref bitcast (i32 24 to avm2ref) )
; <avm2val> [#uses=1]
%i41 = add i32 0, 0
; <i32> [#uses=1]
%i42 = call avm2val @avm2box_i32( i32 %i41 )
; <avm2val> [#uses=1]
%i1 = call avm2val (...)* @avm2_pushbyte( i32 0 )
; <avm2val> [#uses=0]
%i2 = call avm2val (...)* @avm2_getproperty( avm2val %2, avm2val %i42, avm2ref bitcast (i32 5 to avm2ref) )
; <avm2val> [#uses=1]
%i3 = call avm2val (...)* @avm2_getproperty( avm2val %i2, avm2ref bitcast (i32 58 to avm2ref) )
; <avm2val> [#uses=1]
%i4 = call avm2val (...)* @avm2_convert_u( avm2val %i3 )
; <avm2val> [#uses=3]
%i43 = add i32 3, 0
; <i32> [#uses=2]
%i44 = call avm2val @avm2box_i32( i32 %i43 )
; <avm2val> [#uses=6]
%i88 = call double @avm2unbox_double( avm2val %i44 )
; <double> [#uses=1]
%i84 = call double @avm2unbox_double( avm2val %i44 )
; <double> [#uses=1]
%i5 = call avm2val (...)* @avm2_pushbyte( i32 3 )
; <avm2val> [#uses=0]
%i45 = call i32 @avm2unbox_i32( avm2val %i4 )
; <i32> [#uses=1]
%i46 = call i32 @avm2unbox_i32( avm2val %i44 )
; <i32> [#uses=0]
%i47 = and i32 %i45, %i43
; <i32> [#uses=1]
%i48 = call avm2val @avm2box_i32( i32 %i47 )
; <avm2val> [#uses=1]
%i6 = call avm2val (...)* @avm2_bitand( avm2val %i4, avm2val %i44 )
; <avm2val> [#uses=0]
%i7 = call avm2val (...)* @avm2_convert_u( avm2val %i48 )
; <avm2val> [#uses=3]
%i63 = call i32 @avm2unbox_i32( avm2val %i7 )
; <i32> [#uses=1]
%i8 = call avm2val (...)* @avm2_pushuint( i32 0 )
; <avm2val> [#uses=4]
%i53 = call i32 @avm2unbox_i32( avm2val %i8 )
; <i32> [#uses=1]
%i49 = call i32 @avm2unbox_i32( avm2val %i8 )
; <i32> [#uses=1]
br label %bb_m6_b1_0_
; ...

®
Use-case #3 - Cross Language
Crack Scripting Language
➔ C/C++/Java-like Scripting Language
➔ Speed of a compiled language,
ease of use of a scripting language
➔ Unladen Sparrow(Python), Rubinius(Ruby), V8(JS)

cf. PNaCl (Portable Native Client)
Use-case #3 - Emscripten
compiles LLVM bytecode into JavaScript
➔ C/C++ to JavaScript
➔ can be run on the web
➔ Python, the Bullet physics engine, eSpeak (TTS)

https://github.com/kripken/emscripten/wiki
Everything  compiles  into  LLVM  bitcode

Use-case #3 - Emscripten
compiles LLVM bytecode into JavaScript
➔ C/C++ to JavaScript
➔ The  web  is  everywhere,  and  runs  JavaScript
can be run on the web
➔ Python, the Bullet physics engine, eSpeak (TTS)

Compiling  LLVM  bitcode  to  JavaScript  lets  us  run
~  everything,  everywhere

https://github.com/kripken/emscripten/wiki
asm.js

https://www.youtube.com/watch?v=XsyogXtyU9o
asm.js

https://www.youtube.com/watch?v=XsyogXtyU9o
asmjs.org
References
•

Official Site

- http://llvm.org
- Online Demo : http://llvm.org/demo
•

Developer Meetings

- Sponsored by Apple, Google, Adobe, Qualcomm. ,
AMD, CERN, Sony Pictures, Intel

•

LLVM User

- Adobe, Apple, Cray, Google, Electronic Arts, NVIDIA,
Siemens, Sun Microsystems, XMOS ...

- Objective Modula-2, IcedTea, PyPy, iPhone tool

chain, IOQuake3, LLVM D, Mono, MacRuby, Pure,
LLVM-Lua

- CMU, ETH Zurich, NYU, Stanford ...
놀라운 사실들
•

•
•

•

•
•
•

First production JIT compiler for C-based
languages
Clang/LLVM have fully replaced GCC in XCode 5
Used on both major mobile platforms: iOS and
Android
Most GPU compute languages (OpenCL, CUDA,
Renderscript) use LLVM
First complete C++-11x: language + library
First ARM64 compiler in production (iPhone 5s)
2012 ACM Software System Award
Hazard in LLVM
25
Crash Errors
18.75

12.5

6.25

0

1.9

2.0

2.1

2.2

2.3

2.4

LLVM version

2.5

2.6

2.7

2.8
Bug fixed…
From March, 2008 to present:
170 bugs fixed + 3 reported but not yet fixed
( 57 wrong code bugs, 116 crash bugs)

• 2.4 revision simplifies “( a>13 ) & ( a==15)” into “( a>13 )”
• 2.8 revision folds “((x==x) | (y&z))==1” into 0
• 2.8 revision reduces this loop into “i=1”
for (i=0; i<5; i++)
{
if (i) continue;
if (i) break;

}
More to come…
•
•

•
•
•
•

More complete Windows support
More effective profile-guided
optimization
Improved usability, parallelization for LTO
Improved auto-vectorization
Improved debugging support
State-of-the-art pointer analysis
Conclusion
•

LLVM is

- still continue evolution...
- not omnia(all mighty)
- maybe a shortcut to the new compiler
- new strategy for mobile
- alternative solution for HW emulation or VM
- can mix with the other languages
- synergy with OpenCL, OpenGL
- new chance with LLVM projects


차세대컴파일러, VM의미래: 애플 오픈소스 LLVM

  • 1.
    차세대 컴파일러 VM의 미래 애플오픈소스 프로젝트 LLVM twitter.com/@godrm facebook.com/godrm
  • 2.
  • 3.
    Mac OS X MacOSX iOS의 아버지
  • 4.
  • 5.
  • 6.
    JVM vs Objective-CRuntime Java Java Obj-C Obj-C App. App. App. App. Runtime JVM 운영체제 하드웨어
  • 8.
  • 9.
  • 10.
    아이디어의 시작은… • JVMs doall optimizations online at JIT time: - Hugely redundant across runs - Applications launch slowly - What if we could do heavy lifting at install time? ! • Problem: Java bytecode is too limiting! - Memory safety prevents some optzns (eg. bounds checks) - JVM type system doesn’t lend itself to machine optzns
  • 11.
    “ With some sortof low level virtual machine, we could optimize better and a JIT compiler would have to do less work online! ”
  • 12.
    Introduction • LLVM - Low-Level VirtualMachine • An Infrastructure for Multi-stage Optimization - by Chris Arthur Lattner @2002 • Design and Implementation of a compiler infrastructure - support a unique multi-stage optimization system - support inter-procedural and profile-driven optimizations • LLVM virtual instruction (IR) - with high-level type information • Sponsored by APPLE
  • 13.
    LLVM Vision andApproach • Primary mission: Build a set of modular compiler components: - Reduces the time & cost to construct a particular compiler - Components are shared across different compilers - Allows choice of the right component for the job • Secondary mission: Build compilers out of these components X86 PPC CBE clang GCC LTO DWARF Code Target gen JIT Optzn linker IPO BC IO LL IO System Core Support xforms analysis GC ...
  • 14.
    Authors • Vikram Adve - AtUniversity of Illinois • Chris Lattner - LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation @2004 - Macroscopic Data Structure Analysis and Optimization @2005 Ph.D Thesis - work for Apple from 2007 • Related Publications - 15 < @2007, 30 ~ @2008, 50 ~ @2009, 30 ~ @2010
  • 15.
    LLVM Timeline LLVM 1.4@Dec. 2004 WWDC @June. 2008 Clang C @Oct. 2006 LLVM 2.6 @Oct. 2009 LLVM 3.2 @Dec. 2012 LLVM 3.0 @Dec. 2011 LLVM 2.8 @Oct. 2010 LLVM 2.0 @May. 2007 LLVM 2.4 @Nov. 2008 LLVM 3.4 @Jan. 2014 LLVM 1.0 @Oct. 2003 - LTO/LTA Open @July. 2007 osxdev.org
  • 16.
    Compiler Architectures C" C++" Objec)ve+C" GCC#4.2# 프론트엔드# Tree*SSA# 최적화기# 코드 생성기# 실행파일" LLVM# 코드생성기# 실행파일" LLVM# 코드 생성기# 실행파일" GCC"4.2" C" C++" Objec)ve+C" GCC#4.2# 프론트엔드# LLVM# 최적화기# LLVM+GCC"4.2" C" C++" Objec)ve+C" Clang# 프론트엔드# LLVM# 최적화기# LLVM" osxdev.org
  • 17.
    The Architecture Libraries Compiler FE1 . . Compiler FE N LLVM .o files LLVM LLVM Native CodeGen Offline Reoptimizer exe & LLVM exe Linker IPO / IPA LLVM exe LLVM CPU exe Link Time Runtime Optimizer LLVM JIT Compile Time Profile & Trace Info Profile Info LLVM Run-Time
  • 18.
  • 19.
    Virtual Instruction Set for( i=0; i<N; ++i) sum(&A[i], &P); loop: %i.1 = phi int [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptr float* %A, int %i.1 call void %sum(float %AiAddr, %pair* %P) %i.2 = add int %i.1, 1 %tmp.4 = setlt int %i.1, %N br bool %tmp.4, label %loop, label %outloop http://llvm.org/docs/LangRef.html
  • 20.
    LLVM Demo #include <stdio.h> #include<stdlib.h> ! int factorial(int X) { if (X == 0) return 1; return X*factorial(X-1); } ! int main(int argc, char **argv) { printf("%dn", factorial(atoi(argv[1]))); } http://llvm.org/demo
  • 21.
  • 22.
    Summary of theoptimization • Analysis Passes (~50) - Basic Alias Analysis - Basic CallGraph construction - Count alias analysis query response - Dominator Tree Construction - Counts the various types of Instructions - Loop Dependence Analysis - Track pointer bounds • Transform Passes (~70) - Remove redundant conditional branches - Aggressive Dead Code Elimination - Dead Code Elimination - Deduce function attributes - Unroll loops - Optimize use of memcpy and friends - Strip debug info for unused symbols • Utility Passes (~10) - Dead Argument Hacking - View CFG of function http://llvm.org/docs/Passes.html
  • 23.
  • 24.
  • 25.
    Compile Time Performance • Problem:System headers are huge! - Carbon.h contains: ✴ 558 files ✴ 12.3 megabytes ✴ 10,000 function declarations ✴ 2000 structure definitions, 8000 fields ✴ 3000 enum definitions, 20000 enum constants ✴ 5000 typedefs ✴ 2000 file scoped variables ✴ 6000 macros
  • 26.
    AppKit Framework Compile AppKitFramework Compile Times Time Compile Time (minutes) GCC 4.2 LLVM-GCC 24.5 16.0 35% Faster 5% Faster 4.2 -Os 3.9 -O0 -g
  • 27.
    Full Build Speedup(Debug Option) Full Build Speedup (Debug Config) 3x 2.5x 2.3x 2.2x 2.2x 1.9x 1.8x 2.0x 1.8x 1.9x BlastApp 2x 2.1x Growl 2.1x 1.6x 1.4x 1x Sketch Adium Xcode iCal AppleScript Edit Preview Message (Mail) Interface Builder Dashcode Automator AddressBook 0x Full Build is 2x Faster!
  • 28.
    Clang Front-end C, Objective-Cand C++ front-end • Aggressive project with many goals... • - Compatibility with GCC - Fast compilation - Expressive error messages (static analysis) ! t.c:6:49: error: invalid operands to binary expression (‘int’ and ‘struct A’)! ! return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X));! ~~~~~~~~~~~~~ ^ ~~~~~ ! • Host for a broad range of source-level tools http://llvm.org/devmtg/2008-08/Kremenek_StaticAnalyzer.pdf
  • 29.
    Clang Front-end Analyzer XcodeIntegration C, Objective-C and C++ front-end • Aggressive project with many goals... • - Compatibility with GCC - Fast compilation - Expressive error messages (static analysis) ! t.c:6:49: error: invalid operands to binary expression (‘int’ and ‘struct A’)! ! return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X));! ~~~~~~~~~~~~~ ^ ~~~~~ ! • Host for a broad range of source-level tools Improving Your Application with the Xcode Static Analyzer North Beach Tuesday 5:00PM http://llvm.org/devmtg/2008-08/Kremenek_StaticAnalyzer.pdf
  • 30.
    Better Diagnosis ofthe Problem Better Diagnosis of the Problem GCC 4.2 Clang test.m:4:1: error: unknown type name 'NSString' NSString *P = @"good stuff"; ^
  • 31.
    libclang • Clang is notjust a great compiler... - also a library for processing source code ✴ ✴ Resolves identifiers and symbols ✴ Expands macros ✴ • Translates text into AST Makes implicit information explicit Features - Parsing - Indexing and cross-referencing - Syntax highlighting - Code completion NEW
  • 32.
    libclang • Clang is notjust a great compiler... - also a library for processing source code ✴ ✴ Resolves identifiers and symbols ✴ Expands macros ✴ • Translates text into AST Makes implicit information explicit Features - Parsing - Indexing and cross-referencing - Syntax highlighting - Code completion NEW
  • 33.
    ARC • NEW Automatic Reference Coun AutomaticReference Counting - Automatic memory management of Objective-C objects - Just Compile-time, Not Run-time Not Garbage-Collector • Migration Tool in Xcode 4.2 • - with LLVM 3.0 - build-settings : -fobjc-arc (cf. -fno-objc-arc) • New Rules - remove dealloc, retain/release/autorelease ✴ can still use CFRetain / CFRelease in CF - Can’t use NSAllocateObject / NSDeallocateObject - Can’t use object pointer in C Structures - no casual casting id -> void* - Can’t use NSAutoreleasePool -> @autoreleasepool - Can’t use memory zone (NSZone) - Can’t give a property name with new- osxdev.org
  • 34.
    libc++ • Another C++ standardlibrary? - http://libcxx.llvm.org • The C++0x spec introduces several fundamentally new ideas - Move semantics - Perfect forwarding - Variadic templates • New language features - C++03 implementation from the beginning - driven several low-level design decisions. NEW
  • 35.
    DragonEgg • gcc plugin - replacesgcc’s optimizers and code generators - reimplementation of llvm-gcc. with gcc-4.5 or later • Current Status (v2.8) - C works well, C++ works fairly well - can compile a small amount of Obj-C/C++ - Limited debug info - Requires patching gcc - Only supports x86-32/64 - Only supports linux/darwin ex) gcc hello.c -S -O1 -o -fplugin=./dragonegg.so NEW
  • 37.
    LLDB NEW • Next-generation! • &High-performance Debugger! • a set of reusable components in LLVM! • Clang expression parser! • LLVM disassembler! • C/C++, Objective-C! • Efficient Multi-threading, symbol manager! • Extension - Python script! • Support Remote protocol/debug server
  • 38.
    Reusability Reusability Extensibility Reusability Extensibility Reusability Extensibility Reusability Extensibility lldb lldblldb lldb Xcode Xcode 44 44 Xcode Xcode Python Python Python Python LLDB.framework LLDB.framework LLDB.framework LLDB.framework LLDB Core LLDB Core LLDB Core LLDB Core Process Dynamic Dynamic Object FIles Object Symbols Disassembly Process Process Dynamic Object FIles Object Symbols Disassembly Process Dynamic Object FIles FIlesObject Dynamic Object FIles Object Object Symbols Symbols Disassembly Disassembly Process Object Symbols Disassembly Loader Containers Loader Loader Containers Containers Loader Containers Loader Containers Mac OS X GDB Remote Darwin Mach-O Universal DWARF ELF BSD Archive ObjectFile LLVM
  • 39.
    Introduction GDB % gdb a.out (gdb)break main Breakpoint 1 at 0x100000f33:file main.c line4 (gdb) run LLDB % lldb a.out (lldb) breakpoint set --name main Breakpoint created:1:name=‘main’, locations=1 (lldb) process launch
  • 40.
    Introduction GDB (gdb) info args argc= 1 argv = (const char **) 0x7fff5fbff550 (gdb) info locals i = 32767 LLDB (lldb) frame variable argc = 1 argv = 0x00007fff5fbfff68 i=0
  • 41.
    Expression in LLDB LLDB (lldb)expression x+y->getCount() (int) $0 = 2 (lldb) expression pt (struct point_tag) $1 = { (int) x = 2 (int) y = 3 } (lldb) expression $1.x (int) $2 = 2
  • 42.
    LLDB Command Syntax CommandSyntax <noun> <verb> [-options [option-value]] [argument [argument...]] Uses standard getopt_long() for predicate behavior (lldb) process launch a.out --stop-at-entry (lldb) process launch a.out -- --arg0 --arg1 (lldb) process launch a.out -st Options know which other options they are compatible with (lldb) process attach --pid 123 --name a.out
  • 43.
    Common Commands GDB (gdb) ^C (gdb)signal 2 (gdb) info break (gdb) continue (gdb) step (gdb) stepi (gdb) next (gdb) nexti (gdb) finish (gdb) info threads (gdb) backtrace LLDB (lldb) process interrupt (lldb) process signal SIGINT (lldb) breakpoint list (lldb) process continue (lldb) thread step-in (lldb) thread step-inst (lldb) thread step-over (lldb) thread step-over-inst (lldb) thread step-out (lldb) thread list (lldb) thread backtrace
  • 44.
    Common Commands GDB LLDB (gdb) ^C (gdb)signal 2 (gdb) in br (gdb) c (gdb) s (gdb) si (gdb) n (gdb) ni (gdb) f (gdb) info threads (gdb) bt (lldb) pro int (lldb) pro s SIGINT (lldb) br l (lldb) c (lldb) s (lldb) si (lldb) n (lldb) ni (lldb) f (lldb) th l (lldb) bt
  • 45.
  • 46.
    Call for help! Callfor help OSS community needs to unite work on various scripting languages – Common module to represent/type infer an arbitrary dynamic language – Who will provide this? pypy? parrot? llvm itself someday (“hlvm”)? – Ruby Python Perl Javascript ... Common Dynamic Language Representation + Type Inference C, C++, Ada, ... GLSL, ARB VP, ... llvm-gcc OpenGL What Next? LLVM LTO JIT Install Time Cross Lang Debugger Codegen Optzn Support IPO http://llvm.org/
  • 47.
    LLVM sub-projects Clang &LLDB • OpenMP • Compiler-rt • - provides highly tuned implementations of the low-level code generator. • cf. libgcc VMKit - is an implementation of the Java and .NET Virtual Machines • KLEE - implements a "symbolic virtual machine" - which uses a theorem prover to try to evaluate all dynamic paths through a program
  • 48.
    LLVM sub-projects • Polly - implementsa suite of cache-locality optimizations as well as auto-parallelism and vectorization using a polyhedral model. • libclc - implement the OpenCL standard library. • SAFECode - is a memory safety compiler for C/C++ programs. - It instruments code with run-time checks to detect memory safety errors (ex. buffer overflows) at run-time. LLD (Linker) • ... to be continue ... •
  • 49.
    VMKit: a substratefor virtual machines MMTk Classpath Mono Pnetlib LLVM JIT POSIX 운영체제 하드웨어 http://vmkit.llvm.org/tuto/VMKit_pres_eng.pdf
  • 50.
  • 51.
  • 52.
    LLVM use inOpen Source OSes Minix moved to Clang as default compiler • FreeBSD is working on ClangBSD • LLVM a hard dependency for Gallium3D • Building Debian with Clang • Unsupported GCC Flags / C Extensions • 출처 : http://llvm.org/devmtg/2012-04-12/Slides/Mark_Charlebois.pdf
  • 53.
    Use-case #1 -New Compiler • Cling - CERN • CtoVerilog - Haifa University • OpenCL - AMD • Click - Ericsson • EDGtoLLVM - ARM • Delphi XE- Embarcadero • Jaguar - Cray
  • 54.
  • 55.
  • 56.
    OpenGL – • Very fragile, hardto understand and change (hex opcodes) OpenGL Interpreter: – JIT didn’t support all OpenGL features: fallback to interpreter – Interpreter was very slow, 100x or worse than JIT LLVM 이전까지는… GFX Card GLSL Text OpenGL Parser Custom JIT OpenGL AST Interpreter htt
  • 57.
    1.Translate OpenGL ASTinto LLVM call instructions: one per operation – Very fragile, hard to understand and change (hex 2.Use the LLVM inliner to inline opcodes from precompiled bytecode opcodes) 3.Optimize/codegen as before OpenGL • GLSL OpenGL Interpreter: – JIT didn’t support all OpenGL features: fallback to interpreter – Interpreter was very slow, 100xLLVM OpenGL to or worse thanLLVM JIT OpenGL LLVM 이전까지는… Parser LLVM Optimizer JIT LLVM IR OpenGL AST OpenGL to LLVM ... vec3 viewVec = normalize(-ecPosition); GLSL float diffuse = max(dot(lightVec, tnorm), 0.0); Text ... OpenGL Parser OpenGL Optimize, AST Codegen PPC X86 LLVM IR GFX Card ... %tmp1 = call opengl_negate(ecPosition) %viewVec = call opengl_normalize(%tmp1); %tmp2 = call opengl_dot(%lightVec, %tnorm) %diffuse = call opengl_max(%tmp2, 0.0); ... Custom JIT Interpreter LLVM Inliner ... %tmp1 = sub <4 x float> <0,0,0,0>, %ecPosition %tmp3 = shuffle <4 x float> %tmp1, ...; %tmp4 = mul <4 x float> %tmp3, %tmp3 ... http://llvm.org/ htt
  • 58.
    Use-case #2 -Optimization Flash ActionScript (.as3) ➔ ActionScript Bytecode (.abc) ➔ LLVM Bytecode (.bc) ➔.abc ➔ .swf
  • 59.
    ActionScript3 + LLVM? AlchemyProject Simple AS3 function function CopyMatrix(B:Array, A:Array):void { var M:uint = A.length; var N:uint = A[0].length; var remainder:uint = N & 3; for (var i:uint=0; i<M; i++) { var Bi:Array = B[i]; var Ai:Array = A[i]; for (var j:uint=0; j<remainder; j++) Bi[j] = Ai[j]; for (j=remainder; j<N; j+=4) { Bi[j] = Ai[j]; Bi[j+1] = Ai[j+1]; Bi[j+2] = Ai[j+2]; Bi[j+3] = Ai[j+3]; } } } // N mod 4;
  • 60.
    As ABC function CopyMatrix(Array,Array):void /*disp_id=45 method_id=0 */ { // local_count=10 max_scope=1 max_stack=5 code_len=210 0 getlocal0 1 pushscope 2 pushnull 3 coerce Array 5 setlocal 7 7 pushnull 8 coerce Array 10 setlocal 8 12 pushbyte 0 14 convert_u 15 setlocal 9 17 getlocal2 18 getproperty length 20 convert_u 21 setlocal3 22 getlocal2 23 pushbyte 0 25 getproperty null 27 getproperty length 29 convert_u 30 setlocal 4 32 getlocal 4 34 pushbyte 3 36 bitand 37 convert_u 38 setlocal 5 40 pushbyte 0 42 convert_u 43 setlocal 6 45 jump L1
  • 61.
    ActionScript3 + LLVM? AsBC ; ModuleID = 'SparseCompRow' declare avm2val @avm2_getproperty(...) readonly declare void @avm2_setproperty(...) declare avm2val @avm2_coerce(...) readnone define avm2val @GO_m6_CopyMatrix(avm2val, avm2val, avm2val) { bb_m6_b0_0_: %i = call avm2val (...)* @avm2_getproperty( avm2val %2, avm2ref bitcast (i32 24 to avm2ref) ) ; <avm2val> [#uses=1] %i41 = add i32 0, 0 ; <i32> [#uses=1] %i42 = call avm2val @avm2box_i32( i32 %i41 ) ; <avm2val> [#uses=1] %i1 = call avm2val (...)* @avm2_pushbyte( i32 0 ) ; <avm2val> [#uses=0] %i2 = call avm2val (...)* @avm2_getproperty( avm2val %2, avm2val %i42, avm2ref bitcast (i32 5 to avm2ref) ) ; <avm2val> [#uses=1] %i3 = call avm2val (...)* @avm2_getproperty( avm2val %i2, avm2ref bitcast (i32 58 to avm2ref) ) ; <avm2val> [#uses=1] %i4 = call avm2val (...)* @avm2_convert_u( avm2val %i3 ) ; <avm2val> [#uses=3] %i43 = add i32 3, 0 ; <i32> [#uses=2] %i44 = call avm2val @avm2box_i32( i32 %i43 ) ; <avm2val> [#uses=6] %i88 = call double @avm2unbox_double( avm2val %i44 ) ; <double> [#uses=1] %i84 = call double @avm2unbox_double( avm2val %i44 ) ; <double> [#uses=1] %i5 = call avm2val (...)* @avm2_pushbyte( i32 3 ) ; <avm2val> [#uses=0] %i45 = call i32 @avm2unbox_i32( avm2val %i4 ) ; <i32> [#uses=1] %i46 = call i32 @avm2unbox_i32( avm2val %i44 ) ; <i32> [#uses=0] %i47 = and i32 %i45, %i43 ; <i32> [#uses=1] %i48 = call avm2val @avm2box_i32( i32 %i47 ) ; <avm2val> [#uses=1] %i6 = call avm2val (...)* @avm2_bitand( avm2val %i4, avm2val %i44 ) ; <avm2val> [#uses=0] %i7 = call avm2val (...)* @avm2_convert_u( avm2val %i48 ) ; <avm2val> [#uses=3] %i63 = call i32 @avm2unbox_i32( avm2val %i7 ) ; <i32> [#uses=1] %i8 = call avm2val (...)* @avm2_pushuint( i32 0 ) ; <avm2val> [#uses=4] %i53 = call i32 @avm2unbox_i32( avm2val %i8 ) ; <i32> [#uses=1] %i49 = call i32 @avm2unbox_i32( avm2val %i8 ) ; <i32> [#uses=1] br label %bb_m6_b1_0_ ; ... ®
  • 62.
    Use-case #3 -Cross Language Crack Scripting Language ➔ C/C++/Java-like Scripting Language ➔ Speed of a compiled language, ease of use of a scripting language ➔ Unladen Sparrow(Python), Rubinius(Ruby), V8(JS) cf. PNaCl (Portable Native Client)
  • 63.
    Use-case #3 -Emscripten compiles LLVM bytecode into JavaScript ➔ C/C++ to JavaScript ➔ can be run on the web ➔ Python, the Bullet physics engine, eSpeak (TTS) https://github.com/kripken/emscripten/wiki
  • 64.
    Everything  compiles  into LLVM  bitcode Use-case #3 - Emscripten compiles LLVM bytecode into JavaScript ➔ C/C++ to JavaScript ➔ The  web  is  everywhere,  and  runs  JavaScript can be run on the web ➔ Python, the Bullet physics engine, eSpeak (TTS) Compiling  LLVM  bitcode  to  JavaScript  lets  us  run ~  everything,  everywhere https://github.com/kripken/emscripten/wiki
  • 65.
  • 66.
  • 67.
  • 69.
    References • Official Site - http://llvm.org -Online Demo : http://llvm.org/demo • Developer Meetings - Sponsored by Apple, Google, Adobe, Qualcomm. , AMD, CERN, Sony Pictures, Intel • LLVM User - Adobe, Apple, Cray, Google, Electronic Arts, NVIDIA, Siemens, Sun Microsystems, XMOS ... - Objective Modula-2, IcedTea, PyPy, iPhone tool chain, IOQuake3, LLVM D, Mono, MacRuby, Pure, LLVM-Lua - CMU, ETH Zurich, NYU, Stanford ...
  • 70.
    놀라운 사실들 • • • • • • • First productionJIT compiler for C-based languages Clang/LLVM have fully replaced GCC in XCode 5 Used on both major mobile platforms: iOS and Android Most GPU compute languages (OpenCL, CUDA, Renderscript) use LLVM First complete C++-11x: language + library First ARM64 compiler in production (iPhone 5s) 2012 ACM Software System Award
  • 71.
    Hazard in LLVM 25 CrashErrors 18.75 12.5 6.25 0 1.9 2.0 2.1 2.2 2.3 2.4 LLVM version 2.5 2.6 2.7 2.8
  • 72.
    Bug fixed… From March,2008 to present: 170 bugs fixed + 3 reported but not yet fixed ( 57 wrong code bugs, 116 crash bugs) • 2.4 revision simplifies “( a>13 ) & ( a==15)” into “( a>13 )” • 2.8 revision folds “((x==x) | (y&z))==1” into 0 • 2.8 revision reduces this loop into “i=1” for (i=0; i<5; i++) { if (i) continue; if (i) break;
 }
  • 73.
    More to come… • • • • • • Morecomplete Windows support More effective profile-guided optimization Improved usability, parallelization for LTO Improved auto-vectorization Improved debugging support State-of-the-art pointer analysis
  • 74.
    Conclusion • LLVM is - stillcontinue evolution... - not omnia(all mighty) - maybe a shortcut to the new compiler - new strategy for mobile - alternative solution for HW emulation or VM - can mix with the other languages - synergy with OpenCL, OpenGL - new chance with LLVM projects
  • 75.