KEMBAR78
Create Your Own Language | PDF
Create Your Own Language
How to implement a language on top of
Erlang Virtual Machine (BEAM)
Hamidreza Soleimani
Backend Developer / Architect @ BisPhone
Tehran Linux User Group
August 6, 2015
Why should we create
a new language?
Lisp
Javascript
XML
PHP
Python
Haskell
Erlang
Go Ruby
Java
CProlog
Scala
SQL
SQL
Elixir
C++
Rust
Perl
C# Objectiv-CSQL
There are lots of Languages!
By category:

Programming, Query, Domain Specific, etc
By Paradigm:

Imperative, Declarative (Logic, Functional), Structured (Object
Oriented, Modular), etc
By Implementation:

Compiled, Interpreted, Mixed, etc
By Type System:

Static, Dynamic, Strong, Weak, etc
So why should we create it?
Reason 1: Implementing a new idea
Reason 2: Solving a new problem
Reason 3: Mastering language concepts
Reason 4: Just for fun!
How can we create
a new language?
1. Designing
1.1. Identifying the problem
1.2. Planning the target platform/machine
1.3. Determining language category, paradigm, types, etc
2. Implementing Frontend
2.1. Lexical Scanning (Token Generating)

(example: lex, flex, leex, etc)
2.2. Syntax & Semantics Parsing (AST Generating)

(example: yacc, bison, yecc)
2.3. Preprocessing
2.4. Lint Analyzing
2.5. Generating Intermediate Language

(example: GCC IR, LLVM IR, BEAM Bytecode, Java Bytecode)
3. Implementing Backend
3.1. Analyzing Intermediate Language
3.2. Optimizing
3.3. Generating Machine Code
3.4. Evaluating and Executing
Lets look inside Erlang
compiler
– Holy Wikipedia
What is Erlang?
“Erlang is a general-purpose, functional, concurrent,
garbage-collected programming language and
runtime system, with eager evaluation, single
assignment, and dynamic typing. It was originally
designed by Ericsson to support distributed, fault-
tolerant, soft real-time, highly available, non-stop
applications. It supports hot swapping, so that code
can be changed without stopping a system.”
Code Generation Steps
Erlang Source Code
Parse Transformed & Preprocessed Code (erlc -P)
Source Transformed Code (erlc -E)
Abstract Syntax Tree (erlc +dabstr)
Expanded Abstract Syntax Tree (erlc +dexp)
Core Erlang (erlc +to_core)
Assembler Code (erlc -S)
BEAM Bytecode (erlc)
Source Code
$ vim test.erl
Parse Transformed Code
$ erlc -P test.erl
Source Transformed Code
$ erlc -E test.erl
AST Code
$ erlc +dabstr test.erl
Expanded AST Code
$ erlc +dexp test.erl
Core Erlang Code
$ erlc +to_core test.erl
Assembler Code
$ erlc -S test.erl
BEAM Byte Code
$ erlc test.erl
Lets create a query language
for Tnesia
– https://github.com/bisphone/Tnesia
What is Tnesia?
“Tnesia is a time-series data storage which lets you
run time-based queries on a large amount of data,
without scanning the whole set of data, and in a
key-value manner. It can be used embedded inside
an Erlang application, or stand-alone with HTTP
interface to outside which talks in a simple query
language called TQL.”
The Problem
It seems that its Erlang API looks strange to non-Erlang developer!
The Solution
Create a Query Language (TQL) that can be used over HTTP.
TQL Concepts
Syntax Types
Lexical Scanning
Using leex which is a Erlang lexer
Definitions Rules
Parsing
Non-terminals Terminals Rules
Using yecc which is an Erlang parser generator
Root-symbol
Evaluating
Evaluating AST in Erlang without any intermediate code generation
Question || Comment
- https://hamidreza-s.github.com

Create Your Own Language

  • 1.
    Create Your OwnLanguage How to implement a language on top of Erlang Virtual Machine (BEAM) Hamidreza Soleimani Backend Developer / Architect @ BisPhone Tehran Linux User Group August 6, 2015
  • 2.
    Why should wecreate a new language? Lisp Javascript XML PHP Python Haskell Erlang Go Ruby Java CProlog Scala SQL SQL Elixir C++ Rust Perl C# Objectiv-CSQL
  • 3.
    There are lotsof Languages! By category:
 Programming, Query, Domain Specific, etc By Paradigm:
 Imperative, Declarative (Logic, Functional), Structured (Object Oriented, Modular), etc By Implementation:
 Compiled, Interpreted, Mixed, etc By Type System:
 Static, Dynamic, Strong, Weak, etc
  • 4.
    So why shouldwe create it? Reason 1: Implementing a new idea Reason 2: Solving a new problem Reason 3: Mastering language concepts Reason 4: Just for fun!
  • 5.
    How can wecreate a new language?
  • 6.
    1. Designing 1.1. Identifyingthe problem 1.2. Planning the target platform/machine 1.3. Determining language category, paradigm, types, etc
  • 7.
    2. Implementing Frontend 2.1.Lexical Scanning (Token Generating)
 (example: lex, flex, leex, etc) 2.2. Syntax & Semantics Parsing (AST Generating)
 (example: yacc, bison, yecc) 2.3. Preprocessing 2.4. Lint Analyzing 2.5. Generating Intermediate Language
 (example: GCC IR, LLVM IR, BEAM Bytecode, Java Bytecode)
  • 8.
    3. Implementing Backend 3.1.Analyzing Intermediate Language 3.2. Optimizing 3.3. Generating Machine Code 3.4. Evaluating and Executing
  • 9.
    Lets look insideErlang compiler
  • 10.
    – Holy Wikipedia Whatis Erlang? “Erlang is a general-purpose, functional, concurrent, garbage-collected programming language and runtime system, with eager evaluation, single assignment, and dynamic typing. It was originally designed by Ericsson to support distributed, fault- tolerant, soft real-time, highly available, non-stop applications. It supports hot swapping, so that code can be changed without stopping a system.”
  • 11.
    Code Generation Steps ErlangSource Code Parse Transformed & Preprocessed Code (erlc -P) Source Transformed Code (erlc -E) Abstract Syntax Tree (erlc +dabstr) Expanded Abstract Syntax Tree (erlc +dexp) Core Erlang (erlc +to_core) Assembler Code (erlc -S) BEAM Bytecode (erlc)
  • 12.
  • 13.
    Parse Transformed Code $erlc -P test.erl
  • 14.
    Source Transformed Code $erlc -E test.erl
  • 15.
    AST Code $ erlc+dabstr test.erl
  • 16.
    Expanded AST Code $erlc +dexp test.erl
  • 17.
    Core Erlang Code $erlc +to_core test.erl
  • 18.
  • 19.
    BEAM Byte Code $erlc test.erl
  • 20.
    Lets create aquery language for Tnesia
  • 21.
    – https://github.com/bisphone/Tnesia What isTnesia? “Tnesia is a time-series data storage which lets you run time-based queries on a large amount of data, without scanning the whole set of data, and in a key-value manner. It can be used embedded inside an Erlang application, or stand-alone with HTTP interface to outside which talks in a simple query language called TQL.”
  • 22.
    The Problem It seemsthat its Erlang API looks strange to non-Erlang developer!
  • 23.
    The Solution Create aQuery Language (TQL) that can be used over HTTP.
  • 24.
  • 25.
    Lexical Scanning Using leexwhich is a Erlang lexer Definitions Rules
  • 26.
    Parsing Non-terminals Terminals Rules Usingyecc which is an Erlang parser generator Root-symbol
  • 27.
    Evaluating Evaluating AST inErlang without any intermediate code generation
  • 28.
    Question || Comment -https://hamidreza-s.github.com