01110111011010001000 MOV A JMP B
What is a programming language?
A programming language is a notational system for
describing computation in a machine-readable and human-readable form.
To build programs, people use languages that are similar
to human language. The results are translated into machine code, which computers understand.
Classification of the languages
Programming languages fall into three broad categories:
Machine languages
Assembly languages
Higher-level languages
In the formal classification the above 3 categories are
divided into as 5 Generations of languages.
Generations of Programming Languages
1st GL: 2nd GL: 3rd GL: 4th GL: 5th GL: machine codes symbolic assemblers (machine-independent) imperative languages (FORTRAN, Pascal, C ...) domain specific application generators AI languages
Each generation is at a higher level of abstraction i.e.: The higher the generation, the lesser the programmer will be aware of the internal workings of the program in the hardware level.
A Brief Chronology
Early 1950s 1957 FORTRAN 1958 ALGOL 1960 LISP, COBOL 1962 APL, SIMULA 1964 BASIC, PL/I 1966 ISWIM 1970 Prolog 1972 C 1975 Pascal, Scheme 1978 CSP 1978 FP 1983 Smalltalk-80, Ada 1984 Standard ML 1986 C++, Eiffel 1988 CLOS, Oberon, Mathematica 1990 Haskell 1990s Perl, Python, Ruby, JavaScript 1995 Java 2000 C# order codes (primitive assemblers) the first high-level programming language the first modern, imperative language Interactive programming; business programming the birth of OOP (SIMULA) first modern functional language (a proposal) logic programming is born the systems programming language two teaching languages Concurrency matures Backus proposal OOP is reinvented FP becomes mainstream (?) OOP is reinvented (again) FP is reinvented Scripting languages become mainstream OOP is reinvented for the internet
What is Machine Code ?
Machine code is the only form of program instructions that
the computer hardware can understand and execute directly. All other forms of computer language must be translated into machine code in order to be executed by the hardware. Machine code consists of many strings of binary digits that are easy for the computer to interpret, but tedious for human beings to interpret. Machine code is different for each type of computer.
A program in machine code for an Intel x86-based PC will not run
on an IBM mainframe computer, and vice versa.
Assembly Language
Assembly language is a symbolic representation of machine
code, which allows programmers to write programs in machine code without having to deal with the long binary strings. For example, the machine code for an instruction that adds two numbers might be 01101110, but in assembly language, this can be represented by the symbol ADD. A simple assembler program translates this symbolic language directly into machine code. Because machine code is specific to each type of computer hardware, assembly languages are also specific to each type of computer. However, all machine languages and assembly languages look very similar, even though they are not interchangeable.
Assembly code
Assembler Object code
High Level Languages
High-level language is a language that is convenient for human
beings to understand. High-level programming languages must be translated into machine code for execution, and this process is called compilation. A program that carries out this translation is a compiler. High-level language may bear no resemblance at all to machine code. The compiler figures out how to generate machine language that does exactly what the high-level-language source program specifies. Languages like C++, Algol, COBOL, etc., are all compiled high-level languages. They usually work more or less the same across all computer types, which makes them much more portable than assembly language.
Higher-Level Languages Third-Generation Languages
Third-generation languages (3GLs) are the first to use true English-
like phrasing, making them easier to use than previous languages.
3GLs are portable, meaning the object code created for one type of
system can be translated for use on a different type of system.
The following languages are 3GLs:
FORTAN COBOL BASIC Pascal
C C++ Java ActiveX
Example :
Higher-Level Languages Fourth-Generation Languages
Fourth-generation languages (4GLs) are even easier to use
than 3GLs.
4GLs may use a text-based environment (like a 3GL) or may
allow the programmer to work in a visual environment, using graphical tools.
The following languages are 4GLs: Visual Basic (VB) VisualAge Authoring environments
Higher-Level Languages Fifth-Generation Languages
Fifth-generation languages (5GLs) are an issue of debate
in the programming community some programmers cannot agree that they even exist.
These high-level languages would use artificial
intelligence to create software, making 5GLs extremely difficult to develop.
Solve problems using constraints rather than algorithms,
used in Artificial Intelligence
Prolog
Why we need assembly langauge?
Early computer systems were literally programmed by hand. Front panel switches were used to enter instructions and data. These switches represented the address, data and control lines
of the computer system. To enter data into memory, the address switches were toggled to the correct address, the data switches were toggled next, and finally the WRite switch was toggled. This wrote the binary value on the front panel data switches to the address specified. Once all the data and instruction were entered, the run switch was toggled to run the program.
The programmer also needed to know the instruction set
of the processor.
Each instruction needed to be manually converted into bit
patterns by the programmer so the front panel switches could be set correctly.
This led to errors in translation as the programmer could
easily misread 8 as the value B.
It became obvious that such methods were slow and
error prone.
With the advent of better hardware which could address
larger memory, and the increase in memory size (due to better production techniques and lower cost), programs were written to perform some of this manual entry.
Small monitor programs became popular, which allowed
entry of instructions and data via hex keypads or terminals.
Additional devices such as paper tape and punched cards
became popular as storage methods for programs.
Programs were still hand-coded, in that the conversion
from mnemonics to instructions was still performed manually. To increase programmer productivity, the idea of writing a program to interpret another was a major breakthrough. This would be run by the computer, and translate the actual mnemonics into instructions. The benefits of such a program would be
reduced errors
faster translation times changes could be made easier and faster
Assembly language programming is writing machine
instructions in mnemonic form, using an assembler to convert these mnemonics into actual processor instructions and associated data.
So basically assembly language is a translator from
human world to the machine world.
Machine Code
Mnemonics
Assembler
Disadvantages of assembly language programming
the programmer requires knowledge of the processor
architecture and instruction set
many instructions are required to achieve small tasks. source programs tend to be large and difficult to follow programs are machine dependent, requiring complete
rewrites if the hardware is changed
Software development process
The Real World Problem Logical Solution (On paper) Selecting tools (programming language/ method)
Coding Release as final solution Testing /Debugging
The program translation sequence
Developing a software program to accomplish a particular
task : the implementer chooses an appropriate language, develops the algorithm (a sequence of steps, which when
carried out in the order prescribed, achieve the desired result), implements this algorithm in the chosen language (coding), then tests and debugs the final result.
Software execution process
Machine code or the executable code which the machine understands
Assembly language programming
Features provided by an assembler are, allows the programmer to use mnemonics when writing source code programs. variables are represented by symbolic names, not as memory locations symbolic code is easier to read and follow error checking is provided changes can be quickly and easily incorporated with a re-assembly programming aids are included for relocation and expression evaluation In writing assembly language programs for micro-computers, it is
essential that a standardized format be followed. Most manufacturers provide assemblers, which are programs used to generate machine code instructions for the actual processor to execute.
The assembler converts the written assembly
language source program into a format which run on the processor. Each machine code instruction (the binary or hex value) is replaced by a mnemonic. A mnemonic is an abbreviation which represents the actual instruction.
Binary 01001111 00110110 01001101 Hex 4F 36 4D Mnemonic CLRA PSHA TSTA Clears the A accumulator Saves A acc on stack Tests A acc for 0
Mnemonics are used because they are more meaningful than hex or binary values reduce the chances of making an error are easier to remember than bit values Assemblers also accept certain characters as representing
number bases and addressing modes.
$ prefix or h suffix for hexadecimal D for decimal numbers B for binary numbers
O or Q for octal numbers
# for immediate addressing X for indexed addressing
$24 or 24h 24D 67 0101111B 377O 232Q LDAA #$34 LDAA 01,X
Assembly language statements are written one per line. A machine code program thus consists of a sequence of
assembly language statements, where each statement contains a mnemonic. Each line of an assembly language program is split into four fields, as shown below
LABEL OPCODE OPERAND COMMENTS
Label
The label field is optional. A label is an identifier (or text string symbol). Labels are used extensively in programs to reduce reliance upon
programmers remembering where data or code is located. A label can be used to refer to a memory location the value of a piece of data the address of a program, sub-routine, code portion etc. The maximum length of a label differs between assemblers. Some accept up to 32 characters long, others only four characters.
A label, when declared, is suffixed by a colon, and begins with a valid
character (A..Z).
Example
START: LDAA #24H
Here, the label START is equal to the address of the instruction LDAA
#24H.
The label is used in the program as a reference, eg, JMP START
This would result in the processor jumping to the location (address)
associated with the label START, thus executing the instruction LDAA #24H immediately after the JMP instruction. When a label is referenced later on in the program, it is done so without the colon suffix.
An advantage of using labels is that inserting or re-
arranging code statements do not necessitate re-working actual machine instructions.
A simple re-assembly is all that is required. In hand-coding, such changes can take hours to perform.