INTRODUCTION TO BLOCKCHAIN.
This video has one goal what is which is not overly technical and which I find useful in highlighting is
key features if you’re like me you probably first heard of the block chain technology through Bitcoin
then you might have learned something more about ethereum ripple and the large number of other
cryptocurrencies currently trading on the various exchanges all of them essentially you implemented
using different parameters a blockchain is for all intents and purposes a decentralised database as a
user you can upload some data to it and just like a database it will maintain the data and share it
with whoever you are price from the entire Internet to a single select address this database has two
unique features that set itself apart from say a typical centralised SQL based system like a web file
server first once the data is uploaded is stored and arranged in a rather unique way and second
because the system is often decentralised some type of consensus algorithms have to be employed
to make sure that at the end of the day everyone in the system received the same copy of the data
and there's no conflict there could be a coin or token floating on the blockchain like Bitcoin to
facilitate value transfers but that’s not always necessary and the token is just the type of data that
you can upload to the blockchain quote unquote database will go over all these features in
subsequent videos and you’ll see that the key distinguishing feature of different blockchain systems
like Bitcoin versus ethereum is number one the type of the data that you can quote unquote upload
to the blockchain and #2 how consensus is reached between parties in the network the stated goal
of most blockchain innovations is usually to achieve some level of trust and system integrity in some
type of financial or physical transactions without involving a central intermediary like a bank instead
the task of the bank will be spread out to the network of mills comprising the blockchain usually in
the marketing pictures you'll hear several key buzz words like fast transaction verification ownership
tracking data immutability temper resistance and so on as you’ll see later in this module some of
these claims are true some are not so let’s have a high level illustration of how blockchain works
graphically I’m sure that most of you have used a cloud based document storage system like drop
box or Google Drive which are essentially centralised databases maintained by these companies
suppose you write a Word document in Google doc that you want to share it with some colleagues
the document is just a piece of text data that you generate using a client application say a word
processor on your computer or phone once you write it and upload it it goes into say Google’s file
server which receives it and processes it by converting it into a file format that’s compatible with the
database and once the processing is done the data is the stored in the database using some
particular format say sequel or Hadoop which use particular file types designed to facilitate fast
searching and fast data retrieval a blockchain does very similar thing using a different process let’s
go back to the beginning so you want to upload some data on the blockchain instead as usual you
use your client application to generate the data and just like your word processor the client at
generating these data like a Bitcoin wallet is usually not part of the blockchain and the next step is
where the key differences begin instead of uploading it to Google server on a blockchain you’re
essentially broadcasting your data to an entire peer-to-peer network of connected computers which
are called notes and collectively these nose takes on the role of the single database server the first
received the data in a peculiar way then collectively quote unquote process the data using the
consensus algorithm and once processed the data will again be stored on the blockchain clinical
database using the unique chained data structure that we mentioned earlier what you support
videos to look at each of these four steps in detail in the order of conceptual difficulty well first look
at the client site that is the client identities on the blockchain then we'll talk about the unique chain
data structure which again makes data searching easier third will look at how the network receives
the data and finally well i could help the network processes the data using consensus which is often
the most difficult part.
CLASSIFYING BLOCKCHAIN TECHNOLOGY.
In the previous video we viewed a blockchain essentially as a decentralised database with some
unique features just like a database you can upload all kinds of data on the blockchain and this is one
way that blockchain could be classified by the type of data that they can take the first type is the
simplest and probably the one that you're most familiar with that’s the cryptocurrencies like Bitcoin
Litecoin dog coin et cetera these are essentially lectures in that the only type of data that you can
put on it are records of who paid home and possibly who owns what and that's why they're called
currencies because these ledgers can be used to record payment activities between different parties
unfortunately most of them didn’t pan out as intended because as well there shortly they are
incredibly inefficient and therefore few people actually use them as a payment mechanism instead
the majority of people use them for the speculative purposes just like the digital form of gold an
asset that doesn't have much intrinsic value but it can be held as a value storage or investment due
to other peoples perceptions therefore we can call these blockchains gold 2.0 the second type of
blockchain spearheaded by ethereum but also includes competitors like Tesco’s or EOS further
expand the type of data that it can take in addition to transaction records you can also upload
programmes or code that you write to the blockchain and the network of nose will then execute
these programmes this essentially makes the blockchain and the network of nose will then execute
these programmes this essentially makes the blockchain quote unquote programmable and
theoretically you can use it as a computer to execute any programme you right the programmes are
often called smart contracts and as you can see this dramatically expand the usability of the
blockchain from a simple Ledger to a decentralised cloud computing platform that you can build
apps on in reality however the possibility is much more limited as the inefficiency often prevents you
from upholding super complex programmes the third type sort of combines elements from these
two to have a programmable Ledger that was some tweaks like more centralisation becomes more
efficient and more usable for enterprise use look at ripple as an example of this class the next two
types or examples of the apps they can build on a programmable blockchain one type called utility
tokens represent access right to some blockchain apps these tokens are the ones that you usually
see in the initial coin offerings or icy owes will cover these in accredited course another example of
these apps are stable coins which are like the asset based securities in the blockchain world the goal
as their names suggest is to use the programmability of the blockchain to essentially pegged the
value of these tokens to a Fiat currency like the dollar or peg it to the value of some real assets like
gold or real estate will cover this type when we discussed the cryptocurrency market the first three
types tend to be called quote unquote coins which comes with their own blockchains the next types
tend to be called tokens which are apps that piggyback on other blackjack the next types tend to be
called tokens which are apps that piggyback on other blockchain another way that blockchain
technologies are usually classified is how acceptable they are that is how visible the data stored on
there are to the public on one extreme for most publicly traded cryptocurrencies like Bitcoin data on
the blockchain is visible to the entire Internet everyone can access these data by either joining the
network or simply go to a third party website like block explorer or etherscan because its public
anyone can join and become an ode note that this is different than using the blockchain as an user as
a note you’re a processor and are to provide the intermediation services like transaction verification
to the client users in some settings a note is called a minor in other settings is called a validator will
explain these terms shortly and more importantly because everything is public these systems usually
aim to maintain a high level of anonymity in practise this means hiding the nose and the clients
behind some pseudo anonymous identity generated using cryptography such that as public viewer
you can see all the transactions but cannot guess who made those transactions and because
everyone can participate this system could have a lot of notes and is the most decentralised version
of the blockchain therefore they need some complex consensus process to make sure that at the
end of the day all the notes have the same copy of the data on the other extreme is a private
blockchain which is very similar to a shared database instead of open to the public these blockchains
are just set up as data sharing tools between some already trusted parties say between the
manufacturer and his suppliers because of this all the members the nose are completely known to
each other and others cannot join without permission and because the parties already known to
each other often times there’s no need for any consensus mechanism and whoever upload the data
could for instance just share it with everybody and there’s no need to use that To incentivise other
parties to participate either not surprisingly this type of blockchain is the closest analogue to a
shared Google doc it is consequently used primarily for either internal purposes or B2B applications
between already trusted business parties in the middle taking features from both public and private
blockchains is a permissioned blockchain here just like a public blockchain the data could be made
publicly available in anonymised form but like a private blockchain not anyone can just simply sign
up to become a note access to this system is tightly controlled and knows minors or validators have
to receive permission to be able to participate and because of this just like a private blockchain the
consensus process is usually much simpler at the same time if the data is made public this could also
serve as a monitoring device to ensure that the permission knows are not doing anything bad this
would theoretically enhance the processing efficiency making it more suitable as enterprise level
product that could be used in high volume high frequency applications such as cross-border business
payments ripple and hyperledger fabric or examples of provision blockchains.
IDENTITIES ON THE BLOCKCHAIN
in a previous video we had an overview of the blockchain as a technology with a lot of similarities to
a shared database he has four components the user applicationIn a previous video we had an
overview of the blockchain as a technology with a lot of similarities to a shared database he has four
components the user application which maintains the identity and generates the day to the network
to which the data is broadcast and propagated the structure the data is stored on the blockchain and
the data processing mechanism that is the consensus most implementations of blockchain
particularly public blockchains like Bitcoin or ethereum use public key cryptography to maintain a
certain level of anonymity in this setting and identity in a blockchain is simply a pair of public private
keys this is how it works suppose you want to use the Bitcoin network to send a receipts and
payments here just like PayPal where you have to register an account you need to generate an
identity or corner quote username for yourself unlike PayPal however in Bitcoin this is a much
simpler process and that takes a fraction of a second you simply use an app or do it by yourself to
generate a pair of encryption keys these keys are essentially very large random numbers that are
mathematically linked and such that other people cannot easily guess one from the other both keys
can be used to encrypt data that you generate and you gotta call one of them the public key and one
of them the private key doesn't matter which one you pick the important part of this process is the
mathematical link between the keys their set up such that the data encrypted with your public key
can Only be decrypted now you can see why it's useful you going to publish one of the keys your
public key as much as possible put down your blog put in your e-mail signature et cetera because the
public he serves as your receiving address for data that other people sent to you and that’s what’s
underneath the Bitcoin wallet address that you see everywhere online suppose your friend wants to
send some Bitcoin to you then what he'll do is simply generate the transaction data and designate
your public key as the receiver then only you can quote unquote open it by decrypting it with the
private key which only you have and will try to keep it a secret as possible on the other hand when
you want to send the money to someone else your private key serves as your signature you going to
generate the transaction data to someone else's public key and you're going to encrypt it with your
private key this might seem trivial and pointless because anyone can decrypt it with your public key
but that's precisely the point because any information that can be decrypted by your public key must
be encrypted with your private key which only you have and no one else has therefore other people
such as the nose or the miners can verify that the transaction indeed came from you by decrypting it
with your public key this is called checking the signature if they cannot decrypt it with the public he
then the transaction is not signed by your private key and therefore did not come from you this
setting ensures that a transaction could only be initiated with a person possessing the private key
and no one else immediately there are several caveats that we need to know first as you can see
there is no inherent kyc rule or know your customer rule on the blockchain your only identity there
is the public private key pair and because these are essentially random numbers you can create as
many as you want just like a username or e-mail address for example you can create one key pair to
receive some Bitcoin then spread it to 1000 other key pairs and use each keep here for one
transaction only this gives you a certain level of anonymity because on the blockchain the public only
sees the transactions between the public keys we cannot tell the link between the keys and the
actual people at the same time this setting is only pseudo anonymous or pseudonymous because we
still see all the transactions so we can uncover a certain amount of information by running
sophisticated machine learning algorithms on these transaction data also if you for example require
Bitcoin using legitimate means such as buying it on the exchange the exchange would maintain the
link between your identity your actual ID and the keys you use depending on which jurisdiction they
are located in EU S this kyc role is required finally you need to realise that the public key encryption
setting is a significantly weaker form of security than say two factor password authentication used by
banks a lot of blockchain solution providers use marketing pitches like this isn't work based and
more secure form of transaction and is simply not correct because here your private key is your only
form of authentication and you are 100% responsible for keeping it secure if someone steals your
private key is game over they can use it to sign away all your transactions and there's absolutely no
recourse this is a very heavy security requirement and one that's hard to beat will talk about hot and
cold key storage in the cryptocurrency module but people have gone as far as printing their keys out
and storing those in nuclear hardened bunkers but even that suppose you
BLOCKCHAIN DATA
BLOCKCHAIN DATA.
In previous videos we had an overview of the blockchain as a technology similar to a shared
database in this video we’ll look at the structure with which data is stored on the blockchain
clinical database and why the structure might be useful in applications such as cryptocurrencies
blockchain as its name suggests stores data in add linked setting where each block of data is linked
to one or more data blocks created previously therefore making searching through the database
backward in time efficient this linkage between the data pieces is done by applying a concept
that’s well used in computer file searching to create what’s called hash pointers let’s take a look at
this hashing concept and see why it’s useful think about a friend of yours among a room of
strangers how do you identify your friend you might have a lot to say about them their looks their
personality a lot of data that’s unique about your friend that is different than the other people
however computationally you don’t need all that data all you need to tell your friend
unequivocally apart from the others is a fingerprint and this is exactly what hashing does it takes a
piece of data of any length and it creates an almost unique fingerprint of that data of a small to
fixed length such that you can embed it in another piece of data linking them together there are
many hashing algorithms and one example is the SHA 256 cryptographic hashing algorithm used in
the Bitcoin blockchain here’s how it works take a piece of data that you want to create the
fingerprint from it could be anything could be text image video but stored on a computer these are
always sensually once in zeros the hashing function takes these ones and zeroes split them into
chunks and run many many rounds of simple logical operations of them such as end or xor et
cetera essentially shuffling these ones and zeros and condensing them into fewer ones and zeroes
until eventually there are only 256 ones and zeros left that’s 256 bits 32 bytes or 64 dishes
expressed in hexadecimal format so essentially what hashing does is the fingerprinting process
take some input data of any arbitrary length and quote unquote compresses it to create a almost
unique representation of that data in 128 two 156 or other small amounts of ones and zeros by
the way this is how searching on your computer is done each file is indexed that is having a hash
created from it and searches are done on the collection of hashes which are much smaller than the
full data making the process much faster in order to use on a blockchain particularly on public
blockchains distributed to a large number of parties a good cryptographic hashing function like
SH8256 should have some good concealment and efficiency features first it should be very quick to
compute a non decent computers shh functions usually takes a fraction of a second regardless of
the input data sites second it should be very difficult to reverse engineer that is given the hash it
should be infeasible to back out the original input first there needs to be was called a high
avalanche effect which means that if you change the input by just a little bit say adding an
additional zero then taking a hash of it the new hash should be dramatically different from the
original hash this ensures that you can’t simply guess the input by pattern analysis in fact the only
way to guess the input will be performing trying all the input hash one at a time and find the input
that produces that hash taken together these features ensure that a cryptographic hashing
function is essentially A1 way function that is easy to compute but very hard to reverse this
property is very important and is the foundation of a lot of applications where is important to
cheque the authenticity of some data but you don’t want to necessarily reveal that data one
example is password storage the password associated with your online account is stored on the
server in hash so even if someone breaches the site they still can’t easily reverse it unless they
brute force but it’s easy to cheque because when you enter the password is hashed and the
hashes compared with historic copy granting you access if these match the other example is the
Bitcoin mining puzzle which will talk about in the next module here in a generic blocking setting
hashing is primarily used to create the data fingerprint allowing us to link the data together right
here’s how it works in the beginning some data is generated and stored on the blockchain this
state of lock is called the genesis block as it marks the start of the chain when the next batch of
data is created and stored however it’ll include a hash of the previous block this unique fingerprint
essentially creates a pointer pointing toward the previous block that is the same process for future
blocks as well the next block will contain a hash of the previous block which is self-contained the
hash of the block for it and so on South forth note that we can use these hash pointers anywhere
not just on the blocks itself but individual data pieces within each block as well for instance the
Bitcoin blockchain stores individual transaction data and each transaction could contain the hash
of one or more previous transactions showing the previous receipt of the coins this is the chain
concept in blockchain hash pointers allow data blocks and individual data pieces to be chained to
previous histories the reason why this particular structure is needed is twofold first just like any
file search hashing makes searching through the blockchain easier Toulon case in particular data
we just need to take a hash of it and search through all the hash is stored on the blockchain which
is much smaller insights and this is particularly important in a distributed blockchain like Bitcoin
where searches are done over 10s of thousands of nose in addition hash pointers create a so-
called temper evident data log on the blockchain which again in a public setting serves to enhance
data integrity to see this suppose someone tried to alter the historical data in say black one maybe
changing the ownership of some coins but this will be immediately evident to all blockchain notes
because the altered block when hashed won’t match the store historical hash of that block in
block two which again when hashed won’t match the stored hash in block three is so awesome
word therefore attempts to manipulate the data on a well distributed blockchain will be quite
obvious and the proof of any action is evident using hashes notice that I label this as temper
evidenced not tamper proof or data immutability and it’s important to distinguish these claims
because they are often used interchangeably in marketing pitches here the hash pointers make
temporary attempts evident to all nodes of the network but tampering is still very possible even
likely if the tampering party happened to control a large number of nose using say a 51% attack
will look at this distinction in more detail in the video on consensus algorithms
BLOCKCHAIN NETWORK AND DATA PROCESSING.
Let’s take a look at the third component of the blockchain database at how data are sent and
received over next say you use a client application like a Bitcoin wallet to generate some
transaction data if you are using a traditional database to record it this process is easy you just
send it over to the web address of the database server and it will be quickly received and
processed if you’re using a relatively centralise blockchain with a limited number of notes like
ripple this process is also easy you just broadcast the data to all the nose simultaneously and
they’ll receive and process the data in a relatively short time frame things get a little bit more
complicated if you’re using the distributed blockchain like Bitcoin instead of a few central parties
you get a network with hundreds of thousands to millions of notes because of network latency if
you’re trying to get the transaction uploaded to all the notes it would take a really long time
therefore in a distributed blockchain some compromise has to be implemented to make this
process more time efficient the setting will look familiar to you if you have used peer-to-peer file
sharing apps like BitTorrent if not don’t worry things will be quite clear after this video so you
have a network of nose carrying blockchain let’s use Bitcoin as example where the network of
nose are also called miners suppose you want to send some Bitcoin that you have to another
person and this means getting the transaction record uploaded on to the Bitcoin blockchain so
your user wallet app to generate the transaction data which as we see shortly is just a
programming script but instead of sending it to the entire network you’re going to broadcast the
transaction data to a few minors that are closest to you in terms of network latency and the
miners will propagate the data on the network using what’s called a gossip protocol as his name
suggests the protocol is similar to how a gossip was spread among a network of friends you
wouldn’t shout the gossip allowed to everyone but instead you whisper it to the people sitting
next to you same process here the minors will propagate the transaction to the miners that are
closest to them in terms of network latency and this process goes on and on until the entire
network has received the data graph theory suggests that this form of information propagation is
more efficient than the setting where you broadcast information to everybody in practise though
this has to limitations the obvious one is that despite being more efficient it will still take some
time for the information to go through particularly if the network is large or the data is large
insight could still take minutes or even hours therefore most blockchain variations implement
some sort of time cut off for instance in Bitcoin every block is cut at roughly every 10 minutes if
your data didn't make it to the block you have to wait for it to get onto the next block 10 minutes
later this process however introduces a second problem sometimes the 10 minute cut off is not
enough time for the data to be propagated across the entire network so at the end of every
interval some nose whenever received it some nose wouldn't they would have a different set of
transaction information received from other wallet apps closest to them this conflict has to be
resolved and this is the reason for the 4th block Jain component the consensus mechanism which
will discuss in the next video now that we know how the data is propagated across the network
let's take a closer look at the data themselves and see how they are generated and received again
as I said before most blockchains use a scripting language so the data broadcast and stored on the
blockchain or essentially some programming code and the associated inputs and outputs this is an
example of the Bitcoin transaction data that any wallet could generate and now with knowledge
from the previous couple videos we can almost make sense of it is a simple script with some
inputs and outputs the input have hash pointer containing the hash of the previous transaction
proving that you have received the Bitcoin next script sick is just subscript that uses your private
key to sign the transaction as an output you're going to put the amount that you want to send the
receivers the receivers quote unquote shipping address which is just a hash of their public key and
finally a couple of scripts for the miners to execute to cheque the validity of the transaction
including checking the signature to make sure that it matches your public key and also checking
the transaction amount to make sure that you're now spending more than you have and that’s
essentially how your data is processed in a distributed blockchain like Bitcoin let’s take another
30,000 foot view the transaction data is generated as a script using the client app and assigned
using your private key when the miners received the script execute them to cheque the
transaction has the valid signature and the amount then propagate them to the other minors once
it officially make to the blockchain after the consensus is reached the receiver can then repeat the
process and send the coin that they just received elsewhere the final piece of the blockchain data
processing puzzle is the notion of blocks as we saw a couple minutes ago blocks are there simply
to serve as a quota quote batch processing mechanism to enhance the efficiency and enforce a
time cut off for data to be broadcast across the network in this feature the blockchain network is
similar to the AC H network that we talked before instead of the credit card network sure you can
use each transaction as a block and process them in real time just like a credit card transaction but
usually is more efficient to do at least some batch processing like an ACA transaction different
blockchains use different parameters on the block size and time intervals and there’s a lot of
flexibility there from around 10 minutes in Bitcoin to between 10 and 20 seconds for etherium
during this interval the nose received the data as usual and instead of sending each one through
real time they’re going to group them into a pending block and again because of network latency
or even attack attempts this pending block could be different for each node as each pending block
will contain a different set of transactions that the nose have received so far at the end of each
interval the nose are going to reconcile their pending blocks using some consensus algorithm so I
the end only one block makes it to the train and is downloaded by all the notes and this process
would repeat within each block the data is usually organised by the notes themselves using for
instance or countries here the important part is because of the decentralised nature of the
network there is no hard and fast rule that requires the nose to organise the data strictly in a first
come first served basis in fact the nose have complete discretion on whether to receive the data
and how to organise it within the bloc in many cases you might have to pay to get your data
received and broadcast and as well see in the module on cryptocurrencies this could serve as an
important incentive for the nose to participate in the network