0% found this document useful (0 votes)

58 views28 pages

Introduction To Kolmogorov Complexity

Uploaded by

付桡

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views28 pages

Introduction To Kolmogorov Complexity

Uploaded by

付桡

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

An introduction to

Kolmogorov complexity
Liliana Salvador and Gustavo Lacerda
Complex Systems Summer School 2009
Outline
Symbols, strings and languages

Turing machines

Kolmogorov complexity

Incompressibility

Kolmogorov Complexity and Shannon entropy

Symmetry of Information

Time-bounded Kolmogorov complexity

Learning as compression

Universal learning

Universal measures of similarity

Clustering by compression
Symbols, Strings and
Languages
Alphabet A = {0,1} : finite set of symbols
A string over an alphabet A is a finite ordered sequence of symbols
from A.

The empty string, denoted by ε is the (unique) string of length

zero.
Given an alphabet A, we deﬁne
A0 = {ε}
An+1 = AAn

A language L over an alphabet A is a subset of A∗. That is, L ⊂

A∗.
The Roots…
Probability theory, information theory, philosophical
notions of randomness, theory of algorithms
Regular sequence Pr(000000000000000000) = 1/220
Regular sequence Pr(01100110011001100110) = 1/220

Random sequence Pr(01000110101111100001) = 1/220

Classical probability theory cannot express the notion of

randomness of an individual sequence. It can only express
expectations of properties of the total set of sequences under
some distribution.
How to measure information
of a single object?
It has to be an attribute of the object itself
Independent of the description method
Is defined as the length of the shortest binary program from which
the object can be effectively reconstructed.
Example:
Consider the ensemble of all binary strings of length
9999999999999999.
By Shannon’s measure, we require 9999999999999999 bits on
average to encode a string in such an ensemble. However, this
number can be encoded in about 55 bits by expressing it in binary
and adding the repeated pattern 1, or even further 32 x
1111111111111111 (that consists of 24 1’s).
Turing Machine
Turing Machine
A Turing machine
. T = (S, A, δ, s0, b, F )
consists of:
. S = a ﬁnite set of states
. A = an alphabet
. δ : Sx(A ∪ {b}) → Sx(A ∪
{b})x{L, R}, the transition
function,
. s0 ∈ S, the start state,
. b, marking unused tape
cells,
. F ⊂ S, halting and/or D
accepting states.
Happy Birthday Alan!
(23 June 1912 – 7 June 1954) was a British
mathematician, logician, cryptanalyst and computer
scientist.
Considered to be the father of modern
computer science.

Provided an influential formalization of the

concept of the algorithm and computation with
the Turing machine.

Turing was homosexual, living in an era when

homosexuality was considered a mental illness
and homosexual acts were illegal. He was
criminally prosecuted, which essentially ended
his career.

He died not long after from what was officially

declared self-induced cyanide poisoning,
although his death was considered very
ambiguous.
Halting Problem
Given a program and an input to the program, whether
the program will eventually halt when run with that
input.

The halting problem is famous because it was one of

the first problems proven algorithmically undecidable
(not computable).

This means there is no algorithm which can be applied

to any arbitrary program and input to decide whether
the program stops when run with that input.
Occam’s Razor
William of Ockham (1290--1349)

 “Entities should not be multiplied beyond necessity.”

 Commonly explained as: when have choices, choose

the simplest theory.

 Bertrand Russell: ``It is vain to do with more what

can be done with fewer.'‘

 Newton (Principia): ``Natura enim simplex est, et

rerum causis superfluis non luxuriat''.
Ray Solomonoff
(born 1926, Cleveland, Ohio)

Algorithmic probability

Theory of inductive inference

Attended the first meeting

Where AI became a field
…..
Andrey Nikolaevich Kolmogorov
(1903, Tambov, Russia—1987 Moscow)‫‏‬

 Measure Theory
 Probability
 Analysis
 Intuitionistic Logic
 Cohomology
 Dynamical Systems
 Hydrodynamics
 Kolmogorov complexity
Gregory Chaitin
(born 1947, is an Argentine-American)

Algorithmic information theory

Chaitin’s constant Ω
Biology
Neuroscience
Philosophy
……
Algorithmic information
(Kolmogorov complexity)
Proof of the Invariance theorem
Fix an effective enumeration of all
Turing
m
achines (TM’s): T1, T2, … Define K = min {|p|: T(p) = x}
U is an optimal universal TM such that (p produces x)
U(1n0p) = Tn(p)‫‏‬
Then for all x:
KU(x) ≤ KTn(x) + n+1, and |KU(x) – KU’(x)| ≤ c
Fixing U, we write K(x) instead of KU(x).
[Paul Vitanyi slide]
Properties and examples
 Intuitively: K(x)= length of shortest description of x
K(x) ≤ |x|+O(1)

 K(x|y)=length of shortest description of x given y.

K(x|y) ≤ K(x)+O(1)‫‏‬

 For all x,
 K(x|x) = O(1)‫‏‬
 K(x|ε) = k(x); K(ε|x)=O(1)‫‏‬

 K(x,x) = K(x) + O(1)‫‏‬

 K(x,y) ≤ K(x) + K(y) + O(log(min{K(x),K(y)})‫‏‬
Randomness
 Randomness of strings mean that they do not contain
regularities.
 If the regularities are not effective, then we cannot use them.

 Hence, we consider randomness of strings as the lack of

effective regularities (that can be exploited).
 For example: a random string cannot be compressed by any
known or unknown real-world compressor.

[Paul Vitanyi slide]

Intuition:
Randomness = Incompressibility

 For constant c>0, a string x ε {0,1}* is c-incompressible if

K(x) ≥ |x|-c.
Usually, we often simply say that x is incompressible
(We will call incompressible strings random strings.)‫‏‬
Shannon Entropy and Kolmogorov
complexity
Shannon entropy of random variable X over sample size S

H(X) = ∑ P(X=x) log 1/P(X=x)

H(X) bits are necessary on P-average to describe the outcome x.

Example. For P uniform over finite S, we have

H(X)=∑ (1/|S|)log |S| = log |S|.

Kolmogorov complexity, is the minimum description (smallest program) for

one fixed x. And the expected H(X) and the P-expectation of K(x) converge to
the same thing.

H(P) = - ∑P(x) log P(x) is assimptotically equal to the expected complexity

∑x P(x)K(x) ≤ - log P(x) + O(1)

[Paul Vitanyi slide]

Symmetry of information
I(x;y) = K(y) - K(y|x)
= K(x) – K(x|y)
= I(x;y) (up to an additive log term)

(the first term is read as “the information x knows about y)

Entropy or CK?
“It has been shown that although in practice we
can’t be guaranteed to get the right answer to
either the entropy or computational complexity
values, we can be sure that they are (essentially)
equal to each other, so both methods can be
useful, depending on what we know about the
system, and what our local goals are. “
• Tom Carter’s notes
Resource bounded KC
 K complexity depends on unlimited computational
resources. Kolmogorov himself first observed that we can
put resource bounds on such computations. This was
subsequently studied by Barzdins, Loveland, Daley, Levin,
Adleman.

 In the 1960’s, two parallel theories were developed:

 Computational complexity – Hartmanis and Stearns,
Rabin, Blum, measuring time/space complexity
 Kolmogorov complexity, measuring information.
 Resource bounded KC links the two theories.

[Paul Vitanyi slide]

Theory
 Ct,s(x|y) is the t-time s-space limited Kolmogorov complexity
of x condition on y. I.e. the length of shortest program that
with input y, produces x in time t(n) and space s(n).

 In standard K complexity, it does not matter if we say

“produces x” or “accepts x”, they are the same. But in
resource bounded case, they are likely to be different. So
Sipser defined CDt,s(x|y) to be the length of the shortest
program that with input y accepts x in time t(n) and space
s(n).
 When we use just one parameter such as time, we will
simply write Ct or CDt.

[Paul Vitanyi slide]

Learning as (lossless)
compression:
Q: how do you compress well?

A: by finding patterns in the data, i.e. by learning to

predict it

compression is learning: in the long run, it's impossible

to compress a source without learning the patterns in it

learning is compression: the output of learning from

data is a more compact representation of the data
Universal learning
KC suggests a universal learning algorithm: given data
from a source, search for a TM that outputs the same
distribution.

Of course, there are infinitely many such programs.

One way to encode Occam's razor is to select the

smallest such program. This is roughly the idea behind
MDL learning (in reality, MDL uses a restricted class of
programs)
Universal Measures of
Similarity
Normalized Compression Distance:

Since KC is uncomputable, we estimate it using gzip (this is an

overestimate)

Grabbing random articles Wikipedia in 4 languages: (Portuguese,

Spanish, Dutch, German), we compute NCD, and find the
following distances:

NL1-NL2 0.9062

PT-ES .9774, .9698

NL-DE .9801, .9812
PT-NL .9872, .9871
PT-DE .9965, .9957
ES-NL .9917, .9961
ES-DE .9975, 1.000
Clustering by Compression

Clustering music
Warnings!
Warning: gzip uses superficial features! It won't capture
capture deeper similarities, since that would require
lots of data and computing time.

This is universal learning: an ideal compression

algorithm will find/exploit any pattern. Of course, this
means that this method really sucks in practice!

Kolmogorov Complexity
No ratings yet
Kolmogorov Complexity
13 pages
Lloyd 2001 Measures of Complexity - A Nonexhaustive List
No ratings yet
Lloyd 2001 Measures of Complexity - A Nonexhaustive List
2 pages
Intro to Complexity Systems
No ratings yet
Intro to Complexity Systems
49 pages
Systems Engineering
No ratings yet
Systems Engineering
6 pages
In Pursuit of Complexity
No ratings yet
In Pursuit of Complexity
18 pages
(Understanding Complex Systems) Hassan Qudrat-Ullah, J. Michael Spector, Paal Davidsen-Complex Decision Making - Theory and Practice-Springer (2010)
No ratings yet
(Understanding Complex Systems) Hassan Qudrat-Ullah, J. Michael Spector, Paal Davidsen-Complex Decision Making - Theory and Practice-Springer (2010)
332 pages
SpringerNature Books Title List 20240403 062757
No ratings yet
SpringerNature Books Title List 20240403 062757
1,056 pages
Applying System Dynamics To Confront Complex Decision Making in R&D Systems
No ratings yet
Applying System Dynamics To Confront Complex Decision Making in R&D Systems
42 pages
Hayek: Cognitive Scientist Avant La Lettre
100% (1)
Hayek: Cognitive Scientist Avant La Lettre
44 pages
Graduate Studies in Mathematics, 243 J M Landsberg Quantum Computation
No ratings yet
Graduate Studies in Mathematics, 243 J M Landsberg Quantum Computation
221 pages
The Computational Complexity of Machine Learning - Michael J. Kearns PDF
No ratings yet
The Computational Complexity of Machine Learning - Michael J. Kearns PDF
176 pages
Rational Decisions Ken Binmore Instant Download Full Chapters
100% (4)
Rational Decisions Ken Binmore Instant Download Full Chapters
85 pages
System of Systems
No ratings yet
System of Systems
18 pages
Complex Adaptive Systems Guide
No ratings yet
Complex Adaptive Systems Guide
43 pages
Understanding Complexity: Knowledge Solutions
No ratings yet
Understanding Complexity: Knowledge Solutions
9 pages
Data Envelopment Analysis History Models and Inter
No ratings yet
Data Envelopment Analysis History Models and Inter
40 pages
Complex Adaptive Systems
100% (1)
Complex Adaptive Systems
9 pages
(Understanding Complex Systems) Marcel Ausloos, Michel Dirickx - The Logistic Map and The Route To Chaos From The Beginnings To Modern Applications UCS-Springer (2006) PDF
100% (1)
(Understanding Complex Systems) Marcel Ausloos, Michel Dirickx - The Logistic Map and The Route To Chaos From The Beginnings To Modern Applications UCS-Springer (2006) PDF
400 pages
Foundations of Social Choice Theory PDF
No ratings yet
Foundations of Social Choice Theory PDF
259 pages
Complexity Theory
No ratings yet
Complexity Theory
24 pages
1999 Lapackuser'guide
100% (1)
1999 Lapackuser'guide
435 pages
Number Theory Revealed An Introduction Andrew Granville Download
100% (1)
Number Theory Revealed An Introduction Andrew Granville Download
105 pages
978 1 4612 0539 5
No ratings yet
978 1 4612 0539 5
275 pages
Logic and Debate
No ratings yet
Logic and Debate
501 pages
Decidability and Reductions
No ratings yet
Decidability and Reductions
178 pages
1 Bounded Rationality - Aumann
No ratings yet
1 Bounded Rationality - Aumann
13 pages
Lecture Notes On Matroid Optimization 4.1 Definition of A Matroid
No ratings yet
Lecture Notes On Matroid Optimization 4.1 Definition of A Matroid
14 pages
Dawson, JR., John W. - 'Godel and The Limits of Logic'
100% (1)
Dawson, JR., John W. - 'Godel and The Limits of Logic'
6 pages
Robert Aumann
No ratings yet
Robert Aumann
27 pages
Carter - An Introduction To Information Theory and Entropy
No ratings yet
Carter - An Introduction To Information Theory and Entropy
126 pages
Matroid
No ratings yet
Matroid
18 pages
What Is A Complex System
100% (1)
What Is A Complex System
203 pages
Complex System Classification PDF
No ratings yet
Complex System Classification PDF
18 pages
Simulating Interacting Agents and Social Phenomena 2010 PDF
No ratings yet
Simulating Interacting Agents and Social Phenomena 2010 PDF
277 pages
Introduction to Type Theory Notes
100% (1)
Introduction to Type Theory Notes
60 pages
Algorithms On Strings Trees and Sequence PDF
No ratings yet
Algorithms On Strings Trees and Sequence PDF
326 pages
(International Series in Operations Research &Amp_ Management Science 239) Shiuh-Nan Hwang, Hsuan-Shih Lee, Joe Zhu (Eds.)-Handbook of Operations Analytics Using Data Envelopment Analysis-Springer US
100% (1)
(International Series in Operations Research &Amp_ Management Science 239) Shiuh-Nan Hwang, Hsuan-Shih Lee, Joe Zhu (Eds.)-Handbook of Operations Analytics Using Data Envelopment Analysis-Springer US
511 pages
Bluebook - Small Talk
No ratings yet
Bluebook - Small Talk
748 pages
Meta-Programming - A Software Production Method by Charles Simonyi
100% (1)
Meta-Programming - A Software Production Method by Charles Simonyi
146 pages
Calculus of Variations C
100% (1)
Calculus of Variations C
22 pages
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
No ratings yet
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
636 pages
Beginner Complexity Science Module
No ratings yet
Beginner Complexity Science Module
44 pages
Transition To Higher Mathematics - Structure and Proof (Second Edi
No ratings yet
Transition To Higher Mathematics - Structure and Proof (Second Edi
290 pages
Bellman Dynamic Programming (1957)
100% (1)
Bellman Dynamic Programming (1957)
365 pages
Cognitive Psychology - Module 1
No ratings yet
Cognitive Psychology - Module 1
72 pages
Lean's Dependent Type Theory
No ratings yet
Lean's Dependent Type Theory
44 pages
Lean Math for Beginners
No ratings yet
Lean Math for Beginners
140 pages
(Lecture Notes in Computer Science 6309 _ Information Systems and Applications, Incl. Internet_Web, And HCI) M. Tamer Özsu, Patrick Kling (Auth.), Mong Li Lee, Jeffrey Xu Yu, Zohra Bellahsène, Rainer
No ratings yet
(Lecture Notes in Computer Science 6309 _ Information Systems and Applications, Incl. Internet_Web, And HCI) M. Tamer Özsu, Patrick Kling (Auth.), Mong Li Lee, Jeffrey Xu Yu, Zohra Bellahsène, Rainer
163 pages
Oudot Quivers Data Analysis PDF
No ratings yet
Oudot Quivers Data Analysis PDF
229 pages
Complex Adaptive Systems Report
100% (1)
Complex Adaptive Systems Report
6 pages
Philosophical Computations Reflections o
No ratings yet
Philosophical Computations Reflections o
192 pages
Decap453 Data Communication and Networking
No ratings yet
Decap453 Data Communication and Networking
257 pages
The Law of Excluded Middle-Neil Cooper
100% (1)
The Law of Excluded Middle-Neil Cooper
21 pages
Maths and Philosophy
No ratings yet
Maths and Philosophy
9 pages
Dynamic Optimization in Economics
No ratings yet
Dynamic Optimization in Economics
43 pages
An Introduction To Abstract Mathematics
100% (1)
An Introduction To Abstract Mathematics
176 pages
NotesMDL Talk1
No ratings yet
NotesMDL Talk1
12 pages
KC Lecture1
No ratings yet
KC Lecture1
51 pages
Quantum Kolmogorov Complexity
No ratings yet
Quantum Kolmogorov Complexity
12 pages
Group Assignment Fat Solible Vitamins
No ratings yet
Group Assignment Fat Solible Vitamins
6 pages
Yanbu Industrial City
No ratings yet
Yanbu Industrial City
29 pages
Major Themes in William Shakespeare
0% (1)
Major Themes in William Shakespeare
5 pages
Electrical Engineering Students
No ratings yet
Electrical Engineering Students
17 pages
Summative (1-4)
No ratings yet
Summative (1-4)
2 pages
Homework 2
No ratings yet
Homework 2
2 pages
Edupadi Com Classroom Lessons ss1 Ict Application Area of Ict Page 2...
No ratings yet
Edupadi Com Classroom Lessons ss1 Ict Application Area of Ict Page 2...
5 pages
English Worksheets Class 1 Nouns Plurals Verbs Adjectives and Punctuation
No ratings yet
English Worksheets Class 1 Nouns Plurals Verbs Adjectives and Punctuation
6 pages
Ebook File Document 8595
No ratings yet
Ebook File Document 8595
70 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
SHIPCO SKETCHEs
No ratings yet
SHIPCO SKETCHEs
37 pages
MCA - FY08 - Anheuser-Busch InBev India PVT LTD
No ratings yet
MCA - FY08 - Anheuser-Busch InBev India PVT LTD
51 pages
ICT461 - 5a - PHP Introduction
No ratings yet
ICT461 - 5a - PHP Introduction
63 pages
08 Naskah Publikasi
No ratings yet
08 Naskah Publikasi
10 pages
SU23 PAD6143 58628 Prysmakova
No ratings yet
SU23 PAD6143 58628 Prysmakova
14 pages
Brazilian Film Development Insights
No ratings yet
Brazilian Film Development Insights
46 pages
Liz Banks Liz Banks Liz Banks Liz Banks: Ducation
No ratings yet
Liz Banks Liz Banks Liz Banks Liz Banks: Ducation
5 pages
Daleigh Fiddle Tune With Accompaniment
No ratings yet
Daleigh Fiddle Tune With Accompaniment
1 page
Financial Instruments & Tax Guide
No ratings yet
Financial Instruments & Tax Guide
3 pages
N12005 Rev. 88 (Revestimientos. Sistema de Designación y Uso de Los Estándares)
No ratings yet
N12005 Rev. 88 (Revestimientos. Sistema de Designación y Uso de Los Estándares)
32 pages
APDEECET 2018 List of PRVT Colleges
No ratings yet
APDEECET 2018 List of PRVT Colleges
153 pages
Summary of Kentucky Open Meetings Law
No ratings yet
Summary of Kentucky Open Meetings Law
3 pages
Tennis Court Oath
No ratings yet
Tennis Court Oath
16 pages
Arcanol Load400 de en
No ratings yet
Arcanol Load400 de en
1 page
CELTA Lesson Planning Guide
No ratings yet
CELTA Lesson Planning Guide
1 page
Hegemony
No ratings yet
Hegemony
11 pages
Lec 5
No ratings yet
Lec 5
19 pages
Boi Duong 12 Nguyen Phu Tho
100% (1)
Boi Duong 12 Nguyen Phu Tho
224 pages
Oil TRA Datasheet - Ja.en
No ratings yet
Oil TRA Datasheet - Ja.en
11 pages
8.0 Resolving Multi-Signal Drivers: 8.1 Buses
No ratings yet
8.0 Resolving Multi-Signal Drivers: 8.1 Buses
20 pages

Introduction To Kolmogorov Complexity

Uploaded by

Introduction To Kolmogorov Complexity

Uploaded by

An introduction to

Kolmogorov Complexity and Shannon entropy

Time-bounded Kolmogorov complexity

Universal measures of similarity

The empty string, denoted by ε is the (unique) string of length

A language L over an alphabet A is a subset of A∗. That is, L ⊂

Random sequence Pr(01000110101111100001) = 1/220

Classical probability theory cannot express the notion of

Provided an influential formalization of the

Turing was homosexual, living in an era when

He died not long after from what was officially

The halting problem is famous because it was one of

This means there is no algorithm which can be applied

 “Entities should not be multiplied beyond necessity.”

 Commonly explained as: when have choices, choose

 Bertrand Russell: ``It is vain to do with more what

 Newton (Principia): ``Natura enim simplex est, et

Theory of inductive inference

Attended the first meeting

Algorithmic information theory

 K(x|y)=length of shortest description of x given y.

 K(x,x) = K(x) + O(1)‫‏‬

 Hence, we consider randomness of strings as the lack of

[Paul Vitanyi slide]

 For constant c>0, a string x ε {0,1}* is c-incompressible if

H(X) = ∑ P(X=x) log 1/P(X=x)

H(X) bits are necessary on P-average to describe the outcome x.

Example. For P uniform over finite S, we have

H(X)=∑ (1/|S|)log |S| = log |S|.

Kolmogorov complexity, is the minimum description (smallest program) for

H(P) = - ∑P(x) log P(x) is assimptotically equal to the expected complexity

∑x P(x)K(x) ≤ - log P(x) + O(1)

[Paul Vitanyi slide]

(the first term is read as “the information x knows about y)

 In the 1960’s, two parallel theories were developed:

[Paul Vitanyi slide]

 In standard K complexity, it does not matter if we say

[Paul Vitanyi slide]

A: by finding patterns in the data, i.e. by learning to

compression is learning: in the long run, it's impossible

learning is compression: the output of learning from

Of course, there are infinitely many such programs.

One way to encode Occam's razor is to select the

Since KC is uncomputable, we estimate it using gzip (this is an

Grabbing random articles Wikipedia in 4 languages: (Portuguese,

PT-ES .9774, .9698

This is universal learning: an ideal compression

You might also like