KEMBAR78
Stoll - Introduction To Real Analysis | PDF | Set (Mathematics) | Mathematical Analysis
100% found this document useful (1 vote)
1K views583 pages

Stoll - Introduction To Real Analysis

The document is an introduction to the third edition of 'Real Analysis' by Manfred Stoll, published in 2021. It outlines the contents of the book, which covers fundamental topics in real analysis including real numbers, topology, sequences, limits, differentiation, and integration. The book is part of the Textbooks in Mathematics series and includes various exercises and supplemental readings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views583 pages

Stoll - Introduction To Real Analysis

The document is an introduction to the third edition of 'Real Analysis' by Manfred Stoll, published in 2021. It outlines the contents of the book, which covers fundamental topics in real analysis including real numbers, topology, sequences, limits, differentiation, and integration. The book is part of the Textbooks in Mathematics series and includes various exercises and supplemental readings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 583

Introduction to

Real Analysis
Textbooks in Mathematics
Series editors:
Al Boggess, Kenneth H. Rosen
Ordinary Differential Equations
An Introduction to the Fundamentals, Second Edition
Kenneth B. Howell
Differential Geometry of Manifolds, Second Edition
Stephen Lovett
The Shape of Space, Third Edition
Jeffrey R. Weeks
Differential Equations
A Modern Approach with Wavelets
Steven Krantz
Advanced Calculus
Theory and Practice, Second Edition
John Srdjan Petrovic
Advanced Problem Solving Using Maple
Applied Mathematics, Operations Research, Business Analytics, and Decision Analysis
William P Fox, William Bauldry
Nonlinear Optimization
Models and Applications
William P. Fox
Linear Algebra
James R. Kirkwood, Bessie H. Kirkwood
Train Your Brain
Challenging Yet Elementary Mathematics
Bogumil Kaminski, Pawel Pralat
Real Analysis
With Proof Strategies
Daniel W. Cunningham
Introduction to Real Analysis, 3rd Edition
Manfred Stoll

https://www.routledge.com/Textbooks-in-Mathematics/book-series/
CANDHTEXBOOMTH
Introduction to
Real Analysis
Third edition

Manfred Stoll
third edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2021 Taylor & Francis Group, LLC


First edition published by Pearson 1997
Second edition published by Pearson 2001

Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged, please write and
let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@
tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Library of Congress Control Number: 2020943349

ISBN: 978-0-367-48688-4 hbk)


ISBN: 978-1-003-13735-1 (ebk)

Typeset in CMR10
by KnowledgeWorks Global Ltd.
Contents

Preface to the Third Edition ix

Preface to the First Edition xi

To the Student xv

Acknowledgments xvii

1 The Real Numbers 1


1.1 Sets and Operations on Sets . . . . . . . . . . . . . . . . . . 2
1.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Mathematical Induction . . . . . . . . . . . . . . . . . . . . . 16
1.4 The Least Upper Bound Property . . . . . . . . . . . . . . . 21
1.5 Consequences of the Least Upper Bound Property . . . . . . 30
1.6 Binary and Ternary Expansions . . . . . . . . . . . . . . . . 32
1.7 Countable and Uncountable Sets . . . . . . . . . . . . . . . . 36
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 47
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 49

2 Topology of the Real Line 51


2.1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.2 Open and Closed Sets . . . . . . . . . . . . . . . . . . . . . . 57
2.3 Compact Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.4 Compact Subsets of R . . . . . . . . . . . . . . . . . . . . . . 73
2.5 The Cantor Set . . . . . . . . . . . . . . . . . . . . . . . . . 77
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 80
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 81

3 Sequences of Real Numbers 83


3.1 Convergent Sequences . . . . . . . . . . . . . . . . . . . . . . 83
3.2 Sequences of Real Numbers . . . . . . . . . . . . . . . . . . . 88
3.3 Monotone Sequences . . . . . . . . . . . . . . . . . . . . . . . 95
3.4 Subsequences and the Bolzano-Weierstrass Theorem . . . . . 102
3.5 Limit Superior and Inferior of a Sequence . . . . . . . . . . . 106
3.6 Cauchy Sequences . . . . . . . . . . . . . . . . . . . . . . . . 113

v
vi Contents

3.7 Series of Real Numbers . . . . . . . . . . . . . . . . . . . . . 119


Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 124
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 127

4 Limits and Continuity 129


4.1 Limit of a Function . . . . . . . . . . . . . . . . . . . . . . . 130
4.2 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . 144
4.3 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . 158
4.4 Monotone Functions and Discontinuities . . . . . . . . . . . 162
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 178
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 179

5 Differentiation 181
5.1 The Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.2 The Mean Value Theorem . . . . . . . . . . . . . . . . . . . 192
5.3 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . 206
5.4 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . 214
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 221
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 222

6 Integration 223
6.1 The Riemann Integral . . . . . . . . . . . . . . . . . . . . . . 224
6.2 Properties of the Riemann Integral . . . . . . . . . . . . . . 240
6.3 Fundamental Theorem of Calculus . . . . . . . . . . . . . . . 248
6.4 Improper Riemann Integrals . . . . . . . . . . . . . . . . . . 258
6.5 The Riemann-Stieltjes Integral . . . . . . . . . . . . . . . . . 264
6.6 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . 279
6.7 Proof of Lebesgue’s Theorem . . . . . . . . . . . . . . . . . . 291
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 297
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 299

7 Series of Real Numbers 301


7.1 Convergence Tests . . . . . . . . . . . . . . . . . . . . . . . . 301
7.2 The Dirichlet Test . . . . . . . . . . . . . . . . . . . . . . . . 315
7.3 Absolute and Conditional Convergence . . . . . . . . . . . . 320
7.4 Square Summable Sequences . . . . . . . . . . . . . . . . . . 328
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 336
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 338
Contents vii

8 Sequences and Series of Functions 339


8.1 Pointwise Convergence and Interchange of Limits . . . . . . 340
8.2 Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . 344
8.3 Uniform Convergence and Continuity . . . . . . . . . . . . . 352
8.4 Uniform Convergence and Integration . . . . . . . . . . . . . 360
8.5 Uniform Convergence and Differentiation . . . . . . . . . . . 363
8.6 The Weierstrass Approximation Theorem . . . . . . . . . . . 370
8.7 Power Series Expansions . . . . . . . . . . . . . . . . . . . . 377
8.8 The Gamma Function . . . . . . . . . . . . . . . . . . . . . . 398
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 404
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 405

9 Fourier Series 407


9.1 Orthogonal Functions . . . . . . . . . . . . . . . . . . . . . . 408
9.2 Completeness and Parseval’s Equality . . . . . . . . . . . . . 418
9.3 Trigonometric and Fourier Series . . . . . . . . . . . . . . . . 423
9.4 Convergence in the Mean of Fourier Series . . . . . . . . . . 433
9.5 Pointwise Convergence of Fourier Series . . . . . . . . . . . . 443
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 457
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 457

10 Lebesgue Measure and Integration 459


10.1 Introduction to Measure . . . . . . . . . . . . . . . . . . . . 460
10.2 Measure of Open Sets: Compact Sets . . . . . . . . . . . . . 462
10.3 Inner and Outer Measure: Measurable Sets . . . . . . . . . . 475
10.4 Properties of Measurable Sets . . . . . . . . . . . . . . . . . 480
10.5 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . 488
10.6 Lebesgue Integral of a Bounded Function . . . . . . . . . . . 495
10.7 The General Lebesgue Integral . . . . . . . . . . . . . . . . . 508
10.8 Square Integrable Functions . . . . . . . . . . . . . . . . . . 518
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . 529
Supplemental Reading . . . . . . . . . . . . . . . . . . . . . . 530

Bibliography 531

Hints and Solutions 533

Index 559
Preface to the Third Edition

The major changes to the third edition involve the exposition of the topological
concepts required in the study of analysis. In the first two editions metric
spaces are scattered throughout the text. For example, the Notes section of
Chapter 3 contained the definitions of a metric space, ǫ-neighborhoods, as
well as open and compact sets. The Miscellaneous Exercises of this chapter
also contained several relevant exercises. The concept of a metric and norm,
as well as ǫ-neighborhoods are also included in the Text in Section 7.4.
In this third edition, I have decided to include the relevant topological
concepts in Chapter 2, beginning with a discussion of metric spaces in Section
2.1. The usual concepts of open and closed sets, as well as limit points of
a set are included in Section 2.2. While the emphasis is on the general, the
examples emphasize these topological concepts on the real line. Section 2.3
contains a brief introduction to compact sets and their properties that require
only the definition of compactness. Finally, the characterization of compact
subsets of R is included in Section 2.4. Chapter 3 now contains all the topics
previously included in Chapter 2, with the exception that the convergence
of a sequence is defined in metric spaces. Likewise, limits of functions and
continuity in Chapter 4 are also defined in metric spaces. Norms and normed
linear spaces are still introduced in Chapter 7, with examples of the usual
normed linear spaces in the remaining chapters.
Even with these changes, the emphasis is still on sequences of real numbers,
the compact subsets of R, as well as real-valued functions. The major theorems
remain unchanged. The main advantages of the revision is that it unifies the
subject matter, provides students with an introduction to metric spaces and
abstract topological concepts, and provides a better preparation for advanced
studies in analysis.
The third edition also includes additional exercises and expanded hints
and solutions. I have also attempted to correct some of the errors that were
present in the earlier editions. The Supplemental Readings sections have also
been updated. The one item that has been deleted is the appendix on Logic
and Proofs previously included in the second edition. In the author’s opinion
these are topics that are best covered in greater detail in a seperate course.

Manfred Stoll

ix
Preface to the First Edition

The subject of real analysis is one of the fundamental areas of mathematics,


and is the foundation for the study of many advanced topics, not only in
mathematics, but also in engineering and the physical sciences. A thorough
understanding of the concepts of real analysis has also become increasingly im-
portant for the study of advanced topics in economics and the social sciences.
Topics such as Fourier series, measure theory and integration, are fundamen-
tal in mathematics and physics as well as engineering, economics, and many
other areas.
Due to the increased importance of real analysis in many diverse subject
areas, the typical first semester course on this subject has a varied student
enrollment in terms of both ability and motivation. From my own experience,
the audience typically includes mathematics majors, for whom this course
represents the only rigorous treatment of analysis in their collegiate career,
and students who plan to pursue graduate study in mathematics. In addition,
there are mathematics education majors who need a strong background in
analysis in preparation for teaching high school calculus. Occasionally, the
enrollment includes graduate students in economics, engineering, physics, and
other areas, who need a thorough treatment of analysis in preparation for
additional graduate study either in mathematics or their own subject area.
In an ideal situation it would be desirable to offer separate courses for each
of these categories of students. Unfortunately, staffing and enrollment usually
make such choices impossible.
In the preparation of the text there were several goals I had in mind. The
first was to write a text suitable for a one-year sequence in real analysis at the
junior or senior level, providing a rigorous and comprehensive treatment of the
theoretical concepts of analysis. The topics chosen for inclusion are based on
my experience in teaching graduate courses in mathematics, and reflect what
I feel are minimal requirements for successful graduate study. I get to the least
upper bound property as quickly as possible, and emphasize this important
property in the text. For this reason, the algebraic properties of the rational
and real number systems are treated very informally, and the construction
of the real number system from the rational numbers is included only as
a miscellaneous exercise. I have attempted to keep the proofs as concise as
possible, and to let the subject matter progress in a natural manner. Topics or
sections that are not specifically required in subsequent chapters are indicated
by a footnote.

xi
xii Preface to the First Edition

My second goal was to make the text understandable to the typical stu-
dent enrolled in the course, taking into consideration the variations in abilities,
background, and motivation. For this reason, chapters one through six have
been written with the intent to be accessible to the average student, while at
the same time challenging the more talented student through the exercises.
The basic topological concepts of open, closed, and compact sets, as well as
limits of sequences and functions are introduced for the real line only. However,
the proofs of many of the theorems, especially those involving topological con-
cepts, are presented in a manner that permit easy extensions to more abstract
settings. These chapters also include a large number of examples and more
routine and computational exercises. Chapters seven through ten assume that
the students have achieved some level of expertise in the subject. In these
chapters, function spaces are introduced and studied in greater detail. The
theorems, examples, and exercises require greater sophistication and mathe-
matical maturity for full understanding. From my own experiences, these are
not unrealistic expectations.
The book contains most of the standard topics one would expect to find
in an introductory text on real analysis—limits of sequences, limits of func-
tions, continuity, differentiation, integration, series, sequences and series of
functions, and power series. These topics are basic to the study of real analy-
sis and are included in most texts at this level. In addition I have also included
a number of topics that are not always included in comparable texts. For in-
stance, Chapter 6 contains a section on the Riemann-Stieltjes integral, and
a section on numerical methods. Chapter 7 also includes a section on square
summable sequences and a brief introduction to normed linear spaces. Both
of these concepts appear again in later chapters of the text.
In Chapter 8, to prove the Weierstrass approximation theorem, I use the
method of approximate identities. This exposes the student to a very impor-
tant technique in analysis that is used again in the chapter on Fourier series.
The study of Fourier series, and the representation of functions in terms of
series of orthogonal functions, has become increasingly important in many
diverse areas. The inclusion of Fourier series in the text allows the student
to gain some exposure to this important subject, without the necessity of
taking a full semester course on partial differential equations. In the final
chapter I have also included a detailed treatment of Lebesgue measure and
the Lebesgue integral. The approach to measure theory follows the original
method of Lebesgue, using inner and outer measure. This provides an intuitive
and leisurely approach to this very important topic.
The exercises at the end of each section are intended to reinforce the
concepts of the section and to help the students gain experience in developing
their own proofs. Although the text contains some routine and computational
problems, many of the exercises are designed to make the students think about
the basic concepts of analysis and to challenge their creativity and logical
thinking. Solutions and hints to selected exercises are included at the end of
the text. These problems are marked by an asterisk (*).
Preface to the First Edition xiii

At the end of each chapter I have also included a section of notes on the
chapter, miscellaneous exercises, and a supplemental reading list. The notes in
many cases provide historical comments on the development of the subject, or
discuss topics not included in the chapter. The miscellaneous exercises are in-
tended to extend the subject matter of the text or to cover topics that although
important, are not covered in the chapter itself. The supplemental reading list
provides references to topics that relate to the subject under discussion. Some
of the references provide historical information; others provide alternate solu-
tions of results or interesting related problems. Most of the articles appear in
the American Mathematical Monthly or Mathematics Magazine, and should
easily be accessible for students’ reference.
To cover all the chapters in a one-year sequence is perhaps overly ambi-
tious. However, from my own experience in teaching the course, with a judi-
cious choice of topics it is possible to cover most of the text in two semesters.
A one-semester course should at a minimum include all or most of the first
five chapters, and part or all of Chapter 6 or Chapter 7. The latter chapter
can be taught independently of Chapter 6; the only dependence on Chapter
6 is the integral test, and this can be covered without a theoretical treatment
of Riemann integration. The remaining topics should be more than sufficient
for a full second semester. The only formal prerequisites for reading the text
is a standard three- or four-semester sequence in calculus. Even though an
occasional talented student has completed one semester of this course during
their sophomore year, some mathematical maturity is expected and the aver-
age student might be advised to take the course during their junior or senior
year.

Manfred Stoll
To the Student

The difference between a course on calculus and a course on real analysis


is analogous to the difference in the approach to the subject prior to the
nineteenth century and since that time. Most of the topics in calculus were
developed in the late seventeenth and eighteenth centuries by such promi-
nent mathematicians as Newton, Leibniz, Bernoulli, Euler, and many others.
Newton and Leibniz developed the differential and integral calculus; their suc-
cessors extended and applied the theory to many problems in mathematics
and the physical sciences. They had phenomenal insight into the problems,
and were extremely proficient and ingenious in deriving complex formulas.
What they lacked, however, were the tools to place the subject on a rigor-
ous mathematical foundation. This did not occur until the nineteenth century
with the contributions of Cauchy, Bolzano, Weierstrass, Cantor, and many
others.
In calculus the emphasis is primarily on developing expertise in compu-
tational techniques and applications. In real analysis, you will be expected
to understand the concepts and to develop the ability to prove results using
the definitions and previous theorems. Understanding the concept of a limit,
and proving results about limits, will be significantly more important than
computing limits. To accomplish this, it is essential that all definitions and
statements of theorems be learned precisely. Most of the proofs of the theo-
rems and solutions of the problems are logical consequences of the definitions
and previous results; some however do require ingenuity and creativity.
The text contains numerous examples and counter-examples to illustrate
the particular topics under discussion. These are included to show why certain
hypotheses are required, and to help develop a more thorough understanding
of the subject. It is crucial that you not only learn what is true, but that you
also have sufficient counter-examples at your disposal. I have included hints
and answers to selected exercises at the end of the text; these are indicated
by an asterisk (*). For some of the problems I have provided complete details;
for others I have provided brief hints, leaving the details to you. As always,
you are encouraged to first attempt the exercises and, to look at the hints or
solutions only after repeated attempts have been unsuccessful.
At the end of each chapter I have included a supplemental reading list.
The journal articles or books are all related to the topics in the chapter.
Some provide historical information or extensions of the topics to more general
settings; others provide alternate solutions of results in the text, or solutions
of interesting related problems. All of the articles should be accessible in your

xv
xvi To the Student

library. They are included to encourage you to develop the habit of looking
into the mathematical literature.
On reading the text you will inevitably encounter topics, formulas, or ex-
amples that may appear too technical and difficult to comprehend. Skip them
for the moment; there will be plenty for you to understand in what follows.
Upon later reading the section, you may be surprised that it is not nearly as
difficult as previously imagined. Concepts that initially appear difficult be-
come clearer once you develop a greater understanding of the subject. It is
important to keep in mind that many of the examples and topics that appear
difficult to you were most likely just as difficult to the mathematicians of the
era in which they first appeared.
The material in the text is self-contained and independent of the calculus.
I do not use any results from calculus in the definitions and development of
the subject matter. Occasionally, however, in the examples and exercises, I do
assume knowledge of the elementary functions and of notation and concepts
that should have been encountered elsewhere. These concepts will be defined
carefully at the appropriate place in the text.

Manfred Stoll
Acknowledgments

I would like to thank the students at the University of South Carolina who have
learned this material from me, or my colleagues, from preliminary versions of
this text. Your criticisms, comments, and suggestions were appreciated. I am
also indebted to those colleagues, especially the late Jeong Yang, who agreed
to use the manuscript in their courses.
Special thanks are also due to the many reviewers who examined the
manuscripts for the first and second editions and provided constructive criti-
cisms and suggestions for improvements. I would also like to thank the many
readers who over the years have informed me of errors in the text. Hopefully
all of the errors of the first and second edition have been corrected.
Finally, I would like to thank the staff at Addison-Wesley for their assis-
tance in the publication of the first two editions. I would also like to thank
Bob Ross of CRC Press for encouraging me to prepare a third edition, and
his staff for their assistance in the preparation of the third edition.

Manfred Stoll

xvii
1
The Real Numbers

The key to understanding many of the fundamental concepts of calculus, such


as limits, continuity, and the integral, is the least upper bound property of the
real number system R. As we all know, the rational number system contains
gaps.√For example, there does not exist a rational number r such that r2 = 2,
i.e., 2 is irrational. The fact that the rational numbers do contain gaps makes
them inadequate for any meaningful discussion of the above concepts.
The standard argument used in proving that the equation r2 = 2 does not
have a solution in the rational numbers goes as follows: Suppose that there
exists a rational number r such that r2 = 2. Write r = m n where m, n are
integers which are not both even. Thus m2 = 2 n2 . Therefore m2 is even,
and hence m itself must be even. But then m2 , and hence also 2n2 are both
divisible by 4. Therefore n2 is even, and as a consequence n is also even. This
however contradicts our assumption that not both m and n are even. The
method of proof used in this example is proof by contradiction; namely, we
assume the negation of the conclusion and arrive at a logical contradiction.
The above argument shows that there does not exist a rational number
r such that r2 = 2. This argument was known to Pythagoras (around 500
B.C.), and even the Greek mathematicians of this era noted that the straight
line contains many more points than the rational numbers. It was not un-
til the nineteenth century, however, when mathematicians became concerned
with putting calculus on a firm mathematical footing, that the development
of the real number system was accomplished. The construction of the real
number system is attributed to Richard Dedekind (1831–1916) and Georg
Cantor (1845–1917), both of whom published their results independently in
1872. Dedekind’s aim was the construction of a number system, with the same
completeness as the real line, using only the basic postulates of the integers
and the principles of set theory. Instead of constructing the real numbers, we
will assume their existence and examine the least upper bound property. As
we will see, this property is the key to many basic facts about the real numbers
which are usually taken for granted in the study of calculus.
In Chapter 1 we will assume a basic understanding of the concept of a set
and also of both the rational and real number systems. In Section 4 we will
briefly review the algebraic and order properties of both the rational and real
number systems and discuss the least upper bound property. By example we
will show that this property fails for the rational numbers. In the subsequent
two sections we will prove several elementary consequences of the least upper

1
2 Introduction to Real Analysis

bound property. In Section 7 we define the notion of a countable set, and


consider some of the basic properties of countable sets. Among the key results
of this section are that the rational numbers are countable, whereas the real
numbers are not.

1.1 Sets and Operations on Sets


Sets are constantly encountered in mathematics. One speaks of sets of points,
collections of real numbers, and families of functions. A set is conceived simply
as a collection of definable objects. The words set, collection, and family
are all synonymous. The notation x ∈ A means that x is an element of the
set A; the notation x 6∈ A means that x is not an element of the set A. The
set containing no elements is called the empty set and will be denoted by ∅.
A set can be described by listing its elements, usually within braces { }.
For example,
A = {−1, 2, 5, 4}
describes the set consisting of the numbers −1, 2, 4, and 5. More generally,
a set A may be defined as the collection of all elements x in some larger
collection satisfying a given property. Thus the notation
A = {x : P (x)}
defines A to be the set of all objects x having the property P (x). This is usually
read as “A equals the set of all elements x such that P (x).” For example, if x
ranges over all real numbers, the set A defined by
A = {x : 1 < x < 5}
is the set of all real numbers which lie between 1 and 5. For this example,
3.75 ∈ A whereas 5 6∈ A. We will also use the notation A = {x ∈ X : P (x)}
to indicate that only those x which are elements of X are being considered.
Some basic sets that we will encounter throughout the text are the follow-
ing:
N = the set of natural numbers or positive integers = {1, 2, 3, ....}
Z = the set of all integers = {..., −2, −1, 0, 1, 2, ...},
Q = the set of rational numbers = { p/q : p, q ∈ Z, q 6= 0}, and
R = the set of real numbers.
In addition we will occasionally also encounter the set {0, 1, 2, 3, ...} of non-
negative integers.
Real numbers which are not rational numbers are called irrational num-
bers. Since many fractions can represent the same rational number, two ratio-
nal numbers r1 = p1 /q1 and r2 = p2 /q2 are equal if and only if q1 p2 = p1 q2 .
The Real Numbers 3

Set-theoretically the rational numbers can be defined as sets of ordered pairs


of integers (m, n), n 6= 0, where two ordered pairs (p1 , q1 ) and (p2 , q2 ) are
said to be equivalent (represent the same rational number) if p1 q2 = p2 q1 . We
assume that the reader is familiar with the algebraic operations of addition
and multiplication of rational numbers.
A set A is a subset of a set B or is contained in B, denoted A ⊂ B, if
every element of A is an element of B. The set A is a proper subset of B,
denoted A $ B if A is a subset of B but there is an element of B which is not
an element of A. Two sets A and B are equal, denoted A = B, if A ⊂ B and
B ⊂ A. By definition, the empty set ∅ is a subset of every set.

Set Operations
There are a number of elementary operations which may be performed on
sets. If A and B are sets, the union of A and B, denoted A ∪ B, is the set of
all elements that belong either to A or to B or to both A and B. Symbolically,
A ∪ B = { x : x ∈ A or x ∈ B }.
The intersection of A and B, denoted A ∩ B, is the set of elements that
belong to both A and B; that is
A ∩ B = {x : x ∈ A and x ∈ B}.
Two sets A and B are disjoint if A ∩ B = ∅. The relative complement
B \ A, is the set of all elements which are in B but not in A. In set notation,
B \ A = {x : x ∈ B and x ∈
/ A}.
If the set A is a subset of some fixed set X, then X \ A is usually referred to
as the complement of A and is denoted by Ac . These basic set operations
are illustrated in Figure 1.1 with the shaded areas representing A ∪ B, A ∩ B,
and B \ A, respectively.

FIGURE 1.1
A ∪ B, A ∩ B, B \ A

There are several elementary set theoretic identities which will be encoun-
tered on numerous instances throughout the text. We state some of these in
the following theorem; others are given in the exercises.
4 Introduction to Real Analysis

THEOREM 1.1.1 If A, B, and C are sets, then


(a) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C),
(b) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),
(c) C \ (A ∪ B) = (C \ A) ∩ (C \ B),
(d) C \ (A ∩ B) = (C \ A) ∪ (C \ B).

The identities (a) and (b) are referred to as the distributive laws,
whereas (c) and (d) are De Morgan’s laws. If A and B are subsets of a
set X, then De Morgan’s laws can also be expressed as

(A ∪ B)c = Ac ∩ B c , (A ∩ B)c = Ac ∪ B c .

A more general version of both the distributive laws and De Morgan’s laws
will be stated in Theorems 1.7.12 and 1.7.13.
Proof. We will provide the proof of (a) to illustrate the method used in
proving these results. The proofs of (b) – (d) are relegated to the exercises
(Exercise 7). Suppose x ∈ A ∩ (B ∪ C). Then x ∈ A and x ∈ B ∪ C. Since
x ∈ B ∪ C, x ∈ B or x ∈ C. If x ∈ B, then x ∈ A ∩ B and therefore
x ∈ (A ∩ B) ∪ (A ∩ C). Similarly, if x ∈ C then x ∈ A ∩ C and thus again
x ∈ (A ∩ B) ∪ (A ∩ C). This then proves that

A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C).

To complete the proof, it still has to be shown that (A ∩ B) ∪ (A ∩ C) ⊂


A ∩ (B ∪ C), thereby proving equality. If x ∈ (A ∩ B) ∪ (A ∩ C), then by
definition x ∈ A ∩ B or x ∈ A ∩ C, or both. But if x ∈ A ∩ B then x ∈ A
and x ∈ B. Since x ∈ B we also have x ∈ B ∪ C. Therefore x ∈ A ∩ (B ∪ C).
Similarly, if x ∈ A ∩ C then x ∈ A ∩ (B ∪ C). 
If A is any set, the set of all subsets of A is denoted by P(A). The set P(A)
is sometimes referred to as the power set of A. For example, if A = {1, 2},
then
P(A) = {∅, {1}, {2}, {1, 2}}.
In this example, the set A has 2 elements and P(A) has 4 or 22 elements,
the elements in this instance being the subsets of A. If we take a set with
3 elements, then by listing the subsets of A it is easily seen that there are
exactly 23 subsets of A (Exercise 8). On the basis of these two examples we
are inclined to conjecture that if A contains n elements, then P(A) contains
2n elements.
We now prove that this is indeed the case. We form subsets B of A by
deciding for each element of A whether to include it in B, or to leave it out.
Thus for each element of A there are exactly two possible choices. Since A has
n elements, there are exactly 2n possible decisions, each decision corresponding
to a subset of A.
Finally, if A and B are two sets, the Cartesian product of A and B,
The Real Numbers 5

denoted A × B, is defined as the set of all ordered pairs1 (a, b), where the
first component a is from A and the second component b is from B, i.e.,

A × B = {(a, b) : a ∈ A, b ∈ B}.

For example, if A = {1, 2} and B = {−1, 2, 4} then

A × B = {(1, −1), (1, 2), (1, 4), (2, −1), (2, 2), (2, 4)}.

The Cartesian product of R with R is usually denoted by R2 and is referred


to as the euclidean plane. If A and B are subsets of R, then A × B is a subset
of R2 . The case where A and B are intervals is illustrated in Figure 1.2.

FIGURE 1.2
The Cartesian product A × B

Exercises 1.1
1. Let A = {−1, 0, 1, 2}, B = {−2, 3}, and C = {−2, 0, 1, 5}.
a. Find each of the following: (A ∪ B), (B ∪ C), (A ∩ B), (B ∩ C),
A ∩ (B ∪ C), A \ B, C \ B, A \ (B ∪ C).
b. Find each of the following: (A × B), (C × B), (A × B) ∩ (C × B),
(A ∩ C) × B.
c. On the basis of your answer in (b), what might you conjecture about
(A ∩ C) × B for arbitrary sets A, B, C?

1 A set theoretic definition of ordered pair can be given as follows: (a, b) = {{a}, {a, b}}.

With this definition two ordered pairs (a, b) and (c, d) are equal if and only if a = c and
b = d (Miscellaneous Exercise 1).
6 Introduction to Real Analysis

2. Let A = {x ∈ R : −1 ≤ x ≤ 5}, B = {x ∈ R : 0 ≤ x ≤ 3},


C = {x ∈ R : 2 ≤ x ≤ 4}.
*a. Find each of the following: A ∩ B, A ∩ Z, B ∩ C, A ∪ B, B ∪ C.
*b. Find each of the following: A × B, A × C, (A × B) ∪ (A × C).
c. Sketch the sets (A × B) ∪ (A × C) and A × (B ∪ C).
3. If A, B, and C are sets, prove that
a. A ∩ ∅ = ∅, A ∪ ∅ = A, b. A ∩ A = A, A ∪ A = A,
c. A ∩ B = B ∩ A, A ∪ B = B ∪ A.
4. Prove the following associate laws for the set operations ∪ and ∩:
*a. A ∩ (B ∩ C) = (A ∩ B) ∩ C, b. A ∪ (B ∪ C) = (A ∪ B) ∪ C.
5. If A ⊂ B, prove that
a. A ∩ B = A, b. A ∪ B = B.
6. If A is a subset of X, prove that
a. A ∪ Ac = X, b. A ∩ Ac = ∅, c. (Ac )c = A.
7. If A, B, and C are sets, prove that
*a. A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C), b. C \ (A ∪ B) = (C \ A) ∩ (C \ B),
*c. C \ (A ∩ B) = (C \ A) ∪ (C \ B), d. (A ∪ B) \ C = (A \ C) ∪ (B \ C).
8. *Verify that a set with three elements has eight subsets.
9. If A and B are subsets of a set X prove that
a. A \ B = A ∩ B c ,
b. A ∩ B and A \ B are disjoint and that A = (A ∩ B) ∪ (A \ B).
10. True or False. Either prove true for all sets A, B and C, or provide an
example to show that the result is false.
a. (A ∪ B) \ A = B.
b. (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A).
c. (A ∩ B) ∪ (B ∩ C) ∪ (A ∩ C) = A ∩ B ∩ C.
d. (A ∩ B) \ C = A ∩ (B \ C).
11. *Prove that A × (B1 ∪ B2 ) = (A × B1 ) ∪ (A × B2 ).
12. Suppose A, C are subsets of X and B, D are subsets of Y . Prove that
(A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D).

1.2 Functions
We begin this section with the fundamental concept of a function. In many
texts a function or a mapping f from a set A to a set B is described as a rule
The Real Numbers 7

FIGURE 1.3
A function as a graph

that assigns to each element x ∈ A a unique element y ∈ B. This is generally


expressed by writing y = f (x) to denote the value of the function f at x.
The difficulty with this “definition” is that the terms “rule” and “assigns” are
vague and difficult to define. Consequently we will define “function” strictly
in terms of sets, using the notation and concepts introduced in the preceding
section.
The motivation for the following definition is to think of the graph of a
function; namely the set of ordered pairs (x, y) where y is given by the “rule”
that defines the function.

DEFINITION 1.2.1 Let A and B be any two sets. A function f from A


into B is a subset of A × B with the property that each x ∈ A is the first
component of precisely one ordered pair (x, y) ∈ f ; that is, for every x ∈ A
there exists y ∈ B such that (x, y) ∈ f , and if (x, y) and (x, y ′ ) are elements
of f , then y = y ′ . The set A is called the domain of f , denoted Dom f . The
range of f , denoted Range f , is defined by

Range f = {y ∈ B : (x, y) ∈ f for some x ∈ A}.

If Range f = B, then the function f is said to be onto B. (See Figure 1.3)

If f is a function from A to B and (x, y) ∈ f , then the element y is called


the value of the function f at x and we write

y = f (x) or f : x → y.
8 Introduction to Real Analysis

FIGURE 1.4
A function as a mapping

FIGURE 1.5
The function of Example 1.2.2(a)

We also use the notation f : A → B to indicate that f is a function from (or


on) A into (or to) B, and sometimes say that f maps A to B or is a mapping
of A to B. If we think of a function f : A → B as mapping an element x ∈ A
to an element y = f (x) in B, then this is often represented by a diagram as
in Figure 1.4. If f : A → R, then f is said to be a real-valued function on
A.
A function f from A to B is not just any subset of A × B. The key phrase
in Definition 1.2.1 is that each x ∈ A is the first component of precisely
one ordered pair (x, y) ∈ f . To better understand the notion of function we
consider several examples.
EXAMPLES 1.2.2 (a) Let A = {−3, −2, −1, 0, 1} and B = Z. Consider
the subset f of A × B given by
f = {(−3, 2), (−2, −2), (−1, 4), (0, −6), (1, 4)}.
Since each x ∈ A belongs to precisely one ordered pair (x, y) ∈ f , f is a
function from A into B with Range f = {−6, −2, 2, 4}. Figure 1.5 indicates
what f does to each element of A. Even though the element 4 in B is the
second component of the two distinct ordered pairs (−1, 4) and (1, 4), this
does not contradict the definition of function.
(b) Let A and B be as in (a) and consider g defined by
g = {(−3, 2), (−2, 4), (−2, 1), (−1, 4), (0, 5), (1, 1)}.
The Real Numbers 9

Since both (−2, 4) and (−2, 1) are two elements of g with the same first com-
ponent, g is not a function from A to B.
(c) In this example, we let A = B = R, and let h be defined by

h = {(x, y) ∈ R × R : y = x2 + 2}.

This function is described by the equation y = x2 + 2. The standard way of


expressing this function is as

h(x) = x2 + 2, Dom h = R.

This specifies both the equation defining the function and the domain of the
function. For this example, Range h = {y ∈ R : y ≥ 2}2 .
(d) Let A be any nonempty set and let

i = {(x, x) : x ∈ A}.

Then i is a function from A onto A whose value at each x ∈ A is x; i.e.,


i(x) = x. The function i is called the identity function on A.
(e) Let A and B be two nonempty sets and consider the projection
function p from A × B to A defined by

p = {((a, b), a) : (a, b) ∈ A × B}.

In this example, Dom p = A × B. Since p : (a, b) → a, we denote this simply


by p(a, b) = a. For example, if A = {1, 2, 3} and B = {−1, 1}, then A ×
B = {(1, −1), (1, 1), (2, −1), (2, 1), (3, −1), (3, 1)} and p(1, −1) = 1, p(1, 1) =
1, p(2, −1) = 2, etc. 

As was indicated in (c) above, if our function h is given by an equation


such as y = x2 + 2 we will simply write h(x) = x2 + 2, Dom h = R, to denote
the function h. It should be emphasized however that an equation such as
h(x) = x2 + 2 by itself does not define a function; the domain of h must also
be specified. Thus

h(x) = x2 + 2, Dom h = R,

and

g(x) = x2 + 2, Dom g = {x ∈ R : −1 ≤ x ≤ 2},

define two different functions.

2 Since x2 ≥ 0 for all x ∈ R, the range of h is a subset of {y ∈ R : y ≥ 2}. To obtain

equality, we require that for every y > 2 there exists an x ∈ R such that x2 + 2 = y. The
existence of such a y will follow as a consequence of Example 1.4.6.
10 Introduction to Real Analysis

Image and Inverse Image

DEFINITION 1.2.3 Let f be a function from A into B . If E ⊂ A, then


f (E), the image of E under f , is defined by

f (E) = {f (x) : x ∈ E}.

If H ⊂ B, the inverse image of H, denoted f −1 (H), is defined by

f −1 (H) = {x ∈ A : f (x) ∈ H}.

If H contains a single element of B, i.e., H = {y}, we will write f −1 (y)


instead of f −1 ({y}). Thus for y ∈ B,

f −1 (y) = {x ∈ A : f (x) = y}.

It is important to keep in mind that for E ⊂ A, f (E) denotes a subset of


B, while for H ⊂ B, f −1 (H) describes a subset of the domain A. It should
be clear that f (A) = Range f , and that f is onto B if and only if f (A) = B.
To illustrate the notions of image and inverse image of a set we consider the
following examples.

EXAMPLES 1.2.4 (a) As in Example 1.2.2 let A = {−3, −2, −1, 0, 1},
B = Z, and f : A → Z the function given by

f = {(−3, 2), (−2, −2), (−1, 4), (0, −6), (1, 4)}.

Consider the subset E = {−1, 0, 1} of A. Then

f (E) = {f (−1), f (0), f (1)} = {−6, 4}.

If H = {0, 1, 2, 3, 4}, then

f −1 (H) = {x ∈ A : f (x) ∈ H} = {−3, −1, 1}.

Since both f (−1) and f (1) are equal to 4, f −1 (4) = {−1, 1}. On the other
hand, since (x, 0) 6∈ f for any x ∈ A, f −1 (0) = ∅.
(b) Consider the function g : Z → Z given by g(x) = x2 , and let E =
{−1, −2, −3, · · · }. Then

g(E) = {(−n)2 : n ∈ N} = {1, 4, 9, · · · }.

On the other hand,


g −1 (g(E)) = Z \ {0}.
For this example E $ g −1 (g(E)).
(c) Let h be the function defined by h(x) = 2x + 3, Dom h = R. If E =
{x ∈ R : −1 ≤ x ≤ 2}, then

h(E) = {2x + 3 : −1 ≤ x ≤ 2} = {y ∈ R : 1 ≤ y ≤ 7}.


The Real Numbers 11

For the set E we also have

h−1 (E) = {x ∈ R : 2x + 3 ∈ E} = {x : −2 ≤ x ≤ − 21 }.

For each y ∈ R, x ∈ h−1 (y) if and only if 2x + 3 = y, which upon solving


for x gives x = 21 (y − 3). Thus for each y ∈ R, h−1 (y) = { 12 (y − 3)}. Since
h−1 (y) 6= ∅ for each y ∈ R, the function h maps R onto R. 

The operations of finding the image or inverse image of a set usually pre-
serve the basic set operations of union and intersection. There is one important
exception which is presented in part (b) of the next theorem.

THEOREM 1.2.5 Let f be a function from A into B. If A1 and A2 are


subsets of A, then
(a) f (A1 ∪ A2 ) = f (A1 ) ∪ f (A2 ),
(b) f (A1 ∩ A2 ) ⊂ f (A1 ) ∩ f (A2 ).

Proof. To prove (a), let y be an element of f (A1 ∪A2 ). Then y = f (x) for some
x in A1 ∪A2 . Thus x ∈ A1 or x ∈ A2 . Suppose x ∈ A1 . Then y = f (x) ∈ f (A1 ).
Similarly, if x ∈ A2 , y ∈ f (A2 ). Therefore y ∈ f (A1 ) ∪ f (A2 ). Thus

f (A1 ∪ A2 ) ⊂ f (A1 ) ∪ f (A2 ).

Since it is clear that f (A1 ) and f (A2 ) are subsets of f (A1 ∪ A2 ), the reverse
inclusion also holds, thereby proving equality.
Since f (A1 ∩ A2 ) is a subset of both f (A1 ) and f (A2 ), the relation stated
in (b) is also true. 
To see that equality need not hold in (b), consider the function g(x) =
x2 , Dom g = Z, of Example 1.2.4(b). If A1 = {−1, −2, −3, · · · } and A2 =
{1, 2, 3, · · · }, then f (A1 ) = f (A2 ) = {1, 4, 9, · · · }, but A1 ∩ A2 = ∅. Thus

f (A1 ∩ A2 ) = f (∅) = ∅ =
6 f (A1 ) ∩ f (A2 ) = {1, 4, 9, ...}.

THEOREM 1.2.6 Let f be a function from A to B. If B1 and B2 are subsets


of B, then
(a) f −1 (B1 ∪ B2 ) = f −1 (B1 ) ∪ f −1 (B2 ),
(b) f −1 (B1 ∩ B2 ) = f −1 (B1 ) ∩ f −1 (B2 ),
(c) f −1 (B \ B1 ) = A \ f −1 (B1 ).

Proof. The proof of Theorem 1.2.6 is left to the exercises (Exercise 8). 
12 Introduction to Real Analysis

Inverse Function

DEFINITION 1.2.7 A function f from A into B is said to be one-to-one


if whenever x1 6= x2 , then f (x1 ) 6= f (x2 ).

Alternately, a function f is one-to-one if whenever (x1 , y) and (x2 , y) are


elements of f then x1 = x2 . From the definition it follows that f is one-to-one
if and only if f −1 (y) consists of at most one element of A for every y ∈ B.
If f is onto B, then f −1 (y) 6= ∅ for every y ∈ B. Thus if f is one-to-one and
onto B, then f −1 (y) consists of exactly one element x ∈ A and

g = {(y, x) ∈ B × A : f (x) = y}

defines a function from B to A. This leads to the following definition.

DEFINITION 1.2.8 If f is a one-to-one function from A onto B, let

f −1 = {(y, x) ∈ B × A : f (x) = y}.

The function f −1 from B onto A is called the inverse function of f . Fur-


thermore, for each y ∈ B,

x = f −1 (y) if and only if f (x) = y.

There is a subtle point that needs to be clarified. If f is any function from


A to B, then f −1 (y) (technically f −1 ({y})) is defined for any y ∈ B as the set
of points x in A such that f (x) = y. However, if f is a one-to-one function of
A onto B, then f −1 (y) denotes the value of the inverse function f −1 at y ∈ B.
Thus it makes sense to write f −1 (y) = x whenever (y, x) ∈ f −1 . Also, if f is
a one-to-one function of A into B, then f −1 defined by

f −1 = {(y, x) : y ∈ Range f and f (x) = y}

is a function from Range f onto A.

FIGURE 1.6
The inverse function
The Real Numbers 13

EXAMPLES 1.2.9 (a) Let h be the function of Example 1.2.4 (c); that is,
h(x) = 2x + 3, Dom h = R. The function h is clearly one-to-one and onto R
with
x = h−1 (y) = 21 (y − 3), Dom h−1 = R.

(b) Consider the function f defined by the equation y = x2 . If we take for


the domain of f all of R, then f is not a one-to-one function. However, if we
let
Dom f = A = {x ∈ R : x ≥ 0},
then f becomes a one-to-one mapping of A into A. To see that f is one-to-
one, let x1 , x2 ∈ A with x1 6= x2 . Suppose x1 < x2 . Then x21 < x22 , that is,
f (x1 ) 6= f (x2 ). Therefore f is one-to-one. To show that f is onto A, we need
to show that for each y ∈ A, y > 0, there exists a positive real number x such
that
x2 = y.
Intuitively we know that such an x exists; namely the square root of y. How-
ever, a rigorous proof of the existence of such an x will require the least upper
bound property of the real numbers. In Example 1.4.6 we will prove that for
each y > 0 there exists a unique positive real number x such that x2 = y.

The number x is called the square root of y and is denoted by y. Thus the
inverse function of f is given by

f −1 (y) = y, Dom f −1 = {y ∈ R : y ≥ 0}. 

Composition of Functions
Suppose f is a function from A to B and g is a function from B to C. If
x ∈ A, then f (x) is an element of B, the domain of g. Consequently we can
apply the function g to f (x) to obtain the element g(f (x)) in C. This process,
illustrated in Figure 1.7, gives a new function h which maps x ∈ A to g(f (x))
in C.

DEFINITION 1.2.10 If f is a function from A to B and g is a function


from B to C, then the function g ◦ f : A → C defined by

g ◦ f = {(x, z) ∈ A × C : z = g(f (x))}

is called the composition of g with f .

If f is a one-to-one function from A into B, then it can be shown that


(f −1 ◦ f )(x) = x for all x ∈ A and that (f ◦ f −1 )(y) = y for all y ∈ Range f
(Exercise 10). This is illustrated in (b) of the following example.
14 Introduction to Real Analysis

FIGURE 1.7
Composition of g with f


EXAMPLES 1.2.11 (a) If f (x) = 1 + x with Dom f = {x ∈ R : x ≥ −1}
and g(x) = x2 , Dom g = R, then

(g ◦ f )(x) = g(f (x)) = ( 1 + x)2 = 1 + x, Dom(g ◦ f ) = {x ∈ R : x ≥ −1}.

Even though the equation (g◦f )(x) = 1+x is defined for all real numbers x, the
domain of the composite function g ◦ f is still only the set {x ∈ R : x ≥ −1}.
For this example, since Range g ⊂ Dom f , we can also find f ◦ g; namely,
p
(f ◦ g)(x) = f (g(x)) = 1 + x2 , Dom f ◦ g = R.

(b) For the function f in (a), the inverse function f −1 is given by

f −1 (y) = y 2 − 1, Dom f −1 = Range f = {y ∈ R : y ≥ 0}.

Thus for x ∈ Dom f ,



(f −1 ◦ f )(x) = f −1 (f (x)) = (f (x))2 − 1 = ( x + 1)2 − 1 = x,

and for y ≥ 0,
p
(f ◦ f −1 )(y) = f (f −1 (y)) = (y 2 − 1) + 1 = y. 

Exercises 1.2
1. Let A = {−1, 0, 1, 2} and B = N. Which of the following subsets of A × B
is a function from A into B.
a. f = {(−1, 2), (0, 3), (2, 5)}
*b. g = {(−1, 2), (0, 7), (1, −1), (1, 3), (2, 7)}
c. h = {(−1, 2), (0, 2), (1, 2), (2, −1)}
*d k = {((x, y) : y = 2x + 3, x ∈ A}
The Real Numbers 15

2. *a. Let A = {(x, y) ∈ R × R : x2 + y 2 = 1}. Is A a function? Explain


your answer.
b. Let B = {(x, y) ∈ R × R : x2 + y 2 = 1, y ≥ 0}. If B a function?
Explain your answer.
3. Let f : N → N be the function defined by f (n) = 2n − 1. Find f (E) and
f −1 (E) for each of the following subsets E of N.
*a. {1, 2, 3, 4} b. {1, 3, 5, 7} c. N
3
4. Let f = {(x, y) : x ∈ R, y = x + 1}.
*a. Let A = {x : −1 ≤ x ≤ 2}. Find f (A) and f −1 (A).
b. Show that f is a one-to-one function of R onto R.
*c. Find the inverse function f −1 .
5. Let f, g mapping Z into Z be given by f (x) = x + 3 and g(x) = 2x.
a. Find (f ◦ g)(x) and (g ◦ f )(x).
*b. Find (f ◦ g)(N) and (g ◦ f )(N).
6. For each of the following real-valued functions, find the range of the
function f and determine whether f is one-to-one. If f is one-to-one, find
the inverse function f −1 and specify the domain of f −1 .
a. f (n) = 5x + 4, Dom f = R.
*b. f (x) = 3x − 2, Dom f = R.
x
c. f (x) = , Dom f = {x ∈ R : 0 ≤ x < 1}.
x−1
d. f (x) = sin x, Dom f = {x ∈ R : 0 ≤ x ≤ π}.
*e. f (x, y) = x, Dom f = R2 .
1
*f. f (x) = 2 , Dom f = {x ∈ R : −1 ≤ x ≤ 1}.
x +1
2
g. f = {(x, x ) : 0 ≤ x ≤ 1}.
7. Let A = {t ∈ R : 0 ≤ t < 2π} and B = R2 , and let f : A → B be defined
by f (t) = (cos t, sin t).
*a. What is the range of f ?
√ √
*b. Find f −1 of each of the following points: (1, 0), (0, −1), ( 2
2
, 22 ), (0, 1).
c. Is the function f one-to-one?
8. Prove Theorem 1.2.6.
9. Let f : A → B and let F ⊂ A.
a. Prove that f (A) \ f (F ) ⊂ f (A \ F ).
b. Give an example for which f (A) \ f (F ) 6= f (A \ F ).
10. a. If f is a one-to-one function from A into B. Show that (f −1 ◦f )(x) = x
for all x ∈ A and that (f ◦ f −1 )(y) = y for all y ∈ Range f .
b. If g is a function from C into A and h = f ◦ g, show that g = f −1 ◦ h.
11. *Let f : A → B and g : B → A be functions satisfying (g ◦ f )(x) = x for
all x ∈ A. Show that f is a one-to-one function. Must f be onto B?
16 Introduction to Real Analysis

12. If f : A → B and g : B → C are one-to-one functions, show that


(g ◦ f )−1 = f −1 ◦ g −1 on Range (g ◦ f ).

1.3 Mathematical Induction


Throughout the text we will on occasion need to prove a statement, identity,
or inequality involving the positive integer n. As an example, consider the
following identity. For each n ∈ N,
r − rn+1
r + r2 + · · · rn = , r 6= 1.
1−r
Mathematical induction is a very useful tool in establishing that such an
identity is valid for all positive integers n.

THEOREM 1.3.1 (Principle of Mathematical Induction) For each


n ∈ N, let P (n) be a statement about the positive integer n. If
(a) P (1) is true, and
(b) P (k + 1) is true whenever P (k) is true,
then P (n) is true for all n ∈ N.

The proof of this theorem depends on the fact that the positive integers
are well-ordered; namely, every nonempty subset of N has a smallest ele-
ment. This statement is usually taken as a postulate or axiom for the positive
integers: we do so in this text. Since it will be used on several other occasions,
we state it both for completeness and emphasis.
WELL-ORDERING PRINCIPLE Every nonempty subset of N has a
smallest element.
The well-ordering principle can be restated as follows: If A ⊂ N, A 6= ∅,
then there exists n ∈ A such that n ≤ k for all k ∈ A.
To prove Theorem 1.3.1 we will use the method of proof by contradiction.
Most theorems involve showing that a statement P implies the statement Q;
namely, if P is true, then Q is true. In a proof by contradiction one assumes
that P is true and Q is false, and then shows that these two assumptions lead
to a logical contradiction; namely show that some statement R is both true
and false.
Proof of Theorem 1.3.1 Assume that the hypothesis of Theorem 1.3.1 are
true, but that the conclusion is false; that is, there exists a positive integer n
such that the statement P (n) is false. Let

A = {k ∈ N : P (k) is false }.
The Real Numbers 17

By our assumption the set A is nonempty. Thus by the well-ordering principle


A has a smallest element ko . Since P (1) is true, ko > 1. Also, since ko is the
smallest element of A, P (ko − 1) is true. But then by hypothesis (b), P (ko )
is also true, which is a contradiction. Consequently, P (n) must be true for all
n ∈ N. 

EXAMPLES 1.3.2 We now provide two examples to illustrate the method


of proof by mathematical induction. The first example provides a proof of the
identity in the introduction to the section. An alternate method of proof will
be requested in the exercises (Exercise 7).
(a) To use mathematical induction, we let our statement P (n), n ∈ N, be
as follows:
r − rn+1
r + · · · + rn = , r 6= 1.
1−r
When n = 1 we have
r(1 − r) r − r2
r= = provided r 6= 1.
(1 − r) 1−r

Thus the identity is valid for n = 1. Assume P (k) is true for k ≥ 1, i.e.,

r − rk+1
r + · · · + rk = , r 6= 1.
1−r
We must now show that the statement P (k + 1) is true; that is

r − r(k+1)+1
r + · · · + rk+1 = , r 6= 1.
1−r
But

r + · · · + rk+1 = r + · · · + rk + rk+1

which by the induction hypothesis

r − rk+1 r − rk+1 + (1 − r)rk+1


= + rk+1 =
1−r 1−r
r − rk+2
= , r 6= 1.
1−r
Thus the identity is valid for k +1, and hence by the principle of mathematical
induction for all n ∈ N.
(b) For our second example we use mathematical induction to prove
Bernoulli’s inequality. If h > −1, then

(1 + h)n ≥ 1 + n h for all n ∈ N.


18 Introduction to Real Analysis

When n = 1, (1 + h)1 = 1 + h. Thus since equality holds, the inequality is


certainly valid. Assume that the inequality is true when n = k, k ≥ 1. Then
for n = k + 1,

(1 + h)k+1 = (1 + h)k (1 + h),

which by the induction hypothesis and the fact that (1 + h) > 0

≥ (1 + k h)(1 + h) = 1 + (k + 1)h + k h2
≥ 1 + (k + 1)h.

Therefore the inequality holds for n = k + 1, and thus by the principle of


mathematical induction for all n ∈ N. 

Although the statement of Theorem 1.3.1 starts with n = 1, the result is


still true if we start with any integer no ∈ Z. The modified principle of
mathematical induction is as follows: If for each n ∈ Z, n ≥ no , P (n) is a
statement about the integer n satisfying
(a’) P (no ) is true, and
(b’) P (k + 1) is true whenever P (k) is true, k ≥ no ,
then P (n) is true for all n ∈ Z, n ≥ no .
The proof of this follows from Theorem 1.3.1 by simply setting

Q(n) = P (n + no − 1), n ∈ N,

which is now a statement about the positive integer n.


Remark. In the principle of mathematical induction, the hypothesis that
P (1) be true is essential. For example, consider the statement P (n):

n + 1 = n, n ∈ N.

This is clearly false! However, if we assume that P (k) is true, then we also
obtain that P (k + 1) is true. Thus it is absolutely essential that P (no ) be true
for at least one fixed value of no .
There is a second version of the principle of mathematical induction which
is also quite useful.

THEOREM 1.3.3 (Second Principle of Mathematical Induction)


For each n ∈ N, let P (n) be a statement about the positive integer n. If
(a) P (1) is true, and
(b) for k > 1, P(k) is true whenever P (j) is true for all positive integers
j < k,
then P (n) is true for all n ∈ N.
The Real Numbers 19

Proof. (Exercise 3). 


Mathematical induction is also used in the recursive definition of func-
tions defined for the positive integers. In this procedure, we give an initial
value of the function f at n = 1, and then assuming that f has been defined
for all integers k = 1, ..., n, the value of f at n + 1 is given in terms of the
values of f at k, k ≤ n. This is illustrated by the following examples.
EXAMPLES 1.3.4 (a) As an example, consider the function f : N → N
defined by f (1) = 1 and f (n + 1) = nf (n), n ∈ N. The values of f for
n = 1, 2, 3, 4 are given as follows:
f (1) = 1, f (2) = 1, f (3) = 2f (2) = 2 · 1, f (4) = 3f (3) = 3 · 2 · 1.
Thus we conjecture that f (n) = (n − 1)!, where 0! is defined to be equal to
one, and for n ∈ N, n! (read n factorial) is defined as
n! = n · (n − 1) · · · 2 · 1.
This conjecture is certainly true when n = 1. Thus assume that it is true for
n = k, k ≥ 1, that is f (k) = (k − 1)!. Then for n = k + 1,
f (k + 1) = kf (k)

which by the induction hypothesis

= k · (k − 1)! = k!.
Therefore the identity holds for n = (k + 1), and thus by the principle of
mathematical induction for all n ∈ N.
(b) For our second example, consider the function f : N → R defined by
f (1) = 0, f (2) = 13 , and for n ≥ 3 by f (n) = ( n−1
n+1 )f (n − 2), Computing the
values of f for n = 3, 4, 5, and 6, we have
1 1
f (3) = 0, f (4) = , f (5) = 0, f (6) = .
5 7
From these values we conjecture that

 0, if n is odd,
f (n) = 1
 , if n is even.
n+1
To prove our conjecture we will use the second principle of mathematical
induction. Our conjecture is certainly true for n = 1, 2. Suppose n > 2, and
suppose our conjecture holds for all k < n. If n is odd, then so is (n − 2),
and thus by the induction hypothesis f (n − 2) = 0. Therefore f (n) = 0. On
the other hand if n is even, so is (n − 2). Thus by the induction hypothesis
f (n − 2) = 1/(n − 1). Therefore
   
n−1 n−1 1 1
f (n) = f (n − 2) = = . 
n+1 n+1 n−1 n+1
20 Introduction to Real Analysis

Exercises 1.3
1. Use mathematical induction to prove that each of the following identities
are valid for all n ∈ N.
n(n + 1)
a. 1 + 2 + 3 + · · · + n = .
2
2
*b. 1 + 3 + 5 + · · · + (2n − 1) = n .
n(n + 1)(2n + 1)
c. 12 + 22 + · · · + n2 = .
6
2
*d. 13 + 23 + · · · + n3 = 12 n(n + 1) .
 

e. 2 + 22 + 23 + · · · + 2n = 2(2n − 1).
*f. For x, y ∈ R, xn+1 − y n+1 = (x − y)(xn + xn−1 y + · · · + y n ).
1 1 1 n
g. + + ··· + = .
1(2) 2(3) n(n + 1) n+1
2. Use mathematical induction to establish the following inequalities for
n ∈ N.
*a. 2n > n for all n ∈ N b. 2n > n2 for all n ≥ 5
n
*c. n! > 2 for all n ≥ 4 *d 13 + 23 + · · · + n3 < 12 n4 for all n ≥ 3
3. Prove Theorem 1.3.3.
4. *Let f : N → N be defined by f (1) = 5, f (2) = 13, and for n ≥ 3,
f (n) = 2f (n − 2) + f (n − 1). Prove that f (n) = 3 · 2n + (−1)n for all
n ∈ N.
5. For each of the following functions with domain N, determine a formula
for f (n) and use mathematical induction to prove your conclusion.
1
a. f (1) = 12 , and for n > 1, f (n) = (n − 1)f (n − 1) − .
n+1
*b. f (1) = 1, f (2) = 4, and for n > 2, f (n) = 2f (n − 1) − f (n − 2) + 2.
(n + 1)
c. f (1) = 1, and for n > 1, f (n) = f (n − 1).
3n
f (n − 2)
*d. f (1) = 1, f (2) = 0, and for n > 2, f (n) = − .
n(n − 1)
e. For a1 , a2 ∈ R arbitrary, let f (1) = a1 , f (2) = a2 , and for n > 2,
f (n − 2)
f (n) = − .
n(n − 1)
a1 , a2 ∈R arbitrary, let f (1) = a1 , f (2) = a2 , and for n > 2,
*f. For 
n−1
f (n) = f (n − 2).
n+1
6. Let f : N → N be defined by f (1) = 1, f (2) = 2, and
1
f (n + 2) = (f (n + 1) + f (n)).
2
Use Theorem 1.3.3 to prove that 1 ≤ f (n) ≤ 2 for all n ∈ N.
r − rn+1
7. *Prove that r + r2 + · · · + rn = , r 6= 1, n ∈ N, without using
1−r
mathematical induction.
The Real Numbers 21

8. *Use mathematical induction to prove the arithmetic-geometric


mean inequality: If a1 , a2 , ..., an , n ∈ N, are nonnegative real numbers,
then
 a + a + · · · + a n
1 2 n
a1 a2 · · · an ≤ ,
n
with equality if and only if a1 = a2 = · · · = an .

1.4 The Least Upper Bound Property


In this section, we will consider the concept of the least upper bound of a
set and introduce the least upper bound or supremum property of the real
numbers R. Prior to introducing these new ideas we briefly review the algebraic
and order properties of Q and R.
Both the rational numbers Q and the real numbers R are algebraic systems
known as fields. The key facts about a field which we need to know is that
it is a set F with two operations, addition (+) and multiplication (·), which
satisfy the following axioms:
1. If a, b ∈ F, then a + b ∈ F and a · b ∈ F.
2. The operations are commutative; that is, for all a, b ∈ F

a+b=b+a and a · b = b · a.

3. The operations are associative; that is, for all a, b, c ∈ F,

a + (b + c) = (a + b) + c and a · (b · c) = (a · b) · c.

4. There exists an element 0 ∈ F such that a + 0 = a for every a ∈ F.


5. Every a ∈ F has an additive inverse; that is, there exists an element
−a in F such that
a + (−a) = 0.

6. There exists an element 1 ∈ F with 1 6= 0 such that a · 1 = a for all


a ∈ F.
7. Every a ∈ F with a 6= 0 has a multiplicative inverse; that is, there
exists an element a−1 in F such that

a · a−1 = 1.

8. The operation of multiplication is distributive over addition; that


is, for all a, b, c ∈ F,

a · (b + c) = a · b + a · c.
22 Introduction to Real Analysis

The element 0 is called the zero of F and the element 1 is called the unit
of F. For a 6= 0, the element a−1 is customarily written as a1 or 1/a. Similarly
we write a − b instead of a + (−b), ab instead of a · b, and a/b or ab instead of
a · b−1 .
The real numbers R contain a subset P known as the positive real num-
bers satisfying the following:
(O1) If a, b ∈ P, then a + b ∈ P and a · b ∈ P.
(O2) If a ∈ R then one and only one of the following hold:
a ∈ P, −a ∈ P, a = 0.
Properties (O1) and (O2) are called the order properties of R. Any field F
with a nonempty subset satisfying (O1) and (O2) above is called an ordered
field. For the real numbers we assume the existence of a positive set P. For
the rational numbers Q, the set of positive rational numbers is given by
P ∩ Q which can be proved to be equal to { p/q : p, q ∈ Z, q 6= 0, pq ∈ N}.
Let a, b be elements of R. If a − b is positive, i.e., a − b ∈ P, then we write
a > b or b < a. In particular, the notation a > 0 (or 0 < a) means that a is a
positive element. Also, a ≤ b (or b ≥ a) if a < b or a = b.
The following useful results are immediate consequences of the order prop-
erties and the axioms for addition and multiplication. Let a, b, c be elements
of R.
(a) If a > b, then a + c > b + c.
(b) If a > b and c > 0, then ac > bc.
(c) If a > b and c < 0, then ac < bc.
(d) If a 6= 0, then a2 > 0.
(e) If a > 0, then 1/a > 0; if a < 0, then 1/a < 0.
To illustrate the method of proof, we provide the proof of (b). Suppose a > b;
i.e, a − b is positive. If c is positive, then by (O1) (a − b)c is positive. By the
distributive law,
(a − b)c = ac − bc.
Therefore ac − bc is positive; that is, ac > bc. The proofs of the other results
are left as exercises.

Upper Bound of a Set


We now turn our attention to the most important topic of this chapter; namely,
the least upper bound or supremum property of R. In Example 1.4.2(c) we
will show that this property fails for the rational numbers Q. First however,
we define the concept of an upper bound of a set.

DEFINITION 1.4.1 A subset E of R is bounded above if there exists


β ∈ R such that x ≤ β for every x ∈ E. Such a β is called an upper bound
of E.
The Real Numbers 23

The concepts bounded below and lower bound are defined similarly.
A set E is bounded if E is bounded both above and below. We now consider
several examples to illustrate these concepts.

EXAMPLES 1.4.2 (a) Let A = {0, 21 , 23 , 43 , ...} = {1 − n1 : n = 1, 2, 3, ...}


Clearly A is bounded below by any real number r ≤ 0 and also above by any
real number s ≥ 1.
(b) N = {1, 2, 3, ...}. This set is bounded below; e.g., 1 is a lower bound.
Our intuition tells us that N is not bounded above. It is obvious that there
is no positive integer n such that j ≤ n for all j ∈ N. However, what is not
so obvious is that there is no real number β such that j ≤ β for all j ∈ N. In
fact, given β ∈ R, the proof of the existence of a positive integer n > β will
require the least upper bound property of R (Theorem 1.5.1).
(c) B = {r ∈ Q : r > 0 and r2 < 2}. Again it is clear that 0 is a lower
bound for B, and that B is bounded above; e.g., 2 is an upper bound for B.
What is not so obvious however is that B has no maximum. By the maximum
or largest element of B we mean an element α ∈ B such that p ≤ α for all
p ∈ B. Suppose p ∈ B. Define the rational number q by
2 − p2
 
2p + 2
q =p+ = .
p+2 p+2
With q as defined, a simple computation gives
2(p2 − 2)
q2 − 2 = .
(p + 2)2

Since p2 < 2, q > p and q 2 < 2. Thus B has no largest element. Similarly, the
set
{p ∈ Q : p > 0, p2 > 2}
has no minimum or smallest element. Intuitively, the largest element of B
would satisfy p2 = 2. However, as was shown in the introduction, there is no
rational number p for which p2 = 2. 

Least Upper Bound of a Set

DEFINITION 1.4.3 Let E be a nonempty subset of R that is bounded above.


An element α ∈ R is called the least upper bound or supremum of E if
(i) α is an upper bound of E, and
(ii) if β ∈ R satisfies β < α, then β is not an upper bound of E.

Condition (ii) is equivalent to α ≤ β for all upper bounds β of E. Also by


(ii), the least upper bound of a set is unique. If the set E has a least upper
bound, we write
α = sup E
24 Introduction to Real Analysis

to denote that α is the supremum or least upper bound of E. The greatest


lower bound or infimum of a nonempty set E is defined similarly, and if it
exists, is denoted by inf E.
There is one important fact about the supremum of a set which will be
used repeatedly throughout the text. Due to its importance we state it as a
theorem.

THEOREM 1.4.4 Let A be a nonempty subset of R that is bounded above.


An upper bound α of A is the supremum of A if and only if for every β < α,
there exists an element x ∈ A such that

β < x ≤ α.

Proof. Suppose α = sup A. If β < α, then β is not an upper bound of A.


Thus there exists an element x in A such that x > β. On the other hand, since
α is an upper bound of A, x ≤ α.
Conversely, if α is an upper bound of A satisfying the stated condition,
then every β < α is not an upper bound of A. Thus α = sup A. 

EXAMPLES 1.4.5 In the following examples, let’s consider again the three
sets of the previous examples.
(a) As in Example 1.4.2(a), let A = {0, 12 , 23 , 34 , · · · }. Since 0 is a lower
bound of A and 0 ∈ A, inf A = 0. We now prove that sup A = 1. Since
1 − n1 < 1 for all n = 1, 2, . . . , 1 is an upper bound. To show that 1 = sup A
we need to show that if β ∈ R with β < 1, then β is not an upper bound of
A. Clearly if β ≤ 0, then β is not an upper bound of A. Suppose as in Figure
1.8, 0 < β < 1. Then our intuition tells us that there exists an integer no such
that
1 1
no > , or β <1− .
1−β no

FIGURE 1.8
Proof that sup A = 1 in Example 1.4.5(a)
The Real Numbers 25

But 1 − n1o ∈ A, and thus β is not an upper bound. Therefore sup A = 1.


The existence of such an integer no will follow from Theorem 1.5.1. In this
example, inf A ∈ A but sup A 6∈ A.
(b) For the set N, inf N = 1. Since N is not bounded above, N does not
have an upper bound in R.
(c) In this example, we prove that the supremum of the set

B = {r ∈ Q : r > 0 and r2 < 2},

if it exists, is not an element of Q. Suppose α = sup B exists and is in Q.


Since α is rational, α2 6= 2. Thus α2 < 2 or α2 > 2. But if α ∈ B, then
since B contains no largest element, there exists q ∈ B such that q > α. This
contradicts that α is an upper bound of B. Similarly, if α2 > 2, then there
exists a q < α such that q 2 > 2. But then q is an upper bound of B, which is
a contradiction
√ of property (ii) of Definition 1.4.3. The least upper bound of
B in R is 2 (Section 1.5, Exercise 9), which we know is not rational. 

Least Upper Bound Property of R


The following property, also referred to as the completeness property of
R, distinguishes the real numbers from the rational numbers and forms the
foundation for many of the results in real analysis.
SUPREMUM OR LEAST UPPER BOUND PROPERTY OF R
Every nonempty subset of R that is bounded above has a supremum in R.
For our later convenience we restate the supremum property of R as the
infimum property of R.
INFIMUM OR GREATEST LOWER BOUND PROPERTY OF R.
Every nonempty subset of R that is bounded below has an infimum in R.
Although stated here as a property, which we will assume as a basic axiom
about R, the least upper bound property of R is really a theorem due to both
Cantor and Dedekind, both of whom published their results independently in
1872. Dedekind, in the paper “Stetigkeit und irrationale Zahlen” (Continuity
and irrational numbers), used algebraic techniques now known as the method
of Dedekind cuts to construct the real number system R from the rational
numbers Q. He proved that the system R contained a natural subset of positive
elements satisfying the order axioms (O1) and (O2), and furthermore, that
R also satisfied the least upper bound property. The books by Burrill and
by Spooner and Mentzger cited in the Supplemental Readings are devoted
to number systems. Both texts contain Dedekind’s construction of R. Cantor
on the other hand constructed R from Q using Cauchy sequences. In the
miscellaneous exercises of Chapter 3 we will provide some of the key steps of
this construction.
26 Introduction to Real Analysis

EXAMPLE 1.4.6 In this example, we show that for every positive real num-
ber y > 0, there exists a unique positive real number α such that α2 = y; i.e.,

α = y. The uniqueness of α was established in Example 1.2.9(b).
We only prove the result for y > 1, leaving the case 0 < y ≤ 1 to the
exercises (Exercise 6). Let

C = {x ∈ R : x > 0 and x2 < y}.

With y = 2, this set is similar to the set B of Example 1.4.5(c), except


that here we consider all positive real numbers x for which x2 < y. Since
y > 1, 1 ∈ C and thus C is nonempty. Also since y > 1, y 2 > y, and thus y
is an upper bound of C. Hence by the least upper bound property, C has a
supremum in R. Let α = sup C. We now prove that α2 = y. To accomplish
this we show that the assumptions α2 < y and α2 > y lead to contradictions.
Thus α2 = y.
Define the real number β by

y − α2
 
y(α + 1)
β =α+ = . (1)
α+y α+y

Then
y(y − 1)(α2 − y)
β2 − y = . (2)
(α + y)2
If α2 < y, then by (1) β > α, and by (2) β 2 < y. This contradicts that α is an
upper bound for C. On the other hand, if α2 > y, then by (1) β < α and by
(2), β 2 > y. Thus if x ∈ R with x ≥ β, then x2 > y. Therefore β is an upper
bound of C. This contradicts that α is the least upper bound of C. Since β
defined by (1) may not be rational, the same proof will not work for the set B
of Example 1.4.5(c). However, using Theorem
√ 1.5.2 of the following section, it
is possible to also prove that sup B = 2. 

For convenience, we extend the definition of supremum and infimum of a


subset E of R to include the case where E is not necessarily bounded above
or below.

DEFINITION 1.4.7 If E is a nonempty subset of R, we set

sup E = ∞ if E is not bounded above, and


inf E = −∞ if E is not bounded below.

For the empty set ∅, every element of R is an upper bound of ∅ . For


this reason the supremum of the empty set ∅ is taken to be −∞. Similarly,
inf ∅ = ∞. Also, for the symbols −∞ and ∞ we adopt the convention that
−∞ < x < ∞ for every x ∈ R.
The Real Numbers 27

Intervals
Using the order properties of R, we can define certain subsets of R known as
intervals.

DEFINITION 1.4.8 For a, b ∈ R, a ≤ b, the open interval (a, b) is de-


fined as
(a, b) = {x ∈ R : a < x < b},
whereas the closed interval [a, b] is defined as

[a, b] = {x ∈ R : a ≤ x ≤ b}.

In addition, we also have the half-open (half-closed) intervals

[a, b) = {x ∈ R : a ≤ x < b},


(a, b] = {x ∈ R : a < x ≤ b},

and the infinite intervals

(a, ∞) = {x ∈ R : a < x < ∞},


[a, ∞) = {x ∈ R : a ≤ x < ∞},

with analogous definitions for (−∞, b) and (−∞, b]. The intervals (a, ∞), (−∞, b)
and (−∞, ∞) = R are also referred to as open intervals, whereas the intervals
[a, ∞) and (−∞, b] are called closed intervals.

In the above, when b = a, (a, a) = ∅ and [a, a] = {a}. Although the empty
set ∅ and the singleton {a} do not fit our intuitive definition of an interval,
we will include them as the degenerate case of open and closed intervals re-
spectively. It should be noted that the intervals of the form (a, b), (a, b], [a, b),
and [a, b] with a, b ∈ R, a ≤ b, are all bounded subsets of R.
An alternate way of defining intervals without use of the adjectives open
and closed is as follows:

DEFINITION 1.4.9 A subset J of R is an interval if whenever x, y ∈ J


with x < y, then every t satisfying x < t < y is in J.

This definition also allows the possibility that J is empty or a singleton.


One can show that that every set J satisfying the above is one of the intervals
defined in Definition 1.4.8 (Exercise 21).
28 Introduction to Real Analysis

Exercises 1.4
1. Use the axioms for addition and multiplication to prove the following: if
a ∈ R, then
a. a · 0 = 0. b. (−1) · a = −a. c. −(−a) = a.
2. Let a, b ∈ R. Prove the following:
1
a. If a 6= 0, then a
6= 0. b. If a · b = 0, then either a = 0 or b = 0.
3. Let a, b, c ∈ R. Prove the following:
a. If a > b, then a + c > b + c. b. If a 6= 0, then a2 > 0.
c. If a > b and c > 0, then ac > bc.
d. If a > 0 then 1/a > 0, and if a < 0 then 1/a < 0.
4. *If a, b ∈ R, prove that ab ≤ 12 (a2 + b2 ).
5. Find the supremum and the infimum of each of the following sets:
 
1
*a. A = {1, 12 , 14 , 81 , ....} = : n ∈ N
2n−1
b. B = {cos n π4 : n ∈ N}
*c. C = {(1 − (−1)n )n : n ∈ N}
d. D = {sin n π2 ; n ∈ N}
e. E = {n cos nπ : n ∈ N
 
2+n
*f. F = :n∈N
n
g. G = (−1)n − n1 : n ∈ N


*h. H = {x ∈ R : x2 < 4}.


i. I = {x2 : −2 < x < 2}
6. If 0 < y ≤ 1, prove that there exists a unique positive real number x such
that x2 = y.
7. Prove that there exists a positive real number x such that x2 = 2.
8. *Let A be a nonempty subset of R. If α = sup A is finite, show that for
each ǫ > 0, there is an a ∈ A such that α − ǫ < a ≤ α.
9. Let E be a nonempty subset of R that is bounded above, and set U =
{β ∈ R : β is an upper bound of E}. Prove that sup E = inf U .
10. Let A be a nonempty subset of R and let −A = {−x : x ∈ A}. Prove
that inf A = − sup(−A).
11. If A and B are nonempty subsets of R with A ⊂ B, prove that
inf B ≤ inf A ≤ sup A ≤ sup B.
12. Suppose that A and B are bounded subsets of R. Prove that A ∪ B is
bounded and that sup(A ∪ B) = sup{sup A, sup B}.
13. Use the least upper bound property of R to prove that every nonempty
subset of R that is bounded below has an infimum.
The Real Numbers 29

14. For A, B subsets of R, define


A + B = {a + b : a ∈ A, b ∈ B} and A · B = {ab : a ∈ A, b ∈ B}.
a. For A = {−1, 2, 4, 7} and B = {−2, −1, 1}, find A + B and A · B.
*b. If A and B are nonempty and bounded above, prove that
sup(A + B) = sup A + sup B.
c. If A and B are nonempty subsets of the positive real numbers that
are bounded above, prove that sup(A · B) = (sup A)(sup B).
d. Give an example of two nonempty bounded sets A and B for which
sup(A · B) 6= (sup A)(sup B).
15. Let f, g be real-valued functions defined on a nonempty set X satisfy-
ing Range f and Range g are bounded subsets of R. Prove each of the
following:
*a. sup{f (x) + g(x) : x ∈ X} ≤ sup{f (x) : x ∈ X} + sup{g(x) : x ∈ X}.
b. Provide an example for which equality does not hold in (a).
c. inf{f (x) : x ∈ X} + inf{g(x) : x ∈ X} ≤ inf{f (x) + g(x) : x ∈ X}.
*d. If f (x) ≤ g(x) for all x ∈ X, then sup{f (x) : x ∈ X} ≤ sup{g(x) :
x ∈ X}.
e. For each x ∈ X, let h(x) = max{f (x), g(x)}. Prove that
sup{h(x) : x ∈ X} = max{sup{f (x) : x ∈ X}, sup{g(x) : x ∈ X}}.
16. Let X = Y = [0, 1] and let f : X ×Y → R be defined by f (x, y) = 3x+2y.
*a. For each x ∈ X, find F (x) = sup{f (x, y) : y ∈ Y }; then find
sup{F (x) : x ∈ X}.
b. For each y ∈ Y , find G(y) = sup{f (x, y) : x ∈ X}; then find
sup{G(y) : y ∈ Y }.
*c. Find sup{f (x, y) : (x, y) ∈ X × Y }. Compare your answer with your
answer in parts (a) and (b).
17. Perform the computations of Exercise 16 with X = [−1, 1] , Y = [0, 2],
and f (x, y) = 3x − 2y.
18. Let X, Y be nonempty sets, and let f be a nonnegative real-valued func-
tion defined on X × Y . For each x ∈ X and y ∈ Y , define
F (x) = sup{f (x, y) : y ∈ Y }, G(y) = sup{f (x, y) : x ∈ X}.
Prove that
sup{F (x) : x ∈ X} = sup{G(y) : y ∈ Y } = sup{f (x, y) : (x, y) ∈ X × Y }.
19. Let X, Y be nonempty sets and let f : X × Y → R be a function with
bounded range. Let
F (x) = sup{f (x, y) : y ∈ Y } and H(y) = inf{f (x, y) : x ∈ X}.
Prove that
sup{H(y) : y ∈ Y } ≤ inf{F (x) : x ∈ X}.
20. Let X = Y = [0, 1]. Perform the computations of Exercise 19 for each of
the following functions f (x, y).
(
1, x = y,
*a. f (x, y) = 3x + 2y b. f (x, y) =
0, x 6= y.
30 Introduction to Real Analysis

21. Let J be a subset of R that has the following property: if x, y ∈ J with


x < y, then t ∈ J for every t satisfying x < t < y. Prove that J is an
interval as defined in Definition 1.4.8.

1.5 Consequences of the Least Upper Bound


Property
In this section, we look at a number of elementary properties of the real num-
bers which in more elementary courses are usually always taken for granted.
As we will see however, these are all actually consequences of the least upper
bound property of the real numbers.

THEOREM 1.5.1 (Archimedian Property) If x, y ∈ R and x > 0, then


there exists a positive integer n such that

n x > y.

Proof. If y ≤ 0, then the result is true for all n. Thus assume that y > 0. We
will again use the method of proof by contradiction. Let

A = { nx : n ∈ N }.

If the result is false, that is, there does not exist an n ∈ N such that nx > y,
then nx ≤ y for all n ∈ N. Thus y is an upper bound for A. Thus since A 6= ∅,
A has a least upper bound in R. Let α = sup A. Since x > 0, α − x < α.
Therefore α − x is not an upper bound and thus there exists an element of A,
say mx such that
α − x < mx.
But then α < (m + 1)x, which contradicts the fact that α is an upper bound
of A. Therefore, there exists a positive integer n such that nx > y. 
Remark. One way in which the previous result is often used is as follows: given
ǫ > 0, there exists a positive integer no such that no ǫ > 1. As a consequence,
1

n
for all integers n, n ≥ no .

THEOREM 1.5.2 If x, y ∈ R and x < y, then there exists r ∈ Q such that

x < r < y.
The Real Numbers 31

Proof. Assume first that x ≥ 0. Since y − x > 0, by Theorem 1.5.1 there


exists an integer n > 0 so that

n(y − x) > 1 or ny > 1 + nx.

Again by Theorem 1.5.1, {k ∈ N : k > nx} is nonempty. Thus by the well


ordering principle there exists m ∈ N such that

m − 1 ≤ nx < m.

Therefore
nx < m ≤ 1 + nx < ny,
or dividing by n,
m
x < < y.
n
If x < 0 and y > 0, then the result is obvious. Finally, if x < y < 0, then by
the above there exists r ∈ Q such that −y < r < −x, i.e., x < −r < y. 
The conclusion of Theorem 1.5.2 is often expressed by the statement that
the rational numbers are dense in the real numbers, that is, between any two
real numbers there exists a rational number. A precise definition of “dense”
is given in Definition 2.2.19.
Another consequence of the least upper bound property is the following
theorem concerning the existence of nth roots.

THEOREM 1.5.3 For every real number x > 0 and every positive integer
n, there exists a unique positive real number y so that y n = x
√ 1
The number y is written as n x or x n and is called the nth root of x.
The uniqueness of y is obvious. Since the existence of y will be an immediate
consequence of the intermediate value theorem (Theorem 4.2.11), we omit the
details of the proof in the text. A sketch of the proof of Theorem 1.5.3 using
the least upper bound property is included in the miscellaneous exercises. It
should be emphasized that the proof of Theorem 4.2.11 also depends on the
least upper bound property.

COROLLARY 1.5.4 If a, b are positive real numbers and n is a positive


integer, then
1 1 1
(ab) n = a n b n .

Proof. Set α = a1/n and β = b1/n . Then

ab = αn β n = (αβ)n .

Thus by uniqueness, αβ = (ab)1/n . 


32 Introduction to Real Analysis

Exercises 1.5
1. *If r and s are positive rational numbers, prove directly (without using
the supremum property) that there exists an n ∈ N such that nr > s.
2. Given any x ∈ R, prove that there exists a unique n ∈ Z such that
n − 1 ≤ x < n.
3. If r 6= 0 is a rational number and x is an irrational number, prove that
r + x and rx are irrational.
4. *Prove directly (without using Theorem 1.5.2) that between any two
rational numbers there exists a rational number.
5. If x, y ∈ R with x < y, show that x < tx+(1−t)y < y for all t, 0 < t < 1.
6. *a. Prove that between any two rational numbers there exists an irra-
tional number.
*b. Prove that between any two real numbers there exists an irrational
number.
7. If x > 0, show that there exists n ∈ N such that 1 2n < x.


8. *Let x, y ∈ R with x < y. If u ∈ R with u > 0, show that there exists a


rational number r such that x < ru < y.
9. Let B = {r ∈ Q : r > 0 and r2 < 2} and α = sup B. Prove that α2 = 2.

1.6 Binary and Ternary Expansions3


In our standard base 10 number system we use the integers {0, 1, ..., 9} to
represent real numbers. Base 10 however is not the only possible base. In
base 3 (ternary) we use only the integers {0, 1, 2}, and in base 2 (binary)
we use only {0, 1}. In this section, we will show how the least upper bound
property may be used to prove the existence of the expansion of a real number
x, 0 < x ≤ 1, for a given base. For purposes of illustration, and for later use,
we will use base 2 and 3, which are commonly referred to as the binary and
ternary expansions respectively.
In base ten, given the decimal .1021, what we really mean is the real
number given by
1 0 2 1
+ + 3 + 4.
10 102 10 10
However, in base 3, the expansion .1021 represents
1 0 2 1
+ 2 + 3 + 4,
3 3 3 3
3 This section can be omitted on first reading. The ternary expansion of a real number

is only required in Section 2.5 and can be covered at that point.


The Real Numbers 33

which is the ternary expansion of 34/81.


For a given x, 0 < x ≤ 1, the ternary or base 3 expansion of x is defined
inductively as follows:

DEFINITION 1.6.1 Let n1 ∈ {0, 1, 2} be the largest integer such that


n1
< x.
3
Having chosen n1 , ..., nk , let nk+1 ∈ {0, 1, 2} be the largest integer such that
n1 n2 nk nk+1
+ 2 + · · · + k + k+1 < x.
3 3 3 3
The expression .n1 n2 n3 ..... is called the ternary expansion of x.

If we set nn
1 nk o
E= + ··· +
k
: k = 1, 2, ... ,
3 3
then E 6= ∅ and E is bounded above by x. As we will shortly see, sup E = x.
In terms of series, which will be covered in detail later, we have

X nk
x= .
3k
k=1

The binary expansion of x, or the expansion of x to any other base, is


defined similarly. For the binary expansion, the integer nk at each step is
chosen as the largest integer in {0, 1}.

EXAMPLES 1.6.2 (a) We now use the above definition to obtain the
ternary expansion of 31 . At the first step, we must choose n1 as the largest
integer in {0, 1, 2} such that
n1 1
< .
3 3
This inequality fails for n1 = 1, 2. Thus n1 = 0. To find n2 , we choose the
largest integer n2 ∈ {0, 1, 2} such that
0 n2 1
+ 2 < ,
3 3 3
which is satisfied by n2 = 2. To find n3 we must have
0 2 n3 1
+ + 3 < .
3 32 3 3
It is left as an exercise to show that this is satisfied for n3 = 0, 1, 2. Thus we
take n3 = 2. At this stage we conjecture that nk = 2 for all k ≥ 2, and that
1
= .02222... (base 3).
3
34 Introduction to Real Analysis

To see that this indeed is the case we use the fact that for the geometric series

X 1
rk = , 0 < r < 1.
(1 − r)
k=0

Thus
∞ ∞  k
X 2 2 X 1 2 1 1
.0222... = = = 1 = .
3k 9 3 9 1− 3
3
k=2 k=0

In (c) we will illustrate how mathematical induction may also be used to prove
such a result. The above ternary expansion is not unique. The number 13 also
has a finite expansion
1
= .1000... (base 3).
3
We will discuss this in more detail at the end of this section.
(b) The binary expansion of 31 is given by

1
= .010101... (base 2).
3
This expansion can be obtained using the definition and induction (Exercise
4). Alternately, using the geometric series we have
∞ ∞  k
X 1 1 X 1 1
.010101... = = = .
22k 4 4 3
k=1 k=0

1
(c) The ternary expansion of 2 is given by

1
= .1111..... (base 3).
2
We now show in detail how this expansion is derived. Since 31 < 12 and 32 > 12 ,
n1 = 1. We will use the second principle of mathematical induction to prove
that nk = 1 for all k ∈ N. By the above, the result is true for k = 1. Let k > 1
and assume that nj = 1 for all j < k. By definition nk is the largest integer
in {0, 1, 2} such that
1 1 nk 1
+ · · · + k−1 + k < . (3)
3 3 3 2
r − rn+1
Using the identity r + r2 + · · · + rn = , r 6= 1, (Example 1.3.2(a))
1−r
with n = k − 1, we obtain
1 1

 
1 1 3 3k 1 1
+ · · · + k−1 = 2 = 1− .
3 3 3
2 3k−1
The Real Numbers 35

Substituting into equation (3) and multiplying by 2 gives


3 2nk
1− k
+ k < 1.
3 3
This inequality is true if nk = 0 or 1 and is false if nk = 2. Since nk is to be
chosen as the largest integer of {0, 1, 2} for which (3) holds, nk = 1. Thus by
Theorem 1.3.3 nk = 1 for all k ∈ N. 

THEOREM 1.6.3 Let x ∈ R with 0 < x ≤ 1, and with {nk } as defined in


Definition 1.6.1 let
nn nk o
1
E= + · · · + k : k = 1, 2, .. .
3 3
Then sup E = x.

Proof. Let α = sup E. Since x is an upper bound, α ≤ x. Suppose α < x.


Let k be the smallest positive integer such that
1 1
<x−α or α+ < x.
3k 3k
Since α = sup E,
n1 nk
+ · · · + k ≤ α.
3 3
But
n1 nk 1 n1 nk−1 nk + 1
+ ··· + k + k = + · · · + k−1 + < x.
3 3 3 3 3 3k
If nk = 0 or 1, then this contradicts the choice of nk . If nk = 2, then we have
n1 nk−1 + 1
+ ··· + < x.
3 3k−1
If any nj , 1 ≤ j ≤ k − 1 is 0 or 1, we have a contradiction to the choice of nj .
If all the nj = 2, then we obtain
2 1
+ < x ≤ 1,
3 3
which is also a contradiction. 
The expansion
x = .n1 n2 n3 ....
is finite or terminating if there exists an integer m ∈ N such that nk = 0
for all k > m. Otherwise, the expansion is infinite or nonterminating. The
expansion of a real number in a given base is not always unique; when x has
a finite expansion, it also has an infinite expansion as well. For the ternary
expansion, when
a
x= , a∈N with 0 < a < 3m ,
3m
36 Introduction to Real Analysis

and 3 does not divide a, then x has a finite expansion of the form
a1 am
x= + ··· + m, am ∈ {1, 2},
3 3
or an infinite expansion of the form

a1 0 X 2
x= + ··· + m + when am = 1, or
3 3 3k
k=m+1

a1 1 X 2
x= + ··· + m + , when am = 2.
3 3 3k
k=m+1

Exercises 1.6
1. Find the ternary expansion of each of the following.
1 1
*a. 4
b. 13
2. Find the real number determined by each of the following finite or infinite
binary expansions. In the case of an infinite expansion use the geometric
series as in Example 1.6.2 (c) to determine your answer.
a. .1010 b. .101010 · · · *c. .0101
*d. .010101 · · · e. .001001 *f. .001001 · · ·
3. Find the real number determined by each of the following finite or infinite
ternary expansions.
*a. .0022 b. .00222 · · · c. .1010
*d. .101010 · · · e. .001001001 · · · *f. .121212 · · ·
1
4. *Find the binary expansion of 3
. Use induction to prove the result.
3
5. *Find both the finite and infinite binary expansions of 16
.
6. If 0 < x ≤ 1, prove that the infinite ternary expansion of x is unique.
a
7. If x = m , with a ∈ N odd, and 0 < a < 2m , show that x has a finite
2
binary expansion of the form
a1 am
x= + ··· m, am = 1,
2 2
and an infinite expansion
a1 0 1
+ ··· + m + ∞
P
x= k=m+1 k .
2 2 2

1.7 Countable and Uncountable Sets


In discussing sets, we all have an intuitive understanding of what it means
for a set to be finite or infinite, and what it means for two finite sets to be
of the same size; that is, to have the same number of elements. For example,
The Real Numbers 37

the sets A = {2, 7, 11, 21} and B = {7, 3, 19, 32} both have the same number
of elements; namely, four. We accomplish this by counting the number of
elements in each of the two sets. Alternately, the same can be accomplished,
without counting, by simply pairing up the elements; i.e.,

7↔2
3↔7
19 ↔ 11
32 ↔ 21

For infinite sets, the concept of two sets being of the same size or having
the same number of elements is vague. For example, let S denote the squares
of the positive integers; namely,

S = {12 , 22 , 32 , ....}.

Then on one hand, S is a proper subset of the positive integers N, yet as


Galileo (1564–1642) observed, the sets N and S can be placed into a one-to-
one correspondence as follows:

1 ←→ 12
2 ←→ 22
3 ←→ 32
..
.

This example caused Galileo, and many subsequent mathematicians, to con-


clude that the standard notion of “size of a set” did not apply to infinite sets.
Cantor on the other hand realized that the concept of one-to-one correspon-
dence raised many interesting questions about the theory of infinite sets. In
this section, we take a closer look at infinite sets and what it means for an
infinite set to be countable. We begin by defining the concept of equivalence
of sets.

DEFINITION 1.7.1 Two sets A and B are said to be equivalent (or to


have the same cardinality), denoted A ∼ B, if there exists a one-to-one
function of A onto B.

The notion of equivalence of sets satisfies the following:


(i) A ∼ A. (reflexive)
(ii) If A ∼ B, then B ∼ A. (symmetric)
(iii) If A ∼ B and B ∼ C, then A ∼ C. (transitive)
38 Introduction to Real Analysis

DEFINITION 1.7.2 For each positive integer n, let Nn = {1, 2, ..., n}. As
in Section 1, N denotes the set of all positive integers. If A is a set, we say:
(a) A is finite if A ∼ Nn for some n, or if A = ∅.
(b) A is infinite if A is not finite.
(c) A is countable if A ∼ N.
(d) A is uncountable if A is neither finite nor countable.
(e) A is at most countable if A is finite or countable.

The reader might want to ponder Exercise 20 at this point. Countable sets
are often called denumerable or enumerable sets. It should be pointed out
that some textbooks, when using the term countable, include the possibility
that the set is finite.

EXAMPLES 1.7.3 (a) As above, let S = {12 , 22 , 32 , ...}. Then the function
g(n) = n2 is a one-to-one mapping of N onto S. Thus S ∼ N and S is countable.

(b) For our second example, we show that Z ∼ N. To see this, consider
the function f : N → Z defined by
 n
 , (n even),
2

f (n) =
 − (n − 1) , (n odd).

2
The following diagram illustrates what the mapping f does to the first few
integers.

1 −→ 0
2 −→ 1
3 −→ −1
4 −→ 2
5 −→ −2
..
.

It is left as an exercise (Exercise 1) to show that this function is a one-to-one


mapping of N onto Z. Thus Z ∼ N, and the set Z is also countable. 

As another illustration of countable sets consider the following theorem.

THEOREM 1.7.4 N × N is countable.

Proof. For this example, it is easier to construct a one-to-one mapping of


N × N onto N. Such a function is given by

f (m, n) = 2m−1 (2n − 1).


The Real Numbers 39

It is left as an exercise (Exercise 3) to show that f as defined is a one-to-one


mapping of N × N onto N. 
One of our goals in this section will be to show that the set Q of rational
numbers is countable.

Sequences

DEFINITION 1.7.5 If A is a set, by a sequence in A we mean a function


f from N into A. For each n ∈ N, let xn = f (n). Then xn is called the nth
term of the sequence f .

For notational convenience, sequences are denoted by {xn }∞ n=1 or just


{xn }, rather than the function f . Note however the distinction between
{xn }∞
n=1 , which denotes the sequence, and {xn : n = 1, 2, ...} which denotes
the range of the sequence. For example, {1 − (−1)n } denotes the sequence f
where
f (n) = xn = 1 − (−1)n .
On the other hand, {xn : n = 1, 2, ...} = {0, 2}.
By definition, if A is a countable set, then there exists a one-to-one function
f from N onto A. Thus

A = Range f = {xn : n = 1, 2, ....}.

The sequence f is called an enumeration of A, i.e., A = {xn : n = 1, 2, ...}


with xn 6= xm whenever n 6= m. This ability to enumerate elements of a
countable set plays a key role in the proofs of some of the following results.

THEOREM 1.7.6 Every infinite subset of a countable set is countable.

Proof. Let A be a countable set and let {xn : n = 1, 2, ...} be an enumeration


of A. Suppose E is an infinite subset of A. Then each x ∈ E is of the form
xk for some k ∈ N. We inductively construct a function f : N → E as follows:
let n1 be the smallest positive integer such that xn1 ∈ E. Such an integer
exists by the well ordering principle. Having chosen n1 , ...., nk−1 , let nk be the
smallest integer greater than nk−1 such that xnk ∈ E \ {xn1 , ..., xnk−1 }. Set
f (k) = xnk . Since E is infinite, f is defined on N.
If m > k, then nm > nk and thus xnm 6= xnk . Therefore f is one-to-one.
The function f is onto since if x ∈ E, then x = xj for some j. By construction,
nk = j for some k, and thus f (k) = x. 

THEOREM 1.7.7 If f maps N onto A, then A is at most countable.

Proof. If A is finite, the result is certainly true. Suppose A is infinite. Since


40 Introduction to Real Analysis

f maps N onto A, each a ∈ A is of the form f (n) for some n ∈ N. For each
a ∈ A, by the well ordering principle

f −1 ({a}) = {n ∈ N : f (n) = a}

has a smallest integer, which we denote by na . Consider the mapping a → na


of A into N. If a 6= b, then since f is a function, na 6= nb . Also, since A is
infinite, {na : a ∈ A} is an infinite subset of N. Thus the mapping a → na is a
one-to-one mapping of A onto an infinite subset of N. Therefore by Theorem
1.7.6 A is countable. 

Indexed Families of Sets


In Section 1 we defined the union and intersection of two sets. We now extend
these definitions to larger collections of sets. Recall that if X is a set, P(X)
denotes the set of all subsets of X.

DEFINITION 1.7.8 Let A and X be nonempty sets. An indexed family


of subsets of X with index set A is a function from A into P(X).

If f : A → P(A), then for each α ∈ A, we let Eα = f (α). As for sequences,


we denote this function by {Eα }α∈A . If A = N, then {En }n∈N is called a
sequence of subsets of X. In this instance, we adopt the more conventional
notation {En }∞
n=1 to denote {En }n∈N .

EXAMPLES 1.7.9 The following are all examples of indexed families of


sets.
(a) The sequence {Nn }∞n=1 , where Nn = {1, 2, ..., n}, is a sequence of sub-
sets of N.
(b) For each n ∈ N, set In = {x ∈ R : 0 < x < n1 }. Then {In }∞ n=1 is a
sequence of subsets of R.
(c) For each x, 0 < x < 1, let

Ex = {r ∈ Q : 0 ≤ r < x}.

Then {Ex }x∈(0,1) is an indexed family of subsets of Q. In this example, the


open interval (0, 1) is our index set. 

DEFINITION 1.7.10 Suppose {Eα }α∈A is an indexed family of subsets of


X. The union of the family of sets {Eα }α∈A is defined to be
[
Eα = {x ∈ X : x ∈ Eα for some α ∈ A}.
α∈A

The intersection of the family of sets {Eα }α∈A is defined as


\
Eα = {x ∈ X : x ∈ Eα for all α ∈ A}.
α∈A
The Real Numbers 41

If A = N we use the notation



[ ∞
\ [ \
En and En instead of En and En
n=1 n=1 n∈N n∈N

respectively. Also, if A = Nk , then

[ k
[
En is denoted by En ,
n∈Nk n=1

with an analogous definition for the intersection. Occasionally, when S


the index
set A is fixed in the [ we will use the shorthand notation α Eα or
discussion,\
T
α Eα rather than Eα or Eα .
α∈A α∈A

EXAMPLES 1.7.11 We now consider the union and intersection of the


families of sets given in the previous example.
(a) With Nn = {1, 2, ..., n}, we have

\ ∞
[
Nn = {1}, and Nn = N.
n=1 n=1
T
S is in Nn for all n, n Nn = {1}. For the
Since 1 is the only element which
union, since Nn ⊂ N for all n, n NnS⊂ N. On the other hand, if n ∈ N, then
n ∈ Nn , and as a consequence, N ⊂ n Nn , which proves equality.
(b) As in the previous example, for n ∈ N let In = {x ∈ R : 0 < x < n1 }.
We first show that

\
In = ∅.
n=1

Suppose not, then there exist x ∈ R such that x ∈ In for all n, i.e.,
1
0<x< , for all n ∈ N.
n
This however contradicts Theorem 1.5.1 which guarantees the existence of a
positive integer n such that nx > 1. For the union, since In ⊂ I1 for all n ≥ 1,

[
In = I1 = {x ∈ R : 0 < x < 1}.
n=1

(c) We leave it as an exercise (Exercise 9) to show that if Ex is defined as


in Example 1.7.9(c), i.e., Ex = {r ∈ Q : 0 ≤ r < x}, then
\ [
Ex = {0}, and Ex = {r ∈ Q : 0 ≤ r < 1}. 
x∈(0,1) x∈(0,1)
42 Introduction to Real Analysis

As for a finite number of sets, we also have analogs of the distributive laws
and De Morgan’s laws for arbitrary unions and intersections.

THEOREM 1.7.12 (Distributive Laws) If Eα , α ∈ A, and E are subsets


of a set X, then
!
\ [ [
(a) E Eα = (E ∩ Eα ),
α∈A α∈A
!
[ \ \
(b) E Eα = (E ∪ Eα ).
α∈A α∈A

THEOREM 1.7.13 (De Morgan’s Laws) If {Eα }α∈A is a family of sub-


sets of X, then
!c
[ \
(a) Eα = Eαc ,
α∈A α∈A
!c
\ [
(b) Eα = Eαc .
α∈A α∈A

The proofs of both of these theorems, as well as the following analogue of


Theorems 1.2.5 and 1.2.6 are left to the exercises.

THEOREM 1.7.14 Let f be a function from X into Y , and let A be a


nonempty set.
(a) If {Eα }α∈A is a family of subsets of X, then
!
[ [
f Eα = f (Eα ),
α∈A α∈A
!
\ \
f Eα ⊂ f (Eα ).
α∈A α∈A

(b) If {Bα }α∈A is a family of subsets of Y , then


!
[ [
−1
f Bα = f −1 (Bα ),
α∈A α∈A
!
\ \
f −1 Bα = f −1 (Bα ).
α∈A α∈A
The Real Numbers 43

The Countability of Q
The countability of the set of rational numbers Q will follow as a corollary of
the following theorem.

THEOREM 1.7.15 If {En }∞


n=1 is a sequence of countable sets and


[
S= En ,
n=1

then S is countable.

Proof. Since En is countable for each n ∈ N, we can write

En = { xn,k : k = 1, 2, .... }.

Since E1 is an infinite subset of S, the set S itself is infinite. Consider the


function h : N × N → S by
h(n, k) = xn,k .
The function h, although not necessarily one-to-one, is a mapping of N × N
onto S. Thus since N × N ∼ N, there exists a mapping of N onto S. Hence by
Theorem 1.7.7 the set S is countable. 

COROLLARY 1.7.16 Q is countable.

Proof. For each m ∈ N, let


n n o
Em = : n∈Z .
m
S∞
Then Em is countable, and since Q = m=1 Em , by Theorem 1.7.15 the set
Q is countable. 

The Uncountability of R
In November 1873, in a letter to Dedekind, Cantor asked whether the set R
itself was countable. A month later he answered his own question by prov-
ing that R was not countable. We now prove, using Cantor’s elegant “di-
agonal” argument, that the closed interval [0, 1] is uncountable, and thus R
itself is uncountable. For the proof we will use the fact that as in Section
6, every x ∈ [0, 1] has a decimal expansion of the form x = .n1 n2 · · · with
ni ∈ {0, 1, 2, ..., 9}. As for the binary and ternary expansions, the decimal
1
expansion is not necessarily unique. Certain numbers such as 10 have two
expansions; namely
1 1
= .100 · · · and = .0999 · · · .
10 10
This however will not be crucial in the proof of the following theorem.
44 Introduction to Real Analysis

THEOREM 1.7.17 The closed interval [0, 1] is uncountable.

Proof. Since there are infinitely many rational numbers in [0, 1], the set is
not finite. To prove that it is uncountable, we only need to show that it is not
countable. To accomplish this, we will prove that every countable subset of
[0, 1] is a proper subset of [0, 1]. Thus [0, 1] cannot be countable.
Let E = {xn n = 1, 2, ...} be a countable subset of [0, 1]. Then each xn has
a decimal expansion
xn = .xn,1 xn,2 xn,3 · · ·
where for each k ∈ N, xn,k ∈ {0, 1, ..., 9}. We now define a new number

y = .y1 y2 y3 · · ·

as follows: if xn,n ≤ 5, define yn = 6; if xn,n ≥ 6, define yn = 3. Then


y ∈ [0, 1], and since yn 6= 0 or 9, y is not one of the real numbers with two
decimal expansions. Also, since for each n ∈ N yn 6= xn,n , we have y 6= xn for
any n. Therefore y 6∈ E; i.e., E is a proper subset of [0, 1]. 
Another example of an uncountable set is given in the following theorem.

THEOREM 1.7.18 If A is the set of all sequences whose elements are 0 or


1, then A is uncountable.

Remark. The set A is the set of all functions f from N into {0, 1}. Thus a
sequence f ∈ A if and only if f (n) = 0 or 1 for all n.
Proof. As in the previous theorem, we will also prove that every countable
subset of A is a proper subset of A, and thus A cannot be countable.
Let E be a countable subset of A and let {sn : n = 1, 2, ... } be an enumer-
ation of the set E. For each n, sn is a sequence of 0’s and 1’s. We construct a
new sequence s as follows: for each k ∈ N, let

s(k) = 1 − sk (k).

Thus if sk (k) = 0, s(k) = 1, and if sk (k) = 1, s(k) = 0. Thus s ∈ A. Since for


all k ∈ N s(k) 6= sk (k), we have s 6= sn for any n ∈ N. Therefore s 6∈ E, i.e.,
E $ A. 
In the previous two theorems we proved that the closed interval [0, 1] and
the set of all sequences of 0’s and 1’s are both uncountable sets. A natural
question is are these two sets equivalent? Considering the fact that every real
number x ∈ [0, 1] has a binary expansion, which is really a sequence of 0’s
and 1’s, one would expect that the answer is yes. This indeed is the case
(Miscellaneous Exercise 5).
The Real Numbers 45

Exercises 1.7
1. a. Prove that the function f of Example 1.7.3(b) is a one-to-one function
of N onto Z.
b. Find a one-to-one function of Z onto N.
*c. Find a one-to-one function from N onto O, the set of all odd positive
integers.
2. For each of the following determine whether the set is finite, infinite,
countable, or uncountable.
a. {1 − (−1)n : n ∈ N} b. {n cos nπ : n ∈ N}
c. {2n : n ∈ Z} d. 2mn : m, n ∈ N, m odd
e. [0, 1] \ Q f. {x ∈ [0, 1] : x has an infinite ternary expansion}
3. Prove that the function f of Theorem 1.7.4 is a one-to-one function of
N × N onto N.
4. *a. If a, b ∈ R with a < b, prove that (a, b) ∼ (0, 1).
b. Prove that (0, 1) ∼ (0, ∞).
5. Suppose X, Y, Z are sets. If X ∼ Y and Y ∼ Z, prove that X ∼ Z.
6. *a If A ∼ X and B ∼ Y , prove that (A × B) ∼ (X × Y ).
b. If A and B are countable sets, prove that A × B is countable.
7. If X ∼ Y , prove that P(X) ∼ P(Y ).

[ ∞
\
8. Find An and An for each of the following sequences of sets {An }.
n=1 n=1
*a. An = {x ∈ R : −n < x < n}, n ∈ N
b. An = {x ∈ R : − n1 < x < 1}, n ∈ N
*c. An = {x ∈ R : − n1 < x < 1 + n1 }, n ∈ N
d. An = {x ∈ R : 0 ≤ x ≤ 1 − n1 } n ∈ N
*e. An = n1 , 1 − n1 , n ∈ N, n ≥ 2
f. An = {x ∈ R : n ≤ x < ∞}
9. For each x ∈ (0, 1), let Ex = {r ∈ Q : 0 ≤ r < x}. Prove that
\ [
Ex = {0} and Ex = {r ∈ Q : 0 ≤ r < x}.
x∈(0,1) x∈(0,1)

10. Prove Theorem 1.7.12.


11. Prove Theorem 1.7.13.
12. Let f be a function from X into Y .
*a. If {Eα }α∈A is a family of subsets of X, prove that
S S
f ( α Eα ) = α f (Eα ).
b. If {Bα }α∈A is a family of subsets of Y , prove that
f −1 (
T T −1
α Bα ) = αf (Bα ).
13. a. If A is a countable subset of an uncountable set X, prove that X \ A
is uncountable.
*b. Prove that the set of irrational numbers is uncountable.
46 Introduction to Real Analysis

14. Suppose f is a function from X into Y . If the range of f is uncountable,


prove that X is uncountable.
15. *a. For each n ∈ N, prove that the collection of all polynomials in x of
degree less than or equal to n with rational coefficients is countable.
*b. Prove that the set of all polynomials in x with rational coefficients
is countable.
16. Prove that the set of all intervals with rational endpoints is countable.
17. Let A be a nonempty subset of R that is bounded above and let α =
sup A. If α ∈ / A, prove that for every ǫ > 0, the interval (α − ǫ, α)
contains infinitely many points of A
18. *a. Prove that (0, 1) ∼ (0, 1]. (This problem is not easy!)
b. Prove that (0, 1) ∼ [0, 1].
19. *A real number a is algebraic if there exists a polynomial p(x) with
integer coefficients such that p(a) = 0. Prove that the set of algebraic
numbers is countable.
20. Prove that any infinite set contains a countable subset.
21. Prove that a set is infinite if and only if it is equivalent to a proper subset
of itself.
22. *Prove that any function from a set A to the set P(A) of all subsets of
A is not onto.
23. *Prove that [0, 1] × [0, 1] ∼ [0, 1].

Notes
The most important concept of this chapter is the least upper bound property of the
real numbers. This property will be fundamental in the development of the underly-
ing theory of calculus. In the present chapter we have already seen its application in
proving the Archimedian property (Theorem 1.5.1). For the rational number system
this property can be proved directly. For the real number system it was originally
assumed as an axiom by Archimedes (287–212 B.C.). Cantor however proved that
the Archimedian property was no axiom, but a proposition derivable from the least
upper bound property.
In subsequent chapters the least upper bound property will occur in proofs of
theorems, either directly or indirectly, with regular frequency. It will play a crucial
role in the characterization of the compact subsets of R and in the study of sequences
of real numbers. It will also be required in the proof of the intermediate value theorem
for continuous functions. One of the corollaries of this theorem is Theorem 1.5.3 on
the existence of nth roots of positive real numbers. Many other results in the text
will depend on previous theorems which required the least upper bound property in
their proofs.
Miscellaneous Exercises 47

The emphasis on the least upper bound property is not meant to overshadow
the importance of the concepts of countable and uncountable sets. The fact that
the rational numbers are countable, and thus can be enumerated, will be used on
several occasions in the construction of examples. In all the examples and exercises,
every infinite subset of R turns out to be either countable or equivalent to [0, 1].
Cantor also made this observation and it led him to ask whether this result was true
for every infinite subset of R. Cantor was never able to answer this question; nor
has anyone else. The assertion that every infinite subset of R is either countable or
equivalent to [0, 1] is known as the continuum hypothesis. In 1938 Kurt Gödel
proved that the continuum hypothesis is consistent with the standard axioms of set
theory; that is, Gödel showed that continuum hypothesis cannot be disproved on the
basis of the standard axioms of set theory. On the other hand, Paul Cohen in 1963
showed that the continuum hypothesis is undecidable on the basis of the current
axioms of set theory.
Cantor’s creation of the theory of infinite sets was motivated to a great extend by
problems arising in the study of convergence of Fourier series. We will discuss some
of these problems in greater detail in Chapters 9 and 10. Cantor’s original work on
the theory of infinite sets can be found in his monograph listed in the Supplemental
Readings.

Miscellaneous Exercises
1. Let A and B be nonempty sets. For a ∈ A, b ∈ B, define the ordered pair
(a, b) by
(a, b) = {{a}, {a, b}}.
Prove that two ordered pairs (a, b) and (c, d) are equal if and only if a = c
and b = d.
The following two exercises are detailed and lengthy. The first exercise is
a sketch of the proof of Theorem 1.5.3. The second exercise shows how the
least upper bound property may be used to define the exponential function
bx , b > 1.
2. Let
E = {t ∈ R : t > 0 and tn < x}.
a. Show that E 6= ∅ by showing that x/(x + 1) ∈ E.
b. Show that 1 + x is an upper bound of E.
Let
y = sup E.
The remaining parts of the exercise are to show that y n = x. This will be
accomplished by showing that y n < x and y n > x lead to contradictions,
leaving y n = x. To accomplish this, the following inequality will prove
useful. Suppose 0 < a < b, then
bn − an = (b − a)(bn−1 + abn−2 + · · · + an−1 ) < n(b − a)bn−1 . (*)
48 Introduction to Real Analysis

c. Show that the assumption y n < x contradicts y = sup E as follows:


Choose 0 < h < 1 such that
x − yn
h< .
n(y + 1)n−1

Use (*) to show that y + h ∈ E.


d. Show that the assumption y n > x also leads to a contradiction of the
definition of y as follows: Set
yn − x
k= .
y n−1
Show that if t ≥ y − k, then t 6∈ E.
3. Fix b > 1.
a. Suppose m, n, p, q are integers with n > 0 and q > 0. If m/n = p/q,
prove that
(bm )1/n = (bp )1/q .
Thus if r is rational, br is well defined.
b. If r, s are rational, prove that br+s = br bs .
c. If x ∈ R, let B(x) = {bt : t ∈ Q, t ≤ x}. Prove that br = sup B(r) when
r ∈ Q. Thus it now makes sense to define bx = sup B(x) when x ∈ R.
d. Prove that bx+y = bx by for all real numbers x, y.

The following result, known as the Schröder-Bernstein theorem, is non-


trivial, but very important. It is included as an exercise to provide mo-
tivation for further thought and additional studies. A proof of the result
can be found in the text by Halmos listed in the Supplemental Reading.

4. Let X and Y be infinite sets. If X is equivalent to a subset of Y , and Y


is equivalent to a subset of X, prove that X is equivalent to Y .
5. As in Theorem 1.7.18, let A denote the set of all sequences of 0’s and 1’s.
Use the previous result to prove that A ∼ [0, 1].
6. DEFINITION. A complex number is an ordered pair (a, b) of real
numbers. If z = (a, b) and w = (c, d), we write z = w if and only if
a = c and b = d. For complex numbers z and w we define addition and
multiplication as follows:

z + w = (a + c, b + d)
z · w = (ac − bd, ad + bc).

The set of ordered pairs (a, b) of real numbers with the above operations
of addition and multiplication is denoted by C.
a. Prove that (C, +, ·) with zero 0 = (0, 0) and unit 1 = (1, 0) is a field.
b. Set i = (0, 1). Show that i2 = −1.
c. Prove that C is not an ordered field.
Supplemental Reading 49

Supplemental Reading

Buck, R. C., “Mathematical induc- tinuum problem,” Amer. Math. Monthly


tion and recursive definition,” Amer. 54 (1947), 515–525.
Math. Monthly 70 (1963), 128–135. Halmos, Paul, Naive Set Theory,
Burrill, Claude W., Foundations of Springer-Verlag, New York, Heidelberg,
Real Numbers, Mc Graw-Hill, Inc., New Berlin, 1974.
York, 1967. Nimbran, A. S., “One
√ more proof of
Cantor, Georg, Contributions to the the irrationality of 2,” Amer. Math.
Founding of the Theory of Transfi- Monthly 121 (2014), 964.
nite Numbers, (translated by Philip Shrader-Frechette, M., “Complemen-
E.B. Jourdain), Open Court Publ. Co., tary rational numbers,” Math. Mag., 51
Chicago and London, 1915. (1978), 90–98.
Dauben, Joseph W., Georg Can- Spooner, George and Mentzer,
tor; his Mathematics and Philosophy of Richard, Introduction to Number Sys-
the Infinite, Princeton University Press, tems, Prentice-Hall, Inc., Englewood
Princeton, N.J., 1979. Cliffs, N.J., 1968
Gascón, J., “Another proof that the Tripathi, A., “An alternate method
real numbers are uncountable,” Amer. to compute the decimal expansion of ra-
Math. Monthly 122 (2015), 596–597. tional numbers,” Math. Mag. 90 (2017),
Gödel, Kurt, “What is Cantor’s con- 108–113.
2
Topology of the Real Line

In this chapter, we introduce some of the basic concepts fundamental to the


study of limits and continuity, and study the structure of point sets in R. The
branch of mathematics concerned with the study of these topics–not only for
the real numbers but also for more general sets–is known as topology. Modern
point set topology dates back to the early part of the 20th century; its roots,
however, date back to the 1850s and 1860s and the studies of Bolzano, Can-
tor, and Weierstrass on sets of real numbers. Many important mathematical
concepts depend on the concept of a limit point of a set and the limit process,
and one of the primary goals of topology is to provide an appropriate setting
for the study of these concepts.
One of the first topics encountered in the study of calculus is the concept
of limit, which requires the notion of closeness, or the distance between points
becoming small. On the real line or in the euclidean plane, the distance be-
tween points is usually measured as the length of the straight line segment
joining the points. However, in many instances in subsequent chapters our
points will not be points on the line or in the plane, but rather functions de-
fined on some set. For this reason we introduce the concept of distance on an
arbitrary set and study metric spaces in general. In many instance, the proofs
of the results are such that they are valid in any metric space, and will be
stated as such. These more general results about arbitrary metric spaces will
prove useful not only in subsequent chapter but also in other courses.
Even though we introduce abstract metric spaces, our primary emphasis in
this chapter will be on the topology of the real line. A thorough understand-
ing of the topics on the real line and the plane will prove invaluable when
they are encountered again in more abstract settings. On first reading, the
concepts introduced in this chapter may seem difficult and challenging. With
perseverance, however, understanding will follow.

2.1 Metric Spaces


We begin our study of metric spaces with a review of distance between points
in the real numbers or its geometric interpretation as the real line.

51
52 Introduction to Real Analysis

DEFINITION 2.1.1 For a real number x the absolute value of x, denoted


|x|, is defined by (
x, if x > 0,
|x| =
−x, if x ≤ 0.

For example, |4| = 4 and | − 5| = 5. From the definition, |x| ≥ 0 for all
x ∈ R and |x| = 0 if and only if x = 0. This last statement follows from the fact
that if x 6= 0, then −x 6= 0 and thus |x| > 0. The following theorem, the proof
of which is left to the exercises, summarizes several well known properties of
absolute value.

THEOREM 2.1.2 (a) | − x| = |x| for all x ∈ R.


(b) |xy| =√|x||y| for all x, y ∈ R.
(c) |x| = x2 for all x ∈ R.
(d) If r > 0, then |x| < r if and only if −r < x < r.
(e) −|x| ≤ x ≤ |x| for all x ∈ R.

The following inequality is very important and will be used frequently


throughout the text.

THEOREM 2.1.3 (Triangle Inequality) For all x, y ∈ R, we have

|x + y| ≤ |x| + |y|.

Proof. The triangle inequality is easily proved as follows: For x, y ∈ R,

0 ≤ (x + y)2 = x2 + 2xy + y 2
≤ |x|2 + 2|x||y| + |y|2 = (|x| + |y|)2 .

Thus by Theorem 2.1.2(c),


p p
|x + y| = (x + y)2 ≤ (|x| + |y|)2 = |x| + |y|. 

As a consequence of the triangle inequality we obtain the following two


useful inequalities, the proofs of which are left to the exercises.

COROLLARY 2.1.4 For all x, y, z ∈ R we have


(a) |x − y| ≤ |x − z| + |z − y|, and
(b) ||x| − |y|| ≤ |x − y|.

In the following example we illustrate how properties of absolute value can


be used to solve inequalities.

EXAMPLE 2.1.5 Determine the set of all real numbers x that satisfy the
inequality |2x + 4| < 8. By Theorem 2.1.2(d), |2x + 4| < 8 if and only if
−8 < 2x + 4 < 8, or equivalently, −12 < 2x < 4. Thus the given inequality is
satisfied if and only if −6 < x < 2.
Topology of the Real Line 53

Geometrically, |x| represents the distance from x to the origin 0. More


generally, for x, y ∈ R, the euclidean distance d(x, y) is defined by

d(x, y) = |x − y|.

For example, d(−1, 3) = | − 1 − 3| = 4 and d(5, −2) = |5 − (−2)| = 7. The


distance d may be regarded as a function on R × R which satisfy the following
properties: d(x, y) ≥ 0, d(x, y) = 0 if and only if x = y, d(x, y) = d(y, x), and

d(x, y) ≤ d(x, z) + d(z, y)

for all x, y, z ∈ R. We now extend the notion of distance to sets other than R.

DEFINITION 2.1.6 Let X be a nonempty set. A real valued function d


defined on X × X satisfying
(1) d(x, y) ≥ 0 for all x, y ∈ X,
(2) d(x, y) = 0 if and only if x = y,
(3) d(x, y) = d(y, x),
(4) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ R,
is called a metric or distance function on X. The set X with metric d is
called a metric space, and is denoted by (X, d).

Before we continue, let us reflect on properties (1) to (4) in the definition of


a metric. The first two combined simply state that distance is nonnegative and
that the distance between two points is zero if and only if they are the same.
Property (3) is symmetry; namely the distance between x and y is the same
as the distance between y and x. Property (4) is again called the triangle
inequality for the metric d. This inequality simply states that the distance
between x and y is less than or equal to the distance between x and z, and z
and y, for any other point z in the space X. All of these properties are what
we intuitively expect a distance function to satisfy.

EXAMPLES 2.1.7 We now provide several examples of metrics and metric


spaces, including the standard metrics on the euclidean plane R2 . For some of
the examples we prove that the functions as defined are indeed metrics, the
others are left as exercises.
(a) The real numbers R with the euclidean metric d(x, y) = |x − y| is
certainly a metric space. The metric is referred to as the usual metric on R.

(b) Let X be any nonempty subset of R. For x, y ∈ X define

d(x, y) = |x − y|.

The distance between the points x and y is just the usual euclidean distance
between x and y as points in R. However, it is important to remember that
our space is the set X, and not R.
54 Introduction to Real Analysis

(c) Let X be any nonempty set with


(
1, if p 6= q,
d(p, q) =
0, if p = q.

This metric is usually referred to as the trivial metric on X.


(d) In this example, we consider euclidean space

R2 = R × R = {(x1 , x2 ) : x1 , x2 ∈ R}.

For convenience we denote a point (p1 , p2 ) ∈ R2 by p. For such a point p


define q
kpk2 = p21 + p22 .
The quantity kpk2 is called the euclidean norm of p. Given two points
p = (p1 , p2 ) and q = (q1 , q2 ) set
p
d2 (p, q) = kp − qk2 = (p1 − q1 )2 + (p2 − q2 )2 .

By the Pythagorean theorem d2 (p, q) is the euclidean distance between the


points p and q.
Since the square root is nonnegative, d2 (p, q) ≥ 0 for all p, q ∈ R2 . Also,
from the definition it follows immediately that d2 (p, q) = 0 if and only if
p = q. Clearly, d2 (p, q) = d2 (q, p). Hence d2 satisfies properties (1)–(3) of a
metric. For the proof of property (4) we need that k k2 satisfies the triangle
inequality, namely
kp + qk2 ≤ kpk2 + kq2 k.
Assuming that k k2 satisfies the triangle inequality, given any three points
p, q, r ∈ R2 , we have

d2 (p, q) = kp − qk2 = k(p − r) + (r − q)k2


≤ kp − rk2 + kr − qk2 = d2 (p, r) + d2 (r, q).

A geometric proof of the triangle inequality follows from the simple fact
that in a triangle, the length of any one side does not exceed the sum of the
lengths of the other two sides. This inequality can also be proved algebrically
(Exercise 9)
(e) Again we let X = R2 . In the previous example the distance between
points was measured as the length of the straight line segment joining the
two points. This metric however is of little use if one is in a city, such as New
York, where the streets are laid out in a rectangular pattern. In such a setting
a more appropriate way to measure distance is along the actual path one needs
to traverse to get from one point to another. Specifically, for p, q ∈ R2 set

d1 (p, q) = |p1 − q1 | + |p2 − q2 |.


Topology of the Real Line 55

Another metric often encountered in R2 is the metric d∞ which for p, q ∈


2
R is defined as

d∞ (p, q) = max{|p1 − q1 |, |p2 − q2 |}.

That d1 and d∞ are indeed metrics on R2 is left to the exercises (Exercise 10).

(f ) In the previous two examples we restricted our discussion to R2 pri-


marily for purposes of illustration. It is easier to visualize these concepts in
the plane as opposed to higher dimensional space. All three of these metrics
however have natural extensions to Rn , where for n ∈ N, n ≥ 2,

Rn = {(x1 , . . . , xn ) : xi ∈ R, i = 1, . . . , n}.

Here (x1 , . . . , xn ) denotes the ordered n-tuple of real numbers x1 , . . . , xn . As


for ordered pairs, two n-tuples (x1 , . . . , xn ) and (y1 , . . . , yn ) are equal if and
only if xi = yi for all i = 1, . . . , n. The euclidean metric d2 for Rn is defined
as follows: if p, q ∈ Rn with p = (p1 , . . . , pn ) and q = (q1 , . . . , qn ), then
p
d2 (p, q) = (p1 − q1 )2 + · · · + (pn − qn )2 .

In the miscellaneous exercises we outline how to prove that d2 is a metric on


Rn . An alternate proof will be provided in Chapter 7. The metrics d1 and d∞
also have analogous extensions to Rn .
(g) For this example we let X = {p ∈ R2 : kpk2 ≤ 1}. For p, q ∈ X,
define
(
kp − qk2 , if p, q are co-linear through 0,
d(p, q) =
kpk2 + kqk2 , otherwise.

This metric space is sometimes referred to as the Washington D. C. space or


the French Railway space. To see the connection one has to be familiar with
how the main roads run in Washington or how the railroads run in France
with regard to the city of Paris. In Washington (excluding the beltway) all
the major roads run to the city center. Similarly in France, all the major rail
lines run through Paris. Thus if one travels between points p and q on the
same line, the distance is just the usual distance between the points. However,
if p and q are on different lines, then the distance between p and q is the
distance from p to Paris plus the distance from Paris to q. By computing
the distance between the points in Exercise 11 one can verify that this is
how distance is measured in this metric. Plotting the points in the plane will
further illustrate this metric.
(h) Let A be any nonempty set. A real-valued function f is bounded on
A if there exists a positive constant M such that |f (x)| ≤ M for all x ∈ A. Let
X be the set of all bounded real-valued functions on A. For f, g ∈ X define

d(f, g) = sup{|f (x) − g(x)| : x ∈ A}.


56 Introduction to Real Analysis

Clearly, since f and g are bounded, {|f (x) − g(x)| : x ∈ A} is bounded above,
and thus d(f, g) < ∞.
Since this is our first example of a space where the elements are functions
we proceed to show that d is a metric. Clearly d is nonnegative. Furthermore,
since |f (x)−g(x)| ≤ d(f, g) for all x ∈ A, d(f, g) = 0 if and only if f (x) = g(x)
for all x ∈ A. That d(f, g) = d(g, f ) follows from the fact that |f (x) − g(x)| =
|g(x)−f (x)|. It only remains to be shown that d satisfies the triangle inequality.
Let f, g, h be bounded functions on A. Then for x ∈ A

|f (x) − g(x)| = |f (x) − h(x) + h(x) − g(x)|


≤ |f (x) − h(x)| + |h(x) − g(x)| ≤ d(f, h) + d(h, g).

Therefore d(f, h)+d(h, g) is an upper bound for the set {|f (x)−g(x)| : x ∈ A},
and as a consequence,

d(f, g) ≤ d(f, h) + d(h, g).

This then proves that d is a metric on X. 

Exercises 2.1
1. Prove Theorem 2.1.2.
2. *Prove Corollary 2.1.4
3. Prove that for x1 , . . . , xn ∈ R, |x1 + · · · + xn | ≤ |x1 | + · · · + |xn |.

4. If a, b ∈ R, prove that |ab| ≤ 21 (a2 + b2 ).


5. Determine all x ∈ R that satisfy each of the following inequalities.
*a. |3x − 2| ≤ 11, b. |x2 − 4| < 5,
*c. |x| + |x − 1| < 3, d. |x − 1| < |x + 1|.
6. Determine and sketch the set of ordered pairs (x, y) in R × R that satisfy
the following:
a. |x| = |y|, b. |x| ≤ |y|,
c. |xy| ≤ 2, d. |x| + |y| ≤ 1.
7. Determine which of the following are metrics on R.
a. d(x, y) = (x − y)2 .
p
b. d(x, y) = 3 |x − y|
*c. d(x, y) = ln(1 + |x − y|).
d. d(x, y) = |3x − y|.
p
e. d(x, y) = |x − y|3 .
8. If d is a metric on X, prove that |d(x, z) − d(z, y)| ≤ d(x, y).
9. For p = (p1 , p2 ) ∈ R2 , let kpk2 be defined as in Example 2.1.7(d). Prove
algebraically that
kp + qk2 ≤ kpk2 + kqk2 for all p, q ∈ R2 .
10. Prove that d1 and d∞ as defined in Example 2.1.7(e) are metrics on R2 .
Topology of the Real Line 57

11. Let d be the metric on R2 defined in Example 2.1.7(g).


a. Compute the distance in this metric between each of the following pair
of points: *i. ( 12 , 14 ), (− 21 , − 41 ) *ii. ( 21 , 12 ), (0, 1) iii. ( 21 , 32 ), ( 14 , 31 )
b. Prove that d is a metric on R2 .
12. Let A = [0, 1] and let X denote the bounded functions on A. Let d be
the metric on X defined in Example 2.1.7(h). Compute d(f, g) for each
of the following pairs of functions f and g.
a. f (x) = 1, g(x) = x. *b. f (x) = x, g(x) = x2 .
(
0, 0 ≤ x < 12 ,
c. f (x) = x, g(x) = 1 1
2
, 2
≤ x ≤ 1.
13. Let ρ and σ be metrics on a set X. Show that each of the following are
also metrics on X.
i. 2ρ ii. ρ + σ iii. max{ρ, σ}.
14. *Prove that d defined by
|x − y|
d(x, y) = , x, y ∈ R, is a metric on R.
1 + |x − y|
15. Let X = (0, ∞). For x, y ∈ X, set
1 1
ρ(x, y) = − . Prove that ρ is a metric on X.
x y

2.2 Open and Closed Sets


In Chapter 1 we have used the terms open and closed in describing intervals
in R. The purpose of this section is to give a precise meaning to the adjectives
open and closed, not only for intervals, but also for arbitrary subsets of R or
a metric space (X, d). Before defining what we mean by an open set we first
define the concept of an interior point of a set.

DEFINITION 2.2.1 Let (X, d) be a metric space and let p ∈ X. For ǫ > 0,
the set
Nǫ (p) = {x ∈ X : d(p, x) < ǫ}
is called an ǫ-neighborhood of the point p.

Whenever we use the term neighborhood, we will always mean an


ǫ-neighborhood with ǫ > 0.

EXAMPLES 2.2.2 In the following examples we consider the ǫ-neighborhoods


of the metric spaces given in Examples 2.1.7
(a) Consider R with the usual metric d(x, y) = |x − y|. Then for fixed
p ∈ R and ǫ > 0,
Nǫ (p) = {x : p − ǫ < x < p + ǫ}
58 Introduction to Real Analysis

In this case, Nǫ (p) is the open interval (p − ǫ, p + ǫ) centered at p with radius


ǫ.
(b) Let X = [0, ∞) with d(x, y) = |x − y|. With ǫ = 21 and p = 14 ,

N 21 ( 41 ) = {x ∈ [0, ∞) : |x − 14 | < 12 }
= {x ∈ [0, ∞) : − 14 < x < 43 } = [0, 34 ).

On the other hand, N 14 ( 14 ) = (0, 21 ). If p = 0, then for any ǫ > 0,

Nǫ (0) = {x ∈ [0, ∞) : |x| < ǫ} = [0, ǫ).

Thus an ǫ-neighborhood of 0 is the half-open interval [0, ǫ). It is important to


remember that in this example our space is the interval [0, ∞) and not R.
(c) Consider R2 with the metric d2 of Example 2.1.7(d). For fixed a =
(a1 , a2 ) and ǫ > 0,

Nǫ (a) = {x ∈ R2 : kx − ak2 < ǫ}


= {(x1 , x2 ) : (x1 − a1 )2 + (x2 − a2 )2 < ǫ2 }.

This is easily recognized as the interior of a circle with center a and radius ǫ.
(Figure 2.1) Although ǫ-neighborhoods in the plane R2 are typically drawn as
circular regions, this is only the case for the metric d2 . For other metrics this
need not be the case as is illustrated in Exercise 5 of this section.

FIGURE 2.1
Nǫ (a) for the metric d2
Topology of the Real Line 59

(d) Let A = [a, b] and X the set of real-valued functions on A with the
metric as given in Example 2.1.7(h). For a fixed f ∈ X and ǫ > 0,

Nǫ (f ) = {g ∈ X : sup |f (x) − g(x)| < ǫ}.‘

Thus if g ∈ Nǫ (f ), then |g(x) − f (x)| < ǫ for all x ∈ A, or equivalently

f (x) − ǫ < g(x) < f (x) + ǫ. 

In the following definitions we will assume that X is a nonempty set with a


metric d. If X = R, unless otherwise specified d will denote the usual euclidean
metric on R.

DEFINITION 2.2.3 Let E be a subset of X. A point p ∈ E is called an


interior point of E if there exists an ǫ > 0 such that Nǫ (p) ⊂ E. The set of
interior points of E is denoted by Int(E), and is called the interior of E.

EXAMPLES 2.2.4 (a) Let X = R and let E = (a, b] with a < b. Every p
satisfying a < p < b is an interior point of E. If ǫ is chosen such that

0 < ǫ < min{|p − a|, |b − p|},

then Nǫ (p) ⊂ E. The point b however is not an interior point. For every ǫ > 0,
Nǫ (b) = (b − ǫ, b + ǫ) contains points which are not in E. Any x satisfying
b < x < b + ǫ is not in E. This is illustrated in Figure 2.2. For this example,
Int(E) = (a, b).

FIGURE 2.2
Epsilon neighborhoods of p and b in Example 2.2.4(a)

(b) Let E denote the set of irrational real numbers, i.e., E = R \ Q. If


p ∈ E, then by Theorem 1.5.2, for every ǫ > 0 there exists r ∈ Q ∩ Nǫ (p).
Thus Nǫ (p) always contains a point of R not in E. Therefore no point of E
is an interior point of E, i.e., Int(E) = ∅. Using the fact that between any
two real numbers there exists an irrational number, a similar argument also
proves that Int(Q) = ∅.
(c) This example will illustrate that the space X itself is crucial in the
definition of interior point. Let E = [0, 1) with d(x, y) = |x−y|. We claim that
every point of E is an interior point of E. Certainly every p ∈ E, 0 < p < 1, is
60 Introduction to Real Analysis

an interior point of E. The only point about which there may be some doubt
is p = 0. However, if 0 < ǫ < 1, then as in Example 2.2.2(b)

Nǫ (0) = {x ∈ [0, ∞) : |x| < ǫ} = [0, ǫ),

which is a subset of E. Therefore 0 is an interior point of E. Even though


this appears to contradict our intuition, it does not violate the definition of
interior point.
(d) Let X = R2 with metric d2 , and let

E = {(p1 , p2 ) : 1 < p1 < 3, 1 < p2 < 2}.

Given a point p = (p1 , p2 ) ∈ E, if we choose ǫ such that

0 < ǫ < min{|p1 − 1|, |p1 − 3|, |p2 − 1|, |p2 − 2|},

then Nǫ (p) ⊂ E. Therefore every point of E is an interior point of E. 

Open and Closed Sets


Using the notion of an interior point we now define what we mean by an open
set.

DEFINITION 2.2.5
(a) A subset O of R is open if every point of O is an interior point of O.
(b) A subset F of R is closed if F c = R \ F is open.

Remark. From the definition of an interior point it should be clear that a set
O ⊂ R is open if and only if for every p ∈ O there exists an ǫ > 0 (depending on
p) so that Nǫ (p) ⊂ O. Both the definition of interior point and open depend
on the metric of the given set. In situations where there is more than one
metric defined on a given set, we will use the phrase open with respect to d to
emphasize the metric.

EXAMPLES 2.2.6 (a) The entire set R is open. For any p ∈ R and ǫ > 0,
Nǫ (p) ⊂ R. Since R is open, by definition the empty set ∅ is closed. However,
the empty set is also open. Since ∅ contains no points at all, Definition 2.2.5(a)
is vacuously satisfied. Consequently R is also closed.
(b) Every ǫ-neighborhood is open. Suppose p ∈ R and ǫ > 0. If q ∈ Nǫ (p),
then |p − q| < ǫ. Choose δ so that 0 < δ ≤ ǫ − |p − q|. If x ∈ Nδ (q), then

|x − p| ≤ |p − q| + |x − q|
< |p − q| + δ ≤ |p − q| + ǫ − |p − q| = ǫ.

Therefore Nδ (q) ⊂ Nǫ (p) (see Figure 2.3) Thus q is an interior point of Nǫ (p).
Since q ∈ Nǫ (p) was arbitrary, Nǫ (p) is open.
Topology of the Real Line 61

FIGURE 2.3
A delta neighborhood of q in Example 2.2.6(d)

(c) Let E = (a, b], a < b, be as in Example 2.2.4(a). Since the point b ∈ E
is not an interior point of E, the set E is not open. The complement of E is
given by
E c = (−∞, a] ∪ (b, ∞).
An argument similar to the one given in Example 2.2.4(a) shows that a is not
an interior point of E c . Thus E c is not open and hence by definition E is not
closed. Hence E is neither open nor closed.
(d) Let F = [a, b], a < b. Then

F c = (−∞, a) ∪ (b, ∞),

and this set is open. This can be proved directly, but also follows as a conse-
quence of Theorem 2.2.9(a) below.
(e) Consider the set Q. Since no point of Q is an interior point of Q
(Example 2.2.4(b)), the set Q is not open. Also, Q is not closed. 

The use of the adjective open in describing the intervals (a, b), (a, ∞), (−∞, b)
and (−∞, ∞) is justified by the following theorem:

THEOREM 2.2.7 Every open interval in R is an open subset of R.

Proof. Exercise 1. 

THEOREM 2.2.8 Every ǫ-neighborhood is open.

Proof. The proof is identical to the proof given in Example 2.2.6(b). 

THEOREM 2.2.9 Let (X, d) be a metric space. Then S


(a) for any collection {Oα }α∈A of open subsets of X, α∈A Oα is open,
and Tn
(b) for any finite collection {O1 , ..., On } of open subsets of X, j=1 Oj is
open.
62 Introduction to Real Analysis

Proof. The proof of (a) is left as an exercise (Exercise 2).


Tn
(b) If j=1 Oj = ∅, we are done. Otherwise, suppose
n
\
p∈O= Oj .
j=1

Then p ∈ Oi for all i = 1, ..., n. Since Oi is open, there exists an ǫi > 0 such
that
Nǫi (p) ⊂ Oi .
Let ǫ = min{ǫ1 , ..., ǫn }. Then ǫ > 0 and Nǫ (p) ⊂ Oi for all i. Therefore
Nǫ (p) ⊂ O, i.e., p is an interior point of O. Since p ∈ O was arbitrary, O is
open. 
For closed subsets we have the following analogue of the previous result.

THEOREM 2.2.10 Let (X, d) be a metric space. Then T


(a) for any collection {Fα }α∈A of closed subsets of X, α∈A Fα is closed,
and
Sn
(b) for any finite collection {F1 , ..., Fn } of closed subsets of X, j=1 Fj is
closed.

Proof. The proofs of (a) and (b) follow from the previous theorem and De-
Morgan’s laws:
!c  c
\ [ n
[ n
\
Fα = Fαc ,  Fj  = Fjc . 
α∈A α∈A j=1 j=1

Remark. The fact that the intersection of a finite number of open sets is
open is due to the fact that the minimum of a finite number of positive num-
bers is positive. This guarantees the existence of an ǫ > 0 such that the
ǫ-neighborhood of p is contained in the intersection. For an infinite number
of open sets, the choice of a positive ǫ may no longer be possible. This is
illustrated by the following two examples.

EXAMPLES 2.2.11 We now provide two examples to show that part (b)
of Theorem 2.2.9 is in general false for a countable collection of open sets.
Likewise, part (b) of Theorem 2.2.10 is in general also false for an arbitrary
union of closed sets (Exercise 15).
(a) For each n = 1, 2, ..., let On = (− n1 , n1 ). Then each On is open, but

\
On = {0},
n=1

which is not open.


Topology of the Real Line 63

(b) Alternately, if we let Gn = (0, 1 + n1 ), n = 1, 2, ..., then again each Gn


is open, but

\
Gn = (0, 1],
n=1

which is neither open, nor closed. 

Limit Points

DEFINITION 2.2.12 Let E be a subset of a metric space X.


(a) A point p ∈ X is a limit point of E if every ǫ-neighborhood Nǫ (p) of
p contains a point q ∈ E with q 6= p.
(b) A point p ∈ E that is not a limit point of E is called an isolated point
of E.

Remark. In the definition of limit point it is not required that p is a point of


E. Also, a point p ∈ E is an isolated point of E if there exists an ǫ > 0 such
that Nǫ (p) ∩ E = {p}.

EXAMPLES 2.2.13 (a) E = (a, b), a < b. Every point p, a < p < b, is a
limit point of E. This follows from the fact that for any ǫ > 0 there exists a
point x ∈ (a, b) satisfying p < x < p + ǫ. These however are not the only limit
points. Both a and b are limit points of E, but they do not belong to E.
(b) E = { n1 : n = 1, 2, . . . }. Each n1 is an isolated point of E. If ǫ is chosen
so that
1 1 1
0<ǫ< = − ,
n(n + 1) n n+1
Then Nǫ ( n1 ) = { n1 }. Hence no point of E is a limit point of E. However, 0 is
a limit point of E which does not belong to E. To see that 0 is a limit point,
given ǫ > 0 choose n ∈ N so that 1/n < ǫ. Such a choice of n is possible by
Theorem 1.5.1. Then 1/n ∈ Nǫ (0) ∩ E, and thus 0 is a limit point of E.
(c) Let E = Q ∩ [0, 1]. If p ∈/ [0, 1] then p is not a limit point of E. For if
p > 1, then for ǫ = 21 (p − 1) we have Nǫ (p) ∩ E = ∅. Likewise when p < 0.
On the other hand, every p ∈ [0, 1] is a limit point of E. Let ǫ > 0 be given.
Suppose first that 0 ≤ p < 1. Then by Theorem 1.5.2 there exists r ∈ Q such
that p < r < min{p + ǫ, 1}. When p = 1, Theorem 1.5.2 also guarantees the
existence of an r ∈ Q ∩ [0, 1] with p − ǫ < r < p. Thus for every ǫ > 0, Nǫ (p)
contains a point r ∈ E with r 6= p. The same argument also proves that every
point of R is a limit point of Q. 

The following theorem provides a characterization of the closed subsets of


a metric space (X, d).

THEOREM 2.2.14 A subset F of a metric space X is closed if and only if


F contains all its limit points.
64 Introduction to Real Analysis

Proof. Suppose F is closed. Then by definition F c is open and thus for every
p ∈ F c there exists ǫ > 0 such that Nǫ (p) ⊂ F c , that is, Nǫ (p) ∩ F = ∅.
Consequently no point of F c is a limit point of F . Therefore F must contain
all its limit points.
Conversely, let F be a subset of X that contains all its limit points. To
show F is closed we must show F c is open. Let p ∈ F c . Since F contains all
its limit points, p is not a limit point of F . Thus there exists an ǫ > 0 such
that Nǫ (p) ∩ F = ∅. Hence Nǫ (p) ⊂ F c and p is an interior point of F c . Since
p ∈ F c was arbitrary, F c is open and therefore F is closed. 

THEOREM 2.2.15 Let E be a subset of a metric space X. If p is a limit


point of E, then every ǫ-neighborhood of p contains infinitely many points of
E.

Proof. Suppose there exists an ǫ-neighborhood of p that contains only finitely


many points of E, say q1 , . . . , qn with qi 6= p. Let

ǫ = min{d(qi , p) : i = 1, . . . , n}.

Then Nǫ (p) contains at most p. Thus p is not a limit point of E. 

COROLLARY 2.2.16 A finite set has no limit points.

Closure of a Set

DEFINITION 2.2.17 If E is a subset of a metric space X, let E ′ denote


the set of limit points of E. The closure of E, denoted E is defined as

E = E ∪ E′.

THEOREM 2.2.18 If E is a subset of a metric space X, then


(a) E is closed.
(b) E = E if and only if E is closed.
(c) E ⊂ F for every closed set F ⊂ X such that E ⊂ F .
c
Proof. (a) To show that E is closed, we must show that E is open. Let
c
p ∈ E . Then p 6∈ E and p is not a limit point of E. Thus there exists an ǫ > 0
such that
Nǫ (p) ∩ E = ∅.
We complete the proof by showing that Nǫ (p) ∩ E ′ is also empty and thus
c c
Nǫ (p) ∩ E = ∅. Therefore Nǫ (p) ⊂ E , i.e., p is an interior point of E .
Suppose Nǫ (p) ∩ E ′ 6= ∅. Let q ∈ Nǫ (p) ∩ E ′ , and choose δ > 0 such that
Nδ (q) ⊂ Nǫ (p). Since q ∈ E ′ , q is a limit point of E and thus Nδ (q) ∩ E 6= ∅.
But this implies that Nǫ (p) ∩ E 6= ∅, which is a contradiction. Therefore,
Nǫ (p) ∩ E ′ = ∅, which proves the result.
Topology of the Real Line 65

(b) If E = E, then E is closed. Conversely, if E is closed, then E ′ ⊂ E


and thus E = E.
(c) If E ⊂ F and F is closed, then E ′ ⊂ F . Thus E ⊂ F . 

DEFINITION 2.2.19 A subset D of a metric space X is dense in X if


D = X.

The rationals Q are dense in R. By Example 2.2.13(c), every point of R


is a limit point of Q. Hence Q = R. This explains the comment following
Theorem 1.5.2. The rationals are not only dense; they are also countable. The
existence of countable dense subsets play a very important role in analysis.
They allow us to approximate arbitrary elements in a set by elements chosen
from a countable subset of R. Since the rationals are dense in R, given any
p ∈ R and ǫ > 0, there exists r ∈ Q such that |p − r| < ǫ. Additional examples
of this will occur elsewhere in the text.

Characterization of the Open Subsets of R1


If {In } is any
S finite or countable collection of open intervals, then by Theorem
2.2.9, U = n In is an open subset of R. Conversely, every open subset of R
can be expressed as a finite or countable union of open intervals (see Exercise
22). However, a much stronger result is true. We now prove that every open
set can be expressed as a finite or countable union of pairwise disjoint open
intervals. A collection {In } of subsets of R is pairwise disjoint if In ∩ Im = ∅
whenever n 6= m.

THEOREM 2.2.20 If U is an open subset of R, then there exists a finite


or countable collection {In } of pairwise disjoint open intervals such that
[
U= In .
n

Proof. Let x ∈ I. Since U is open, there exists an ǫ > 0 such that

(x − ǫ, x + ǫ) ⊂ U.

In particular (s, x] and [x, t) are subsets of U for some s < x and some t > x.
Define rx and lx as follows:

rx = sup{t : t > x and [x, t) ⊂ U }, and


lx = inf{s : s < x and (s, x] ⊂ U }.

Then x < rx ≤ ∞ and −∞ ≤ lx < x. For each x ∈ U , let Ix = (lx , rx ). Then


(a) Ix ⊂ U ,
1 This topic can be omitted upon first reading of the text. The structure of open sets will

only be required in Chapter 10 in defining the measure of an open subset of R.


66 Introduction to Real Analysis

(b) If x, y ∈ U , then either Ix = Iy or Ix ∩ Iy = ∅.


The proofs of (a) and (b) are left as exercises (Exercise 21).
To complete the proof, we let I = {Ix : x ∈ U }. For each interval I ∈ I,
choose rI ∈ Q such that rI ∈ I. If I, J ∈ I are distinct intervals, then rI 6= rJ .
Therefore the mapping I → rI is a one-to-one mapping of I into Q. Thus the
collection I is at most countable and therefore can be enumerated as {Ij }j∈A ,
where A is either a finite subset of N, or A = N. Clearly
[
U= Ij ,
j∈A

and by (b), if n 6= j, then In ∩ Ij = ∅. Thus the collection {Ij }j∈A is pairwise


disjoint. 

Relatively Open and Closed Sets


One of the reasons for studying topological concepts is to enable us to study
properties of continuous functions. In most instances, the domain of a function

is not all of R but rather a proper subset of R as is the case with f (x) = x for
which Dom f = [0, ∞). When discussing a particular function we will always
restrict our attention to the domain of the function rather than all of R. With
this in mind we make the following definition.

DEFINITION 2.2.21 Let Y be a subset of a metric space X.


(a) A subset U of Y is open in (or open relative to) Y if for every p ∈ U ,
there exists ǫ > 0 such that Nǫ (p) ∩ Y ⊂ U .
(b) A subset C of Y is closed in (or closed relative to) Y if Y \ C is open
in Y .

EXAMPLE 2.2.22 Let X = [0, ∞) and let U = [0, 1). Then U is not open
in R but is open in X. (Why?) 

The following theorem, the proof of which is left as an exercise (Exercise


24), provides a simple characterization of what it means for a set to be open
or closed in a subset of X.

THEOREM 2.2.23 Let Y be a subset of a metric space X.


(a) A subset U of Y is open in Y if and only if U = Y ∩ O for some open
subset O of X.
(b) A subset C of Y is closed in Y if and only if C = Y ∩ F for some
closed subset F of X.
Topology of the Real Line 67

Connected Sets2
Our final topic of this section involves the notion of a connected set. The idea
of connectedness is just one more of the many mathematical concepts which
have their roots in the studies of Cantor on the structure of subsets of R. When
we use the term connected subset of R, intuitively we are inclined to think of
an interval as opposed to sets such as the positive integers N or (0, 1) ∪ {2}.
We make this precise with the following definition.

DEFINITION 2.2.24 A subset A of a metric space X is connected if


there do not exist two disjoint open sets U and V such that
(a) A ∩ U 6= ∅ and A ∩ V 6= ∅, and
(b) (A ∩ U ) ∪ (A ∩ V ) = A.

The definition for a connected set differs from most definitions in that it
defines connectedness by negation; i.e., defining what it means for a set not
to be connected. According to the definition, a set A is not connected if there
exist disjoint open sets U and V satisfying both (a) and (b). As an example of
a subset of R which is not connected, consider the set of positive integers N.
If we let U = ( 12 , 23 ) and V = ( 32 , ∞), then U and V are disjoint open subsets
of R with
U ∩ N = {1} and V ∩ N = {2, 3, ....}
that also satisfy (U ∩ N) ∪ (V ∩ N) = N. That the interval (a, b) is connected
is a consequence of the following theorem, the proof of which is left to the
exercises (Exercise 27).

THEOREM 2.2.25 A subset of R is connected if and only if it is an interval.

Exercises 2.2
1. Prove Theorem 2.2.7.
2. *Prove Theorem 2.2.9(a).
3. *a Show that every finite subset of R is closed.
b. Show that the intervals (−∞, a] and [a, ∞) are closed subsets of R.
4. Let X be the metric space of Example 2.1.7(g). Let p = ( 21 , 0). Describe
the ǫ-neigborhoods of p for each of the following values of ǫ: ǫ = 14 , ǫ =
1
2
, ǫ = 43 .

2 This concept, although important and used implicitly in several instances in the text,

will not be required specifically in subsequent chapters except in a few exercises. Thus the
topic of connectedness can be omitted upon first reading of the text.
68 Introduction to Real Analysis

5. For j = 1, 2, ∞, let dj denote the metrics given in Examples 2.1.7 (d)


and (e). For each j and ǫ > 0, let Nǫj (p) denote the ǫ-neighborhood of
the point p ∈ R2 .
a. For p = (0, 0) and ǫ = 1, sketch the ǫ-neighborhoods Nǫj (p) for
j = 1, ∞.
*b. Prove that for ǫ > 0,
Nǫ1 (P ) ⊂ Nǫ2 (P ) ⊂ Nǫ∞ (P ) ⊂ N2ǫ
1
(P ).
c. Using the results of (b), prove that a set U is open with respect to
one of the metrics d1 , d2 , d∞ if and only if it is open with respect to the
other two.
6. If U and V are open subsets of R, prove that U × V is an open subset of
R2 .
7. Consider the metric space (X, d) of Example 2.1.7(c). Prove that every
subset of X is open.
8. For the following subsets E of R, find each of the following: Int(E), E ′ ,
Isolated points of E, if any, and E. Determine whether the set E is open,
closed, or neither.
*a. (0, 1) ∪ {2} b. (a, b) c. (a, b] *d. { n1 : n ∈ N} e. Q ∩ [0, 1].
9. Let E ⊂ R. A point p ∈ R is a boundary point of E if for every ǫ > 0,
Nǫ (p) contains both points of E and points of E c . Find the boundary
points of each of the following sets,
*a. (a, b) b. E = { n1 : n ∈ N} c. N d. Q
10. a. Prove that a set E ⊂ R is open if and only if E does not contain any
of its boundary points.
b. Prove that a set E ⊂ R is closed if and only if E contains all its
boundary points.
11. For each of the following subset E of R2 , find Int(E) and E.
a. E = (1, 2) × [−1, 1] b. E = {(x, y) : −1 < x ≤ 2, y ∈ R}
c. E = {(x, y) : y = x} d. E = {(x, y) : y ≤ x + 1}
e. E = {p ∈ R2 : 0 < d2 (p, 0) < 1}
12. a. Construct a set with exactly two limit points.
b. Find an infinite subset of R with no limit points.
c. Construct a countable subset of R with countably many limit points.
d. Find a countable subset of R with uncountably many limit points.
13. Let X = (0, ∞). For each of the following subsets of X determine whether
the given set is open in X, closed in X, or neither.
*a. (0, 1] b. (0, 1) *c. (0, 1] ∪ (2, 3) d. (0, 1] ∪ {2}
14. For each of the following subsets of Q, determine whether the set is open
in Q, closed in Q, both open and closed in Q, or neither.
a. A = {p ∈ Q : 1 < p < 2}. b. B = {p ∈ Q : 2 < p2 < 3}. c. N.
Topology of the Real Line 69

15. a. Prove Theorem 2.2.10(a).



S∞ of a countable collection {Fn }n=1 of closed subsets
b. Give an example
of R such that n=1 Fn is not closed.
16. *Let A be a nonempty subset of R that is bounded above, and let α =
sup A. If α ∈
/ A, prove that α is a limit point of A
17. Let (X, d) be a metric space, and E ⊂ X.
*a. Prove that Int(E) is open.
b. Prove that E is open if and only if E = Int(E).
c. If G ⊂ E and G is open, prove that G ⊂ Int (E).
18. Let A, B be subsets of R.
a. If A ⊂ B, show that Int(A) ⊂ Int(B).
b. Show that Int(A ∩ B) = Int(A) ∩ Int(B).
c. Is Int(A ∪ B) = Int(A) ∪ Int(B)?
d. Are the results of (a) and (b) still true if A and B are subsets of a
metric space X ?
19. Let A, B be subsets of a metric space X.
*a. Show that (A ∪ B) = A ∪ B.
b Show that A ∩ B ⊂ A ∩ B.
c. Give an example for which the containment in (b) is proper.
20. Let D0 = {0, 1}, and for each n ∈ N, let

Dn = {a 2n : a ∈ N, a is odd, 0 < a < 2n }. Let D =
 S
Dn .
n=0

Prove that D is a countable dense subset of [0, 1].


21. Prove statements (a) and (b) of Theorem 2.2.20.
22. *Prove that there exists a countable collection I of open intervals such
that if U is an open subset
S of R, there exists a finite or countable collection
{In } ⊂ I with U = In .
23. *If U is an open subset of R, prove that E ⊂ U is open in U if and only
if E is an open subset of R.
24. Prove Theorem 2.2.23.
25. For each of the following, use the definition to prove that the given set is
not connected.
*a. (0, 1)∪{2} b. { n1 : n = 1, 2, ....} c. {p ∈ Q : p > 0 and 1 < p2 < 3}.
26. *If A is connected, prove that A is connected.
27. *Prove Theorem 2.2.25.
70 Introduction to Real Analysis

2.3 Compact Sets

In this section, we introduce the concept of a compact set in the setting of


metric spaces. A characterization of the compact subsets of R is provided in
Section 2.4. The notion of a compact set is very important in the study of
analysis, and many significant results in the text will depend on the fact that
every closed and bounded interval in R is compact. The modern definition of
a compact set given in 2.3.3 dates back to the second half of the nineteenth
century and the studies of Heine and Borel on compact subsets of R.

DEFINITION 2.3.1 Let E be a subset of a metric space (X, d). A collection


{Oα }α∈A of open subsets of X is an open cover of E if
[
E⊂ Oα .
α∈A

An alternate definition is as follows: The collection {Oα }α∈A of open sets


is an open cover of E if for each p ∈ E, there exists an α ∈ A such that
p ∈ Oα .

EXAMPLES 2.3.2 (a) Let E = (0, 1) and On = (0, 1 − n1 ), n = 2, 3, ....


Then {On }∞n=2 is an open cover of E. To see this, suppose x ∈ E. Then since
x < 1, there exists an integer n such that x < 1 − 1/n. Thus x ∈ On , and as
a consequence
[∞
E⊂ On ,
n=2

S∞ proves the assertion. In fact, since On ⊂ E for each n, we have E =


which
n=2 On .
(b) Let F = [0, ∞) and for each n ∈ N let Un = (−1, n). Then {Un }n∈N
is an open cover of F . 

DEFINITION 2.3.3 A subset K of X is compact if every open cover of


K has a finite subcover of K; that is, if {Oα }α∈A is an open cover of K, then
there exists α1 , ..., αn ∈ A such that
n
[
K⊂ O αj .
j=1
Topology of the Real Line 71

EXAMPLES 2.3.4 (a) Every finite set is compact. Suppose E =


{p1 , ..., pn } is a finite subset of R and {Oα }α∈A is an open cover of E. Then
for each j, j = 1, ..., n, there exists αj ∈ A such that pj ∈ Oαj . But then
{Oαj }nj=1 is a finite sub-collection which covers E.
(b) The open interval (0, 1) is not compact. For the open cover {On }∞ n=2
of (0, 1) in Example 2.3.2(a), no finite sub-collection can cover (0, 1). Suppose
on the contrary that a finite number, say On1 , ..., Onk , cover (0, 1). Let N =
max{n1 , ..., nk }. Then
k
[ N
[
1
(0, 1) ⊂ O nj ⊂ On = (0, 1 − N ),
j=1 n=2

which is a contradiction.
(c) The closed set F = [0, ∞) is not compact. For the open cover U =
{(−1, n)}n∈N of F , no finite sub-collection can cover F . If there exist a finite
number of sets in U which cover F , then there exists N ∈ N such that F ⊂
(−1, N ). (Why?) This however is a contradiction. 

Properties of Compact Sets


Before we provide a characterization of the compact subsets of R, we first
prove several properties of compact sets that require only the definition of
compactness. As a consequence, all three of the following theorems are true in
more general settings; e.g. in n−dimensional space Rn as well as in a general
metric space (X, d).

THEOREM 2.3.5
(a) Every compact subset of a metric space is closed.
(b) Every closed subset of a compact set is compact.

Proof. (a) To show that K is closed, we need to show that K c is open. Let
p ∈ K c be arbitrary. For each q ∈ K, choose ǫq > 0 such that

Nǫq (q) ∩ Nǫq (p) = ∅.

Any ǫq satisfying 0 < ǫq < 21 d(p, q) will work. Then {Nǫq (q)}q∈K is an open
cover of K. Since K is compact, there exists q1 , ..., qn such that
n
[
K⊂ Nǫqj (qj ).
j=1

Let ǫ = min{ǫqj : j = 1, ..., n}, which is positive. Then

Nǫ (p) ∩ Nǫqj (qj ) = ∅ for all j = 1, ..., n.


72 Introduction to Real Analysis

Thus Nǫ (p) ∩ K = ∅, i.e., Nǫ (p) ⊂ K c . Therefore p is an interior point of K c


and thus K c is open.
(b) Let F be a closed subset of the compact set K and let {Oα }α∈A be an
open cover of F . Then
{Oα }α∈A ∪ {F c }
is an open cover of K. Since K is compact, a finite number of these will cover
K, and hence also F . 

COROLLARY 2.3.6 If F is closed and K is compact, then F ∩ K is com-


pact.

As a consequence of the previous theorem, the open interval (0, 1) is not


compact since it is not closed. However, being closed is not sufficient for a set
to be compact. The half-open interval [0, ∞) is closed in R, but as shown in
Example 2.3.4(c) is not compact. In Theorem 2.4.2 we will provide necessary
and sufficient conditions for a subset of R to be compact.
Remark. In proving that the compact set K was closed, compactness allowed
us to select a finite subcover from the constructed open cover of K. Finiteness
then assured that the ǫ as defined was positive. This method of first construct-
ing an open cover possessing certain properties and then using compactness
to assure the existence of a finite subcover will be used on other occasions in
the text.

THEOREM 2.3.7 If E is an infinite subset of a compact set K, then E has


a limit point in K.

Proof. If no point of K is a limit point of E, then for each q ∈ K, there exists


a neighborhood Nq of q so that Nq contains at most one point of E, namely
q if q ∈ E. Since E is infinite, no finite sub-collection of {Nq }q∈K can cover
E, and consequently no finite sub-collection of {Nq }q∈K can cover K, which
is a contradiction. 
Another useful consequence of compactness is the following analogue of
what is known as the nested intervals property (see Exercise 3 of Section 2.4).

THEOREM 2.3.8 If {Kn }∞ n=1 is a sequence of nonempty compact subsets


of X with Kn ⊃ Kn+1 for all n, then

\
K= Kn
n=1

is nonempty and compact.


Topology of the Real Line 73
T∞
Proof. We first show that n=1 Kn 6= ∅. Let On = Knc . By Theorem 2.3.5
Kn is closed and thus On is open. Furthermore,

\ ∞
[
Kn = ∅ if and only if On = X.
n=1 n=1
T∞
Thus if n=1 Kn = ∅, then {On }∞ n=1 is an open cover of X, and thus also of
K1 . But K1 is compact. Therefore there exists n1 < · · · < nk such that
k
[
K1 ⊂ O nj .
j=1

But then K1 ∩ Kn1 ∩ · · · ∩ Knk = ∅. This however is a contradiction, since


T intersection is equal to Knk , which by hypothesis is nonempty. Thus K =
the
Kn 6= ∅. By Theorem 2.2.10 K is closed, and hence by Theorem 2.3.5(b),
K is compact. 

Exercises 2.3
1
1. Let A = n
: n = 1, 2, ... .
a. Show that the set A is not compact.
*b. Prove directly (using the definition) that K = A ∪ {0} is compact.
2. Show that (0, 1] is not compact by constructing an open cover of (0, 1]
that does not have a finite subcover.
3. Suppose A and B are compact subsets of a metric space X.
*a. Prove (using only the definition) that A ∪ B is compact.
b. Prove that A ∩ B is compact.
4. *Let K be a nonempty compact subset of R. Prove that sup K and inf K
exist and are in K.
5. Construct a compact subset K of R with an infinite number of isolated
points. Justify that your set K is compact.
6. Suppose K is an infinite compact subset of a metric space (X, d). Prove
that there exists a countable subset D of K such that D = K.

2.4 Compact Subsets of R


We now turn to our goal of providing a characterization of the compact sub-
sets of the real line R. The first of the two main results is attributed to Eduard
Heine (1821–1881) and Emile Borel (1871–1956), whereas the second is due
to Bernhard Bolzano (1781–1848) and Karl Weierstrass (1815–1897). The two
74 Introduction to Real Analysis

theorems rank very high among the many important advances in the foun-
dations of analysis during the nineteenth century. The importance of these
results will become evident in later chapters. As is to be expected, the least
upper bound property of R will play a crucial role in the proofs of these
theorems.

THEOREM 2.4.1 (Heine-Borel) Every closed and bounded interval [a, b]


is compact.

Proof. Let U = {Uα }α∈A be an open cover of [a, b] and let

E = {r ∈ [a, b] : [a, r] is covered by a finite number of the sets Uα }.

The set E is bounded above by b, and since a ∈ Uα for some α ∈ A, E is


nonempty. Thus by the least upper bound property the supremum of E exists
in R. Let α = sup E. Since b is an upper bound of E, α ≤ b.
We first show that α ∈ E, i.e., [a, α] is covered by a finite number of sets
in U . Since α ∈ [a, b], α ∈ Uβ for some β ∈ A. Since Uβ is open, there exists
ǫ > 0 such that (α − ǫ, α + ǫ) ⊂ Uβ . Furthermore, since α − ǫ is not an upper
bound of E, there exists r ∈ E such that α − ǫ < r ≤ α. But then [a, r] is
covered by a finite number, say Uα1 , ..., Uαn , of sets in U . But then the finite
collection {Uα1 , ..., Uαn , Uβ } covers [a, α]. Therefore, α ∈ E.
To conclude the proof we show that α = b. Suppose α < b. If we choose
s < b such that α < s < α + ǫ, then the collection {Uα1 , ..., Uαn , Uβ } also
covers [a, s]. Thus s ∈ E which contradicts that α = sup E. Hence α must
equal b. 
The statement of the Heine-Borel theorem was initially due to Heine, a
student of Weierstrass, who used the result implicitly in the 1870s in his
studies on continuous functions. The theorem was proved by Borel in 1894
for the case where the open cover was countable. For an arbitrary open cover
the result was finally proved in 1904 by Henri Lebesgue (1875–1941). Using
the Heine-Borel theorem we now prove the following characterization of the
compact subsets of R.

THEOREM 2.4.2 (Heine-Borel-Bolzano-Weierstrass) Let K be a sub-


set of R. Then the following are equivalent:
(a) K is closed and bounded.
(b) K is compact.
(c) Every infinite subset of K has a limit point in K.

Remark. A subset E of R is bounded if it is bounded both above and below,


i.e., there exists a constant M such that |x| ≤ M for all x in E.
Topology of the Real Line 75

Proof. (a) ⇒ (b). Since K is bounded, there exists a positive constant M so


that K ⊂ [−M, M ]. Since [−M, M ] is compact and K is closed, by Theorem
2.3.5(b) K is compact.
(b) ⇒ (c). This is Theorem 2.3.7.
(c) ⇒ (a). Suppose the set K is not bounded. Then for every n ∈ N there
exists pn ∈ K with pn 6= pm for n 6= m such that |pn | > n for all n. Then
{pn : n = 1, 2, ...} is an infinite subset of R with no limit point in R, and hence
none in K, which is a contradiction. Thus K is bounded.
Let p be a limit point of K. By definition of limit point, for each n ∈ N
there exist pn ∈ K with pn 6= p such that
1
|pn − p| < .
n
Then S = {pn : n = 1, 2, ...} is an infinite subset of K, and p is a limit point
of S. To complete the proof we must show that p is the only limit point of S,
and hence by hypothesis must be in K, i.e., K is closed.
Suppose q ∈ R with q 6= p. Let ǫ = 21 |p − q| and choose N ∈ N such that
1/N < ǫ. Then for all n ≥ N
1
|p − q| ≤ |pn − q| + |pn − p| < |pn − q| + < |pn − q| + ǫ.
n
Therefore, for all n ≥ N ,

|pn − q| > |p − q| − ǫ = 21 |p − q|.

Thus Nǫ (q) contains at most finitely many pn , and as a consequence q cannot


be a limit point of S. Therefore no q ∈ R with q 6= p is a limit point of S. 
Statement (c) of the previous theorem is basically what is referred to as
the Bolzano-Weierstrass theorem, which we state for completeness. The proof
of the result follows immediately from Theorems 2.3.7 and 2.4.2.

THEOREM 2.4.3 (Bolzano-Weierstrass) Every bounded infinite subset


of R has a limit point.

The theorem was originally proved by Bolzano, and modified slighly in


the 1860s by Weierstrass. This result can also be proved directly using the
nested interval property (Exercise 7). Although we restricted ourselves to R,
the analogous statement of Theorem 2.4.2 is also true in Rn . The proof of the
following theorem, for n = 2, is left as an exercise. The proof for n > 2 is
identical to the case n = 2.

THEOREM 2.4.4 A subset E of Rn is compact if and only if E is closed


and bounded.
76 Introduction to Real Analysis

As in the Remark following Theorem 2.4.2, a subset E of Rn is bounded


if and only if there exists a positive constant M such that d2 (p, 0) ≤ M for
all p ∈ E. This is equivalent to

|pi | ≤ M, i = 1, . . . , n

for all p = (p1 , . . . , pn ) ∈ E.

Exercises 2.4
* Find a countable collection {Kn }∞
1. S n=1 of compact subsets of R such that

n=1 K n is not compact.
2. a. Suppose I and J are closed and bounded intervals in R. Prove that
I × J is a compact subset of R2 .

b. Prove that a subset E of R2 is compact if and only if it is closed and


bounded.
3. *(Nested Intervals Property) Let {In } be a countable family of
nonempty closed and bounded intervals satisfying In ⊃ In+1 for all n.
Prove that there exists a ≤ b in R such that
\∞
In = [a, b].
n=1

4. Let d denote the usual metric on R and let ρ be the metric on R given
by
|x − y|
ρ(x, y) = .
1 + |x − y|
a. Prove that a subset of R is open with respect to the metric d if and
only if it is open with respect to ρ.
b. Show that [0, ∞) is closed and bounded in the metric ρ but that [0, ∞)
is not a compact subset of the metric space (R, ρ).
5. Let X = Q with metric d(p, q) = |p − q|. Let E = {p ∈ Q : p ≥ 0, p2 < 2}.
Show that E is closed and bounded in Q, but not compact.
6. This exercise outlines an alternate proof of the Heine-Borel Theorem.
Suppose [a, b] is not compact. Then there exists an open cover U =
{Uα }α∈A of [a, b] such that no finite sub-collection of U covers [a, b]. We
now proceed to show that  this leads toa+b
a contradiction. Divide [a, b] into
two closed subintervals a, a+b

2
and 2 , b each of length (b − a)/2. At
least one of these, call it I1 cannot be covered by a finite number of the
Uα . Repeating this process obtain a sequence {In } of closed and bounded
intervals satisfying (a) [a, b] ⊃ I1 ⊃ I2 ⊃ · · · ⊃ In ⊃ · · · , (b) length of
In = (b − a)/2n , and (c) for each n, In is not covered by a finite number
of the Uα . Now use Exercise 3 to obtain a contradiction.
7. Prove the Bolzano-Weierstrass Theorem using the nested intervals prop-
erty.
Topology of the Real Line 77

2.5 The Cantor Set


In this section, we will construct a compact subset of [0, 1], known as the
Cantor set, that has a number of interesting properties. This set is constructed
by induction, the first two stages of which are illustrated in Figure 2.4.
Let P0 = [0, 1]. From P0 remove the middle third open interval ( 13 , 32 ). This
leaves two disjoint closed intervals
J1,1 = 0, 13 , J1,2 = 32 , 1 .
   

Set P1 = J1,1 ∪ J1,2 .


From each of J1,1 and J1,2 remove the middle third open intervals
1 2 7 8
 
3 2 , 32 and 32 , 32

of length 312 . This leaves 22 disjoint closed intervals J2,1 , J2,2 , J2,3 , J2,4 of
length 312 ; namely
 1 2 3 6 7 8 9
0, 32 , 32 , 32 , 32 , 32 , 32 , 32 .

Set P2 = J2,1 ∪ J2,2 ∪ J2,3 ∪ J2,4 . In Figure 2.4, the shaded intervals indicate
the open intervals that are removed at each stage of the construction.

FIGURE 2.4
Construction of the Cantor set

We continue this process inductively. At the nth step, each Pn is the union
of 2n disjoint closed intervals each of length 1/3n , i.e.,
n
2
[
Pn = Jn,j ,
j=1

where for each j, Jn,j is a closed interval of the form


 
xj xj + 1
Jn,j = n , .
3 3n
Since each Pn is a finite union of closed intervals, Pn is closed and bounded,
hence compact. Furthermore, since Po ⊃ P1 ⊃ P2 ⊃ · · · , by Theorem 2.3.8,

\
P = Pn
n=0
78 Introduction to Real Analysis

is a nonempty compact subset of [0, 1]. The set P is called the Cantor ternary
set.
We now consider some of the properties of the set P .
Property 1 P is compact and nonempty.
Property 2 P contains all the endpoints of the closed intervals {Jn,k },
n = 1, 2, ..., k = 1, 2, ..., 2n .
Property 3 Every point of P is a limit point of P .
Proof. Let p ∈ P and let ǫ > 0 be given. Choose m ∈ N such that 1/3m < ǫ.
Since p ∈ Pm , p ∈ Jm,k for some k, 1 ≤ k ≤ 2m . But
 
xk xk + 1
Jm,k = m , .
3 3m
Since length of Jm,k = 1/3m < ǫ, Jm,k ⊂ Nǫ (p). Thus both endpoints of Jm,k
are in P ∩ Nǫ (p), and at least one of these is distinct from p. 
Property 4 The sum of the lengths of the intervals removed is 1.
Proof. At step 1, we removed one interval of length 1/3. At the second step,
we removed two intervals of length 1/32 . At the nth step, to obtain Pn , we
removed 2n−1 intervals of length 1/3n . Thus we obtain that
1 1 1
Sum of the lengths of the intervals removed = + 2 2 + · · · + 2n−1 n + · · ·
3 3 3
∞ ∞  n
X 2n−1 1 X 2
= =
n=1
3n 3 n=0 3
1 1
= 2 = 1. 
3 1− 3

As a consequence of Property 4,
Property 5 P contains no intervals.
For x ∈ [0, 1] let x = .n1 n2 n3 ... be the ternary expansion. As we indicated
in Section 1.6, this expansion is unique except when
a
x= , a∈N with 0 < a < 3m ,
3m
where 3 does not divide a. In this case x has two expansions: a finite expansion
a1 am
x= + ··· + m, am ∈ {1, 2}
3 3
and an infinite expansion. If am = 2, we will use the finite expansion. If
am = 1, we will use the infinite expansion

a1 0 X 2
x= + ··· + m + .
3 3 3k
k=m+1
Topology of the Real Line 79

With this convention we have


Property 6 If for each x ∈ [0, 1], x = .n1 n2 n3 .... is the ternary expansion
of x, then
x∈P if and only if nk ∈ {0, 2}.

Proof. Exercise 2. 
As a consequence of Property 6 and Theorem 1.7.18,
Property 7 P is uncountable.
For each n, the set Pn has only a finite number of endpoints. As a con-
sequence, the set of points of P which are endpoints of some open interval
removed in the construction is countable. Since P is uncountable, P contains
points other than endpoints. By Exercise 1 of Section 1.6, the ternary expan-
sion of 41 is
1
= .020202.....
4
1 1
Thus 4 ∈ P , but 4 is not an endpoint of P .
Remark. By Property 4, the sum of the lengths of the intervals removed is 1.
This seems to imply that P is in some sense very “small.” On the other hand,
by Property 7 P is uncountable, which seems to imply that P is “large.”

Exercises 2.5
1
1. Determine whether 13
is in the Cantor set.
2. Prove Property 6 of the Cantor set.
3. Let 0 < α < 1. Construct a closed subset F of [0, 1] in a manner similar
to the construction of the Cantor set such that the sum of the lengths of
all the intervals removed is α.
4. *Prove that the Cantor set P is equivalent to [0, 1].

Notes
Without a doubt, the most important concept of this chapter is compactness. The
fact that every open cover of a compact set has a finite subcover will be crucial in
the study of continuous functions, especially uniform continuity. As we will see in
many instances, the applications of compactness depend on the ability to choose a
finite subcover from a particular open cover. A good example of this is the proof of
Theorem 2.3.5. Other instances will occur later in the text.
Since compactness is the most important concept, Theorems 2.4.1 and 2.4.2
are the two most important results. In the Heine-Borel theorem we proved that
80 Introduction to Real Analysis

every closed and bounded interval is compact, whereas in the Heine-Borel-Bolzano-


Weierstrass theorem we characterized the compact subsets of R. In Theorem 2.3.7
we proved that if K is a compact set then every infinite subset of K has a limit point
in K. The converse of this result is also true, not only in R (Theorem 2.4.3), but
also in the more general setting of metric spaces. A proof of this important result is
outlined in the Miscellaneous Exercises.

Miscellaneous Exercises
The first two exercises involve the geometric and euclidean metric structure of Rn .
For n ≥ 2, Rn = {(x1 , ..., xn ) : xi ∈ R, i = 1, ..., n}. For p = (p1 , ..., pn ), q =
(q1 , ..., qn ) in Rn and c ∈ R, define

p + q = (p1 + q1 , ..., pn + qn ), and


cp = (cp1 , ..., cpn ).

Also, let 0 = (0, ..., 0). For p, q ∈ Rn , the inner product of p and q, denoted
hp, qi, is defined as
hp, qi = p1 q1 + · · · pn qn .

1. Prove each of the following: for p, q, r ∈ Rn ,


a. hp, pi ≥ 0 with equality if and only if p = 0.
b. hp, qi = hq, pi.
c. hap + bq, ri = ahp, ri + bhq, ri for all a, b ∈ R.
p p
d. |hp, qi| ≤ hp, pi hq, qi.
This last inequality is usually called the Cauchy-Schwarz inequality.
As a hint on how to prove (d), for λ ∈ R, expand hp − λq, p − λqi and
then choose λ appropriately. Note that by (a), hp − λq, p − λqi ≥ 0 for
all λ ∈ R.
2. For p = (p1 , ..., pn ) ∈ Rn , set kpk2 = hp, pi = p21 + · · · + p2n . The
p p

quantity kpk2 is called the norm or the euclidean length of the vector
p.
a. Use the result of 1(d) to prove that kp + qk2 ≤ kpk2 + kqk2 for all
p, q ∈ Rn .
b. Using the result of (a), prove that d2 (p, q) = kp − qk2 is a metric on
Rn .
3. If E is an uncountable subset of R, prove that some point of E is a limit
point of E.
The following exercise is designed to prove the converse of Theorem 2.3.7;
namely, if K is a subset of a metric space (X, d) having the property that
every infinite subset of K has a limit point in K, then K is compact.
Miscellaneous Exercises 81

4. Let K be a subset of a metric space (X, d) that has the property that
every infinite subset of K has a limit point in K.
a. Prove that there exists a countable subset D of K which is dense
in K. (Hint: Fix n ∈ N. Let p1 ∈ K be arbitrary. Choose p2 ∈ K, if
possible, such that d(p1 , p2 ) ≥ n1 . Suppose p1 , ..., pj have been chosen.
Choose pj+1 , if possible, such that d(p1 , pj+1 ) ≥ n1 for all i = 1, ..., n. Use
the assumption about K to prove that this process must terminate after
a finite numberS of steps. Let Pn denote this finite collection of points,
and let D = n∈I Pn . Prove that D is countable and dense in K.)
b. Let D be as in (a), and let U be an open subset of X such that U ∩K 6=
∅. Prove that there exists p ∈ D and n ∈ N such that N1/n (p) ⊂ U .
c. Using the result of (b), prove that for every open cover U of K, S there
exists a finite or countable collection {Un }n ⊂ U such that K ⊂ n Un .
d. Prove that every countable open cover of K has a finite subcover.
(Hint: SIf {Un }∞n=1 is a countable open cover of K, for each n ∈ N let
Wn = n j=1 Uj . Prove that K ⊂ Wn for some n ∈ N. Assume that the
result is false, and obtain an infinite subset of K with no limit point in
K which is contradiction.)

Supplemental Reading

Asic, M. D. and Adamovic, D. D., tory of the Cantor set and Cantor func-
“Limit points of sequences in met- tion,” Math. Mag. 67 (1994), 136–140.
ric spaces,” Amer. Math. Monthly 77 Geissinger, L., “Pythagoras and
(1970) 613–616. the Cauchy Schwarz inequality,” Amer.
Corazza, P., “Introduction to met- Math. Monthly 83 (1976) 40–41.
ric preserving functions,” Amer. Math. Kaplansky, I., Set Theory and Met-
Monthly 106(1999) 309–323. ric Spaces Chelsea Publ. Co., New York,
Dubeau, F., “Cauchy-Bunyakowski- 1977
Schwarz inequality revisited,” Amer. Labarre, Jr., A. E., “Structure the-
Math. Monthly 99 (1990) 419–421. orem for open sets of real numbers,”
Espelie, M. S. and Joseph, J. E., Amer. Math. Monthly 72 (1965) 1114.
“Compact subsets of the Sorgenfrey Nathanson, M. B., “Round met-
line,” Math. Mag. 49 (1976) 250–251. ric spaces,” Amer. Math. Monthly 82
Fleron, Julian F., “A note on the his- (1975) 738–741.
3
Sequences of Real Numbers

Now that we have covered the basic topological concepts required for the study
of analysis, we begin with limits of sequences. This topic will be our first serious
introduction to the limit process. The notion of convergence of a sequence
dates back to the early nineteenth century and the work of Bolzano (1817)
and Cauchy (1821). Some of the concepts and results included in this chapter
have undoubtedly been encountered previously in the study of calculus. Our
presentation however will be considerably more rigorous—emphasizing proofs
rather than computations.
Although our primary emphasis will be on sequences of real numbers,
these are not the only sequences which are typically encountered. It is not
at all unusual to talk about sequences of functions, sequences of vectors, etc.
For this reason we will begin our study of sequences in the general setting of
metric spaces. Most of the examples however will come from the real numbers.
A good understanding of sequences in R will prove helpful in providing insight
into properties of sequences in more general settings.
We begin the chapter by introducing the notion of convergence of a se-
quence in a metric space, and then by proving the standard limit theorems
for sequences of real numbers normally encountered in calculus. In Section
3.3 we will use the least upper bound property of R to prove that every
bounded monotone sequence of real numbers converges in R. The study of
subsequences and sub-sequential limits will be the topic of Section 3.4. In this
section, we also prove the well known result of Bolzano and Weierstrass that
every bounded sequence of real numbers has a convergent subsequence. This
result will then be used to provide a short proof of the fact that every Cauchy
sequence of real numbers converges. Although the study of series of real num-
bers is the main topic of Chapter 7, some knowledge of series will be required
in the construction of certain examples in Chapters 4 and 6. For this reason
we include a brief introduction to series as the last section of this chapter.

3.1 Convergent Sequences


We begin our study of sequences by first considering sequences in arbitrary
metric spaces. Throughout this section we let (X, d) be a metric space. When

83
84 Introduction to Real Analysis

X = R, unless otherwise specified d will denote the usual euclidean metric


on R. Recall that by a sequence in X we mean a function f : N → X. For
each n ∈ N, pn = f (n) is called the nth term of the sequence f , and for
convenience, the sequence f is denoted by {pn }∞
n=1 , or simply {pn }.

DEFINITION 3.1.1 A sequence {pn }∞ n=1 in X is said to converge if there


exists a point p ∈ X such that for every ǫ > 0, there exists a positive integer
no = no (ǫ) such that pn ∈ Nǫ (p) for all n ≥ no . If this is the case, we say
that {pn } converges to p, or that p is the limit of the sequence {pn }, and
we write
lim pn = p or pn → p.
n→∞

If {pn } does not converge, then {pn } is said to diverge.

In the definition, the statement pn ∈ Nǫ (p) for all n ≥ no is equivalent to

d(pn , p) < ǫ for all n ≥ no .

As a general rule, the integer no will depend on the given ǫ. This will be
illustrated in the following examples.

EXAMPLES 3.1.2 (a) For our first example we show that the sequence
{1/n}∞n=1 converges to 0 in R. The proof of this is the remark following The-
orem 1.5.1; namely, given ǫ > 0, there exists a positive integer no such that
no ǫ > 1. Thus for all n ≥ no ,

| n1 − 0| = 1
n < ǫ.
1
Therefore lim = 0. In this example, the integer no must be chosen so that
n→∞ n
no > 1/ǫ.
(b) If p ∈ R, the sequence {pn } defined by pn = p for all n ∈ N is called the
constant sequence p. Since |pn − p| = 0 for all n ∈ N, we have lim pn = p.
n→∞
 ∞
2n + 1
(c) Consider the sequence . We will show that
3n + 2 n=1

2n + 1 2
lim = .
n→∞ 3n + 2 3
Since
2n + 1 2 1 1
− = < ,
3n + 2 3 3(3n + 2) 9n
1
given ǫ > 0, choose no ∈ N such that no > 9ǫ . Then for all n ≥ no ,

2n + 1 2
− < ǫ.
3n + 2 3
Sequences of Real Numbers 85

Thus the given sequence converges to 2/3.


(d) The sequence {1 − (−1)n }∞ n=1 diverges in R. To prove this, we first
note that for this sequence, |pn − pn+1 | = 2 for all n. Suppose pn → p for some
p ∈ R. Let 0 < ǫ < 1. Then by the definition of convergence, there exists an
integer no such that |pn − p| < ǫ for all n ≥ no . But if n ≥ no , then

2 = |pn − pn+1 | ≤ |pn − p| + |p − pn+1 | < 2ǫ < 2,

which is a contradiction.
(e) Consider the sequence
 
1 2n + 1
pn = 1− ,
n 3n + 2

in R2 . In Exercise 10 you will be asked to prove that a sequence pn = (pn , qn )


converges to p = (p, q) if and only if pn → p and qn → q. Thus the above
sequence converges to (1, 32 ).
(f ) Let X denote the set of bounded functions on [0, 1] with metric d
given by d(f, g) = sup{|f (x) − g(x)| : x ∈ [0, 1]}. Consider the sequence {fn },
where for each n ∈ N, fn is the function given by fn (x) = xn /n. If f = 0
denotes the zero function on [0, 1], i.e., f (x) = 0 for all x ∈ [0, 1], then for
each n ∈ N,
 n 
x 1
d(fn , f ) = d(fn , 0) = sup : x ∈ [0, 1] = .
n n

As a consequence the sequence {fn } converges to the zero function in X. 

DEFINITION 3.1.3 A sequence {pn } in X is said to be bounded if there


exists p ∈ X and a positive constant M such that d(p, pn ) ≤ M for all n ∈ N.

Remark. A sequence {pn } in R is bounded if there exists a positive constant


C such that |pn | ≤ C for all n ∈ N. To see this, suppose {pn } is bounded
according to Definition 3.1.3. Then there exists p ∈ R such that |pn − p| ≤ M
for all n. As a consequence

|pn | ≤ |pn − p| + |p| ≤ (M + |p|),

which proves the results. Thus with respect to the usual metric a sequence
{pn } is bounded if and only if the set {pn : n = 1, 2, . . . } is a bounded subset
of R. This however is not always the case for other metrics (see Example
3.1.5(c)).

THEOREM 3.1.4 Let (X, d) be a metric space.


(a) If a sequence {pn } in X converges, then its limit is unique.
86 Introduction to Real Analysis

(b) Every convergent sequence in X is bounded.


(c) If E ⊂ X and p is a limit point of E, then there exists a sequence {pn }
in E with pn 6= p for all n such that

lim pn = p.
n→∞

Proof. (a) Suppose the sequence {pn } converges to two distinct points p, q ∈
X. Let ǫ = 13 d(p, q). Since pn → p, there exists an integer n1 such that
d(pn , p) < ǫ for all n ≥ n1 . Also, since pn → q, there exists an integer n2 such
that d(pn , q) < ǫ for all n ≥ n2 . Thus if n ≥ max{n1 , n2 }, by the triangle
inequality
d(p, q) ≤ d(pn , p) + d(pn , q) < 2ǫ = 32 d(p, q)
which is a contradiction.
(b) Let {pn } be a convergent sequence in X that converges to p ∈ X. Take
ǫ = 1. For this ǫ, there exists an integer no such that d(pn , p) < 1 for all
n > no . Let
M = max{d(p, p1 ), . . . , d(p, pno ), 1 }.
Then d(p, pn ) ≤ M for all n ∈ N. Therefore the sequence is bounded.
(c) We construct the sequence {pn } in E as follows: Since p is a limit point
of E, for each positive integer n, by the definition of limit point, there exists
pn ∈ E with pn 6= p, such that
1
d(pn , p) < .
n
This sequence clearly satisfies pn → p. 

EXAMPLES 3.1.5 (a) According to the previous theorem, every con-


vergent sequence is bounded. The converse however is false. The sequence
{1 − (−1)n }∞
n=1 is bounded, but by Example 3.1.2(d), the sequence does not
converge. The sequence is bounded since |pn | = |1 − (−1)n | ≤ 2 for all n ∈ N.
(b) The sequence {n(−1)n } is not bounded in R, and thus cannot converge.

(c) Let X = R with the metric

|x − y|
ρ(x, y) = .
1 + |x − y|

With this metric every sequence {pn } in R satisfies ρ(pn , p) < 1 for any p ∈ R.
Thus every sequence {pn } in (R, ρ) is bounded according to Definition 3.1.3.
Obviously however there exist sequences in R for which the range is not a
bounded subset of R. It is interesting to note that a sequence {pn } converges
to p in the usual metric if and only if pn → p in the metric ρ (Exercise 11).
Sequences of Real Numbers 87

√ (d) In this example, we illustrate part (c) of the previous theorem. Since
2 is a limit point of Q, the previous theorem guarantees
√ the existence of
a sequence {rn } of rational numbers such that
√ r n → 2. Note however that
this sequence need not be unique. If rn → 2, then the same is true for the
sequence {rn + n1 }. 

Exercises 3.1
1. For each of the following sequences, prove, using an ǫ, no argument that
the sequence converges to the given limit p; that is, given ǫ > 0 determine
no such that |pn− p| < ǫ for all n ≥ no .  
3n + 5 3 2n + 5 1
*a. ,p= b. ,p=
2n
 2 + 7 2 6n − 3 3
(−1)n
  
n +1 1
*c. 2
, p = d. 1 − ,p=1
 2n n  2 n
(−1) n √ √
e. , p = 0. *f. n+1− n , p=0
n2 + 1
n q o 1
g. n 1 + n1 − 1 , p =
2
2. Show that each of the following  sequences
 diverge in R.
n n 1 n nπ o
*a. {n (1 + (−1) )}. b. (−1) + . *c. sin .
n 2
(−1)n

n nπ o
d. n sin . e. .
2 n+1
1
3. If b > 0, prove that lim = 0.
n→∞ 1 + nb

4. Prove each of the following.


1
*a. If b > 1, prove that lim n = 0.
n→∞ b

b. If 0 ≤ b < 1, prove that lim bn = 0.


n→∞

5. *Let {an } be a sequence in R with lim an = a. Prove that lim a2n = a2 .


n→∞ n→∞
√ √
6. *If an ≥ 0 for all n and lim an = a, prove that lim an = a.
n→∞ n→∞

7. Prove that if {an } converges to a, then {|an |} converges to |a|. Is the


converse true?
8. *Let {an } be a sequence in R with lim an = a. If a > 0, prove that
n→∞
there exists no ∈ N such that an > 0 for all n ≥ no .
9. Let {an } be a sequence in R satisfying |an − an+1 | ≥ c for some c > 0
and all n ∈ N. Prove that the sequence {an } diverges.
10. Consider R2 with the metric d2 as defined in Example 2.1.7(d). Suppose
pn = (an , bn ) and p = (a, b). Prove that
lim pn = p if and only if lim an = a and lim bn = b.
n→∞ n→∞ n→∞

11. Consider R with the usual metric, and also with the metric
|x − y|
ρ(x, y) = .
1 + |x − y|
88 Introduction to Real Analysis

Prove that a sequence {pn } in R converges to p ∈ R in the usual metric


if and only if pn → p in the metric ρ.
12. Let X be the set of bounded functions on [0, 1] with metric d as defined in
Example 3.1.2(f). Prove that each of the following sequences of functions
converges to the indicated functions f .
n x o∞ n x o∞
a. x + sin nx , f (x) = x. b. , (b > 1), f (x) = 0.
n n=1 bn n=1
 ∞
x
c. , f (x) = 0.
n + x n=1
13. Let (X, d) be as in the previous exercise. Does the sequence {xn } converge
in (X, d)?

3.2 Sequences of Real Numbers


In this section, we will emphasize some of the important properties of se-
quences of real numbers, and also investigate the limits of several basic se-
quences that are frequently encountered in the study of analysis. Our first
result involves algebraic operations on convergent sequences.

THEOREM 3.2.1 If {an } and {bn } are convergent sequences of real num-
bers with
lim an = a and lim bn = b,
n→∞ n→∞

then
(a) lim (an + bn ) = a + b, and
n→∞
(b) lim an bn = a b.
n→∞
bn b
(c) Furthermore, if a 6= 0, and an 6= 0 for all n, then lim = .
n→∞ an a
Proof. The proof of (a) is left to the exercises (Exercise 1). To prove (b), we
add and subtract the term an b to obtain

|an bn − ab| = |(an bn − an b) + (an b − ab)| ≤ |an | |bn − b| + |b| |an − a|.

Since {an } converges, by Theorem 3.1.4(b), {an } is bounded. Thus there exists
a constant M > 0 such that |an | ≤ M for all n. Therefore

|an bn − ab| ≤ M |bn − b| + |b| |an − a|.

Let ǫ > 0 be given. Since an → a, there exists a positive integer n1 such that
ǫ
|an − a| <
2(|b| + 1)
Sequences of Real Numbers 89

for all n ≥ n1 . Also, since bn → b, there exists a positive integer n2 such that
ǫ
|bn − b| <
2M
for all n ≥ n2 . Thus if n ≥ max{n1 , n2 },
 ǫ   
ǫ
|an bn − ab| < M + |b| < ǫ.
2M 2(|b| + 1)
Therefore lim an bn = ab.
n→∞
To prove (c) it suffices to show that lim 1/an = 1/a. The result (c) then
n→∞
follows from (b). Since a 6= 0 and an → a, there exists a positive integer no
such that
|an − a| < 21 |a|
for all n ≥ no . Also, since
1
|a| ≤ |a − an | + |an | < 2 |a| + |an |

for n ≥ no , we have
1
|an | ≥ 2 |a|
for all n ≥ no . Therefore,
1 1 |a − an | 2
− = < 2 |an − a|.
an a |an ||a| |a|
Let ǫ > 0 be given. Since an → a, we can choose an integer n1 ≥ no so that

|a|2
|an − a| < ǫ
2
for all n ≥ n1 . Therefore
1 1
− <ǫ
an a
for all n ≥ n1 , and as a consequence
1 1
lim = . 
n→∞ an a
COROLLARY 3.2.2 If {an } is a convergent sequence of real numbers with
lim an = a, then for any c ∈ R,
n→∞
(a) lim (an + c) = a + c, and
n→∞
(b) lim c an = c a.
n→∞

Proof. If we define the sequence {cn } by cn = c for all n ∈ N, then the


conclusions follow by (a) and (b) of the previous theorem. 
90 Introduction to Real Analysis

THEOREM 3.2.3 Let {an } and {bn } be sequences of real numbers. If {bn }
is bounded and lim an = 0, then
n→∞

lim an bn = 0.
n→∞

Proof. Exercise 3. 
Remark. Since the sequence {bn } may not converge, Theorem 3.2.1(c) does
not apply. The fact that the sequence {bn } is bounded is crucial. For example,
consider the sequences { n1 } and {3n}.

THEOREM 3.2.4 Suppose {an }, {bn }, and {cn } are sequences of real num-
bers for which there exists no ∈ N such that
a n ≤ bn ≤ c n for all n ∈ N, n ≥ no ,
and that lim an = lim cn = L. Then the sequence {bn } converges and
n→∞ n→∞

lim bn = L.
n→∞

Proof. Exercise 4 
The above result, commonly called the squeeze theorem , is very useful
in applications. Quite often to show that a given sequence {an } in R converges
to a, we will first prove that
|an − a| ≤ M bn
for some positive constant M and a nonnegative sequence {bn } with lim bn =
n→∞
0. Since the above inequality is equivalent to
−M bn ≤ an − a ≤ M bn ,
by Theorem 3.2.4 the sequence {an − a} converges to 0, or equivalently that
lim an = a.
n→∞

Some Special Sequences


We next consider some special sequences of real numbers that occur frequently
in the study of analysis. For the proof of Theorem 3.2.6 we require the following
result.

THEOREM 3.2.5 (Binomial Theorem) For a ∈ R, n ∈ N,


n
! ! ! ! !
n
X n k n n n 2 n
(1 + a) = a = + a+ a + ··· + an
k=0
k 0 1 2 n

where !
n n!
=
k k!(n − k)!
is the binomial coefficient.
Sequences of Real Numbers 91

In the above, for n ∈ N, n! (read n factorial) is defined by

n! = n · (n − 1) · · · 2 · 1,

with the usual convention that 0! = 1. Since the binomial theorem is ancillary
to our main topic of discussion, we leave the proof, using mathematical in-
duction, to the exercises (Exercise 13). An alternate proof using Taylor series
will be provided in Section 8.7.

THEOREM 3.2.6 .
1
(a) If p > 0, then lim = 0.
n→∞ np

(b) If p > 0, then lim n p = 1.
n→∞

(c) lim n n = 1.
n→∞

(d) If p > 1 and α is real, then lim = 0.
n→∞ pn

(e) If |p| < 1, then lim pn = 0.


n→∞
pn
(f ) For all p ∈ R, lim = 0.
n→∞ n!

Proof. The proofs of (a) and (b) are left to the exercises (Exercise 5). The
proof of (a) is straightforward and the proof of√(b) (for p > 1) is similar to
the proof of (c). For the proof of (c), let xn = n n − 1. Since xn is positive,
by the binomial theorem
!
n n n(n − 1) 2
n = (1 + xn ) ≥ x2n = xn
2 2

for all n ≥ 2. Therefore, x2n ≤ 2/(n − 1) for all n ≥ 2, and as a consequence


r
2
0 ≤ xn ≤ .
n−1
Thus by (a) and Theorem 3.2.4, lim xn = 0, from which the result follows.
n→∞
(d) Let k be a positive integer so that k > α. Since p > 1, write p = (1+q),
with q > 0. By the binomial theorem, for n > 2k,
!
n n(n − 1) · · · (n − k + 1) k
pn = (1 + q)n > qk = q .
k k!

Since k < 21 n, n − k + 1 > 21 n + 1 > 21 n. Therefore,

n(n − 1) · · · (n − k + 1) nk
> k ,
k! 2 k!
92 Introduction to Real Analysis

and as a consequence,
nα 2k k!
 
1
0≤ ≤ .
pn qk nk−α
The result now follows by part (a) and Theorem 3.2.4.
±1
(e) Write p as p = , where q > 1. Then
q
1
|pn | = |p|n =
qn
which by part (d) (with α = 0) converges to 0 as n → ∞.
(f) Fix k ∈ N such that k > |p|. For n > k,
 n
pn |p|n k (k−1) |p|
= < .
n! n! (k − 1)! k
Since |p|/k < 1, the result follows by (e). 

EXAMPLES 3.2.7 We now provide several examples to illustrate the pre-


vious theorems.  
2n + 1
(a) As in Example 3.1.2(c), consider the sequence . We write
3n + 2

2n + 1 n(2 + n1 ) 2+ 1
n
= 2 = 2 .
3n + 2 n(3 + n ) 3+ n

1 2
Since lim = lim = 0, by Corollary 3.2.2(a),
n→∞ n n→∞ n
   
1 2
lim 2 + = 2 and lim 3 + = 3.
n→∞ n n→∞ n
Therefore by Theorem 3.2.1(b) and (c),
1


2+ 1  lim 2 + n 1 2
n n→∞
lim 2 = 2
 =2· = .
n→∞ 3+ n lim 3 + n
3 3
n→∞

(−1)n
 
(b) Consider the sequence √ . We first note that
2 n+7
(−1)n 1 1
0≤ √ ≤ √ .
2 n+7 2 n
Thus by Theorems 3.2.4 and 3.2.6(a) with p = 21 ,
(−1)n
lim √ = 0.
n→∞ 2 n + 7
Sequences of Real Numbers 93

2n + n3

(c) For our next example we consider the sequence . As in (a),
3n + n2
we first factor out the dominant power in both the numerator and denomi-
nator. By Theorem 3.2.6(d), lim nα /pn = 0 for any α ∈ R and p > 1. This
n→∞
simply states that pn (p > 1) grows faster than any power of n. Therefore
the dominant terms in both the numerator and denominator are 2n and 3n ,
respectively. Thus
3  n 3
2n + n3 2n (1 + 2nn ) 2 (1 + 2nn )
= 2 = 2 .
3n + n2 3n (1 + 3nn ) 3 (1 + 3nn )
By Theorems 3.2.1 and 3.2.6(d)
n3
(1 + 2n )
lim n2
= 1.
n→∞ (1 + 3n )
2 n

Finally, since lim 3 = 0 (Theorem 3.2.6(e)), we have
n→∞

2n + n 3
lim = 0.
n→∞ 3n + n2

(e) As out final example we consider the sequence {n((1 + n1 )−2 − 1)}.
Before we can evaluate the limit of this sequence we must first simplify the
nth term of the sequence. This is accomplished as follows:
n2
   
1 −2 1
xn = n((1 + n ) − 1) = n −1 =n −1
(1 + n1 )2 (n + 1)2
−2n2 − n
 
−2n − 1
=n = .
(n + 1)2 (n + 1)2
Now we can factor out an n2 from both the numerator and the denominator.
This gives
−2 − n1
xn = .
(1 + n1 )2
Using the limit theorem we now conclude that lim xn = −2. 
n→∞

Exercises 3.2
1. Prove Theorem 3.2.1(a).
2. Let {an } and {bn } be sequences of real numbers.
a. If {an } and {an + bn } both converge, prove that the sequence {bn }
converges.
b. Suppose bn 6= 0 for all n ∈ N. If {bn } and {an /bn } both converge,
prove that the sequence {an } also converges.
94 Introduction to Real Analysis

3. Prove Theorem 3.2.3.


4. Prove Theorem 3.2.4.
5. Prove each of the following.
1
a. If p > 0, prove that lim = 0.
n→∞ np

*b. If p > 0, prove that lim n p = 1.
n→∞

6. Findthe limit of each


∞of the following sequences. ∞
3n2 + 2n + 1 n3

*a. 2
. b. 1 + n
.
 5n − 2n + 3∞ n=1  √ 3 ∞n=1
n 2 n
*c. −n . d. .
1 + n1 n=1
n + 1 n=1
 ∞
1
  p
*e. q −1 . f. { n2 + n − n}∞ n=1 .
 1+ 1 
n n=1
√ √ √  ∞ n o∞
*g. n n + a − n n=1 , a > 0. h. (2n + 3n )1/n .
n=1
7. For each of the following sequences, determine whether the given se-
quence converges or diverges. If the sequence converges, find its limit; if
it diverges, explain why.
∞ ∞
1 + (−1)n
 
1 nπ
*a. 1 + b. sin
n n 2 n=1
(  2 2 )n=1
∞  n
∞
1 n +1 3
*c. d.
n2 2n + 3 2n + n2 n=1
√3
 ∞
n=1  ∞
8n3 + 5 n cos nπ
*e. √ f.
9n2 − 4 n=1 2n + 3 n=1
1 π
8. *Prove that lim cos n = 0.
n→∞ n 2
9. Let {xn } be a sequence in R with xn → 0, and xn 6= 0 for all n. Prove
1
that lim xn sin = 0.
n→∞ xn
10. Let {an } be a sequence of positive real number such that
an+1
lim = L.
n→∞ an

*a. If L < 1, prove that the sequence {an } converges to zero.


b. If L > 1, prove that the sequence {an } is unbounded.
c. Give an example of a convergent sequence {an } of positive real num-
bers for which L = 1.
d. Give an example of a divergent sequence {an } of positive real num-
bers for which L = 1.
11. Use the previous exercise to determine convergence or divergence of each
of the following sequences.  2
n
*a. {n2 an }, 0 < a < 1. b. , 0 < a < 1.
an
Sequences of Real Numbers 95
n
   
a n!
c. , 0 < a < 1. d. .
n! nn
12. *Suppose lim (an − 1)/(an + 1) = 0. Prove that lim an = 1.
n→∞ n→∞

13. a. For n ∈ N, 1 ≤ k ≤ n, prove that


! ! !
n n n+1
+ = .
k k−1 k
*b. Use mathematical induction to prove the binomial theorem (Theorem
3.2.5).
14. Let {ak }∞
k=1 be a sequence in R. For each n ∈ N, define
a1 + · · · + an
sn = .
n
*a. Prove that if lim ak = a, then lim sn = a.
k→∞ n→∞

b. Give an example of a sequence {ak } which diverges, but for which


{sn } converges.

3.3 Monotone Sequences


In this section, we will briefly consider monotone sequences of real numbers.
As we will see, one of the advantages of such sequences is that they will either
converge in R, or diverge to +∞ or −∞.

DEFINITION 3.3.1 A sequence {an }∞


n=1 of real numbers is said to be
(a) monotone increasing (or nondecreasing) if an ≤ an+1 for all
n ∈ N;
(b) monotone decreasing (or nonincreasing) if an ≥ an+1 for all
n ∈ N;
(c) monotone if it is either monotone increasing or monotone decreasing.

A sequence {an } is strictly increasing if an < an+1 for all n. Strictly


decreasing is defined similarly.
As a general rule, bounded sequences need not converge; e.g. {1 − (−1)n }.
For monotone sequences however, we have the following convergence result.

THEOREM 3.3.2 If {an }∞ ∞


n=1 is monotone and bounded, then {an }n=1 con-
verges.

Proof. Suppose {an } is monotone increasing. Set

E = { an : n = 1, 2, .... }.
96 Introduction to Real Analysis

Then E 6= ∅ and bounded above. Let a = sup E. We now show that

lim an = a.
n→∞

Let ǫ > 0 be given. Since a − ǫ is not an upper bound of E, there exists a


positive integer no such that

a − ǫ < ano ≤ a.

Since {an } is monotone increasing,

a − ǫ < an ≤ a for all n ≥ no .

Thus an ∈ Nǫ (a) for all n ≥ no and therefore lim an = a. 


n→∞

FIGURE 3.1
Nested intervals property

Nested Intervals Property


As an application of the previous theorem we prove the following result usually
referred to as the nested intervals property. The term nested comes from
the fact that the sequence {In } of intervals satisfies In ⊃ In+1 for all n ∈ N
(see Figure 3.1).

COROLLARY 3.3.3 (Nested Intervals Property) If {In }∞ n=1 is a se-


quence of closed and bounded intervals with In ⊃ In+1 for all n ∈ N, then

\
In 6= ∅
n=1

Proof. Suppose In = [an , bn ], an , bn ∈ R, an ≤ bn . Since In ⊃ In+m for all


m ≥ 0,
an ≤ an+m ≤ bn+m ≤ bm
for all n, m ∈ N. Thus the sequence {an } is monotone increasing and bounded
Sequences of Real Numbers 97

above by every bm , m ∈ N. Thus by the previous theorem, a = lim an exists


n→∞
with a ≤ bm for all m ∈ N. Therefore a ∈ Im for all m ∈ N and thus

\
a∈ Im ,
m=1

which proves the result. 


Remark. Similarly we can show that if b = lim bn , then b ∈ In for all n,
n→∞
and thus

\
[a, b] ⊂ In .
n=1

In fact, one can show that equality holds. (See Exercise 3 of Section 2.4.)

EXAMPLES 3.3.4 (a) Our first example shows that the conclusion of
Corollary 3.3.3 is false if the intervals In are not closed. As in Example
1.7.11(b), for each n ∈ N set In = (0, n1 ). Then In ⊃ In+1 for all n, but

\
In = ∅.
n=1

The conclusion of Corollary 3.3.3 may also be false if the intervals In are
unbounded (Exercise 1).
(b) Consider the sequence {pn } with 0 < p < 1. Even though Theorem
3.2.6(e) applies, we use the results of this section to prove that lim pn = 0.
n→∞
For n ∈ N set sn = pn . Since p > 0, sn > 0 for all n ∈ N. Thus {sn } is
bounded below. Also, since 0 < p < 1,

sn+1 = pn+1 = psn < sn .

Thus the sequence {sn } is monotone decreasing, bounded below, and hence
by Theorem 3.3.2, is convergent. Let s = lim sn . Then
n→∞

s = lim sn+1 = p lim sn = ps.


n→∞ n→∞

Therefore s = ps. Since p 6= 1, we must have that s = 0.


(c) Let a1 = 1 and for n ≥ 1 set an+1 = 61 (2an +5). The first three terms of
the sequence {an } are a1 = 1, a2 = 76 , and a3 = 11
9 . Thus we suspect that the
sequence {an } is monotone increasing and bounded above by 2. Since a1 < 2,
if we assume that an ≤ 2, then
1 1
an+1 = (2an + 5) ≤ (4 + 9) < 2.
6 6
98 Introduction to Real Analysis

Thus by mathematical induction an ≤ 2 for all n ∈ N. Likewise, since a1 < a2 ,


if our induction hypothesis is an < an+1 , then the result is true for n = 1, and
1 1
an+1 = (2an + 5) < (2an+1 + 5) = an+2 .
6 6
Therefore the sequence {an } is monotone increasing, bounded above, and thus
converges. Let a = lim an . Then
n→∞

1 1
a = lim an+1 = lim (2an + 5) = (2a + 5).
n→∞ n→∞ 6 6
Solving the equation a = 61 (2a + 5) for a gives a = 45 .

(d) Let a1 = 1, and for n > 1, set an+1 = 2an . To investigate the
convergence of the sequence {an }, we will establish by induction that

1 ≤ an < an+1 < 2,

for all n ∈ N. When n = 1, we have



1 = a1 < 2 = a2 < 2.

Thus the statement is true for n = 1. Assume that it is true for n = k. Then
√ p
1 ≤ ak+1 = 2ak < 2ak+1 = ak+2 ,

and p √
ak+2 = 2ak+1 < 4 = 2.
Thus the sequence {an } is monotone increasing, bounded above by 2, and
hence by Theorem 3.3.2 is convergent. It is possible to prove directly that

sup{an : n = 1, 2, ...} = 2.

However, if we let a = lim an , then


n→∞
√ √
a = lim an+1 = lim 2an = 2a.
n→∞ n→∞

The last equality follows by Exercise 6 of Section 3.1. Therefore a is a solution


of a2 = 2a, which since a ≥ 1, implies a = 2. 
Sequences of Real Numbers 99

Euler’s Number e
EXAMPLE 3.3.5 In this example, we consider in detail the very important
sequence {tn }∞
n=1 , where for each n ∈ N,
 n
1
tn = 1 + .
n
We will show that the sequence {tn } is monotone increasing and bounded
above, and thus has a limit. The standard notation for this limit is e (in
honor of Leonhard Euler); i.e.,
 n
1
e = lim 1 + .
n→∞ n
By the binomial theorem
 n
1 1 n(n − 1) 1 n(n − 1) · · · 1 1
tn = 1 + =1+n· + 2
+ ··· + · n . (1)
n n 1·2 n 1 · 2···n n
For k = 1, ..., n, the (k + 1)-st term on the right side is
n(n − 1) · · · (n − k + 1) 1
· k
1 · 2···k n
which is equal to
    
1 1 2 k−1
1− 1− ··· 1 − . (2)
1 · 2···k n n n
If we expand tn+1 in the same way, we obtain n+2 terms, and for k = 1, 2, ..., n
the (k + 1)-st term is
    
1 1 2 k−1
1− 1− ··· 1 − ,
1 · 2···k n+1 n+1 n+1
which is greater than the corresponding term in (2). Thus tn < tn+1 for all n.
From (1) we also obtain
1 1 1
tn ≤ 1 + 1 + + + ··· +
1·2 1·2·3 1 · 2···n
1 1 1
≤ 1 + 1 + + 2 + · · · + n−1 ,
2 2 2
n−1
= (r − rn ) (1 − r), r 6= 1,

which by the identity 1 + r + · · · + r

1 − ( 12 )n 1
= 1+ <1+ = 3.
1 − 21 1− 1
2
Thus {tn } is bounded above by 3, and we can apply Theorem 3.3.2. Since tn ≤
3 for all n ∈ N we also have that e ≤ 3. To five decimal places, e = 2.71828....
The number e is the base of the natural logarithm function which will be
defined in Example 6.3.5 as a definite integral. 
100 Introduction to Real Analysis

Infinite Limits
If a monotone increasing sequence {an } is bounded above then by Theorem
3.3.2 the sequence converges. If the sequence {an } is not bounded above, then
for each positive real number M there exists no ∈ N such that an ≥ M for all
n ≥ no . Since the real number M can be taken to be arbitrarily large, this is
usually expressed by saying that the sequence {an } diverges to ∞. We make
this concept precise, not only for monotone sequences, but for any sequence
of real numbers with the following definition.

DEFINITION 3.3.6 Let {an } be a sequence of real numbers. We say that


{an } approaches infinity, or that {an } diverges to ∞, denoted an → ∞,
if for every positive real number M , there exists an integer no ∈ N such that

an > M for all n ≥ no .

We will also use the notation lim an = ∞ to denote that an → ∞ as


n→∞
n → ∞. The concept of an → −∞ is defined similarly.

THEOREM 3.3.7 If {an } is monotone increasing and not bounded above,


then an → ∞ as n → ∞.

As a consequence of Theorems 3.3.2 and 3.3.7, every monotone increasing


sequence {an } either converges to a real number (if the sequence is bounded
above) or diverges to ∞. In either case,

lim an = sup{an : n ∈ N}.


n→∞

Remark. Although the definition of diverging to infinity is included in this


section on monotone sequences, this should not give the impression that Def-
inition 3.3.6 is applicable only to such sequences. In the following we give an
example of a sequence that diverges to infinity but which is not monotone.
Also, it is important to remember that when we say that a sequence converges,
we mean that it converges to a real number.

EXAMPLE 3.3.8 Consider the sequence {n(2 + (−1)n )}. If n is even, then
n(2 + (−1)n ) = 3n; if n is odd, then n(2 + (−1)n ) = n. In either case,

n(2 + (−1)n ) ≥ n,

and thus the sequence diverges to ∞. The sequence however is clearly not
monotone. 
Sequences of Real Numbers 101

Exercises 3.3
1. *Show by example that the conclusion of Corollary 3.3.3 is false if the
intervals In with In ⊃ In+1 are not bounded.
2. Show that each of the following sequences are monotone. find a lower or
upper
√bound ifit exists.
Find the limit
 if you
 can.
n2 + 1 1 1
*a. b. a+ a− , a > 1.
n n n
π 2π nπ
c. {an } , a > 1 d. {sn } where sn = cos3 + cos2 + · · · + cos2
 n 2 2 2
n
e.
n!
√ √
3. Define the sequence {an } as follows: a1 = 2, and an+1 = 2 + an .
a. Show that an ≤ 2 for all n.
b. Show that the sequence {an } is monotone increasing.
c. Find lim an .
n→∞

4. *Let a1 > 1, and for n ∈ N, n ≥ 1, define an+1 = 2 − 1 an . Show that
the sequence {an } is monotone and bounded. Find lim an .
n→∞

5. Let 0 < a < 1. Set t1 = 2, and for n ∈ N, set tn+1 = 2 − a/tn . Show that
the sequence {tn } is monotone and bounded. Find lim tn .
n→∞

6. Let α > 0. Choose x1 > α. For n = 1, 2, 3, ..., define
 
1 α
xn+1 = xn + .
2 xn
*a. Show that the sequence {xn } is monotone and bounded.

b. Prove that lim xn = α
n→∞

c. Prove that 0 ≤ xn − α ≤ (x2n − α)/xn .
7. In Exercise
√ 6, let α = 3 and x1 = 2. Use part (c) to find xn such that
|xn − 3| < 10−5 .
8. For each of the following prove that the sequence {an } converges and find
the limit. √
1
a. an+1 = √ 6
(2an + 5), a1 = 2 b. an+1 = √2an , a1 = 3
*c. an+1 = √2an + 3, a1 = 1 d. an+1 =√ 2an + 3, a1 = 4
*e. an+1 = 3an − 2, a1 = 4 f. an+1 = 3an − 2, a1 = 23
9. Let A be a nonempty subset of R that is bounded above and let α =
sup A. Show that there exists a monotone increasing sequence {an } in
A such that α = lim an . Can the sequence {an } be chosen to be strictly
increasing?
10. Use (
Example 3.3.5)to find the
(limit of each )
of the following sequences.
 2n n+1
1 1
*a. 1+ b. 1+
n n
( 3n )  n 
1 1
*c. 1+ d. 1−
2n n
102 Introduction to Real Analysis
1 1
11. *For each n ∈ N, let sn = 1 + + · · · + . Show that {sn } is monotone
2 n
increasing but not bounded above.
1 1
12. For each n ∈ N, let sn = 1 + √ + · · · + √ . Show that the sequence
2 n
{sn } is monotone increasing but not bounded above.
1 1 1
13. *For each n ∈ N, let sn = 2 + 2 + · · · + 2 . Show that the sequence
1 2 n
{sn } is monotone increasing and bounded above by 2.
14. Let 0 < b < 1. For each n ∈ N, let sn = 1 + b + b2 + · · · + bn . Prove
that the sequence {sn } is monotone increasing and bounded above. Find
lim sn .
n→∞

15. Show that each of the following sequences


diverge to +∞
an
 2
n +1
*a. a
, a > 1. b. .
n n
n


(−1)
*c. n + d. {n + (−1)n n}
n
16. *Which of the sequences in the previous exercise are not monotone?
Explain your answer!
17. If an → ∞ and {bn } converges in R, prove that {an + bn } diverges to ∞.
1
18. If an > 0 for all n ∈ N and lim an = 0, prove that → ∞.
n→∞ an
19. Suppose a1 > a2 > 0. For n ≥ 2 set an+1 = 12 (an + an−1 ). Prove that
a. {a2k+1 } is monotone decreasing. b. {a2k } is monotone increasing,
and c. {an } converges.
20. Let {sn } be a bounded sequence of real numbers. For each n ∈ N let an
and bn be defined as follows; an = inf{sk : k ≥ n}, bn = sup{sk : k ≥ n}.
a. Prove that the sequences {an } and {bn } are monotone and bounded.
b. Prove that lim an = lim bn if and only if the sequence {sn } con-
n→∞ n→∞
verges.
21. *In Theorem 3.3.2 we used the supremum property of R to prove that
every bounded monotone sequence converges. Prove that the converse is
also true; namely, if every bounded monotone sequence in R converges,
then every nonempty subset of R that is bounded above has a supremum
in R.
22. *Use the nested intervals property to prove that [0, 1] is uncountable.

3.4 Subsequences and the Bolzano-Weierstrass Theorem


In this section, we will consider subsequences and subsequential limits of a
given sequence of real numbers. One of the key results of the section is that
Sequences of Real Numbers 103

every bounded sequence of real numbers has a convergent subsequence. This


result, also known as the sequential version of the Bolzano-Weierstrass theo-
rem, is one of the fundamental results of real analysis.

DEFINITION 3.4.1 Let (X, d) be a metric space. Given a sequence {pn }


in X, consider a sequence {nk }∞ k=1 of positive integers such that n1 < n2 <

n3 < .... Then the sequence {pnk }k=1 is called a subsequence of the sequence
{pn }.

If the sequence {pnk } converges, its limit is called a subsequential limit of


the sequence {pn }. Specifically, a point p ∈ X is a subsequential limit of
the sequence {pn } if there exists a subsequence {pnk } of {pn } that converges
to p. Also, given a sequence {pn } in R, we say that ∞ is a subsequential
limit of {pn } if there exists a subsequence {pnk } so that pnk → ∞ as k → ∞.
Similarly for −∞.

EXAMPLES 3.4.2 (a) Consider the sequence {1 − (−1)n }. If n is even,


then an = 0, and if n is odd, then an = 2. Thus 0 and 2 are subsequential
limits of the given sequence. That these are the only two subsequential limits
are left to the exercises (Exercise 1).
(b) As our second example, consider the sequence {(−1)n + n1 }. Then both
1 and −1 are subsequential limits. If n is even, i.e., n = 2k, then
1
an = a2k = 1 + ,
2k
which converges to 1. On the other hand, if n is odd, i.e., n = 2k + 1, then
1
an = a2k+1 = −1 + ,
2k + 1
which converges to −1. This shows that −1 and 1 are subsequential limits.
Suppose {ank } is any subsequence of {an }. If the sequence {nk } contains an
infinite number of both odd and even integers, then the subsequence {ank }
cannot converge. (Why?) On the other hand, if all but a finite number of the
nk are even, then {ank } converges to 1. Similarly, if all but a finite number
of the nk are odd, then {ank } converges to −1. Thus −1 and 1 are the only
subsequential limits of {an }.
(c) Consider the sequence {n(1+(−1)n )}. If n is even, then n(1+(−1)n ) =
2n, whereas if n is odd, n(1+(−1)n ) = 0. Thus 0 and ∞ are two subsequential
limits of the sequence. The same argument as in (b) proves that these are the
only two subsequential limits. 

Our first result assures us that for convergent sequences, every subsequence
also converges to the same limit.
104 Introduction to Real Analysis

THEOREM 3.4.3 Let (X, d) be a metric space and let {pn } be a sequence
in X. If {pn } converges to p, then every subsequence of {pn } also converges
to p.

Proof. Let {pnk } be a subsequence of {pn }, and let ǫ > 0 be given. Since
pn → p, there exists a positive integer no such that d(pn , p) < ǫ for all n ≥ no .
Since {nk } is strictly increasing, nk ≥ no for all k ≥ no . Therefore,

d(pnk , p) < ǫ

for all k ≥ no , i.e., pnk → p. 

EXAMPLES 3.4.4 (a) In this example, we give an application of Theorem


3.4.3. Consider the sequence {pn } where 0 < p < 1. Since

0 < pn+1 < pn < 1

for all n, the sequence {pn } is monotone decreasing, bounded below, and hence
converges. Let
a = lim pn .
n→∞

By Theorem 3.4.3 the subsequence {p2n } also converges to a. But p2n = (pn )2 ,
and thus
a = lim p2n = lim (pn )2 = a2 .
n→∞ n→∞
2
Thus a = a. Since 0 ≤ a < 1, we must have a = 0.
(b) In our second example we show how the previous theorem may be used
to prove divergence of a sequence. Consider the sequence {sin nθπ}, where θ
is a rational number with 0 < θ < 1. Write θ = a/b, with a, b ∈ N and
b ≥ 2. When a = kb, k ∈ N, then sin nθπ = sin kaπ = 0. Therefore 0 is a
subsequential limit of the sequence. On the other hand, if n = 2kb + 1, k ∈ N,
then
a  a 
sin nθπ = sin (2kb + 1) π = sin 2kaπ + π
b b
a a
= cos(2kaπ) sin π = sin π.
b b
Since 0 < a/b < 1, sin ab π 6= 0. Thus sin ab π is another distinct subsequential
limit of {sin nθπ}. Hence as a consequence of Theorem 3.4.3 the sequence
{sin nθπ} diverges. The result is still true if θ is irrational. The proof however
is much more difficult. 

The following result, which is a sequential version of Theorem 2.3.7, will


prove useful in subsequent results.

THEOREM 3.4.5 Let K be a compact subset of a metric space (X, d). Then
every sequence in K has a convergent subsequence which converges in K.
Sequences of Real Numbers 105

Proof. Let {pn } be a sequence in K, and let E = {pn : n = 1, 2, . . . }. If E is


finite, then there exists a point p ∈ E and a sequence {nk } with n1 < n2 < . . .
such that
pn1 = pn2 = · · · = p.
The subsequence {pnk } obviously converges to p which is in K.
If E is infinite, then by Theorem 2.3.7, E has a limit point p ∈ K. Choose
n1 such that d(p, pn1 ) < 1. Having chosen n1 , . . . , nk−1 , choose an integer
nk > nk−1 so that
1
d(p, pnk ) < .
k
Such an integer nk exists since every neighborhood of p contains infinitely
many points of E. The sequence {pnk } is a subsequence of {pn }, which by
construction converges to p ∈ K. 
We are now ready to state and prove the sequential version Bolzano-
Weierstrass theorem.

COROLLARY 3.4.6 (Bolzano-Weierstrass) Every bounded sequence in


R has a convergent subsequence.
Proof. Suppose {pn } is a bounded sequence in R. Then there exists a positive
integer M such that {pn } is a sequence in the compact set [−M, M ]. The result
now follows by the previous theorem. 
Remark. The converse of Theorem 3.4.5 is also true. If K is a subset of
a metric space (X, d) having the property that every sequence in K has a
convergent subsequence, then K is compact. For metric spaces, the proof of
the converse is similar to Miscellaneous Exercise 4 of the previous chapter.
If K is a subset of R, then the hypothesis can be used to show that K is
closed and bounded, and thus by Theorem 2.4.2 is compact.
An argument similar to the one used in the previous Corollary may be
used to prove the following.
THEOREM 3.4.7 Let {pn } be a sequence in a metric space(X, d). If p is
a limit point of {pn : p ∈ N}, then there exists a subsequence {pnk } of {pn }
such that pnk → p as k → ∞.
Proof. Exercise 8 
As an application of Theorem 3.4.7 we consider the following example.
EXAMPLE 3.4.8 Let {rn }∞ n=1 be an enumeration of the rational numbers
in [0, 1]. By Example 2.2.13(c), every p ∈ [0, 1] is a limit point of {rn : n =
1, 2, ...}. Thus if p ∈ [0, 1], there exists a subsequence {rnk } of {rn } such
that rnk → p. The sequence {rn } has the property that every p ∈ [0, 1] is a
subsequential limit of the sequence. This sequence also provides an example
of a sequence for which the set of subsequential limits of the sequence is
uncountable. 
106 Introduction to Real Analysis

Exercises 3.4
1. a. Prove that 0 and 2 are the only subsequential limits of the sequence
{1 − (−1)n }.
b. Prove that 0 and ∞ are the only subsequential limits of the sequence
{n(1 + (−1)n )}.
2. a. Construct a sequence {sn } for which the subsequential limits are
{−∞, −2, 1}.
b. Construct a sequence {sn } for which the set of subsequential limits is
countable.
3. Findnall the osubsequential limitsn of the following sequences.
nπ nπ o
*a. sin . b. n sin .
2 4
(−1)n
 
*c. 1 − d. {(1.5 + (−1)n )n }.
n n nπ o
*e. (−1)n + 2 sin nπ

2
f. n sin
4
4. Use Example 3.3.5 to find the limit of each of the following sequences.
Justify your answer.
( 6n )  n   n 
1 1 2
*a. 1+ b. 1+ c. 1+
3n 2n n

5. Suppose p > 1. Use the method of Example 3.4.4 to show that lim n p =
n→∞
1.
6. For n ∈ N set pn = n1/n .
a. Show that 1 < pn+1 < pn for all n ≥ 3.
b. Let p = lim pn . Use the fact that the subsequence {p2n } also con-
n→∞
verges to p to conclude that p = 1.
7. Let {pn } be a bounded sequence of real numbers and let p ∈ R be such
that every convergent subsequence of {pn } converges to p. Prove that the
sequence {pn } converges to p.
8. Prove Theorem 3.4.7.
9. Prove that every sequence in R has a monotone subsequence.
10. *Prove that every bounded sequence in Rn has a convergent subsequence.
11. Use the Bolzano-Weierstrass theorem to prove the nested intervals prop-
erty (Corollary 3.3.3).
12. Prove that every uncountable subset of R has a limit point in R.

3.5 Limit Superior and Inferior of a Sequence


In this section, we define the limit superior and limit inferior of a sequence
of real numbers. These two limit operations are important because unlike the
Sequences of Real Numbers 107

limit of a sequence, the limit superior and limit inferior of a sequence always
exist. The concept of the limit superior and limit inferior will also be important
in our study of both series of real numbers and power series.
Let {sn } be a sequence in R. For each k ∈ N, we define ak and bk as
follows:

ak = inf{ sn : n ≥ k },
bk = sup{ sn : n ≥ k }.

Recall that for a nonempty subset E of R, sup E is the least upper bound of
E if E is bounded above, and ∞ otherwise.
From the definition, ak ≤ bk for all k. Furthermore, the sequences {ak }
and {bk } satisfy the following:

ak ≤ ak+1 and bk ≥ bk+1 (3)

for all k. To prove (3), let Ek = {sn : n ≥ k}. Then Ek+1 ⊂ Ek . Therefore, if
bk = sup Ek , sn ≤ bk for all n ≥ k. In particular

s n ≤ bk for all n ≥ k + 1.

Therefore bk+1 = sup Ek+1 ≤ bk . A similar argument will show that the
sequence {ak } is nondecreasing.
As a consequence of (3) the sequence {ak } is monotone increasing and
the sequence {bk } is monotone decreasing. Thus by Theorems 3.3.2 and 3.3.7,
these two sequences always have limits in R ∪ {−∞, ∞}.

DEFINITION 3.5.1 Let {sn } be a sequence in R. The limit superior of


{sn }, denoted lim sn or lim sn , is defined as
n→∞

lim sn = lim bk = inf sup{sn : n ≥ k}.


n→∞ k→∞ k∈N

The limit inferior of {sn }, denoted lim sn or lim sn , is defined as


n→∞

lim sn = lim ak = sup inf{sn : n ≥ k}.


n→∞ k→∞ k∈N

We now give several examples for which we will compute the limit inferior
and limit superior. As will be evident, these computations are very tedious.
An easier method will be given in Theorem 3.5.7.

EXAMPLES 3.5.2 (a) {1 + (−1)n }∞ n


n=1 . Let sn = 1 + (−1) . Then sn = 2
if n is even, 0 otherwise. Thus ak = 0 for all k and bk = 2 for all k. Therefore

lim sn = 2 and lim sn = 0.


n→∞ n→∞
108 Introduction to Real Analysis

(b) {n (1 + (−1)n )}∞


n=1 . In this example,
(
n 0, if n is odd ,
sn = n (1 + (−1) ) =
2n, if n is even.

Set Ek = {sn : n ≥ k}. Then

Ek = {0, 2(k + 1), 0, 2(k + 3), · · · } if k is odd,


Ek = {2k, 0, 2(k + 2), 0, 2(k + 4), · · · } if k is even.

Therefore ak = inf Ek = 0, and bk = sup Ek = ∞. Thus

lim sn = 0 and lim sn = ∞.


n→∞ n→∞

(c) {(−1)n + n1 }∞ n
n=1 . Set sn = (−1) + 1/n. Then
(
−1 + n1 , n odd,
sn =
1 + n1 , n even.

To compute the limit superior and inferior of the sequence {sn }, we set Ek =
{sn : n ≥ k}. If k is even, then
n o
1
Ek = 1 + k1 , −1 + k+1 1
, 1 + k+2 ,··· .

Therefore, for k even,


1
bk = sup Ek = 1 + and ak = inf Ek = −1.
k
Similarly, for k odd,
1
bk = sup Ek = 1 + and ak = inf Ek = −1.
k+1
As a consequence,
 1
 1+ , k even,
k

ak = −1 for all k, bk =
1
1+ , k odd.


k+1
Thus
lim sn = 1 and lim sn = −1. 
n→∞ n→∞

The following theorem provides an (ǫ, no ) characterization of the limit su-


perior. An analogous characterization for the limit inferior is given in Theorem
3.5.4.
Sequences of Real Numbers 109

THEOREM 3.5.3 Let {sn }∞


n=1 be a sequence in R.
(a) Suppose lim sn ∈ R. Then β = lim sn if and only if for all ǫ > 0
n→∞ n→∞
(i) there exists no ∈ N such that sn < β + ǫ for all n ≥ no , and
(ii) given n ∈ N, there exists k ∈ N with k ≥ n such that sk > β − ǫ.
(b) lim sn = ∞ if and only if given M and n ∈ N, there exists k ∈ N
n→∞
with k ≥ n such that sk ≥ M .
(c) lim sn = −∞ if and only if sn → −∞ as n → ∞.
n→∞

Remark. The statement “sn < β + ǫ for all n ≥ no ” means that sn < β + ǫ
for all but finitely many n. On the other hand, the statement “given n, there
exists k ∈ N with k ≥ n such that sk > β − ǫ” means that sn > β − ǫ for
infinitely many indices n.

THEOREM 3.5.4 Let {sn } be a sequence in R.


(a’) Suppose lim sn ∈ R. Then α = lim sn if and only if for all ǫ > 0
n→∞ n→∞
(i) there exists no ∈ N such that sn > α − ǫ for all n ≥ no , and
(ii) given n ∈ N, there exists k ∈ N with k ≥ n such that sk < α + ǫ.
(b’) lim sn = −∞ if and only if given M and n ∈ N, there exists k ∈ N
n→∞
with k ≥ n such that sk ≤ M .
(c’) lim sn = ∞ if and only if sn → ∞ as n → ∞.
n→∞

Proof of Theorem 3.5.3 We will only proof (a). The proofs of (b) and (c)
are left to the exercises (Exercise 5).
(a) Suppose β = lim sn = lim bk where
k→∞

bk = sup{sn : n ≥ k}.

Let ǫ > 0 be given. Since lim bk = β there exists a positive integer no such
k→∞
that bk < β + ǫ for all k ≥ no . Since sn ≤ bk for all n ≥ k,

sn < β + ǫ for all n ≥ no .

This proves (i). Suppose n ∈ N is given. Since bk → β, and {bk } is monotone


decreasing, bk ≥ β for all k. In particular, bn ≥ β. By the definition of bn
however, given ǫ > 0, there exists an integer k ≥ n such that

sk > bn − ǫ ≥ β − ǫ,

which proves (ii).


110 Introduction to Real Analysis

Conversely, assume that (i) and (ii) hold. Let ǫ > 0 be given. By (i) there
exists no ∈ N such that sn < β + ǫ for all n ≥ no . Therefore

bno = sup{sn : n ≥ no } ≤ β + ǫ.

Since the sequence {bn } is monotone decreasing, bn ≤ β + ǫ for all n ≥ no .


Thus
lim sn = lim bn ≤ β + ǫ.
Since ǫ > 0 was arbitrary, lim sn ≤ β.
Suppose β ′ = lim sn < β. Choose ǫ > 0 such that β ′ < β − 2ǫ. But then
there exists no such that

sn < β ′ + ǫ < β − ǫ for all n ≥ no ,

which contradicts (ii). Thus lim sn = β. 


To illustrate the previous two theorems, consider the sequence

sn = (−1)n + 1/n

of Example 3.5.2(c). For this sequence, lim sn = 1 and lim sn = −1. Given
ǫ > 0, then
sn < 1 + ǫ
for all n ∈ N with n ≥ 1/ǫ. Since the odd terms get close to −1, we can never
have the existence of an integer no such that sn > 1 − ǫ for all n ≥ no . On
the other hand, given any n ∈ N, there exists an even integer k ≥ n such that
sk > 1 − ǫ.
An immediate consequence of the previous two theorems is as follows:

COROLLARY 3.5.5 lim sn = lim sn if and only if lim sn exists in


n→∞ n→∞ n→∞
R ∪ {−∞, ∞}.

Proof. Suppose lim sn = lim sn = α ∈ R. Let ǫ > 0 be given. By (a) and (a’)
of the previous two theorems, there exist positive integers n1 and n2 , such
that

sn < α + ǫ for all n ≥ n1 , and


sn > α − ǫ for all n ≥ n2 .

Thus if no = max{n1 , n2 },

α − ǫ < sn < α + ǫ

for all n ≥ no ; i.e., lim sn = α. The proofs of the cases α = ∞ or α = −∞


n→∞
are similar.
If lim sn = α, then it easily follows that both lim sn = α and lim sn = α.
n→∞

Sequences of Real Numbers 111

THEOREM 3.5.6 Let {an } and {bn } be bounded sequences in R. Then


lim an + lim bn ≤ lim (an + bn ) ≤ lim an + lim bn
n→∞ n→∞ n→∞ n→∞ n→∞

≤ lim (an + bn ) ≤ lim an + lim bn .


n→∞ n→∞ n→∞

Proof. Exercise 6 
The following theorem relates the limit superior and inferior of a sequence
to the subsequential limits of the sequence, and is in fact very useful for finding
lim sn and lim sn of a sequence {sn }.

THEOREM 3.5.7 Let {sn }∞


n=1 be a sequence in R and let

E = the set of subsequential limits of {sn } in R ∪ {−∞, ∞}.


Then lim sn and lim sn are in E and
n→∞ n→∞
(a) lim sn = sup E, and
n→∞
(b) lim sn = inf E.
n→∞

Proof. Let s = lim sn . Suppose s ∈ R. To show that s ∈ E, we show the


existence of a subsequence {snk } of {sn } which converges to s. Take ǫ = 1.
Let n1 be the smallest integer such that
s − 1 < sn1 < s + 1.
Such an integer exists by (i) and (ii) of Theorem 3.5.3(a). Suppose n1 < n2 <
1
· · · < nk have been chosen. Take ǫ = k+1 . Let nk+1 be the smallest integer
greater than nk such that
1 1
s− k+1 < snk+1 < s + k+1 .

Again, such an integer exists by (i) and (ii) of Theorem 3.5.3(a). Then {snk }
is a subsequence of {sn } which clearly converges to s. Therefore s ∈ E. The
case s = ∞ is treated similarly. If s = −∞, then by (c) of Theorem 3.5.3,
sn → −∞ as n → ∞.
Since s ∈ E, s ≤ sup E. It remains to be shown that sup E = s. If s = ∞
we are done. Otherwise, suppose sup E = β > s. Suppose β 6= ∞. Then there
exists α ∈ E such that
s < α ≤ β.
Since α ∈ R, we can choose ǫ > 0 such that s + ǫ < α − ǫ. For this ǫ, there
exists no ∈ N such that sn < s + ǫ for all n ≥ no . Hence there can exist only
finitely many k such that
|sk − α| < ǫ.
Consequently no subsequence of {sn } can converge to α. This contradiction
shows that sup E = s. The case β = ∞ is treated similarly. 
112 Introduction to Real Analysis

EXAMPLES 3.5.8 In the following examples we use Theorem 3.5.7 to com-


pute lim sn and lim sn for each of the given sequences {sn }.
(a) Let sn = (−1)n + 1/n. By Example 3.4.2(b), the set of subsequential
limits of {sn } is {−1, 1}. Thus by the previous theorem,

lim sn = −1 and lim sn = 1.

(b) Let sn = n(1 + (−1)n ). By Example 3.4.2(c) the subsequential limits


of {sn } are 0 and ∞. Therefore,

lim sn = 0 and lim sn = ∞.


(c) Let sn = sin . If n is even, i.e., n = 2k, then s2k = sin kπ = 0. On
2
the other hand, if n is odd, i.e., n = 2k+1, then s2k+1 = sin(2k+1) π2 = (−1)k .
Hence the set of subsequential limits of the sequence {sn } is {−1, 0, 1}. As a
consequence,
lim sn = −1 and lim sn = 1. 

Exercises 3.5
1. Find the limit inferior and limit superior of each of the following se-
quences.
n nπ o n nπ o
*a. n sin b. (1 + (−1)n ) sin
4 4
n + (−1)n n2
 
n n
*c. d. {[1.5 + (−1) ] }
n2 + 1
1 − 2(−1)n n
   
1
e. + n(1 + cos nπ) *f.
n 3n + 2
2. Let {an } be a sequence in R. If lim |an | = 0, prove that lim an = 0.
n→∞

3. *Let {rn } be an enumeration of the rationals in (0, 1). Find lim rn and
lim rn .
4. Let {sn } be a sequence in R. If s ∈ R satisfies that for every ǫ > 0, there
exists no ∈ N such sn < s + ǫ for all n ≥ no , prove that lim sn ≤ s.
5. a. Prove Theorem 3.5.3(b).
b. Prove Theorem 3.5.3(c).
6. *a. Let {an } and {bn } be bounded sequences in R. Prove that
lim an + lim bn ≤ lim(an + bn ) ≤ lim an + lim bn .
b. Give an example to show that equality need not hold in (a).
7. a. If an and bn are positive for all n, prove that
lim(an bn ) ≤ (lim an ) (lim bn ),
provided the product on the right is not of the form 0 · ∞.
b. Need equality hold in (a)?
Sequences of Real Numbers 113

8. *Let s1 = 0. For n ∈ N, n > 1, let sn be defined by


s2m−1 1
s2m = , s2m+1 = + s2m .
2 2
Find lim sn and lim sn .
√ an+1
9. Let an > 0 for all n. Prove that lim n an ≤ lim .
an
10. *Suppose {an }, {bn } are sequences of nonnegative real numbers with
lim bn = b 6= 0, and lim an = a. Prove that lim an bn = a b.
n→∞ n→∞ n→∞

3.6 Cauchy Sequences

In order to apply the definition to prove that a given sequence {pn } converges,
it is required that we know the limit of the sequence {pn }. For this reason,
theorems that provide sufficient conditions for convergence, such as Theorem
3.3.2, are particularly useful. The drawback to Theorem 3.3.2 is that it applies
only to monotone sequences of real numbers. In this section, we consider
another criterion that for sequences in R is sufficient to ensure convergence of
the sequence.

DEFINITION 3.6.1 Let (X, d) be a metric space. A sequence {pn }∞ n=1 in


X is a Cauchy sequence if for every ǫ > 0, there exists a positive integer
no such that
d(pn , pm ) < ǫ
for all integers n, m ≥ no .

Remark. In the above definition, the criterion d(pn , pm ) < ǫ for all integers
n, m ≥ no is equivalent to
d(pn+k , pn ) < ǫ
for all n ≥ no and all k ∈ N. Thus if {pn } is a Cauchy sequence in X,

lim d(pn+k , pn ) = 0
n→∞

for every k ∈ N. The converse however is false; namely, if {pn } is a sequence


in R that satisfies lim d(pn+k , pn ) = 0 for every k ∈ N, this does not imply
n→∞
that the sequence {pn } is a Cauchy sequence (Exercise 4). The hypothesis
only implies that for each k ∈ N, given ǫ > 0, there exists a positive integer
no such that d(pn+k , pn ) < ǫ for all n ≥ no .
114 Introduction to Real Analysis

THEOREM 3.6.2 Let (X, d) be a metric space


(a) Every convergent sequence in X is a Cauchy sequence.
(b) Every Cauchy sequence is bounded.

Proof. (a) Suppose that {pn } converges to p ∈ X. Let ǫ > 0 be given. Then
for the given ǫ, there exists a positive integer no such that

d(pn , p) < 21 ǫ

for all n ≥ no . Thus by the triangle inequality, if n, m ≥ no ,

d(pn , pm ) ≤ d(pn , p) + d(p, pm ) < 12 ǫ + 21 ǫ = ǫ.

(b) Take ǫ = 1. By the definition of Cauchy sequence, there exists no ∈ N


such that d(pn , pm ) < 1 for all n, m ≥ no . Let

M = max{1, d(p1 , pno ), . . . , d(pno −1 , pno )}.

Then for all n, d(pn , pno ) ≤ M . Thus {pn } is bounded. 

EXAMPLES 3.6.3 (a) Let X = (0, 1) with d(x, y) = |x − y|. Consider


the sequence {1/n}∞ n=1 . Let ǫ > 0 be given. Choose an integer no such that
1/n < ǫ/2 for all n ≥ no . Then for all n, m ≥ no ,

1 1 1 1
1
d( n1 , m )= − ≤ + < ǫ.
n m n m

Thus the sequence {1/n} is Cauchy but does not converge in X. In R the
sequence converges to 0, but 0 ∈
/ X. Intuitively, a Cauchy sequence that fails
to converge does so because of the absence of a point or element in the space
to which it can converge.
(b) Let X = Q with d(p, q) = |p − q|. If {pn } is any sequence of rational
numbers that converges to an irrational number, then the sequence {pn } is a
Cauchy sequence in (Q, d), which however does not converge in Q. 

THEOREM 3.6.4 If {pn } is a Cauchy sequence in a metric space X that


has a convergent subsequence, then the sequence {pn } converges.

Proof. Suppose {pnk } is a convergent subsequence of {pn } with lim pnk = p.


Let ǫ > 0 be given. Since {pn } is Cauchy, there exists an integer N1 such that

d(pn , pm ) < 21 ǫ for all n, m ≥ N1 .

Since pnk → p, for the given ǫ, there exists an integer k1 such that

d(pnk , p) < 12 ǫ for all k ≥ k1 .


Sequences of Real Numbers 115

Let no = max{k1 , N1 }, and choose nk such that k ≥ no . Then nk ≥ N1 . Thus


if n ≥ no , by the triangle inequality

d(pn , p) ≤ d(pn , pnk ) + d(pnk , p) < 21 ǫ + 21 ǫ = ǫ.

Therefore lim pn = p, which proves the result. 

THEOREM 3.6.5 Every Cauchy sequence of real numbers converges.

Proof. Let {pn } be a Cauchy sequence in R. By Theorem 3.6.2, the sequence


{pn } is bounded. Thus by Corollary 3.4.6, the sequence {pn } has a convergent
subsequence. The result now follows by Theorem 3.6.4. 

DEFINITION 3.6.6 A metric space (X, d) is said to be complete if every


Cauchy sequence in X converges to a point in X.

As an example, R with the usual metric is complete. In the exercises you


will be asked to prove that R2 with the euclidean metric is also complete.
Additional examples of complete metric spaces will be encountered in subse-
quent chapters. Since the proof of Theorem 3.6.5 used the Bolzano-Weierstrass
theorem, the completeness of R ultimately depends on the least upper bound
property of R. Conversely, if we assume completeness of R, then we can prove
that R satisfies the least upper bound property (Exercise 14). For this rea-
son the least upper bound or supremum property of R is often called the
completeness property of R.

EXAMPLES 3.6.7 (a) For our first example we consider the sequence {sn }
where for n ∈ N
1 1
sn = 1 + 2 + · · · + 2 .
2 n
For k ∈ N,
1 1
|sn+k − sn | = + ··· +
(n + 1)2 (n + k)2
   
1 1 1 1
≤ − + ··· + −
n n+1 n+k−1 n+k
1 1
= − .
n n+k
In the above we have used the inequality
 
1 1 1
≤ −
(n + m)2 n+m−1 n+m

valid for all n, m ∈ N. Since the sequence { n1 } converges, it is a Cauchy


sequence. Thus given ǫ > 0 there exists no ∈ N such that | n1 − n+k
1
| < ǫ for
116 Introduction to Real Analysis

all n ≥ no and all k ∈ N. Therefore the sequence {sn } is a Cauchy sequence


and hence converges.
(b) In our second example we give an application to illustrate how the
concept of a Cauchy sequence may be used to prove convergence of a given
sequence. Additional applications will be given in the exercises. Let a1 , a2 be
arbitrary real numbers with a1 6= a2 . For n ≥ 3, define an inductively by
1
an = (an−1 + an−2 ).
2
Our first goal is to show that the sequence {an } is Cauchy. We first note that

an+1 − an = − 21 (an − an−1 ).

As a consequence, for n ≥ 2,

an+1 − an = (− 21 )n−1 (a2 − a1 ). (4)

This last statement is most easily verified by induction (Exercise 5). For m ≥ 1,
consider |an+m − an |. By the triangle inequality,
m−1
X m−1
X
|an+m − an | = an+k+1 − an+k ≤ |an+k+1 − an+k |,
k=0 k=0

which by the above

m−1 m
X 1 1 X 1
≤ |a2 − a1 | = |a − a1 |
n−2 2
.
2n+k−1 2 2k
k=0 k=1

By Example 1.3.2(a)
m
X r − rm+1
rk = , r 6= 1. (5)
1−r
k=1

Thus with r = 21 ,
m 1
X 1 2 − ( 12 )m+1 1
= = 1 − m < 1.
2k
k=1
1 − 21 2

Therefore,
1
|an+m − an | ≤ |a2 − a1 |
2n−2
for all n ≥ 2 and m ∈ N. Let ǫ > 0 be given. Choose no such that
|a2 − a1 |/2n−2 < ǫ for all n ≥ no . Then by the above,

|an+m − an | < ǫ
Sequences of Real Numbers 117

for all m ∈ N, n ≥ no . This however is just another way of stating that

|an − am | < ǫ for all m, n ≥ no .

Therefore the sequence {an } is a Cauchy sequence in R, and thus by Theorem


3.6.5,
a = lim an
n→∞

exists in R.
Can we find the limit a here? If we take the same approach as in Example
3.3.4(c), by taking the limit of both sides of equation (5) we only get a = a.
To find the value of a, let us observe that

an+1 − a1 = (an+1 − an ) + (an − an−1 ) + · · · + (a2 − a1 )


Xn
= (ak+1 − ak ),
k=1

then use (4) to get

n
X
= (a2 − a1 ) (− 21 )k−1
k=1
= 23 (a2 − a1 )[1 − (− 21 )n ].

The last equality follows from formula (5). Since an+1 → a and (− 21 )n → 0,
upon taking the limit of both sides we obtain

a − a1 = 32 (a2 − a1 ) or a = a1 + 23 (a2 − a1 ). 

Contractive Sequences
One of the key properties of the sequence {an } of the previous example was
that
|an+1 − an | ≤ 21 |an − an−1 |
for all n ≥ 2. This property was used to show that the sequence {an } was a
Cauchy sequence and thus converged. Sequences that satisfy a criterion such
as the above are commonly referred to as contractive sequences. We make this
precise in the following definition.

DEFINITION 3.6.8 A sequence {pn } in a metric space (X, d) is contrac-


tive if there exists a real number b, 0 < b < 1, such that

d(pn+1 , pn ) ≤ b d(pn , pn−1 )

for all n ∈ N, n ≥ 2.
118 Introduction to Real Analysis

If {pn } is a contractive sequence, then an argument similar to the one used


in the previous example shows that

d(pn+1 , pn ) ≤ bn−1 d(p2 , p1 )

for all n ≥ 1, and that

bn−1
d(pn+m , pn ) ≤ bn−1 d(p2 , p1 )(1 + b + · · · + bm−1 ) < d(p2 , p1 )
1−b
for all n, m ∈ N. As a consequence, every contractive sequence is a Cauchy
sequence. Therefore, if (X, d) is a complete metric space, every contractive
sequence in X converges to a point in X. We summarize this in the following
theorem.

THEOREM 3.6.9 Let (X, d) be a complete metric space. Then every con-
tractive sequence in X converges in X. Furthermore, if the sequence {pn } is
contractive and p = lim pn , then
bn−1
(a) d(p, pn ) ≤ d(p2 , p1 ), and
1−b
b
(b) d(p, pn ) ≤ d(pn , pn−1 ), where 0 < b < 1 is the constant in
1−b
Definition 3.6.8.

Proof. We leave the details of the proof to the exercises (Exercise 9). 

Exercises 3.6
1. If {an } and {bn } are Cauchy sequences in R, prove (without using The-
orem 3.6.5) that {an + bn } and {an bn } are also Cauchy sequences.
2. For each of the following determine whether the given sequence is a
Cauchy
 sequence.
(−1)n
  
n+1
*a. b. {(−1)n } c. n +
 n n  
n
1 + (−1)n n 1 + (−1)n n2
   
1
*d. e. f. 1 + √ .
n2 + 3 2n2 + 3 n
1 1 1
3. For n ∈ N let sn = 1 + + + · · · + . Prove that {sn } is a Cauchy
2! 3! n!
sequence.
1 1
4. Consider the sequence {sn } defined by sn = 1 + + · · · + .
2 n
*a. Show that {sn } is not a Cauchy sequence.
b. Even though {sn } is not a Cauchy sequence, show that
lim |sn+k − sn | = 0 for all k ∈ N.
n→∞

5. Use mathematical induction to prove Identity (4).


Sequences of Real Numbers 119

6. If K is a compact subset of a metric space (X, d), prove that every Cauchy
sequence in K converges to a point in K.
7. Prove that (R2 , d2 ) is complete.
8. Let {an } be the sequence of Example 3.6.7(b).
a. Use mathematical induction to prove that
 
1 1 1
a2k+1 = 2k−1 (a1 + a2 ) + (a1 + 2a2 ) 1 − k−1 .
2 3 4
b. Use the result of (a) to find lim an .
9. Prove Theorem 3.6.9.
10. *Let a1 > 0, and for n ≥ 2, define an = (2 + an−1 )−1 . Prove that {an }
is contractive, and find lim an .
n→∞

11. Let c1 ∈ (0, 1) be arbitrary, and for n ∈ N set cn+1 = 51 (c2n + 2).
a. Show that {cn } is contractive.
b. Let c = lim cn Show that c is a solution of x2 − 5x + 2 = 0.
n→∞

c. Let c1 = 21 . Using the result of Theorem 3.6.9, determine the value of


n such that |cn − c| < 10−3 .
12. Consider the polynomial p(x) = x3 +5x−1. It can be shown that p(x) has
exactly one root in the open interval (0, 1). Let a1 ∈ (0, 1) be arbitrary,
1
and for n > 1, set an+1 = (1 − a3n ).
5
a. Prove that the sequence {an } is contractive.
b. Show that if a = lim an , then p(a) = 0.
n→∞
1
c. Let a1 = 2
.
Using the result of Theorem 3.6.9(b), determine the value
of n such that |an − a| < 10−4 .
13. Let a1 6= a2 be real numbers, and let 0 < b < 1. For n ≥ 3, set
an = b an−1 + (1 − b) an−2 .
a. Show that the sequence {an } is contractive.
*b. Find lim an .
n→∞

14. Prove that if every Cauchy sequence in R converges, then every nonempty
subset of R that is bounded above has a supremum.

3.7 Series of Real Numbers


In this section, we will give a brief introduction to series of real numbers. Some
knowledge of series, especially series with nonnegative terms, will be required
in Chapter 4. The topic of series in general, including various convergence
tests, alternating series, etc., will be treated in much greater detail in Chapter
120 Introduction to Real Analysis

7. We begin with some preliminary notation. If {an }∞


n=1 is a sequence in R
and if p, q ∈ N with p ≤ q, set
q
X
ak = ap + ap+1 + · · · + aq .
k=p

DEFINITION 3.7.1 Let {an }∞ n=1 be a sequence of real numbers. Let


{sn }∞
n=1 be the sequence obtained from {an }, where for each n ∈ N, sn =
Pn
ak . The sequence {sn } is called an infinite series, or series, and is de-
k=1
noted either as

X
ak or as a1 + a2 + · · · + an + · · · .
k=1

For each n ∈ N, sn is called the nth partial sum of the series and an is called
the nth term of the series.

P
The series ak converges if and only if the sequence {sn } of nth partial
k=1
sums converges in R. If lim sn = s, then s is called the sum of the series,
n→∞
and we write
X∞
s= ak .
k=1

P
If the sequence {sn } diverges, then the series ak is said to diverge.
k=1

EXAMPLES 3.7.2 (a) For |r| < 1, consider the geometric series

X
rk .
k=1

For n ∈ N,
n
X
sn = rk = r + r2 + · · · + rn .
k=1

Thus
(1 − r) sn = sn − r sn = r − rn+1 ,
and as a consequence
r − rn+1
sn = .
1−r
Sequences of Real Numbers 121

Since |r| < 1, by Theorem 3.2.6(e), lim rn = 0. Therefore lim sn = r/(1−r),


n→∞ n→∞
and thus

X r
rk = |r| < 1.
1−r
k=1

rk diverges (Exercise 3).
P
For |r| ≥ 1 the series
n=1

 
P 1 1
(b) Consider the series ak , where for each k ∈ N, ak = − .
k=1 k k+1
Then
n
X
sn = ak
k=1
     
1 1 1 1 1
= 1−
+ − + ··· + −
2 2 3 n n+1
1
=1− .
n+1

P
Thus lim sn = 1 and hence ak = 1.
n→∞ k=1

(−1)k . Then
P
(c) Consider
k=1

n
(
X
k 0, if n is even,
sn = (−1) =
k=1
−1, if n is odd.

Thus since {sn } diverges, the series diverges. 

The Cauchy Criterion


The following criterion, which provides necessary and sufficient conditions for
the convergence of a series, was formulated by Augustin-Louis Cauchy (1789–
1857) in 1821.


P
THEOREM 3.7.3 (Cauchy Criterion) The series ak converges if and
k=1
only if given ǫ > 0, there exists a positive integer no , such that
m
X
ak < ǫ
k=n+1

for all m > n ≥ no .


122 Introduction to Real Analysis

Proof. Since
m
X
ak = |sm − sn |,
k=n+1

the result is an immediate consequence of Theorems 3.6.2 and 3.6.5. 


P
Remark. The previous theorem simply states that the series ak converges
if and only if the sequence {sn } of n th partial sums is a Cauchy sequence.


P
EXAMPLE 3.7.4 In this example, we show that the series 1/k diverges.
k=1
We accomplish this by showing that the sequence {sn } of partial sums is not
a Cauchy sequence. Consider
1 1
s2n − sn = + ··· + , n ∈ N.
n+1 2n
There are exactly n terms in the sum on the right, and each term is greater
than or equal to 1/2n. Therefore
 
1 1
s2n − sn ≥ n = .
2n 2

The sequence {sn } therefore fails to be a Cauchy sequence and thus the series
diverges. The divergence of this series appears to have been first established by
Nicole Oresme (1323?–1382) using a method of proof similar to that suggested
in the solution of Exercise 11 of Section 3.3 

P
COROLLARY 3.7.5 If ak converges, then lim ak = 0.
k=1 k→∞

Proof. Since ak = sk − sk−1 , this is an immediate consequence of the Cauchy


criterion. 
Remark. The condition lim ak = 0 is not sufficient for the convergence of
k→∞P
1 1
P
ak . For example, the series k diverges, yet lim k = 0.
k→∞


P
THEOREM 3.7.6 Suppose ak ≥ 0 for all k ∈ N. Then ak converges if
k=1
and only if {sn } is bounded above.

Proof. Since ak ≥ 0 for all k, the sequence {sn } is monotone increasing. Thus
by Theorem 3.3.2, the sequence {sn } converges if and only if it is bounded
above. 
Sequences of Real Numbers 123

Exercises 3.7
1 1 1 1
1. *Using the inequality ≤ = − , prove that the series
k2 k(k − 1) k−1 k

X 1
converges.
k2
k=1

X 1
2. Prove that the series converges.
k2 + k
k=1

rk diverges.
P
3. If |r| ≥ 1, show that the series
k=1
∞ 1
P
4. Prove that the series converges. (See Exercise 13 of Section 3.6.)
k=1 k!
P
5. *Suppose ak ≥ 0 for all k. Prove that if ak converges, then
∞ √
X ak
converges.
k
k=1

P ∞
P
6. If ak and bk both converge, prove each of the following:
k=1 k=1

P
a. cak converges for all c ∈ R.
k=1

P
b. (ak + bk ) converges.
k=1

P
7. If (ak + bk ) converges, does this imply that the series
k=1

a1 + b1 + a2 + b2 + · · · converges?
8. Suppose bk ≥ ak ≥ 0 for all k ∈ N.

P P∞
a. If bk converges, prove that ak converges.
k=1 k=1

P ∞
P
b. If ak diverges, prove that bk diverges.
k=1 k=1

X 1
9. Consider the series , p ∈ R.
kp
k=1

a. Prove that the series diverges for all p ≤ 1.


b. Prove that the series converges for all p > 1.

Notes
This chapter provided our first serious introduction to the limit process. In subse-
quent chapters we will encounter limits of functions, the derivative, and the integral,
all of which are further examples of the limit process. Of the many results proved
124 Introduction to Real Analysis

in this chapter, it is difficult to select one or two for special emphasis. They are all
important! Many of them will be encountered again—either directly or indirectly—
throughout the text.
Some of the concepts and results of this chapter have certainly been encountered
previously; others undoubtedly are new. Two concepts which may not have been
previously encountered are limit superior (inferior) of a sequence of real numbers and
complete metric spaces. The primary importance of the limit superior and inferior
of a sequence is that these two limit operations always exist in R ∪ {−∞, ∞}. As we
will see in Chapter 7, this will allow us to present the correct statements of the root
and ratio test for convergence of a series. The limit superior will also be required to
define the radius of convergence of a power series. There will be other instances in
the text where these two limit operations will be encountered.
In the chapter we have proved several important consequences of the least up-
per bound property of R. The least upper bound property was used to prove that
every bounded monotone sequence converges. This result was subsequently used to
prove the nested intervals property, which in turn can be used to provide a proof
of the Bolzano-Weierstrass theorem. The nested intervals property can also be used
to prove the supremum property of R (Exercise 21 of Section 3.3). Another prop-
erty of the real numbers that is equivalent to the least upper bound property is the
completeness property of R; namely, every Cauchy sequence of real numbers con-
verges. Other consequences of the least upper bound property will be encountered
in subsequent chapters.
Cauchy sequences were originally studied by Cantor in the middle of the nine-
teenth century. He referred to them as “fundamental sequences” and used them in
his construction of the real number system R (See Miscellaneous Exercises 4 – 11).
The main reason that these sequences are attributed to Cauchy, rather than Cantor,
is due to the fact that his 1821 criterion for convergence of a series (Theorem 3.7.3) is
equivalent to the statement that the sequence of partial sums is a Cauchy sequence.
The fact that Cauchy was a more prominent mathematician than Cantor may also
have been a factor.

Miscellaneous Exercises
The first three exercises involve the concept of an infinite product. Let {ak } be a
sequence of nonzero real numbers. For each n = 1, 2, ..., define
n
Y
pn = ak = a1 · a2 · · · an .
k=1

If p = lim pn exists, then p is the infinite product of the sequence {ak }∞


k=1 , and
n→∞
we write
Y∞
p= ak .
k=1
If the limit does not exist, then the infinite product is said to diverge.
Remark. Some authors require that p 6= 0. We will not make this requirement;
rather we will specify p 6= 0 if this hypothesis is required in a result.
Sequences of Real Numbers 125

1. Determine whether each of the following infinite products converge. If it


converges, find the infinite product.
∞ ∞   ∞  
Y Y 1 Y 1
a. (−1)k . b. 1− . c. 1− 2 .
k k
k=1 k=2 k=2

Y
2. If ak = p with p 6= 0, prove that lim ak = 1.
k→∞
k=1
3. If an ≥ 0 for all n ∈ N, prove that
Q∞ P∞
k=1 (1 + ak ) converges if and only if k=1 ak converges.
To prove the result, establish the following inequality:
a1 + · · · + an ≤ (1 + a1 ) · · · (1 + an ) ≤ ea1 +···+an .
Construction of the Real Numbers
In the following exercises we outline the construction of the real num-
ber system from the rational number system using Cantor’s method of
Cauchy sequences. Let Q denote the set of rational numbers. A sequence
{an } in Q is Cauchy if for every r ∈ Q, r > 0, there exists a positive
integer no such that |an − am | < r for all n, m ≥ no . A sequence {an } in
Q is called a null sequence if for every r ∈ Q, r > 0, there exists a pos-
itive integer no such that |an | < r for all n ≥ no . Two Cauchy sequences
{an } and {bn } in Q are said to be equivalent, denoted {an } ∼ {bn },
provided {an − bn } is a null sequence.
4. Let {an }, {bn }, {cn }, and {dn } be Cauchy sequences in Q. Prove the
following:
a. {an } ∼ {an }.
b. If {an } ∼ {bn }, then {bn } ∼ {an }.
c. If {an } ∼ {bn } and {bn } ∼ {cn }, then {an } ∼ {cn }.
d. If {an } ∼ {bn }, then {−an } ∼ {−bn }.
e. If {an } ∼ {cn } and {bn } ∼ {dn }, then
{an + bn } ∼ {cn + dn } and {an bn } ∼ {cn dn }.

Given a Cauchy sequence {an } in Q, let [{an }] denote the set of all
Cauchy sequences in Q equivalent to {an } . The set [{an }] is called the
equivalence class determined by {an }.
5. Given two Cauchy sequences {an } and {bn } in Q, prove that [{an }] =
[{bn }] provided {an } ∼ {bn }, and [{an }] ∩ [{bn }] = ∅ otherwise.
Let R denote the set of equivalence classes of Cauchy sequences in Q. We
denote the elements of R by lower case Greek letters α, β, γ, .... Thus if
α ∈ R, α = [{an }] for some Cauchy sequence {an } in Q. The sequence
{an } is called a representative of the equivalence class α. Suppose α =
[{an }] and β = [{bn }]. Define −α, α + β, and α · β as follows:
−α = [{−an }],
α + β = [{an + bn }],
α · β = [{an bn }].
126 Introduction to Real Analysis

One needs to show that these operations are well defined; that is, inde-
pendent of the representative of the equivalence class. For example, to
prove that −α is well defined, we suppose that {an } and {bn } are two rep-
resentatives of α; i.e., {an } ∼ {bn }. But then by 4(d), {−an } ∼ {−bn }.
Therefore, [{−an }] = [{−bn }]. This shows that −α is well defined.
6. Prove that the operations + and · are well defined on R.

For each p ∈ Q, let {p} denote the sequence all of whose terms are equal
to p. If p ∈ Q, let αp = [{p}]. Also, we set

θ = [{0}], ι = [{1}].

As we will see, the element θ will be the zero of R and ι will be the unit
of R. A Cauchy sequence {bn } in Q belongs to θ if and only if bn → 0.
Similarly, {an } ∈ ι if and only if (an − 1) → 0. The following problem
provides us with the multiplicative inverse of α 6= θ.
7. If α 6= θ, prove that there exists {an } ∈ α such that an 6= 0 for all n ∈ N,
and that { a1n } is a Cauchy sequence. Define α−1 = [{ a1n }].
8. Prove that R with operations + and · is a field.

We now proceed to define an order relation on R. A Cauchy sequence


{an } in Q is positive if there exists r ∈ Q, r > 0, and no ∈ N such that
an > r for all n ≥ no . Let P be defined by
P = {[{an }] : {an } is a positive Cauchy sequence }.
9. Prove that the set P satisfies the order properties (O1) and (O2) of
Section 1.4.
10. Show that the mapping p → αp is a one-to-one mapping of Q into R
which satisfies

αp + αq = αp+q
αp · αq = αpq

for all p, q ∈ Q. Furthermore, if p > 0, then αp ∈ P.


11. Prove that every nonempty subset of R which is bounded above has a
least upper bound in R.

The above exercises prove that R is an ordered field which satisfies the least
upper bound property. One can show that any two complete ordered fields are in
fact isomorphic, that is, there exists a one-to-one map of one onto the other which
preserves the operations of addition, multiplication, and the order properties. Thus
R is isomorphic to the real numbers R.
Sequences of Real Numbers 127

Supplemental Reading

Aguirre, J. A. F., “A note on Cauchy Newman, D. J. and Parsons, T.


sequences,” Math. Mag. 68 (1995), 296– D., “On monotone subsequences,” Amer.
297. Math. Monthly 95 (1988) 44–45.
Bell, H. E., “Proof of a fundamen- Staib, J. H. and Demos, M. S., “On
tal theorem on sequences,” Amer. Math. the limit points of the sequence {sin n},”
Monthly 71 (1964) 665–666. Math. Mag. 40 (1967) 210–213.
Goffman, C., “Completeness of the Wenner, B. R., “The uncountability
real numbers,” Math. Mag. 47 (1974) 1– of the reals,” Amer. Math. Monthly 76
8. (1969) 679–680.
4
Limits and Continuity

The concept of limit dates back to the late seventeenth century and the work
of Isaac Newton (1642–1727) and Gottfried Leibniz (1646–1716). Both of
these mathematicians are given historical credit for inventing the differential
and integral calculus. Although the idea of “limit” occurs in Newton’s work
Philosophia Naturalis Principia Mathematica of 1687, he never expressed the
concept algebraically; rather he used the phrase “ultimate ratios of evanescent
quantities” to describe the limit process involved in computing the derivatives
of functions.
The subject of limits lacked mathematical rigor until 1821 when Augustin-
Louis Cauchy (1789–1857) published his Cours d’Analyse in which he offered
the following definition of limit: “If the successive values attributed to the
same variable approach indefinitely a fixed value, such that finally they differ
from it by as little as desired, this latter is called the limit of all the others.”
Even this statement does not resemble the modern delta-epsilon version of
limit given in Section 1. Although Cauchy gave a strictly verbal definition of
limit, he did use epsilons, deltas, and inequalities in his proofs. For this reason
Cauchy is credited for putting calculus on the rigorous basis with which we
are familiar today.
Based on the previous study of calculus, the student should have an in-
tuitive notion of what it means for a function to be continuous. This most
likely compares to how mathematicians of the eighteenth century perceived
a continuous function; namely one that can be expressed by a single formula
or equation involving a variable x. Mathematicians of this period certainly
accepted functions that failed to be continuous at a finite number of points.
However, even they might have difficulty envisaging a function that is contin-
uous at every irrational number and discontinuous at every rational number
in its domain. Such a function is given in Example 4.2.2(g). An example of an
increasing function having the same properties will also be given in Section 4
of this chapter.
In Section 1 we define the limit at a point of a real-valued function defined
on a subset of a metric space, and provide numerous examples to illustrate this
idea. In Sections 2 and 3 we consider the closely related theory of continuity
and investigate some of the consequences of this very important concept.

129
130 Introduction to Real Analysis

4.1 Limit of a Function


The basic idea underlying the concept of the limit of a function f at a point
p is to study the behavior of f at points close to, but not equal to, p. We
illustrate this with the following simple examples. Suppose that the velocity
v (ft/sec) of a falling object is given as a function v = v(t) of time t. If the
object hits the ground in t = 2, then v(2) = 0. Thus to find the velocity at the
time of impact, we investigate the behavior of v(t) as t approaches 2, but is
not equal to 2. Neglecting air resistance, the function v(t) is given as follows:
(
−32t, 0 ≤ t < 2,
v(t) =
0, t ≥ 2.

Our intuition should convince us that v(t) approaches −64 ft/sec as t ap-
proaches 2, and that this is the velocity upon impact.
As another example, consider the function f (x) = x sin x1 , x 6= 0. Here the
function f is not defined at x = 0. Thus to investigate the behavior of f at 0
we need to consider the values f (x) for x close to, but not equal to 0. Since

|f (x)| = |x sin x1 | ≤ |x|

for all x 6= 0, our intuition again should tell us that f (x) approaches 0 as x
approaches 0. This indeed is the case as will be shown in Example 4.1.10(c).
We now make this idea of f (x) approaching a value L as x approaches a
point p precise. In order that the definition be meaningful, we must require
that the point p be a limit point of the domain of the function f .

DEFINITION 4.1.1 Let (X, d) be a metric space, E be a subset of X and


f a real-valued function with domain E. Suppose that p is a limit point of E.
The function f has a limit at p if there exists a number L ∈ R such that given
any ǫ > 0, there exists a δ > 0 for which

|f (x) − L| < ǫ

for all points x ∈ E satisfying 0 < d(x, p) < δ. If this is the case, we write

lim f (x) = L or f (x) → L as x → p.


x→p

Although we restricted our consideration to the case where f is a real-


valued function, we could just as easily have considered the case where f has
values in a metric space (Y, ρ). The extension to f : E → Y is obtained by
replacing |f (x) − L| with ρ(f (x), L), where in this case L is an element of Y .
The definition of the limit of a function can also be stated in terms of ǫ
Limits and Continuity 131

and δ neighborhoods as follows: If E ⊂ X, f : E → R, and p is a limit point


of E, then
lim f (x) = L
x→p

if and only if given ǫ > 0, there exists a δ > 0 such that

f (x) ∈ Nǫ (L) for all x ∈ E ∩ (Nδ (p) \ {p}).

This is illustrated graphically in Figure 4.1 for the case where E is a subset
of R.

FIGURE 4.1
lim f (x) = L
x→p

Remarks. (a) In the definition of limit, the choice of δ for a given ǫ may
depend not only on ǫ and the function, but also on the point p. This will be
illustrated in Example 4.1.2(f) below.
(b) If p is not a limit point of E, then for δ sufficiently small, there do not
exist any x ∈ E so that 0 < |x − p| < δ. Thus if p is an isolated point of E,
the concept of the limit of a function at p has no meaning.
(c) In the definition of limit, it is not required that p ∈ E, only that p is
a limit point of E. Even if p ∈ E, and f has a limit at p, we may very well
have that
lim f (x) 6= f (p).
x→p

This will be the case in Example 4.1.2(b) below.


(d) Let E ⊂ R and p a limit point of E. To show that a given function f
does not have a limit at p, we must show that for every L ∈ R, there exists
132 Introduction to Real Analysis

an ǫ > 0, such that for every δ > 0, there exists an x ∈ E with 0 < |x − p| < δ,
for which
|f (x) − L| ≥ ǫ.
We will illustrate this in Example 4.1.2(d).

EXAMPLES 4.1.2 (a) For x 6= 2, let f (x) be defined by


x2 − 4
f (x) = , x 6= 2.
x−2
The domain of f is E = (−∞, 2) ∪ (2, ∞), and 2 is clearly a limit point of E.
We now show that lim f (x) = 4. For x 6= 2,
x→2

x2 − 4
|f (x) − 4| = − 4 = |x + 2 − 4| = |x − 2|.
x−2
Thus given ǫ > 0, the choice δ = ǫ works in the definition.
(b) Consider the following variation of (a). Let g be defined on R by
2
 x − 4,

x 6= 2,
g(x) = x−2
2, x = 2.

For this example, 2 is a point in the domain of g, and it is still the case that
lim g(x) = 4. However, the limit does not equal g(2) = 2. The graph of g is
x→2
given in Figure 4.2.

FIGURE 4.2
Graph of g

(c) Let E = (−1, 0) ∪ (1, ∞). For x ∈ E, let h(x) be defined by



x+1−1
h(x) = , x 6= 0.
x
Limits and Continuity 133

We claim that lim h(x) = 1/2. This result is obtained as follows: For x 6= 0,
x→0
√ √  √ 
x+1−1 x+1−1 x+1+1 x 1
= √ = √ =√ .
x x x+1+1 x( x + 1 + 1) x+1+1
From this last term we now conjecture that h(x) → 1/2 as x → 0. By the
above,

1 1 1 1− x+1
h(x) − = √ − = √
2 x+1+1 2 2( x + 1 + 1)
√ √
(1 − x + 1)(1 + x + 1) −x
= √ = √
2( x + 1 + 1)2 2( x + 1 + 1)2
1 |x|
= √ .
2 ( x + 1 + 1)2

For x ∈ E we have ( x + 1 + 1)2 > 1, and thus
1 |x|
h(x) − < .
2 2
Given ǫ > 0, let δ = min{1, ǫ}. Then for 0 < |x| < δ,
1 |x| δ
h(x) − < < < ǫ,
2 2 2
and thus lim h(x) = 1/2.
x→0
(d) Let f be defined on R as follows:
(
1, x ∈ Q,
f (x) =
0, x∈6 Q.

We will show that for this function, lim f (x) fails to exist for every p ∈ R.
x→p
Fix p ∈ R. Let L ∈ R and let

ǫ = max{|L − 1|, |L|}.

Suppose ǫ = |L − 1|. By Theorem 1.5.2, for any δ > 0, there exists an x ∈ Q


such that 0 < |p − x| < δ. For such an x,

|f (x) − L| = |1 − L| = ǫ.

If ǫ = |L|, then by Exercise 6, Section 1.5, for any δ > 0, there exists an
irrational number x with 0 < |x − p| < δ. Again, for such an x, |f (x) − L| = ǫ.
Thus with ǫ as defined, for any δ > 0, there exists an x with 0 < |x − p| < δ
such that |f (x) − L| ≥ ǫ. Since this works for every L ∈ R, lim f (x) does not
x→p
exist.
134 Introduction to Real Analysis

(e) Let f : R → R be defined by


(
0, x ∈ Q,
f (x) =
x, x 6∈ Q.

Then lim f (x) = 0. Since |f (x)| ≤ |x| for all x, given ǫ > 0, any δ, 0 < δ ≤ ǫ,
x→0
will work in the definition of the limit. A modification of the argument given
in (d) shows that for any p 6= 0, lim f (x) does not exist. An alternate proof
x→p
will be provided in Example 4.2.2(b)
(f ) This example shows dramatically how the choice of δ will in general
depend not only on ǫ, but also on the point p. Let E = (0, 1) and let f : E → R
be defined by
1
f (x) = .
x
We will prove that for p ∈ (0, 1],
1 1
lim = .
x→p x p

If x > p/2, then


1 1 |x − p| 2
− = < 2 |x − p|.
x p xp p
Therefore, given ǫ > 0, let δ = min{p/2, p2 ǫ/2}. Then if 0 < |x − p| < δ,
x > p/2, and
1 1 2
− < 2 δ < ǫ.
x p p
The δ as defined depends both on p and ǫ. This suggests that any δ that works
for a given p and ǫ must depend on both p and ǫ. Suppose on the contrary
that for a given ǫ > 0 the choice of δ is independent of p ∈ (0, 1). Then with
ǫ = 1, there exists a δ > 0 such that

1 1
− <1
x p

for all x, p ∈ (0, 1) with 0 < |x − p| < δ. Since any smaller δ will also work,
we can assume that 0 < δ < 21 . But now if we take p = 21 δ and x = δ, then
0 < |x − p| < δ and thus

1 2 1
|f (x) − f (p)| = − = > 1.
δ δ δ

This contradiction proves that the choice of δ must depend on both p and ǫ.
Limits and Continuity 135

(g) For our final example we consider the function f defined on R2 \ (0, 0)
xy
given by f (x, y) = 2 . We will prove that the limit lim f (x, y) = 52 .
x + y2 (x,y)→(1,2)
Consider
2 5xy − 2x2 − 2y 2
f (x, y) − =
5 5(x2 + y 2
(x − 2y)(y − 2) + (4y − 2x)(x − 1)
=
5(x2 + y 2 )
(|x| + 2|y|)|y − 2| + (4|y| + 2|x|)|x − 1|
≤ .
5(x2 + y 2
If (x, y) ∈ N 12 (1, 2), then 12 < |x| < 32 and 23 < |y| < 52 . (Verify!) Therefore for
(x, y) ∈ N 21 (1, 2) we have |x| + 2|y| < 13 26 2 2 25
2 , 4|y| + 2|x| < 2 , and 5(x + y ) > 2 .
Therefore
2 26
f (x, y) − < (|y − 2| + |x − 1|).
5 25
Hence if δ is chosen so that 0 < δ ≤ 21 , then for (x, y) ∈ Nδ (1, 2) we have
2 52
f (x, y) − < δ.
5 25
25
Thus given ǫ > 0, if we choose δ such that 0 < δ < min{ 21 , 52 ǫ}, then (x, y) ∈
2
Nδ (1, 2) implies that |f (x, y) − 5 | < ǫ. 

Sequential Criterion for Limits


Our first theorem allows us to reduce the question of the existence of the limit
of a function to one concerning the existence of limits of sequences. As we will
see, this result will be very useful in subsequent proofs, and also in showing
that a given function does not have a limit at a point p.

THEOREM 4.1.3 Let E be a subset of a metric space X, p a limit point of


E, and f a real-valued function defined on E. Then
lim f (x) = L if and only if lim f (pn ) = L
x→p n→∞

for every sequence {pn } in E, with pn 6= p for all n, and lim pn = p.


n→∞

Remark. Since p is a limit point of E, Theorem 3.1.4 guarantees the existence


of a sequence {pn } in E with pn 6= p for all n ∈ N and pn → p.
Proof. Suppose lim f (x) = L. Let {pn } be any sequence in E with pn 6= p
x→p
for all n and pn → p. Let ǫ > 0 be given. Since lim f (x) = L, there exists a
x→p
δ > 0 such that
|f (x) − L| < ǫ for all x ∈ E, 0 < |x − p| < δ. (1)
136 Introduction to Real Analysis

Since lim pn = p, for the above δ, there exists a positive integer no such that
n→∞

0 < |pn − p| < δ for all n ≥ no .

Thus if n ≥ no , by (1), |f (pn ) − L| < ǫ. Therefore lim f (pn ) = L.


n→∞
Conversely, suppose f (pn ) → L for every sequence {pn } in E with pn 6= p
for all n and pn → p. Suppose lim f (x) 6= L. Then there exists an ǫ > 0
x→p
such that for every δ > 0, there exists an x ∈ E with 0 < |x − p| < δ and
|f (x) − L| ≥ ǫ. For each n ∈ N, take δ = 1/n. Then for each n, there exists
pn ∈ E such that
1
0 < |pn − p| < and |f (pn ) − L| ≥ ǫ.
n
Thus pn → p, but {f (pn )} does not converge to L. This contradiction proves
the result. 
An immediate consequence of the previous theorem is the following unique-
ness theorem.

COROLLARY 4.1.4 If f has a limit at p, then it is unique.

Theorem 4.1.3 is often applied to show that a limit does not exist. If one
can find a sequence {pn } with pn → p, such that {f (pn )} does not converge,
then lim f (x) does not exist. Alternately, if one can find two sequences {pn }
x→p
and {rn } both converging to p, but for which

lim f (pn ) 6= lim f (rn ),


n→∞ n→∞

then again lim f (x) does not exist. We illustrate this with the following two
x→p
examples.

EXAMPLES 4.1.5 (a) Let E = (0, ∞) and f (x) = sin x1 , x ∈ E. We use


the previous theorem to show that
1
lim sin
x→0 x
2
does not exist. Let pn = . Then
(2n + 1)π
π
f (pn ) = sin(2n + 1) = (−1)n .
2
Thus lim f (pn ) does not exist, and consequently by Theorem 4.1.3, lim f (x)
n→∞ x→0
also does not exist. The graph of f (x) = sin x1 is given in Figure 4.3.
Limits and Continuity 137

FIGURE 4.3
Graph of f (x) = sin(1/x), x > 0

(b) As in Example 4.1.2(e) let


(
0, x ∈ Q,
f (x) =
x, x 6∈ Q.

Suppose p ∈ R, p 6= 0. Since Q is dense in R, there exists a sequence {pn } ⊂ Q


with pn 6= p for all n ∈ N such that pn → p. Hence lim f (pn ) = 0. On the
n→∞
other hand, since R \ Q is also dense in R, there exists a sequence {qn } of
irrational numbers with qn → p. But then lim f (qn ) = lim qn = p. Thus
n→∞ n→∞
since p 6= 0, by Theorem 4.1.3 lim f (x) does not exist. 
x→p

Limit Theorems

THEOREM 4.1.6 Suppose E is a subset of a metric space X, f, g : E → R,


and p is a limit point of E. If
lim f (x) = A and lim g(x) = B,
x→p x→p

then
(a) lim [f (x) + g(x)] = A + B,
x→p
(b) lim f (x)g(x) = AB, and
x→p
f (x) A
(c) lim = , provided B 6= 0.
x→p g(x) B
138 Introduction to Real Analysis

Proof. For (a) and (b) apply Theorem 4.1.3 and Theorem 3.2.1, respectively.
We leave the details to the exercises (Exercise 11).
Proof of (c). By (b) it suffices to show that
1 1
lim = .
x→p g(x) B

We first show that since B =6 0, g(x) 6= 0 for all x sufficiently close to p, x 6= p.


Take ǫ = |B|/2. Then by the definition of limit, there exists a δ1 > 0 such
that
|B|
|g(x) − B| <
2
for all x ∈ E, 0 < |x − p| < δ1 . By Corollary 2.1.4 |g(x) − B| ≥ ||g(x)| − |B||.
Thus
|B| |B|
|g(x)| > |B| − = >0
2 2
for all x ∈ E, 0 < |x − p| < δ1 .
We can now apply Theorem 4.1.3 and the corresponding result for se-
quences of Theorem 3.2.1. Let {pn } be any sequence in E with pn → p
and pn 6= p for all n. For the above δ1 , there exists an no ∈ N such that
0 < |pn − p| < δ1 for all n ≥ no . Thus g(pn ) 6= 0 for all n ≥ no . Therefore by
Theorem 3.2.1(c),
1 1
lim = .
n→∞ g(pn ) B
Since this holds for every sequence pn → p, by Theorem 4.1.3,
1 1
lim = . 
x→p g(x) B

The proofs of the following two theorems are easy consequences of Theo-
rem 4.1.3 and the corresponding theorems for sequences (Theorems 3.2.3 and
3.2.4). First however we give the following definition.

DEFINITION 4.1.7 A real-valued function f defined on a set E is


bounded on E if there exists a constant M such that |f (x)| ≤ M for all
x ∈ E.

THEOREM 4.1.8 Suppose E is a subset of a metric space X, p is a limit


point of E, and f, g are real-valued functions on E. If g is bounded on E and
lim f (x) = 0, then
x→p
lim f (x)g(x) = 0.
x→p

Proof. Exercise 12. 


Limits and Continuity 139

THEOREM 4.1.9 Suppose E is a subset of a metric space X, p is a limit


point of E, and f, g, h are functions from E into R satisfying

g(x) ≤ f (x) ≤ h(x) for all x ∈ E.

If lim g(x) = lim h(x) = L, then lim f (x) = L.


x→p x→p x→p

Proof. Exercise 13. 


We now provide examples to illustrate the previous theorems.

EXAMPLES 4.1.10 (a) Using mathematical induction and Theorem


4.1.6(b), lim xn = cn for all n ∈ N. If p(x) = an xn + · · · + a1 x + ao is
x→c
a polynomial function of degree n, where n is a nonnegative integer and
ao , a1 . . . , an ∈ R with an 6= 0, then a repeated application of Theorem 4.1.6(a)
gives lim p(x) = p(c).
x→c
(b) Consider
x3 + 2x2 − 2x − 4
lim .
x→−2 x2 − 4
By part (a), lim (x3 + 2x2 − 2x − 4) = 0 and lim (x2 − 4) = 0. Since the
x→−2 x→−2
denominator has limit zero, Theorem 4.1.6(c) does not apply. In this example
however, for x 6= −2

x3 + 2x2 − 2x − 4 (x + 2)(x2 − 2) x2 − 2
2
= = .
x −4 (x + 2)(x − 2) x−2

Since lim (x − 2) = −4 which is nonzero, we can now apply Theorem 4.1.6(c)


x→−2
to conclude that
x3 + 2x2 − 2x − 4 x2 − 2 1
lim 2
= lim =− .
x→−2 x −4 x→−2 x − 2 2

(c) Let E = R \ {0}, and let f : E → R be defined by


1
f (x) = x sin .
x
Since | sin(1/x)| ≤ 1 for all x ∈ R, x 6= 0 and lim x = 0, by Theorem 4.1.8
x→0

1
lim x sin = 0.
x→0 x
The graph of f (x) = x sin x1 is given in Figure 4.4.
(d) Let E = (0, ∞) and let f be defined on E by
sin t
f (t) = .
t
140 Introduction to Real Analysis

FIGURE 4.4
Graph of f (x) = x sin(1/x), x 6= 0

We now prove that


sin t
lim = 1.
t→0 t

As we will see in the next chapter, this limit will be crucial in computing the
derivative of the sine function. From Figure 4.5, we have

area (△OP Q) < area (sector OP R) < area (△ORS).

In terms of t, this gives


1 1 1
sin t cos t < t < tan t.
2 2 2
Therefore
sin t 1
cos t < < .
t cos t
Using the fact that lim cos t = 1 (Exercise 5), by Theorem 4.1.9 we obtain
t→0

sin t
lim = 1. 
t→0 t
Limits and Continuity 141

FIGURE 4.5
Triangles and sector of Example 4.1.10(d)

Limits at Infinity
Up to this point we have only considered limits at points p ∈ R. We now
extend the definition to include limits at ∞ or −∞. The definition of the
limit at ∞ is very similar to lim f (n) where f : N → R; that is, f is a
n→∞
sequence in R.

DEFINITION 4.1.11 Let f be a real-valued function such that Dom f ∩


(a, ∞) 6= ∅ for every a ∈ R. The function f has a limit at ∞ if there exists
a number L ∈ R such that given ǫ > 0, there exists a real number M for which

|f (x) − L| < ǫ

for all x ∈ Dom f ∩ (M, ∞). If this is the case, we write

lim f (x) = L.
x→∞

Similarly, if Dom f ∩ (−∞, b) 6= ∅ for every b ∈ R.

lim f (x) = L
x→−∞

if and only if given ǫ > 0, there exists a real number M such that

|f (x) − L| < ǫ

for all x ∈ Dom f ∩ (−∞, M ).


142 Introduction to Real Analysis

The hypothesis that Dom f ∩ (a, ∞) 6= ∅ for every a ∈ R is equivalent to


saying that the domain of the function f is not bounded above. If Dom f = N,
then the above definition gives the definition for the limit of a sequence. The
reader should convince themself that all theorems up to this point involving
limits at a point p ∈ R are still valid if p is replaced by ∞ or −∞.

EXAMPLES 4.1.12 (a) As our first example, consider the function


sin x
f (x) = defined on (0, ∞). Since | sin x| ≤ 1,
x
1
|f (x)| ≤
x
for all x ∈ (0, ∞). Let ǫ > 0 be given. Then with M = 1/ǫ,

|f (x)| < ǫ for all x > M.


sin x
Therefore, lim = 0.
x→∞ x
(b) For our second example consider f (x) = x sin πx. If we set pn =
(n + 12 ), n ∈ N, then

f (pn ) = (n + 12 ) sin(n + 12 )π = (−1)n (n + 21 ).

Thus the sequence {f (pn )}∞ n=1 is unbounded, and as a consequence


lim x sin πx does not exist. 
x→∞

Exercises 4.1
1. Use the definition to establish each of the following limits.
*a. lim (2x − 7) = −3. b. lim (3x + 5) = −1
x→2 x→−2
x 1
*c. lim = . d. lim 2x2 − 3x − 4 = 1.
x→1 1 + x 2 x→−1
x3 + 1 x3 − 2x − 4 5
*e. lim = 3. f. lim =
x→−1 x + 1 x→2 x2 − 4 2
2. Use the definition to establish each of the following limits.
a. lim c = c b. lim x = p.
x→p x→p
*c. lim x3 = p3 d. lim xn = pn , n ∈ N
x→p x→p
√ √
√ √ x+p− p 1
*e. lim x= p, p>0 f. lim = √ ,p>0
x→p x→p x 2 p
3. For each of the following, determine whether the indicated limit exists in
R. Justify your answer!
x x2 − 1
*a. lim b. lim
x→0 |x| x→1 x + 1
1 p 1
*c. lim cos d. lim |x| cos .
x→0 x x→0 x
(x + 1)2 − 1 x4 − 2x2 + 1
*e. lim f. lim 3
x→0 x x→1 x − x2 − x + 1
Limits and Continuity 143

4. *Define f : (−1, 1) → R by
x2 − x − 2
f (x) = .
x+1
Determine the limit L of f at −1 and prove, using ǫ and δ, that f has
limit L at −1.
5. *a. Using Figure 4.5, prove that | sin h| ≤ |h| for all h ∈ R.
b. Using the trigonometric identity 1 − cos h = 2 sin2 h2 , prove that
(i) lim cos h = 1.
h→0
1 − cos h
(ii) lim =0
h→0 h
6. Let E be a subset of a metric space (X, d), p a limit point of E, and
f : E → R. Suppose there exists a constant M > 0 and L ∈ R such that
|f (x) − L| ≤ M d(x, p) for all x ∈ E. Prove that lim f (x) = L.
x→p

7. Suppose f : E → R, p is a limit point of E, and lim f (x) = L.


x→p

*a. Prove that lim |f (x)| = |L|.


x→p
p √
b. If in addition f (x) ≥ 0 for all x ∈ E, prove that lim f (x) = L.
x→p
n n
*c. Prove that lim (f (x)) = L for each n ∈ N.
x→p

8. Use the limit theorems, examples, and previous exercises to find each of
the following limits. State which theorem, examples, or exercises are used
in each case.
5x2 + 3x − 2 x3 − x2 + 2
*a. lim b. lim
x→−1
r x−1 x→−1 x+1
3x + 1 |x + 2|3/2
*c. lim d. lim
x→1
√ 2x + 5 x→−2
 x+2 
x−2 1 1 √
*e. lim f. lim √ − f rac1 p , p > 0
x→4 x − 4 x→o x x+p
sin 2x |x − 2| − |x + 2|
*g. lim h. lim
x→0 x x→0 x
9. *Suppose f : (a, b) → R, p ∈ [a, b], and lim f (x) > 0. Prove that there
x→p
exists a δ > 0 such that f (x) > 0 for all x ∈ (a, b) with 0 < |x − p| < δ.
10. Suppose E is a subset of a metric space (X, d), p is a limit point of E,
and f : E → R. Prove that if f has a limit at p, then there exists a
positive constant M and a δ > 0 such that |f (x)| ≤ M for all x ∈ E, 0 <
d(x, p) < δ.
11. a. Prove Theorem 4.1.6(a).
b. Prove Theorem 4.1.6(b).
12. *Prove Theorem 4.1.8.
13. Prove Theorem 4.1.9.
14. Let f, g be real-valued functions defined on E ⊂ R and let p be a limit
point of E.
*a. If lim f (x) and lim (f (x) + g(x)) exist, prove that lim g(x) exists.
x→p x→p x→p
144 Introduction to Real Analysis

b. If lim f (x) and lim (f (x)g(x)) exist, does it follow that lim g(x)
x→p x→p x→p
exists?
15. Let E be a subset of a metric space, p a limit point of E. Suppose f is
a bounded real-valued function on E having the property that lim f (x)
x→p
does not exist. Prove that there exist distinct sequences {pn } and {qn }
in E with pn → p and qn → p such that lim f (pn ) and lim f (qn ) exist,
n→∞ n→∞
but are not equal.
16. *Let f be a real-valued function defined on (a, ∞) for some a > 0. Define
g on (0, a1 ) by g(t) = f ( 1t ). Prove that
lim f (x) = L if and only if lim g(t) = L.
x→∞ t→0

17. Investigate the limits at ∞ of each of the following functions.


3x2 + 3x − 1 1
*a. f (x) = 2 +1
b. f (x) =
√ 2x 1 + x2
2
4x + 1 2x + 3
*c. f (x) = . d. f (x) = √
x √ x+
√ x − 2x
*e. f (x) = x2 + x − x. f. f (x) = √
2 x + 3x
*g. f (x) = x cos x1 h. f (x) = x sin x1 .
18. Let f : (a, ∞) → R be such that lim xf (x) = L where L ∈ R. Prove
x→∞
that lim f (x) = 0.
x→∞

19. Let f : R → R satisfy f (x + y) = f (x) + f (y) for all x, y ∈ R. If lim f (x)


x→0
exists, prove that
a. lim f (x) = 0, and b. lim f (x) exists for every p ∈ R.
x→0 x→p

4.2 Continuous Functions

The notion of continuity dates back to Leonhard Euler (1707–1783). To Euler,


a continuous curve (function) was one that could be expressed by a single for-
mula or equation of the variable x. If the definition of the curve was made up
of several parts, it was called discontinuous. This definition was sufficient to
convey the concept of continuity if we keep in mind that in Euler’s time math-
ematicians were primarily only concerned with elementary functions; namely
functions built up from the trigonometric and exponential functions, and in-
verses of these functions, using algebraic operations and composition.
The more modern version of continuity is due to Bernhard Bolzano (1817)
and Augustin-Louis Cauchy (1821). Both men were motivated to provide a
clear and precise definition of continuity in order to be able to prove the
intermediate value theorem (Theorem 4.2.11). Cauchy’s definition of continu-
ity was as follows: “The function f (x) will be, between two assigned values
Limits and Continuity 145

of the variable x, a continuous function of this variable if for each value of


x between these limits, the numerical [i.e. absolute value] of the difference
f (x + α) − f (x) decreases indefinitely with α”1 . Even this definition appears
strange in comparison with the more modern definition in use today. Both
Bolzano and Cauchy were concerned with continuity on an interval, rather
than continuity at a point.

DEFINITION 4.2.1 Let E be a subset of a metric space (X, d) and f a real-


valued function with domain E. The function f is continuous at a point
p ∈ E, if for every ǫ > 0, there exists a δ > 0 such that
|f (x) − f (p)| < ǫ
for all x ∈ E with d(x, p) < δ. The function f is continuous on E if and
only if f is continuous at every point p ∈ E.

The above definition can be rephrased as follows: A function f : E → R is


continuous at p ∈ E if and only if given ǫ > 0, there exists a δ > 0 such that
f (x) ∈ Nǫ (f (p)) for all x ∈ Nδ (p) ∩ E.
This is illustrated on the real line in Figure 4.6.

FIGURE 4.6
An illustration of Definition 4.2.1

Remarks. (a) If p ∈ E is a limit point of E, then f is continuous at p if and


only if
lim f (x) = f (p).
x→p

1 Cauchy, Cours d’Analyse, p.43


146 Introduction to Real Analysis

Also, as a consequence of Theorem 4.1.3, f is continuous at p if and only if

lim f (pn ) = f (p)


n→∞

for every sequence {pn } in E with pn → p.


(b) If p ∈ E is an isolated point, then every function f on E is continuous
at p. This follows immediately from the fact that for an isolated point p of E,
there exists a δ > 0 such that Nδ (p) ∩ E = {p}.
We now consider several of the functions given in previous examples, and
also some additional new examples.

EXAMPLES 4.2.2 (a) Let g be defined as in Example 4.1.2(b), i.e.,


 2
x − 4
, x 6= 2,
g(x) = x−2
2, x = 2.

At the point p = 2, lim g(x) = 4 6= g(2). Thus g is not continuous at p = 2.


x→2
However, if we redefine g at p = 2 so that g(2) = 4, then this function is now
continuous at p = 2.
(
0, x ∈ Q,
(b) Let f be as defined in Example 4.1.2(e), i.e. f (x) = .
x, x 6∈ Q
Since
lim f (x) = 0 = f (0),
x→0

f is continuous at p = 0. On the other hand, since lim f (x) fails to exists for
x→p
every p 6= 0, f is discontinuous at every p ∈ R, p 6= 0.
(
1, x ∈ Q,
(c) The function f defined by f (x) = of Example 4.1.2(d)
0, x 6∈ Q,
is discontinuous at every p ∈ R.
(d) As in Example 4.1.2(f), the function f (x) = 1/x is continuous at every
p ∈ (0, 1).
(e) Let f be defined by
(
0, x = 0,
f (x) =
x sin x1 , x 6= 0.

By Example 4.1.10(c),
lim f (x) = 0 = f (0).
x→0

Thus f is continuous at x = 0.
Limits and Continuity 147

(f ) In this example, we show that f (x) = sin x is continuous on R. Let


x, y ∈ R. Then

|f (y) − f (x)| = | sin y − sin x|


= 2| cos 21 (y + x) sin 12 (y − x)|
≤ 2| sin 21 (y − x)|.

By Exercise 5 of Section 4.1 | sin h| ≤ |h|. Therefore

|f (y) − f (x)| ≤ |y − x|,

from which it follows that f is continuous on R.


(g) We now consider a function on (0, 1) which is discontinuous at every
rational number in (0, 1) and continuous at every irrational number in (0, 1).
For x ∈ (0, 1) define

0, if x is irrational,
f (x) = 1 m
 , if x is rational with x = in lowest terms.
n n
The graph of f , at least for a few rational numbers, is given in Figure 4.7.

FIGURE 4.7
Graph of the function of Example 4.2.2(g)

To establish our claim we will show that

lim f (x) = 0
x→p

for every p ∈ (0, 1). As a consequence, since f (p) = 0 for every irrational
number p ∈ (0, 1), f is continuous at every irrational number. Also, since
148 Introduction to Real Analysis

f (p) 6= 0 when p ∈ Q ∩ (0, 1), f is discontinuous at every rational number in


(0, 1).
Fix p ∈ (0, 1) and let ǫ > 0 be given. To prove that lim f (x) = 0 we need
x→p
to show that there exists a δ > 0 such that
|f (x)| < ǫ
for all x ∈ Nδ (p) ∩ (0, 1), x 6= p. This is certainly the case for any irrational
number x. On the other hand, if x is rational with x = m n (in lowest terms),
then f (x) = n1 . Choose no ∈ N such that n1o < ǫ. There exist only a finite
number of rational numbers m n (in lowest terms) in (0, 1) with denominator
less than no . Denote these by r1 , ..., rm , and let
δ = min{|ri − p| : i = 1, ..., m, ri 6= p}.
Note, since p may be a rational number and thus possibly equal to ri for some
i = 1, ..., m, we take the minimum of {|ri − p|} only for those i for which
ri 6= p. Thus δ > 0 and if r ∈ Q ∩ Nδ (p) ∩ (0, 1), r 6= p, with r = m
n in lowest
terms, then n ≥ no . Therefore,
1
|f (r)| = < ǫ.
n
Thus |f (x)| < ǫ for all x ∈ Nδ (p) ∩ (0, 1), x 6= p. 

If f and g are real-valued functions defined on a set E, we define the sum


f + g, the difference f − g, and the product f g on E as follows: For x ∈ E,
(f + g)(x) = f (x) + g(x),
(f − g)(x) = f (x) − g(x),
(f g)(x) = f (x)g(x).

Furthermore, if g(x) 6= 0 for all x ∈ E, we define the quotient f /g by


 
f f (x)
(x) = .
g g(x)
More generally, if f and g are real-valued functions defined on a set E, the
quotient f /g can always be defined on E1 = {x ∈ E : g(x) 6= 0}.
As an application of Theorem 4.1.6 we prove that continuity is preserved
under the algebraic operations defined above,

THEOREM 4.2.3 If E is a subset of a metric space X and f, g : E → R


are continuous at p ∈ E, then
(a) f + g and f − g are continuous at p, and
(b) f g is continuous at p.
f
(c) If g(x) 6= 0 for all x ∈ E, then is continuous at p.
g
Limits and Continuity 149

Proof. If p is an isolated point of E, then the result is true since every function
on E is continuous at p. If p is a limit point of E, then the conclusions follow
from Theorem 4.1.6. 

Composition of Continuous Functions


In the following theorem we prove that continuity is also preserved under
composition of functions.

THEOREM 4.2.4 Let A, B ⊂ R and let f : A → R and g : B → R


be functions such that Range f ⊂ B. If f is continuous at p ∈ A and g is
continuous at f (p), then h = g ◦ f is continuous at p.

Proof. Let ǫ > 0 be given. Since g is continuous at f (p), there exists a δ1 > 0
such that

|g(y) − g(f (p))| < ǫ for all y ∈ B ∩ Nδ1 (f (p)). (2)

Since f is continuous at p, for this δ1 , there exists a δ > 0 such that

|f (x) − f (p)| < δ1 for all x ∈ A ∩ Nδ (p).

Thus, if x ∈ A with |x − p| < δ, by (2)

|h(x) − h(p)| = |g(f (x)) − g(f (p))| < ǫ.

Therefore h is continuous at p. 

EXAMPLES 4.2.5 (a) If p is a polynomial function of degree n, that is

p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 ,

where n is a nonnegative integer, a0 , ..., an ∈ R with an 6= 0, and Dom p = R,


then by Theorem 4.1.6 (a) and (b), p is continuous on R.
(b) Suppose p and q are polynomials on R and E = {x ∈ R : q(x) = 0}.
Then by Theorem 4.1.6 the rational function r defined on R \ E by

p(x)
r(x) = , x ∈ R \ E,
q(x)

is continuous on R \ E.
(c) By Example 4.2.2(f), f (x) = sin x is continuous on R. Hence if p is
a polynomial function on R, by Theorem 4.2.4 (f ◦ p)(x) = sin(p(x)) is also
continuous on R. 
150 Introduction to Real Analysis

Topological Characterization of Continuity


Before considering several consequences of continuity, we provide a strictly
topological characterization of a continuous function. In more abstract courses
this is often taken as the definition of continuity.

THEOREM 4.2.6 Let E be a subset of a metric space X and let f be a


real-valued function on E. Then f is continuous on E if and only if f −1 (V )
is open in E for every open subset V of R.

Proof. Recall (Definition 2.2.21) that a set U ⊂ E is open in E if for every


p ∈ U there exists a δ > 0 such that Nδ (p) ∩ E ⊂ U .
Suppose f is continuous on E and V is an open subset of R. If f −1 (V ) = ∅
we are done. Suppose p ∈ f −1 (V ). Then f (p) ∈ V . Since V is open, there exists
ǫ > 0 such that Nǫ (f (p)) ⊂ V . Since f is continuous at p, there exists a δ > 0
such that f (x) ∈ Nǫ (f (p)) for all x ∈ Nδ (p) ∩ E; i.e., Nδ (p) ∩ E ⊂ f −1 (V ).
Since p ∈ f −1 (V ) was arbitrary, f −1 (V ) is open in E.
Conversely, suppose f −1 (V ) is open in E for every open subset V of R.
Let p ∈ E and let ǫ > 0 be given. Then by hypothesis f −1 (Nǫ (f (p))) is open
in E. Thus there exists a δ > 0 such that

E ∩ Nδ (p) ⊂ f −1 (Nǫ (f (p))) ,

that is, f (x) ∈ Nǫ (f (p)) for all x ∈ Nδ (p) ∩ E. Therefore f is continuous at p.




EXAMPLES√ 4.2.7 (a) We illustrate the previous theorem for the function
f (x) = x, Dom f = [0, ∞). Suppose first that V is an open interval (a, b)
with a < b. Then

 ∅,
 b ≤ 0,
−1 2
f (V ) = [0, b ), a ≤ 0 < b,
 2 2

(a , b ), 0 < a.

Clearly ∅ and (a2 , b2 ) are open subsets of R and hence also of [0, ∞). Although
[0, b2 ) is not open in R,

[0, b2 ) = (−b2 , b2 ) ∩ [0, ∞).

Thus by Theorem 2.2.23 [0, b2 ) is open in [0,S∞). If V is an arbitrary open


subset of R, then by Theorem 2.2.20 V = In , where {In } is a finite or
n
countable collection of open intervals. Since f −1 (V ) = f −1 (In ) (Theorem
S
n
−1 −1
√ each f (In ) is open in [0, ∞), f (V ) is open in [0, ∞). Therefore
1.7.14) and
f (x) = x is continuous on [0, ∞).
(b) In this example, we show that if f : E → R is continuous on E and
Limits and Continuity 151

V ⊂ E is open in E, then f (V ) is not necessarily open in Range f . Consider


the function f : R → R given by
(
x2 , x ≤ 2,
f (x) =
6 − x, x > 2.

Then f is continuous on R and Range f = R (Exercise 10). However,


f ((−1, 1)) = [0, 1), and this set is not open in R. 

Continuity and Compactness


We now consider several consequences of continuity. In our first result we
prove that the continuous image of a compact set is compact. In the proof of
the theorem we only use continuity and the definition of a compact set. For
subsets of R, an alternate proof using the Heine-Borel-Bolzano-Weierstrass
theorem (Theorem 2.4.2) is suggested in the exercises (Exercise 25).

THEOREM 4.2.8 If K is a compact subset of a metric space X and if


f : K → R is continuous on K, then f (K) is compact.

Proof. Let {Vα }α∈A be an open cover of f (K). Since f is continuous on K,


f −1 (Vα ) is open in K for every α ∈ A. By Theorem 2.2.23, for each α there
exists an open subset Uα of X such that
f −1 (Vα ) = K ∩ Uα .
We claim that {Uα }α∈A is an open cover of K. If p ∈ K, then f (p) ∈ f (K)
and thus f (p) ∈ Vα for some α ∈ A. But then p is in f −1 (Vα ) and hence also
in Uα . Since each Uα is also open, the collection {Uα }α∈A is an open cover of
K. Since K is compact, there exists α1 , ..., αn ∈ A such that
n
[
K⊂ U αj .
j=1

Therefore,
n
[ n
[
K= (Uαj ∩ K) = f −1 (Vαj ),
j=1 j=1

and by Theorem 1.7.14(a),


n
[
f (K) = f (f −1 (Vαj )).
j=1

n
Since f (f −1 (Vαj )) ⊂ Vαj , f (K) ⊂
S
Vαj . Thus f (K) is compact. 
j=1
As a corollary of the previous theorem we obtain the following general-
ization of the usual maximum-minimum theorem normally encountered in
calculus.
152 Introduction to Real Analysis

COROLLARY 4.2.9 Let K be a compact subset of R and let f : K → R be


continuous. Then there exist p, q ∈ K such that

f (q) ≤ f (x) ≤ f (p) for all x ∈ K.

Proof. Let M = sup{f (x) : x ∈ K}. By the previous theorem f (K) is


compact. Thus, since f (K) is bounded, M < ∞. Also, since f (K) is closed,
M ∈ f (K). Thus there exists p ∈ K such that f (p) = M . Similarly for
m = inf{f (x) : x ∈ K}. 
We now provide examples to show that the result is false if K ⊂ R is not
compact; that is, not both closed and bounded.

EXAMPLES 4.2.10 (a) Suppose E is a subset of R. If E is unbounded,


consider
x2
f (x) = .
1 + x2
Then f is continuous on E, sup{f (x) : x ∈ E} = 1, but f (x) < 1 for all
x ∈ E. To see that the supremum is 1, we note that since E is unbounded,
there exists a sequence {xn } in E such that x2n → ∞ as n → ∞. But then
1
lim f (xn ) = lim = 1.
n→∞ n→∞ 1 + 12
x n

Thus if 0 < β < 1, there exists an integer n such that f (xn ) > β. Hence
sup{f (x) : x ∈ E} = 1.
(b) If E is not closed, let xo be a limit point of E which is not in E. Then
1
g(x) =
1 + (x − xo )2

is continuous on E, with g(x) < 1 for all x ∈ E. A similar argument as above


shows that sup{g(x) : x ∈ E} = 1. 

Intermediate Value Theorem


The following theorem is attributed to both Bolzano and Cauchy. Cauchy how-
ever implicitly assumed the completeness of R in his proof, whereas the proof
by Bolzano (given below) uses the least upper bound property. An alternate
proof is outlined in the miscellaneous exercises.

THEOREM 4.2.11 (Intermediate Value Theorem) Let f : [a, b] → R


be continuous. Suppose f (a) < f (b). If γ is a number satisfying

f (a) < γ < f (b),

then there exists c ∈ (a, b) such that f (c) = γ.


Limits and Continuity 153

FIGURE 4.8
Intermediate value theorem

The statement and conclusion of the intermediate value theorem is illus-


trated in Figure 4.8. The theorem simply states that if f is continuous on [a, b]
with f (a) < f (b), and γ ∈ R satisfies that f (a) < γ < f (b), then the graph of
f crosses the line y = γ at least once.
Proof. Let A = {x ∈ [a, b] : f (x) ≤ γ}. The set A 6= ∅ since a ∈ A, and
A is bounded above by b. Thus by the least upper bound property A has a
supremum in R. Let c = sup A. Since b is an upper bound, c ≤ b.
We now show that f (c) = γ. Suppose f (c) < γ. Then ǫ = 21 (γ − f (c)) > 0.
Since f is continuous at c, for this ǫ there exists a δ > 0 such that

f (c) − ǫ < f (x) < f (c) + ǫ for all x ∈ Nδ (c) ∩ [a, b].

Since f (c) < γ, c 6= b, and thus (c, b] ∩ Nδ (c) 6= ∅. But for any x ∈ (c, b] with
c < x < c + δ,
1 1 1
f (x) < f (c) + ǫ = f (c) + γ − f (c) = (f (c) + γ) < γ.
2 2 2
But then x ∈ A and x > c which contradicts c = sup A. Therefore f (c) ≥ γ.
Since c = sup A, either c ∈ A or c is a limit point of A. If c ∈ A, then
f (c) ≤ γ. If c is a limit point of A, then by Theorem 3.1.4 there exists a
sequence {xn } in A such that xn → c. Since xn ∈ A, f (xn ) ≤ γ. Since f is
continuous,
f (c) = lim f (xn ) ≤ γ.
n→∞

Thus in either case, f (c) ≤ γ. Therefore f (c) = γ. 


154 Introduction to Real Analysis

The intermediate value theorem is one of the fundamental theorems of


calculus. Simply stated, the theorem implies that if I is an interval and f :
I → R is continuous, then f (I) is an interval. Due to the importance of this
result we state it as a corollary.

COROLLARY 4.2.12 If I ⊂ R is an interval and f : I → R is continuous


on I, then f (I) is an interval.

Proof. Let s, t ∈ f (I) with s < t, and let a, b ∈ I with a 6= b be such that
f (a) = s and f (b) = t. Suppose γ satisfies s < γ < t. If a < b, then since f is
continuous on [a, b], by the intermediate value theorem theorem there exists
c ∈ (a, b) such that f (c) = γ. Thus γ ∈ f (I). A similar argument also holds if
a > b. 
There is an alternate way to state the previous corollary using the ter-
minology of connected sets. If I is a connected subset of R and f : I → R
is continuous on I, then f (I) is connected. This result can be proved using
only properties of continuous functions and the definition of a connected set
(Exercise 28). The corollary now follows from the fact that a subset of R
is connected if and only if it is an interval (Theorem 2.2.25). The proof of
Theorem 2.2.25 however also requires the least upper bound property of R.
Consequently, the supremum property of the real numbers cannot be avoided
in proving Corollary 4.2.12.
The following two corollaries are additional applications of the intermedi-
ate value theorem. Our first result is the proof of Theorem 1.5.3.

COROLLARY 4.2.13 For every real number γ > 0 and every positive in-
teger n, there exists a unique positive real number y so that y n = γ.

Proof. That y is unique is clear. Let f (x) = xn , which by Exercise 7 is


continuous on R. Let a = 0 and b = γ + 1. Since (γ + 1)n > γ, f satisfies the
hypothesis of Theorem 4.2.11. Thus there exists y, 0 < y < γ + 1, such that

f (y) = y n = γ. 

COROLLARY 4.2.14 If f : [0, 1] → [0, 1] is continuous, then there exists


y ∈ [0, 1] such that f (y) = y.

Proof. Let g(x) = f (x) − x. Then g(0) = f (0) ≥ 0 and g(1) = f (1) − 1 ≤ 0.
Thus there exists y ∈ [0, 1] such that g(y) = 0; i.e., f (y) = y. 
Limits and Continuity 155

EXAMPLES 4.2.15 (a) In the proof of Theorem 4.2.11 continuity of the


function f was required. The following example shows that the converse of
Theorem 4.2.11 is false; that is, if a function f satisfies the intermediate value
property on an interval [a, b], this does not imply that f is continuous on [a, b].
Let f be defined on [0, π2 ] as follows:

 sin 1 , 0 < x ≤ π2 ,
f (x) = x
 −1, x = 0.

Then f (0) = −1, f ( π2 ) = 1, and for every γ, −1 < γ < 1, there exists an
x ∈ (0, π2 ) such that f (x) = γ. However, the function f is not continuous at
x = 0 (see Figure 4.3).
(b) In this example, we show that the conclusion of the intermediate value
theorem is false if the interval [a, b] of real numbers is replaced by an interval
of rational numbers. Let E = {x ∈ Q : 0 ≤ x ≤ 2}, and let f (x) = x2 . Then f
is continuous on E with f (0) < 2 < f (2). However, there does not exist r ∈ E
such that f (r) = 2. 

Exercises 4.2
1. For each of the following, determine whether the given function is con-
tinuous at the indicated point xo .
 2
 2x − 5x − 3
, x 6= 3,
a. f (x) = x−3 at xo = 3
6, x = 3,

√
 x−2
x 6= 4
b. h(x) = x−4 at xo = 4
4 x=4

 1 − cos x ,

x 6= 0,
*c. g(x) = x at xo = 0
 0, x = 0,
(
x2 , x ≤ 2,
*d k(x) = at xo = 2
4 − x, x > 2,
2. Let f : R → R be defined by
(
8x, when x is rational,
f (x) =
2x2 + 8, when x is irrational.
a. Prove, using ǫ and δ, that f is continuous at 2.
*b. Is f continuous at 1? Justify your answer.
3. Let f : R → R be defined by
(
x2 , x ∈ Q,
f (x) =
x + 2, x∈/ Q.
Find all points (if any) where f is continuous.
156 Introduction to Real Analysis

4. *Prove (without using Example 4.2.7) that f (x) = x is continuous on
[0, ∞).
r
1 x+1
5. Define f : (0, 1] → R by f (x) = √ − .
x x
a. Justify that f is continuous on (0, 1].
*b. Can one define f (0) so that f is continuous on [0, 1]?
6. Let E ⊂ R, and suppose f : E → R is continuous at p ∈ E.
*a. Prove that |f | is continuous at p. Is the converse true?
p
b. Set g(x) = |f x)|. Prove that g is continuous at p.
7. *Let E be a subset of a metric space X, and suppose f : E → R is
continuous on E. Prove that f n defined by f n (x) = (f (x))n is continuous
on E for each n ∈ N.
8. *a. Prove that f (x) = cos x is continuous on R.
b. If E ⊂ R and f : E → R is continuous on E, prove that g(x) =
cos(f (x)) is continuous on E.
9. For each of the following equations, determine the largest subset E of R
such that the given equation defines a continuous function on E. In each
case state which theorems or examples are used to show that the function
is continuous on E.
x3 + 4x − 5 1
*a. f (x) = . b. g(x) = sin .
x(x2 − 4) x
1 cos x
*c. h(x) = √ . d. k(x) = .
x2 + 1 sin x
(
x2 , x ≤ 2,
10. Prove that f (x) = is continuous on R and that
6 − x, x > 2,
Range f = R.
11. As in Example 4.2.7, use Theorem 4.2.6 to prove that each of the following
functions is continuous on the given domain.
1
a. f (x) = , Dom f = (0, ∞). b. g(x) = x2 , Dom g = R.
x
12. *a. Let f : R2 → R2 be defined by f (x, y) = (x + y, x − y). Show that f
is continuous on R2 .
b. Let E be a subset of R, and suppose f, g are continuous real-valued
functions on E. Prove that f (x, y) = (f (x), f (y)) is continuous on E × E.
xy
13. Prove that the function f defined on D = R2 \ (0, 0) by f (x, y) = 2
x + y2
is continuous and bounded on D.
14. Suppose E is a subset of R and f, g : E → R are continuous at p ∈ E.
Prove that each of the functions defined below is continuous at p.
*a. max{f, g}(x) = max{f (x), g(x)}, x ∈ E.
b. min{f, g}(x) = min{f (x), g(x)}, x ∈ E.
+
c. f (x) = max{f (x), 0}.
Limits and Continuity 157

15. Prove that there exists x ∈ (0, π2 ) such that cos x = x.


16. *Use the intermediate value theorem to prove that every polynomial of
odd degree has at least one real root.
17. *Suppose f : [−1, 1] → R is continuous and satisfies f (−1) = f (1). Prove
that there exists γ ∈ [0, 1] such that f (γ) = f (γ − 1).
18. Suppose f : [0, 1] → R is continuous and satisfies f (0) = f (1). Prove that
there exists γ ∈ [0, 12 ] such that f (γ) = f (γ + 12 ).
19. *Let E ⊂ R and let f : E → R be continuous. Let F = {x ∈ E : f (x) =
0}. Prove that F is closed in E. Is F necessarily closed in R?
20. Suppose f : (0, 1) → R is continuous and satisfies f (r) = 0 for each
rational number r ∈ (0, 1). Prove that f (x) = 0 for all x ∈ (0, 1).
21. Let E ⊂ R and let f be a real-valued function on E that is continuous
at p ∈ E. If f (p) > 0, prove that there exists an α > 0 and a δ > 0 such
that f (x) ≥ α for all x ∈ Nδ (p) ∩ E.
22. *Let f : E → R be continuous at p ∈ E. Prove that there exists a positive
constant M and δ > 0 such that |f (x)| ≤ M for all x ∈ E ∩ Nδ (p).
23. Let f : (0, 1) → R be defined by
(
0, if x is irrational,
f (x) = m
n, if x is rational with x = n
in lowest terms.
a. Prove that f is unbounded on every open interval I ⊂ (0, 1).
b. Use (a) and the previous exercise to conclude that f is discontinuous
at every point of (0, 1).
24. Suppose E is a subset of R and f, g : E → R are continuous on E. Show
that {x ∈ E : f (x) > g(x)} is open in E.
25. *Let K be a compact subset of a metric space X and let f : K → R be
continuous on K. Prove that f (K) is compact by showing that f (K) is
closed and bounded.
26. Let E ⊂ R and let f be a real-valued function on E. Prove that f is
continuous on E if and only if f −1 (F ) is closed in E for every closed
subset F of R.
27. Let A, B ⊂ R and let f : A → R and g : B → R be functions such that
Range f ⊂ B.
*a. If V ⊂ R, prove that (g ◦ f )−1 (V ) = f −1 (g −1 (V )).
b. If f and g are continuous on A and B respectively, use Theorem 4.2.6
to prove that g ◦ f is continuous on A.
28. Suppose I is a connected subset of R and f : I → R is continuous on
I. Prove, using only the properties of continuity and the definition of
connected set, that f (I) is connected.
29. *Let K be a compact subset of a metric space X, and let f be a real-
valued function on K. Suppose that for each x ∈ K there exists ǫx > 0
such that f is bounded on Nǫx (x) ∩ K. Prove that f is bounded on K.
158 Introduction to Real Analysis

30. Let A ⊂ R. For p ∈ R, the distance from p to the set A, denoted


d(p, A) is defined by d(p, A) = inf{|p − x| : x ∈ A}.
a. Prove that d(p, A) = 0 if and only if p ∈ A.
b. For x, y ∈ R, prove that |d(x, A) − d(y, A)| ≤ |x − y|.
c. Prove that the function x → d(x, A) is continuous on R.
d. If A, B are disjoint closed subsets of R, prove that
d(x, A)
f (x) =
d(x, A) + d(x, B)
is a continuous function on R satisfying 0 ≤ f (x) ≤ 1 for all x ∈ R, and
(
0, x ∈ A,
f (x) =
1, x ∈ B.
31. Let f be a continuous real-valued function on R satisfying f (0) = 1 and
f (x + y) = f (x)f (y) for all x, y ∈ R. Prove that f (x) = ax for some
a ∈ R, a > 0.

4.3 Uniform Continuity


In the previous section we discussed continuity of a function at a point and on
a set. By Definition 4.2.1, a function f : E → R is continuous on E if for each
p ∈ E, given any ǫ > 0, there exists a δ > 0 such that |f (x) − f (p)| < ǫ for all
x ∈ E ∩Nδ (p). In general, for a given ǫ > 0, the choice of δ that works depends
not only on ǫ and the function f , but also on the point p. This was illustrated
in Example 4.1.2(f) for the function f (x) = 1/x, x ∈ (0, 1). Functions for
which a choice of δ independent of p is possible are given a special name.

DEFINITION 4.3.1 Let E be a subset of a metric space (X, d) and f :


E → R. The function f is uniformly continuous on E if given ǫ > 0, there
exists a δ > 0 such that
|f (x) − f (y)| < ǫ for all x, y ∈ E with d(x, y) < δ.

The key point in the definition of uniform continuity is that the choice
of δ must depend only on ǫ, the function f , and the set E; it has to be inde-
pendent of any x ∈ E. To illustrate this, we consider the following examples.

EXAMPLES 4.3.2 (a) If E is a bounded subset of R, then f (x) = x2


is uniformly continuous on E. Since E is bounded, there exists a positive
constant C > 0 so that |x| ≤ C for all x ∈ E. If x, y ∈ E, then
|f (x) − f (y)| = |x2 − y 2 | = |x + y||x − y| ≤ (|x| + |y|)|x − y|
≤ 2 C |x − y|.
Limits and Continuity 159

Let ǫ > 0 be given. Take δ = ǫ/2C. If x, y ∈ E with |x − y| < δ, then by the


above
|f (x) − f (y)| ≤ 2C |x − y| < 2Cδ < ǫ.
Therefore f is uniformly continuous on E. In this example, the choice of δ
depends both on ǫ, and the set E. In the exercises you will be asked to show
that this result is false if the set E is an unbounded interval.
(b) Let f (x) = sin x. As in Example 4.2.2(f),
|f (y) − f (x)| ≤ |y − x|
for all x, y ∈ R. Consequently, f is uniformly continuous on R.
(c) In this example, we show that the function f (x) = 1/x, x ∈ (0, 1), is
not uniformly continuous on (0, 1). Suppose on the contrary that f is uniformly
continuous on (0, 1). Then if we take ǫ = 1, there exists a δ > 0 such that
1 1
|f (x) − f (y)| = − <1
x y
for all x, y ∈ (0, 1) with |x − y| < δ. Since any smaller δ will also work, we
can assume that δ < 1. Then for any x ∈ (0, 12 ), y = x + 12 δ is in (0, 1) and
satisfies |x − y| < δ. Thus
1

|f (x) − f (y)| = < 1.
x(x + 12 δ)
Since x + 21 δ < 1 for all x ∈ (0, 21 ), we have
1
2δ < x(x + 21 δ) < x.
Thus x > 12 δ for all x ∈ (0, 21 ), which is a contradiction. The function f (x) =
1/x however is uniformly continuous on [a, ∞) for any fixed a > 0 (Exercise
4(a)). 

Lipschitz Functions
Both of the functions in Example 4.3.2 (a) and (b) are examples of an extensive
class of functions. If E is a subset of a metric space (X, d), a function f : E →
R satisfies a Lipschitz condition on E if there exists a positive constant M
such that
|f (x) − f (y)| ≤ M d(x, y)
for all x, y ∈ E. Functions satisfying the above inequality are usually referred
to as Lipschitz functions. As we will see in the next chapter, functions for
which the derivative is bounded are Lipschitz functions. As a consequence
of the following theorem, every Lipschitz function is uniformly continuous.
However, not every uniformly continuous
√ function is a Lipschitz function. For
example, the function f (x) = x is uniformly continuous on [0, ∞), but f
does not satisfy a Lipschitz condition on [0, ∞) (see Exercise 5).
160 Introduction to Real Analysis

THEOREM 4.3.3 Suppose E is a subset of a metric space (X, d) and f :


E → R. If there exists a positive constant M such that

|f (x) − f (y)| ≤ M d(x, y)

for all x, y ∈ E, then f is uniformly continuous on E.

Proof. Exercise 1. 

Uniform Continuity Theorem


If the function f does not satisfy a Lipschitz condition on E, then to determine
whether f is uniformly continuous on E is much more difficult. The following
theorem provides a sufficient condition on the set E such that every continuous
real-valued function on E is uniformly continuous.

THEOREM 4.3.4 If K is compact and f : K → R is continuous on K,


then f is uniformly continuous on K.

Proof. Let ǫ > 0 be given. Since f is continuous, for each p ∈ K, there exists
a δp > 0 such that
ǫ
|f (x) − f (p)| < (3)
2
for all x ∈ K ∩ Nδp (p).

The collection Nδp /2 (p) p∈K is an open cover of K. Since K is compact,
a finite number of these will cover K. Thus there exist a finite number of
points p1 , ..., pn in K such that
n
[
K⊂ Nδpi /2 (pi ).
i=1

Let
1
δ= 2 min{δpi : i = 1, ..., n}.
Then δ > 0. Suppose x, y ∈ K with d(x, y) < δ. Since x ∈ K, x ∈ Nδpi /2 (pi )
for some i. Furthermore, since d(x, y) < δ ≤ δpi /2 ,

x, y ∈ Nδpi (pi ).

Thus by the triangle inequality and (3),

|f (x) − f (y)| ≤ |f (x) − f (pi )| + |f (pi ) − f (y)| < 21 ǫ + 12 ǫ = ǫ. 

COROLLARY 4.3.5 A continuous real-valued function on a closed and


bounded interval [a, b] is uniformly continuous.
Limits and Continuity 161

The definition of uniform continuity as well as the proof of Corollary 4.3.5


appeared in a paper by Eduard Heine in 1872.

EXAMPLE 4.3.6 In this example, we show that both the properties closed
and bounded are required. The interval [0, ∞) is closed, but not bounded. The
function f (x) = x2 is continuous on [0, ∞), but not uniformly continuous on
[0, ∞) (Exercise 2). On the other hand, the interval (0, 1) is bounded, but not
closed. The function f (x) = 1/x is continuous on (0, 1), but f is not uniformly
continuous on (0, 1). 

Exercises 4.3
1. Prove Theorem 4.3.3.
2. Show that the following functions are not uniformly continuous on the
given domain.
1
*a. f (x) = x2 , Dom f = [0, ∞) b. g(x) = 2 , Dom g = (0, ∞)
x
1 1
*c. h(x) = sin , Dom h = (0, ∞) d. k(x) = , Dom k = (0, π)
x sin x
3. Prove that each of the following functions is uniformly continuous on the
indicated set.
x
*a. f (x) = , x ∈ [0, ∞) b. g(x) = x2 , x ∈ N.
1+x
1
*c. h(x) = 2 , x ∈ R. d. k(x) = cos x, x ∈ R
x +1
2
x sin x
e. e(x) = , x ∈ (0, ∞) *f. f (x) = , x ∈ (0, 1)
x+1 x
4. Show that each of the following functions is a Lipschitz function.
1
*a. f (x) = , Dom f = [a, ∞), a > 0.
x
x
b. g(x) = 2 , Dom g = [0, ∞).
x +1
1
*c. h(x) = sin , Dom h = [a, ∞), a > 0.
x
d. p(x) a polynomial, Dom p = [−a, a], a > 0.

5. *a. Show that f (x) = x satisfies a Lipschitz condition on [a, ∞),
a > 0.

b. Prove that x is uniformly continuous on [0, ∞) .
*c. Show that f does not satisfy a Lipschitz condition on [0, ∞).
6. Suppose E ⊂ R and f, g are Lipschitz functions on E.
a. Prove that f + g is a Lipschitz function on E.
b. If in addition f and g are bounded on E, prove that f g is a Lipschitz
function on E.
7. Suppose E ⊂ R, and f, g are uniformly continuous real valued functions
on E.
a. Prove that f + g is uniformly continuous on E.
162 Introduction to Real Analysis

*b. If in addition f and g are bounded, prove that f g is uniformly con-


tinuous on E.
c. Is (b) still true if only one of the two functions is bounded?
8. Suppose E is a subset of a metric space X and f : E → R is uniformly
continuous. If {xn } is a Cauchy sequence in E, prove that {f (xn )} is a
Cauchy sequence.
9. Let f : (a, b) → R be uniformly continuous on (a, b). Use the previous
exercise to show that f can be defined at a and b such that f is continuous
on [a, b]
10. *Suppose that E is a bounded subset of R and f : E → R is uniformly
continuous on E. Prove that f is bounded on E.
11. Suppose −∞ ≤ a < c < b ≤ ∞, and suppose f : (a, b) → R is continuous
on (a, b).
a. If f is uniformly continuous on (a, c) and also uniformly continuous
on (c, b), prove thatf is uniformly continuous on (a, b).
b. Show by example that the conclusion in (a) may be false if f is not
continuous on (a, b).
12. Let a ∈ R. Suppose f is a real-valued function on [a, ∞) satisfying
lim f (x) = L, where L ∈ R. Prove that
x→∞

*a. f is bounded on [a, ∞), and


b. f is uniformly continuous on [a, ∞)
13. Let E ⊂ R. A function f : E → E is contractive if there exists a
constant b, 0 < b < 1, such that |f (x) − f (y)| ≤ b |x − y|.
*a. If E is closed, and f : E → E is contractive, prove that there exists
a unique point xo ∈ E such that f (xo ) = xo . (Such a point xo is called a
fixed point of f )
b. Let E = (0, 31 ]. Show that f (x) = x2 is contractive on E, but that f
does not have a fixed point in E.
14. A function f : R → R is periodic if there exists p ∈ R such that
f (x + p) = f (x) for all x ∈ R. Prove that a continuous periodic function
on R is bounded and uniformly continuous on R.

4.4 Monotone Functions and Discontinuities


In this section, we take a closer look both at limits and continuity for real-
valued functions defined on an interval I ⊂ R. More specifically however,
we will be interested in classifying the types of discontinuities which such a
function may have. We will also investigate properties of monotone functions
defined on an interval I. These functions will play a crucial role in Chapter 6
Limits and Continuity 163

on Riemann-Stieltjes integration. First however we begin with the right and


left limit of a real-valued function defined on a subset E of R.

Right and Left Limits

DEFINITION 4.4.1 Let E ⊂ R and let f be a real-valued function defined


on E. Suppose p is a limit point of E ∩ (p, ∞). The function f has a right
limit at p if there exists a number L ∈ R such that given any ǫ > 0, there
exists a δ > 0 for which

|f (x) − L| < ǫ for all x ∈ E satisfying p < x < p + δ.

The right limit of f , if it exists, is denoted by f (p+), and we write

lim f (x).
f (p+) = lim+ f (x) = x→p
x→p
x>p

Similarly, if p is a limit point of E ∩ (−∞, p), the left limit of f at p, if it


exists, is denoted by f (p−), and we write

lim f (x).
f (p−) = lim− f (x) = x→p
x→p
x<p

The hypothesis that p is a limit point of E ∩ (p, ∞) guarantees that for


every δ > 0, E∩(p, p+δ) 6= ∅. If E is an open interval (a, b), −∞ < a < b ≤ ∞,
then any p satisfying a ≤ p < b is a limit point of E ∩ (p, ∞). Similarly, if
−∞ ≤ a < b < ∞, then any p satisfying a < p ≤ b is a limit point of
(−∞, p) ∩ E. If I is any interval with Int(I) 6= ∅, and f : I → R, then f has a
limit at p ∈ Int(I) if and only if
(a) f (p+) and f (p−) both exist, and
(b) f (p+) = f (p−).
The hypothesis that p ∈ Int(I) guarantees that p is a limit point of both
(−∞, p) ∩ I and also I ∩ (p, ∞). If p is a left end point of the interval I, then
the right limit of f at p coincides with the limit of f at p. The analogous
statement is also true if p is a right endpoint of I.
We also define right and left continuity of a function at a point p as follows:

DEFINITION 4.4.2 Let E ⊂ R and let f be a real-valued function on E.


The function f is right continuous (left continuous) at p ∈ E if for any
ǫ > 0, there exists a δ > 0 such that

|f (x) − f (p)| < ǫ for all x ∈ E with p ≤ x < p + δ (p − δ < x ≤ p).

Remarks. If p ∈ E is an isolated point of E or is not a limit point of E∩(p, ∞),


then there exists a δ > 0 such that E ∩ (p, p + δ) = ∅. Thus if f : E → R and
ǫ > 0 is arbitrary, then |f (x) − f (p)| < ǫ for all x ∈ E ∩ [p, p + δ). Thus every
164 Introduction to Real Analysis

f : E → R is right continuous at p. In particular, if E is a closed interval [a, b],


then every f : [a, b] → R is right continuous at b. Also, f is left continuous at
b if and only if f is continuous at p.
The following theorem, the proof of which is left to the exercises, is an
immediate consequence of the definitions.

THEOREM 4.4.3 A function f : (a, b) → R is right continuous at p ∈ (a, b)


if and only if f (p+) exists and equals f (p). Similarly, f is left continuous at
p if and only if f (p−) exists and equals f (p).
Proof. Exercise 1. 

Types of Discontinuities
By the previous theorem a function f is continuous at p ∈ (a, b) if and only if
(a) f (p+) and f (p−) both exist, and
(b) f (p+) = f (p−) = f (p).
A real-valued function f defined on an interval I can fail to be continuous
at a point p ∈ I (the closure of I) for several reasons. One possibility is that
lim f (x) exists but either does not equal f (p), or f is not defined at p. Such a
x→p
function can easily be made continuous at p by either defining, or redefining,
f at p as follows:
f (p) = lim f (x).
x→p

For this reason, such a discontinuity is called a removable discontinuity.


For example, the function
 2
x − 4
, x=6 2,
g(x) = x−2
2, x = 2,

of Example 4.2.2(a) is not continuous at 2 since


lim g(x) = 4 6= g(2).
x→2

By redefining g such that g(2) = 4, the resulting function is then continuous


1
at 2. Another example is given by f (x) = x sin , x ∈ (0, ∞), which is not
x
defined at 0. If we define f on [0, ∞) by

 0, x = 0,
f (x) = 1
x sin , x > 0,
x
then by Example 4.2.2(e), f is now continuous at 0.
Another possibility is that f (p+) and f (p−) both exist, but are not equal.
This type of discontinuity is called a jump discontinuity. (See Figure 4.9)
Limits and Continuity 165

DEFINITION 4.4.4 Let f be a real valued function defined on an interval I.


The function f has a jump discontinuity at p ∈ Int(I) if f (p+) and f (p−)
both exist, but f is not continuous at p. If p ∈ I is a left (right) endpoint of
I, then f has a jump discontinuity at p if f (p+) (f (p−)) exists, but f is not
continuous at p.

FIGURE 4.9
Jump discontinuity of f at p

Jump discontinuities are also referred to as simple discontinuities, or


discontinuities of the first kind. All other discontinuities are said to be of
second kind.
If f (p+) and f (p−) both exist, but f is not continuous at p, then either
(a) f (p+) 6= f (p−), or
(b) f (p+) = f (p−) 6= f (p).
In case (a), f has a jump discontinuity at p, whereas in case (b), the discon-
tinuity is removable. All discontinuities for which f (p+) or f (p−) does not
exist are discontinuities of the second kind.

EXAMPLES 4.4.5 (a) Let f be defined by


(
x, 0 < x ≤ 1,
f (x) = 2
3 − x , x > 1.

If x < 1, then f (x) = x. Therefore,

f (1−) = lim f (x) = lim x = 1 = f (1).


x→1− x→1
166 Introduction to Real Analysis

Likewise, the right limit of f at 1 is

f (1+) = lim+ f (x) = lim 3 − x2 = 2.


x→1 x→1

Therefore f (1−) = f (1) = 1, and f (1+) = 2. Thus f is left continuous at


1, but not continuous. Since both right and left limits exist at 1, but are not
equal, the function f has a jump discontinuity at 1.
(b) Let [x] denote the greatest integer function, that is, for each x,
[x] = largest integer n which is less than or equal to x. For example, [2.9] = 2,
[3.1] = 3, and [−1.5] = −2. The graph of y = [x] is given in Figure 4.10. It is
clear that for each n ∈ Z,

lim [x] = n − 1, and lim [x] = n.


x→n− x→n+

FIGURE 4.10
Graph of [x]

Thus f has a jump discontinuity at each n ∈ Z. Also, since f (n) = [n] = n,


f (x) = [x] is right continuous at each integer. Finally, since f is constant on
each interval (n − 1, n), n ∈ Z, f is continuous at every x ∈ R \ Z.
(c) Let f be defined on R by

 0, if x ≤ 0
f (x) = 1
sin , if x > 0.
x
Then f (0−) = 0, but f (0+) does not exist. Thus the discontinuity is of second
kind.
Limits and Continuity 167

(d) Consider the function g : R → R defined by

g(x) = sin(2πx[x]).

For x ∈ (n, n + 1), n ∈ Z, x[x] = nx, and thus g(x) is continuous on every
interval (n, n + 1), n ∈ Z. On the other hand, for n ∈ Z,

lim sin(2πx[x]) = sin(2πn2 ) = 0, and


x→n+
lim sin(2πx[x]) = sin(2πn(n − 1)) = 0.
x→n−

Since g(n) = sin(2πn2 ) = 0, g is also continuous at each n ∈ Z. The function


g however, is not uniformly continuous on R (Exercise 7). 

Monotone Functions

DEFINITION 4.4.6 Let f be a real-valued function defined on an interval


I.
(a) f is monotone increasing (increasing, nondecreasing) on I if
f (x) ≤ f (y) for all x, y ∈ I with x < y.
(b) f is monotone decreasing (decreasing, nonincreasing) on I if
f (x) ≥ f (y) for all x, y ∈ I with x < y.
(c) f is monotone on I if f is monotone increasing on I or monotone
decreasing on I.

A function f is strictly increasing on I if f (x) < f (y) for all x, y ∈ I


with x < y. The concept of strictly decreasing is defined similarly. Also, f is
strictly monotone on I if f is strictly increasing on I or strictly decreasing
on I. Our main result for monotone functions is as follows:

THEOREM 4.4.7 Let I ⊂ R be an open interval and let f : I → R be


monotone increasing on I. Then f (p+) and f (p−) exists for every p ∈ I and

sup f (x) = f (p−) ≤ f (p) ≤ f (p+) = inf f (x).


x<p p<x

Furthermore, if p < q, p, q ∈ I, then f (p+) ≤ f (q−).

Although we stated the theorem for monotone increasing functions, a sim-


ilar statement is also valid for monotone decreasing functions.
Proof. Fix p ∈ I. Since f is increasing on I, {f (x) : x < p, x ∈ I} is bounded
above by f (p). Let
A = sup{f (x) : x < p, x ∈ I}.
Then A ≤ f (p). We now show that

lim f (x) = A.
x→p−
168 Introduction to Real Analysis

The proof of this is similar to the proof of Theorem 3.3.2. Let ǫ > 0 be given.
Since A is the least upper bound of {f (x) : x < p}, there exists xo < p such
that
A − ǫ < f (xo ) ≤ A.
Thus if xo < x < p, A − ǫ < f (xo ) ≤ f (x) ≤ A. Therefore,

|f (x) − A| < ǫ for all x, xo < x < p.

Thus by definition, lim f (x) = A. Similarly


x→p−

f (p) ≤ f (p+) = inf{f (x) : p < x, x ∈ I}.

Finally, suppose p < q. Then

f (p+) = inf{f (x) : x > p, x ∈ I} ≤ inf{f (x) : p < x < q}


≤ sup{f (x) : p < x < q} ≤ sup{f (x) : x < q, x ∈ I} = f (q−). 

COROLLARY 4.4.8 If f is monotone on an open interval I, then the set


of discontinuities of f is at most countable.

Proof. Let E = {p ∈ I : f is discontinuous at p }. Suppose f is monotone


increasing on I. Then

p∈E if and only if f (p−) < f (p+).

For each p ∈ E, choose rp ∈ Q such that

f (p−) < rp < f (p+).

If p < q, then f (p+) ≤ f (q−). Therefore, if p, q ∈ E, rp 6= rq , and thus the


function p → rp is a one-to-one map of E into Q. Therefore E is equivalent
to a subset of Q and thus is at most countable. 

Construction of Monotone Functions with Prescribed Dis-


continuities
We now proceed to show that given any finite or countable subset A of (a, b),
there exists a monotone increasing function f on [a, b] that is discontinuous
at each x ∈ A and continuous on [a, b] \ A. We first illustrate how this is
accomplished for the case where A = {a1 , ..., an } is a finite subset of (a, b). To
facilitate this construction we define the unit jump function I on R as follows:

DEFINITION 4.4.9 The unit jump function I : R → R is defined by


(
0, when x < 0,
I(x) = .
1, when x ≥ 0.
Limits and Continuity 169

The function I is right continuous at 0 with I(0+) = I(0) = 1 and I(0−) =


0. For each k = 1, ..., n, let
(
0, when x < ak
Ik (x) = I(x − ak ) =
1, when x ≥ ak .

FIGURE 4.11
Graph of I(x − a)

Then Ik has a unit jump at each ak and is right continuous at ak (see Fig-
ure 4.11).
Suppose {c1 , ..., cn } are positive real numbers. Define f on [a, b] by
n
X
f (x) = ck I(x − ak ).
k=1

The reader should verify that the function f is


(a) monotone increasing on [a, b],
(b) continuous on [a, b] \ {a1 , a2 , ..., an },
(c) right continuous at each ak , k = 1, 2, ..., n, and
(d) discontinuous at each ak with f (ak +) − f (ak −) = ck for all k =
1, 2, ..., n.
That such a function exists for any finite subset {a1 , ..., an } of (a, b) is not
surprising. However, that such a function exists for any countable subset A of
(a, b) may take some convincing, especially if one takes A to be dense in [a, b];
e.g., the rational numbers in (a, b).
170 Introduction to Real Analysis

THEOREM 4.4.10 Let a, b ∈ R with a < b, and let {xn }n∈N be a countable
subset of (a, b). Let {cn }∞
n=1 be any sequence of positive real numbers such that

P
cn converges. Then there exists a monotone increasing function f on [a, b]
n=1
such that

P
(a) f (a) = 0 and f (b) = cn ,
n=1
(b) f is continuous on [a, b] \ {xn : n = 1, 2, ... },
(c) f (xn +) = f (xn ) for all n; i.e. f is right continuous at all xn , and
(d) f is discontinuous at each xn with

f (xn ) − f (xn −) = cn .

Proof. For each x ∈ [a, b], define



X
f (x) = cn I(x − xn ).
n=1

Since 0 ≤ cn I(x − xn ) ≤ cn for each x ∈ [a, b], we have


n
X n
X ∞
X
sn (x) = ck I(x − xk ) ≤ ck ≤ ck .
k=1 k=1 k=1

Thus for each x ∈ [a, b], the sequence {sn (x)} of partial sums is monotone
increasing and bounded above and hence by Theorem 3.7.6 converges. Since

I(x − xn ) ≤ I(y − xn ) n = 1, 2, ....

for all x, y with x < y, f is monotone increasing on [a, b]. Furthermore, since
xn > a for all n, I(a − xn ) = 0 for all n. Therefore f (a) = 0. Also, since
I(b − xn ) = 1 for all n,

X
f (b) = ck .
k=1

This proves (a).


We now prove (b). Fix p ∈ [a, b], p 6= xn for any n. Let E = {xn : n ∈ N}.
There are two cases to consider.
(i) Suppose p is not a limit point of E. If this is the case, there exists a
δ > 0 such that Nδ (p) ∩ E = ∅. Then

I(x − xk ) = I(p − xk ) for all x ∈ (p − δ, p + δ)

and all k = 1, 2, .... Thus f is constant on (p − δ, p + δ) and hence continuous.


(ii) Suppose p is a limit point of E. Let ǫ > 0 be given. Since the series
Limits and Continuity 171

P
ck converges, by the Cauchy criterion there exists a positive integer N
k=1
such that

X
ck < ǫ.
k=N +1

Choose δ such that

0 < δ < min{|p − xn | : n = 1, 2, ..., N }.

With this choice of δ, if xk ∈ Nδ (p)∩E, we have k > N . Suppose p < x < p+δ.
Then
I(p − xk ) = I(x − xk ) for all k = 1, 2, ..., N.
Furthermore, for any x > p, we always have

0 ≤ I(x − xk ) − I(p − xk ) ≤ 1, for all k ∈ N.

Therefore, for p < x < p + δ,



X ∞
X
0 ≤ f (x) − f (p) ≤ ck (I(x − xk ) − I(p − xk )) ≤ ck < ǫ.
k=N +1 k=N +1

Thus f is right continuous at p. Similarly, f is left continuous at p, and


therefore f is continuous at p.
For the proof of (c), fix an xn ∈ E. If xn is an isolated point of E, then
as above, there exists a δ > 0 such that E ∩ (xn , xn + δ) = ∅. Therefore
f (y) = f (xn ) for all y, xn < y < xn + δ. Thus f (xn +) = f (xn ). Suppose xn
is a limit point of E. Let ǫ > 0 be given. Again, choose a positive integer N
so that
X∞
ck < ǫ.
k=N +1

As in (b), there exists a δ > 0 such that if xk ∈ (xn , xn + δ) ∩ E, then k > N .


Thus

X
0 ≤ f (y) − f (xn ) ≤ ck < ǫ for all y ∈ (xn , xn + δ).
k=N +1

Therefore f (xn +) = f (xn ) and f is right continuous at each xn .


For the proof of (d), suppose y < xn . Again, if xn is an isolated point of
E, there exists a δ > 0 such that (xn − δ, xn ) ∩ E = ∅. Therefore, for all k 6= n,

I(y − xk ) = I(xn − xk ) for all y, xn − δ < y < xn ,

and for all y < xn ,

0 = I(y − xn ) ≤ I(xn − xn ) = I(0) = 1.


172 Introduction to Real Analysis

Therefore,
f (xn ) − f (y) = cn for all y, xn − δ < y < xn .

P
Suppose xn is a limit point of E. Given ǫ > 0, choose N such that ck <
k=N +1
ǫ. For this N , choose δ > 0 such that if xk ∈ (xn − δ, xn ) ∩ E then k > N .
Then for all y ∈ [a, b] with xn − δ < y < xn ,

X
cn ≤ f (xn ) − f (y) ≤ cn + ck < cn + ǫ.
k=N +1

Therefore, f (xn ) − f (xn −) = cn . 

EXAMPLES 4.4.11 (a) Take cn = 2−n , xn = 1 − 1/(n + 1), n = 1, 2, ...,


(a, b) = (0, 1). As in Theorem 4.4.10 let

X
f (x) = cn I(x − xn ).
n=1

In this example, the sequence {xn } satisfies 0 < x1 < x2 < · · · < 1. If
0 ≤ x < x1 = 21 , then I(x − xn ) = 0 for all n. Thus
f (x) = 0, x ∈ [0, 21 ).
2
If x1 ≤ x < x2 = 3, then I(x − x1 ) = 1 and I(x − xk ) = 0 for all k ≥ 2.
Therefore,
1
f (x) = c1 =, x ∈ [ 21 , 13 ).
2
If x2 ≤ x < x3 = 34 , then I(x − xk ) = 1 for k = 1, 2 and I(x − xk ) = 0 for
k ≥ 3. Therefore,
1 1 3
f (x) = c1 + c2 = + = , x ∈ [ 23 , 34 ),
2 22 4
and so forth. The graph of f is depicted in Figure 4.12.
(b) Let cn = 2−n and let {xn } be an enumeration of the rationals in (0, 1).
Theorem 4.4.10 guarantees the existence of a nondecreasing function on [0, 1],
which is discontinuous at each rational number in (0, 1), and continuous at
every irrational number in (0, 1).
(c) If in the proof of Theorem 4.4.10 we take {xn }n∈N to be a countable
P∞
subset of R and choose {cn }n∈N (cn > 0) such that cn = 1, then we obtain
n=1
a nondecreasing real-valued function f on R satisfying lim f (x) = 0 and
x→−∞
lim f (x) = 1. (See Exercise 21) Such a function is called a distribution
x→∞
function on R. Such functions arise naturally in probability theory. 
Limits and Continuity 173

FIGURE 4.12 P
Graph of f (x) = cn I(x − xn )

Inverse Functions
Suppose f is a strictly increasing real-valued function on an interval I. Let
x, y ∈ I with x 6= y. If x < y, then since f is strictly increasing, f (x) < f (y).
Similarly, if x > y then f (x) > f (y). Thus f (x) 6= f (y) for any x, y ∈ I with
x 6= y. Therefore f is one-to-one and consequently has an inverse function f −1
defined on f (I). In the following theorem we prove that if f is continuous on
I then f −1 is also continuous on f (I).

THEOREM 4.4.12 Let I ⊂ R be an interval and let f : I → R be strictly


monotone and continuous on I. Then f −1 is strictly monotone and continuous
on J = f (I).

Proof. Without loss of generality we consider the case where f is strictly


increasing on I. Since f is continuous, by Corollary 4.2.12 f (I) = J is an
interval. Furthermore, since f is strictly increasing on I, f is a one-to-one
function from I onto J. Hence f −1 is a one-to-one function from J onto I.
Suppose y1 , y2 ∈ J with y1 < y2 . Then there exist distinct points x1 , x2 ∈
I such that f (xi ) = yi , i = 1, 2. Since f is strictly increasing, we have x1 < x2 .
Thus f −1 (y1 ) < f −1 (y2 ), i.e., f −1 is strictly increasing.
It remains to be shown that f −1 is continuous on J. We first show that
−1
f is left continuous at each yo ∈ J for which (−∞, yo ) ∩ J 6= ∅. This last
assumption only means that yo is not a left endpoint of J. Let xo ∈ I be such
that f (xo ) = yo . Then (−∞, xo ) ∩ I 6= ∅. Let ǫ > 0 be given. Without loss of
174 Introduction to Real Analysis

FIGURE 4.13
Continuity of the inverse function

generality, we can assume that ǫ is sufficiently small so that xo − ǫ ∈ I. Since


f is continuous and strictly increasing,
f ((xo − ǫ, xo ]) = (f (xo − ǫ), f (xo )] = (yo − δ, yo ]
where δ > 0 is given by δ = f (xo ) − f (xo − ǫ). Thus since f −1 is strictly
increasing,
xo − ǫ = f −1 (yo − δ) < f −1 (y) < f −1 (yo ) = xo
for all y ∈ (yo − δ, yo ) (see Figure 4.13). Hence,
|f −1 (yo ) − f −1 (y)| < ǫ for all y ∈ (yo − δ, yo ].
Therefore f −1 is left continuous at yo . A similar argument also proves that
f −1 is right continuous at each yo ∈ J that is not a right endpoint of J. Thus
f −1 is continuous at each yo ∈ J. 
For a strictly increasing function f on an open interval I, an alternate
proof of the continuity of the inverse function f −1 is suggested in the exercises
(Exercise 14).

EXAMPLE 4.4.13 The function f (x) = x2 is continuous and strictly in-


creasing on I = [0, ∞) with J = f (I) = [0, ∞). Thus the inverse func-

tion f −1
√(y) = y is continuous on [0, ∞). As a consequence, the function n
g(x) = x is continuous on [0, ∞).
√Applying the same argument to f (x) = x
shows that the function g(x) = n x is strictly increasing and continuous on
[0, ∞). 
Limits and Continuity 175

Remark. In the statement of Theorem 4.4.12 we assumed that f was strictly


monotone and continuous on the interval I. The fact that f is either strictly
increasing or strictly decreasing on I implies that f is one-to-one on the in-
terval I. Conversely, if f is one-to-one and continuous on an interval I, then
as a consequence of the intermediate value theorem the function f is strictly
monotone on I (Exercise 15). This, however, is false if either f is not contin-
uous on the interval I, or if Dom f is not an interval. (See Exercises 16, 17,
and 18).

Exercises 4.4
1. Prove Theorem 4.4.3.
2. For each of the following functions f defined on R \ {0}, find lim f (x)
x→0−
and lim f (x), provided the limits exist.
x→0+ 
x[x], x < 0,
x
a. f (x) = . *b. f (x) = [x]
|x|  , x > 0.
x
2
 − x ]
c. f (x) = [1 *d. f (x) = [x2 − 1]
1 1
e. f (x) = . *f. f (x) = x .
x x
3. For each of the functions f in Exercise 2, determine whether f has a
removable discontinuity, a jump discontinuity, or a discontinuity of second
kind, at x = 0. If f has a removable discontinuity at 0, specify how f (0)
should be defined in order that f is continuous at 0.
4. *Investigate continuity of g(x) = (x − 2)[x] at xo = 2.
5. Let f (x) = x − [x]. Discuss continuity of f . Sketch the graph of f .
6. For each of the following determine the value of b such that f has a
removable discontinuity at the indicated point xo .
(
x − 2, x < 1,
a. f (x) = 3
xo = 1
bx + 4, x > 1,
(
−x2 [x], x < −2,
*b. f (x) = xo = −2
x + b, x > −2,
7. a. Sketch the graph of g(x) = sin(2πx[x]) for x ∈ (−4, 4).
*b. Prove that g(x) is not uniformly continuous on R.
8. Prove that the function f of Example 4.4.11(a) is continuous at x = 1.
9. Let E ⊂ R and let f be a real-valued function on E. Suppose p ∈ R is a
limit point of E ∩ (p, ∞). Prove that
lim f (x) = L if and only if lim f (pn ) = L
x→p+ n→∞

for every sequence {pn } in E with pn > p for all n ∈ N and pn → p.


10. Let f be a real-valued function (a, b].
176 Introduction to Real Analysis

*a. If f is continuous on (a, b] and lim f (x) exists, prove that f is


x→a+
uniformly continuous on (a, b].
b. If f is uniformly continuous on (a, b], prove that lim f (x) exists.
x→a+

11. Let f : [0, 2] → R be defined by


(
x, 0 ≤ x ≤ 1,
f (x) = 2
1+x , 1 < x ≤ 2.
Show that f and f −1 are strictly increasing and find f ([0, 2]). Are f and
f −1 continuous at every point of their respective domains
12. *If m ∈ Z, n ∈ N, prove that f (x) = xm/n is continuous on (0, ∞).
13. a. If f and g are monotone increasing functions on an interval I, prove
that f + g is monotone increasing on I.
b. If in addition f and g are positive, prove that f g is monotone increasing
on I.
c. Show by example that the conclusion in part (b) may be false if f and
g are not both positive on I.
14. Let I ⊂ R be an open interval and let f : I → R be strictly increasing
and continuous on I.
*a. If U ⊂ I is open, prove that f (U ) is open.
b. Use (a) and Theorem 4.2.6 to prove that f −1 is continuous on f (I).
15. Let I ⊂ R be an interval and let f be a one-to-one continuous real-valued
function on I. Prove that f is strictly monotone on I.
16. Let f : [0, 1] → R be defined by
(
2x, 0 ≤ x < 21 ,
f (x) = 1
3 − 2x, 2
≤ x ≤ 1.
a. Sketch the graph of f .
*b. Show that f is one-to-one on [0, 1], but not strictly increasing on
[0, 1].
*c. Show that f ([0, 1]) = [0, 2].
*d. Find f −1 (y). y ∈ [0, 2], and show that f −1 is not continuous at
yo = 1.
17. Let E = [0, 1] ∪ [2, 3), and for x ∈ E set
(
x2 , 0 ≤ x ≤ 1,
f (x) =
4 − x, 2 ≤ x < 3.
a. Sketch the graph of f .
b. Show that f is one-to-one and continuous on E.
c. Show that f (E) = [0, 2].
d. Find f −1 (y) for y ∈ [0, 2], and show that f −1 is not continuous at
yo = 1.
Limits and Continuity 177

18. Let f : [0, 1] → R be defined by


(
x, x ∈ Q,
f (x) =
1 − x, x∈/ Q.
Prove that f is one-to-one, that f ([0, 1]) = [0, 1], but that f is not mono-
tone of any interval I ⊂ [0, 1].
19. Prove that if f is monotone increasing on [a, ∞), a ∈ R, and bounded
above, then lim f (x) exists.
x→∞

20. Let I ⊂ R be an interval and let f : I → R be monotone increasing. For


p ∈ Int(I), the jump of f at p, denoted Jf (p) is defined by
Jf (p) = f (p+) − f (p−).
If p is a left endpoint, set Jf (p) = f (p+) − f (p), and if p is a right
endpoint, set Jf (p) = f (p) − f (p−).
a. Prove that f is continuous at p ∈ I if and only if Jf (p) = 0.
b. If p ∈ Int(I), prove that Jf (p) = inf{f (y)−f (x) : x < p < y, x, y ∈ I}.
21. Let {xn }∞ subset of R and {cn }∞
n=1 be a countable P n=1 a sequence of posi-
tive real numbers satisfying cn = 1. Let f : R → R be defined by

P
f (x) = cn I(x − xn ).
n=1

Prove that lim f (x) = 0 and lim f (x) = 1.


x→−∞ x→∞

Notes
The limit of a function at a point is one of the fundamental tools of analysis. Not
only is it crucial to continuity, but also to many subsequent topics in the text. The
limit process will occur over and over again. We will encounter it in the next chapter
in the definition of the derivative. It will occur again in the chapters on integration,
series, etc.
Another very important concept that will be encountered on many other oc-
casions in the text is uniform continuity. Uniform continuity is important in that
given ǫ > 0, it guarantees the existence of a δ > 0 such that |f (x) − f (y)| < ǫ for all
x, y ∈ Dom f with |x − y| < δ. In Chapter 6 we will use this to prove that every
continuous real-valued function on [a, b], a, b ∈ R, is Riemann integrable on [a, b].
Other applications of uniform continuity will occur in many other theorems and in
the exercises.
One of the most important results of this chapter is the intermediate value
theorem (Theorem 4.2.11). The intermediate value theorem has already been used
in Corollary 4.2.13 to prove the existence of nth roots; namely, for every positive real
number x and n ∈ N, there exists a unique positive real number y such that y n = x.
Even though the existence of nth roots can be proved without the intermediate
value theorem, any such proof however is simply the statement that the function
178 Introduction to Real Analysis

f (x) = xn satisfies the intermediate value property on [0, a] for every a > 0. Other
applications of the intermediate value theorem will occur elsewhere in the text.
The proof of the intermediate value theorem depended on the fact that the
connected subsets of R are the intervals (Theorem 2.2.25) and that the continuous
image of a connected set is connected (Exercise 28 of Section 4.2). Assuming these
two results, the intermediate value theorem is an immediate consequence as follows:
Suppose f is continuous on [a, b]. Let I = f ([a, b]). Then I is connected and thus
must be an interval. Thus if f (a) < γ < f (b), γ ∈ I and hence there exists c ∈ [a, b]
such that f (c) = γ. That the continuous image of a connected set is connected
follows from the definition. However, the proof that the connected subsets of R are
the intervals requires the least upper bound property.

Miscellaneous Exercises
1. Let f be a continuous real-valued function on [a, b] with f (a) < 0 < f (b).
Let c1 = 12 (a + b). If f (c1 ) > 0, let c2 = 21 (a + c1 ). If f (c1 ) < 0, let
c2 = 12 (c1 + b). Continue this process inductively to obtain a sequence
{cn } in (a, b) which converges to a point c ∈ (a, b) for which f (c) = 0.
2. Let E ⊂ R, p a limit point of E, and f a real-valued function defined on
E. The limit superior of f at p, denoted lim f (x), is defined by
x→p

lim f (x) = inf sup{f (x) : x ∈ (Nδ (p) \ {p}) ∩ E}.


x→p δ>0

Similarly, the limit inferior of f at p, denoted lim f (x), is defined by


x→p

lim f (x) = sup inf{f (x) : x ∈ (Nδ (p) \ {p}) ∩ E}.


x→p δ>0

Prove each of the following:


a. lim f (x) ≤ L if and only if given ǫ > 0, there exists a δ > 0 such that
x→p
f (x) < L + ǫ for all x ∈ E, 0 < d(x, p) < δ.
b. lim f (x) ≥ L if and only if given ǫ > 0 and δ > 0, there exists x ∈ E
x→p
with 0 < d(x, p) < δ such that f (x) > L − ǫ.
c. If lim f (x) = L, then for any sequence {xn } in E with xn 6= p for all
x→p
n ∈ N, lim f (xn ) ≤ L.
d. There exists a sequence {xn } in E with xn 6= p for all n ∈ N, such
that
lim f (xn ) = lim f (x).
x→∞ x→p

3. Let X ⊂ R and f a real-valued function on X. For p ∈ X, the oscillation


of f at p, denoted ω(f ; p), is defined as
ω(f ; p) = inf δ>0 sup{|f (x) − f (y)| : x, y ∈ Nδ (p) ∩ X}.
Prove each of the following:
Limits and Continuity 179

a. The function f is continuous at p if and only if ω(f ; p) = 0.


b. For every s ∈ R, the set {x ∈ X : ω(f ; x) < s} is open.
c. The set {x ∈ X : f is continuous at x} is the intersection of at most
countably many sets that are open in X.

The following set of exercises involve the Cantor ternary function. Let
P denote the Cantor ternary set of Section 2.3. For each x ∈ (0, 1], let
x = .a1 a2 a3 .... denote the ternary expansion of x. Define N as follows:
(
∞, if an 6= 1 for all n ∈ N,
N=
min{n : an = 1}, otherwise.
Define bn = 21 an for n < N , and bN = 1, if N is finite. (Note: bn ∈ {0, 1}
for all n.)
N b
P n
4. If x ∈ (0, 1] has two ternary expansions, show that n
is independent
n=1 2
of the expansion of x.

The Cantor ternary function f on [0, 1] is defined as follows: f (0) = 0,


and if x ∈ (0, 1] with ternary expansion x = .a1 a2 a3 ...., set
N b
P n
f (x) = n
,
n=1 2

where N and bn are defined as above.


5. Prove each of the following:
a. f is monotone increasing on [0, 1].
b. f is constant on each interval in the complement of the Cantor set in
[0, 1].
c. f is continuous on [0, 1].
d. f (P ) = [0, 1].
e. Sketch the graph of f .

Supplemental Reading

Bryant, J., Kuzmanovich, J. and d’Augustin Cauchy, series 2, vol.3,


Pavlichenkov, A., “Functions with com- Gauthier-Villars Paris, 1899.
pact preimages of compact sets,” Math. Dupree, E. and Mathes, B., “Func-
Mag. 70 (1997), 362–364. tions with dense graphs,” Math. Mag. 86
Bumcrot, R. and Sheingorn, M., (2013), 366–369.
“Variations on continuity: Sets of infinite Fleron, Julian F., “ A note on the his-
limits,” Math. Mag. 47 (1974), 41–43. tory of the Cantor set and Cantor func-
Cauchy, A. L., Cours d’analyse, tion,” Math. Mag. 67 (1994), 136–140.
Paris, 1821, in Oeuvres complétes Grabinger, Judith V., “Who gave you
180 Introduction to Real Analysis

the epsilon? Cauchy and the origins of function uniformly continuous,” Math.
rigorous calculus,” Amer. Math. Monthly Mag. 57 (1994), 169–173.
90 (1983), 185–194. Straffin, Jr., Philip, D., “Periodic
points of continuous functions,” Math.
Radcliffe, D. G., “A function that
Mag. 51 (1978) 99–105.
is surjective on every interval,” Amer.
Velleman, D. J., “Characterizing con-
Math. Monthly 123 (2016), 88–89.
tinuity,” Amer. Math. Monthly 104
Snipes, Ray F., “Is every continuous (1997), 318-322.
5
Differentiation

The development of differential and integral calculus by Isaac Newton (1642–


1727) and Gottfried Wilhelm Leibniz (1646–1716) in the mid seventeenth cen-
tury constitutes one of the great advances in mathematics. In the two years
following his degree from Cambridge in 1664, Newton invented the method
of fluxions (derivatives) and fluents (integrals) to solve problems in physics
involving velocity and motion. During the same period he also discovered the
laws of universal gravitation and made significant contributions to the study
of optics. Leibniz on the other hand, whose contributions came 10 years later,
was led to the invention of calculus through the study of tangents to curves
and the problem of area. The first published account of Newton’s calculus
appeared in his 1687 treatise Philosophia Naturalis Principia Mathematica.
Unfortunately however, much of Newton’s work on calculus did not appear
until 1737, ten years after his death, in a work entitled Methodus fluxionum
et serierum infinitorum.
Mathematicians prior to the time of Newton and Leibniz knew how to
compute tangents to specific curves and velocities in particular situations.
They also knew how to compute areas under elementary curves. What dis-
tinguished the work of Newton and Leibniz from that of their predecessors
was that they realized that the problems of finding the tangent to a curve
and the area under a curve were inversely related. More importantly however,
they also developed the notation and a set of techniques (a calculus) to solve
these problems for arbitrary functions, whether algebraic or transcendental.
In Newton’s presentation of his infinitesimal calculus he looked upon y as a
flowing quantity, or fluent, of which the quantity ẏ was the fluxion or rate
of change. The notation of Newton is still in use in physics and differential
geometry,
R whereas every student of calculus is exposed to the d (for difference)
and (for sum) notation of Leibniz to denote differentiation and integration.
Many of the basic rules and formulas of the differential calculus were devel-
oped by these two remarkable mathematicians. In the paper A New Method
for Maxima and Minima, and also for Tangents, which is not Obstructed by
Irrational Quantities published in 1684, Leibniz gave correct rules for differ-
entiation of sums, products, quotients, powers, and roots. In addition to his
many contributions to the subject, Leibniz also disseminated his results in
publications and correspondence with colleagues throughout Europe.
Newton and Leibniz with their invention of the calculus had created a tool
of such novel subtlety that its utility was proved for over 150 years before

181
182 Introduction to Real Analysis

its limitations forced mathematicians to clarify its foundations. The rigorous


formulation of the derivative did not occur until 1821 when Cauchy provided a
formal definition of limit. This helped to place the theory on a firm mathemat-
ical footing. Cauchy’s contributions to the rigorous development of calculus
will be evident both in this and subsequent chapters.
In this chapter, we develop the theory of differentiation based on the defi-
nition of Cauchy, with special emphasis on the mean value theorem and con-
sequences thereof. The first section presents the standard results concerning
derivatives of functions obtained by means of algebraic operations and com-
position. In the examples and exercises we will derive the derivatives of some
of the basic algebraic and trigonometric functions. However, throughout the
chapter we will assume that the reader is already familiar with standard tech-
niques of differentiation and some of its applications. As a consequence we
will concentrate on the mathematical concepts of the derivative, emphasizing
many of its more subtle properties.

5.1 The Derivative


In an elementary calculus course the derivative is usually introduced by con-
sidering the problem of the tangent line to a curve or as the problem of finding
the velocity of an object moving in a straight line. Suppose y = f (x) is a real-
valued function defined on an interval [a, b]. Fix p ∈ [a, b]. For x ∈ [a, b], x 6= p,
the quantity
f (x) − f (p)
Q(x) =
x−p
represents the slope of the straight line (secant line) joining the points
(p, f (p)) and (x, f (x)) on the graph of f (see Figure 5.1). The function Q(x)
is defined for all values of x ∈ [a, b], x 6= p. The limit of Q(x) as x approaches
p, provided this limit exists, is defined as the slope of the tangent line to the
curve y = f (x) at the point (p, f (p)).
A similar type of limit occurs if we consider the problem of defining the
velocity of a moving object. Suppose that an object is moving in a straight
line and that its distance s from a fixed point P is given as a function of t;
namely s = s(t). If to is fixed, then the average velocity over the time interval
from t to to , t 6= to , is defined as

s(t) − s(to )
.
t − to
The limit of this quantity as t approaches to , again provided that the limit
exists, is taken as the definition of the velocity of the object at time to .
Differentiation 183

FIGURE 5.1
Secant line between two points on the graph of f

Both of the previous two examples involve identical limits; namely,


f (x) − f (p) s(t) − s(to )
lim and lim .
x→p x−p t→to t − to
These limits, if they exist, are called the derivatives of the functions f and s
at p and to respectively. The term derivative comes from the French fonction
derivée.

DEFINITION 5.1.1 Let I ⊂ R be an interval and let f be a real-valued


function with domain I. For fixed p ∈ I, the derivative of f at p, denoted
f ′ (p), is defined to be
f (x) − f (p)
f ′ (p) = lim ,
x→p x−p
provided the limit exists. If f ′ (p) is defined at a point p ∈ I, we say that f
is differentiable at p. If the derivative f ′ is defined at every point of a set
E ⊂ I, we say that f is differentiable on E.

If p is an interior point of I, then p + h ∈ I for all h sufficiently small. If


we set x = p + h, h 6= 0, then the definition of the derivative of f at p can be
expressed as
f (p + h) − f (p)
f ′ (p) = lim ,
h→0 h
provided that the limit exists. This formulation of the derivative is sometimes
easier to use.
In the definition of the derivative we do not exclude the possibility that p
is an endpoint of I. If p ∈ I is the left endpoint of I, then
f (x) − f (p) f (p + h) − f (p)
f ′ (p) = lim = lim ,
x→p+ x−p h→0+ h
184 Introduction to Real Analysis

provided of course that the limit exists. The analogous formula also holds if
p ∈ I is the right endpoint of I. In analogy with the right and left limit of a
function we also define the right and left derivative of a function.
DEFINITION 5.1.2 Let I ⊂ R be an interval and let f be a real-valued
function with domain I. If p ∈ I is such that I ∩ (p, ∞) 6= ∅, then the right

derivative of f at p, denoted f+ (p), is defined as

′ f (p + h) − f (p)
f+ (p) = lim+ ,
h→0 h
provided the limit exists. Similarly, if p ∈ I satisfies (−∞, p) ∩ I 6= ∅, then the

left derivative of f at p, denoted f− (p), is given by

′ f (p + h) − f (p)
f− (p) = lim ,
h→0− h
provided the limit exists.

Remarks. (a) If p ∈ Int(I), then f ′ (p) exists if and only if both f+ ′


(p) and

f− (p) exist, and are equal. On the other hand, if p ∈ I is the left (right)
endpoint of I, then f ′ (p) exists if and only if f+ ′
(p) (f−′
(p)) exists. In this
′ ′ ′
case, f (p) = f+ (p) (f− (p)).

The reader should note the distinction between f+ (p) and f ′ (p+). The
first denotes the right derivative of f at p, whereas the later is the right limit
of the derivative; i.e.,
f ′ (p+) = lim+ f ′ (x).
x→p

Here of course we are assuming that f is defined for all x ∈ (p, p + δ) for some
δ > 0.
(b) If f is a differentiable function on an interval I, we will also occasionally
use Leibniz’s notation
d df dy
f (x), , or ,
dx dx dx
to denote the derivative of y = f (x).
(c) If f is differentiable on an interval I, then the derivative f ′ (x) is itself
a function on I. Therefore we can consider the existence of the derivative of
the function f ′ at a point p ∈ I. If the function f ′ has a derivative at a point
p ∈ I, we refer to this quantity as the second derivative of f at p, that we
denote f ′′ (p). Thus
f ′ (p + h) − f ′ (p)
f ′′ (p) = lim .
h→0 h
In a similar fashion we can define the third derivative of f at p, denoted f ′′′ (p)
or f (3) (p). In general, for n ∈ N, f (n) (p) denotes the nth derivative of f at
p. In order to discuss the existence of the nth derivative of f at p, we require
the existence of the (n − 1) st derivative of f on an interval containing p.
Differentiation 185

EXAMPLES 5.1.3 (a) In the exercises (Exercise 2) you will we asked to


prove that if f (x) = xn , n ∈ Z, then f ′ (x) = nxn−1 for all x ∈ R (x 6= 0 if n
is negative). For the function f (x) = x2 , the result is obtained as follows:

(x + h)2 − x2
f ′ (x) = lim = lim (2x + h) = 2x.
h→0 h h→0

A similar computation shows that f ′′ (x) = 2.



(b) Consider f (x) = x, x > 0. We first note that for h 6= 0.
√ √
f (x + h) − f (x) x+h− x
=
h √ h √
√ √
( x + h − x) ( x + h + x)
= √ √
h ( x + h + x)
1
=√ √ .
x+h+ x
√ √
Since lim x + h = x. we have
h→0

1 1
f ′ (x) = lim √ √ = √ .
h→0 x+h+ x 2 x

(c) Consider f (x) = sin x. From the identity

sin(x + h) = sin x cos h + cos x sin h

we obtain
   
sin(x + h) − sin x cos h − 1 sin h
= sin x + cos x .
h h h

By Example 4.1.10(d) and Exercise 5, Section 4.1,

sin h cos h − 1
lim =1 and lim = 0.
h→0 h h→0 h
Therefore,

sin(x + h) − sin x
f ′ (x) = lim
h→0 h
   
cos h − 1 sin h
= sin x lim + cos x lim
h→0 h h→0 h
= cos x.

d
In Exercise 3 you will be asked to prove that (cos x) = − sin x.
dx
186 Introduction to Real Analysis

(d) Let f be defined by


(
x, x ≥ 0,
f (x) = |x| =
−x, x < 0.

Then

′ |h| h
f+ (0) = lim+ = lim+ = 1, and
h→0h h→0 h

′ |h| −h
f− (0) = lim− = lim− = −1.
h→0 h h→0 h
′ ′
Thus f+ (0) and f− (0) both exist, but are unequal. Therefore f ′ (0) does not
exist.
(e) In this example, let g(x) = x3/2 with Dom g = [0, ∞). Then for p = 0,

x3/2 √
g ′ (0) = g+

(0) = lim+ = lim+ x = 0.
x→0 x x→0

Thus g is differentiable at 0 with g ′ (0) = 0.


(f ) Let f be defined by

x sin 1 , x 6= 0,
f (x) = x
 0, x = 0.

1 1 1
For x 6= 0, f ′ (x) = − cos + sin (Exercise 8). When x = 0,
x x x
f (h) − f (0) 1
f ′ (0) = lim = lim sin ,
h→0 h h→0 h
which by Example 4.1.5(a) does not exist. Therefore f ′ (0) does not exist.
(g) Consider the following variation of (f). Let

x2 sin 1 , x 6= 0,
g(x) = x .
 0, x = 0.

This example is very important! In Exercise 9 you will be asked to show that
g ′ (x) exists for all x ∈ R with g ′ (0) = 0, but that the derivative g ′ is not
continuous at 0. 

When Cauchy gave the rigorous definition of the derivative, he assumed


that the given function was continuous on its domain. As a consequence of
the following theorem this requirement is not necessary.
Differentiation 187

THEOREM 5.1.4 If I ⊂ R is an interval and f : I → R is differentiable at


p ∈ I, then f is continuous at p.

Proof. For t 6= p,
 
f (t) − f (p)
f (t) − f (p) = (t − p).
t−p
Since
f (t) − f (p)
lim
t→p t−p
exists and equals f ′ (p), by Theorem 4.1.6(b),
 
f (t) − f (p)
lim (f (t) − f (p)) = lim lim (t − p) = f ′ (p) · 0 = 0.
t→p t→p t−p t→p

Therefore, lim f (t) = f (p) and thus f is continuous at p. In the above, if p is


t→p
an endpoint, then the limits are either the right or left limit at p, whichever
is appropriate. 
Remark. In both Examples 5.1.3(d) and (f), the given function is continuous
at 0, but not differentiable at 0. Given a finite number of points, say p1 , ..., pn ,
it is easy to construct a function f that is continuous but not differentiable
at p1 , ..., pn . For example,
n
X
f (x) = |x − pk |
k=1

has the desired properties. In 1861, Weierstrass constructed a function f that


is continuous at every point of R but nowhere differentiable. When published
in 1874, this example astounded the mathematical community. Prior to this
time mathematicians generally believed that continuous functions were differ-
entiable (except perhaps at a finite number of points). In Example 8.5.3 we
will consider the function of Weierstrass in detail.

Derivatives of Sums, Products, and Quotients


We now derive the formulas for the derivative of sums, products, and quotients
of functions. These rules were discovered by Leibniz in 1675.

THEOREM 5.1.5 Suppose f, g are real-valued functions defined on an in-


terval I. If f and g are differentiable at x ∈ I, then f + g, f g and f /g (if
g(x) 6= 0) are differentiable at x and
(a) (f + g)′ (x) = f ′ (x) + g ′ (x),
(b) (f g)′ (x) = f ′ (x)g(x) + f (x)g ′ (x),
 ′
f f ′ (x)g(x) − f (x)g ′ (x)
(c) (x) = , provided g(x) 6= 0.
g (g(x))2
188 Introduction to Real Analysis

Proof. The proof of (a) is left as an exercise (Exercise 4). For the proof of
(b), by adding and subtracting the term f (x + h)g(x), we have for h 6= 0,
 
(f g)(x + h) − (f g)(x) g(x + h) − g(x)
= f (x + h)
h h
 
f (x + h) − f (x)
+ g(x).
h
By Theorem 5.1.4, since f is differentiable at x, lim f (x + h) = f (x). Thus
h→0
since each of the limits exist, by Theorem 4.1.6
(f g)(x + h) − (f g)(x)
(f g)′ (x) = lim
h→0 h
 
g(x + h) − g(x)
= lim f (x + h) lim
h→0 h→0 h
 
f (x + h) − f (x)
+ g(x) lim
h→0 h
= f (x)g ′ (x) + g(x)f ′ (x).

To prove (c), we first prove that (1/g)′ (x) = −g ′ (x)/[g(x)]2 , provided


g(x) 6= 0. The result then follows by writing f /g as f · (1/g) and applying the
product formula (b). If g(x) 6= 0, then since g is continuous at x, as in Theorem
4.1.6(c), g(x + h) 6= 0 for all h sufficiently small. Thus for h sufficiently small
and nonzero,
1 1
−  
g(x + h) g(x) g(x + h) − g(x) 1
=− .
h h g(x)g(x + h)
Again, using the fact that lim g(x + h) = g(x), by Theorem 4.1.6
h→0

1 1
 ′ −
1 g(x + h) g(x)
(x) = lim
g h→0 h
 
g(x + h) − g(x) 1
= − lim lim
h→0 h h→0 g(x)g(x + h)

−g ′ (x)
= 2 . 
g (x)

The Chain Rule


The previous theorem allows us to compute the derivatives of sums, products,
and quotients of differentiable functions. The chain rule on the other hand
allows us to compute the derivative of a function obtained from the composi-
tion of two or more differentiable functions. Prior to stating and proving the
result, we introduce some useful notation.
Differentiation 189

Suppose f is differentiable at x ∈ I. For t ∈ I, t 6= x, set


f (t) − f (x)
Q(t) = .
t−x
Then by Definition 5.1.1, Q(t) → f ′ (x) as t → x. If we let u(t) = Q(t) − f ′ (x),
then u(t) → 0 as t → x. Therefore, if f is differentiable at x, for t 6= x,

f (t) − f (x) = (t − x) [f ′ (x) + u(t)] , where u(t) → 0 as t → x. (1)

By setting u(x) = 0, the above identity is valid for all t ∈ I.

THEOREM 5.1.6 (Chain Rule) Suppose f is a real-valued function de-


fined on an interval I and g is a real-valued function defined on some interval
J such that Range f ⊂ J. If f is differentiable at x ∈ I and g is differentiable
at f (x), then h = g ◦ f is differentiable at x and

h′ (x) = g ′ (f (x)) f ′ (x).

Proof. Let y = f (x). Then by (1)

f (t) − f (x) = (t − x) [f ′ (x) + u(t)] , (2)


g(s) − g(y) = (s − y) [g ′ (y) + v(s)] , (3)

where t ∈ I, s ∈ J and u(t) → 0 as t → x and v(s) → 0 as s → y. Let s = f (t).


Since f is continuous at x, s → y as t → x. By identity (3), and then (2),

h(t) − h(x) = g(f (t)) − g(f (x))


= [f (t) − f (x)] [g ′ (y) + v(s)]
= (t − x) [f ′ (x) + u(t)] [g ′ (y) + v(f (t))] .

Therefore, for t 6= x,
h(t) − h(x)
= [f ′ (x) + u(t)] [g ′ (y) + v(f (t))] .
t−x
Since v(f (t)) and u(t) both have limit 0 as t → x,

h(t) − h(x)
lim = f ′ (x) g ′ (y) = g ′ (f (x)) f ′ (x). 
t→x t−x

To illustrate the previous theorem we consider the following examples.

EXAMPLES 5.1.7 (a) By Example 5.1.3(c) the function f (x) = sin x is


differentiable on R. Hence if g : I → R is differentiable on the interval I,
h(x) = (f ◦ g)(x) = sin g(x) is differentiable on I with

h′ (x) = f ′ (g(x))g ′ (x) = g ′ (x) cos g(x).


190 Introduction to Real Analysis

In particular, if g(x) = 1/x2 , Dom g = (0, ∞), then by Theorem 5.1.5(c),


−2x 2
g ′ (x) = 2 2
= − 3, x ∈ (0, ∞).
(x ) x
Therefore,
d 1 2 1
sin 2 = − 2 cos 2 .
dx x x x

(b) By Exercise 2
d n
x = nxn−1 for all n ∈ N.
dx
Thus if f : I → R is differentiable on the interval I, then by the chain rule
g(x) = [f (x)]n , n ∈ N, is differentiable on I with g ′ (x) = n[f (x)]n−1 f ′ (x).
This formula can also be obtained from Theorem 5.1.5(b) using mathematical
induction.

Exercises 5.1
1. Use the definition to find the derivative of each
√ of the following functions.
*a. f (x) = x3 , x ∈ R b. g(x) = x + 2, x > −2
1 1
*c. h(x) = , x 6= 0 d. k(x) = √ , x > −2
x x+2
x x
*e. f (x) = , x 6= −1 f. g(x) = 2 x∈R
x+1 x +1
d n
2. *Prove that for all integers n, x = n xn−1 (x 6= 0 if n is negative).
dx
d
3. *a. Prove that (cos x) = − sin x.
dx
sin x
b. Find the derivative of tan x = .
cos x
4. Prove Theorem 5.1.5(a).
5. For each of the following, determine whether the given function is differ-
entiable at the indicated point xo . Justify your answer!
*a. f (x) = x|x| at xo = 0
(
x2 , x ∈ Q,
b. f (x) = at xo = 0
0, x∈/ Q,
*c. g(x) = (x − 2)[x], at xo = 2

d. h(x) = x + 2, x ∈ [−2, ∞), at x0 = −2
(
sin x, x ∈ Q,
*e. f (x) = at xo = 0
x, x∈/ Q,
(
x4/3 cos x, x 6= 0,
f. g(x) = at xo = 0
0, x = 0,
Differentiation 191

6. Let f (x) = |x|3 . Compute f ′ (x), f ′′ (x), and show that f ′′′ (0) does not
exist.
7. Determine where each of the following functions from R to R is differen-
tiable and find the derivative.
*a. f (x) = x [x]. b. g(x) = |x
 − 2| + |x + 1|.
x2 1 , 0 < x ≤ 1,
*c. h(x) = | sin x|. d. k(x) = x
0, x=0

8. Use the product rule, quotient rule, and chain rule to find the derivative
of each of the following.
a f (x) = x sin x1 , x 6= 0 b. f (x) = (cos(sin x)n )m , n, m ∈ N
p √
c. x + 2 + x d. f (x) = x4 (x + sin x1 ), x 6= 0
2
sin x  
e. f (x) = 2 f. f (x) = cos4 x−1
x+1
, x 6= 1
1 + sin x
9. Let g be defined by

x2 sin 1 , x 6= 0,
g(x) = x
 0, x = 0.
a. Prove that g is differentiable at 0 and that g ′ (0) = 0.
*b. Show that g ′ (x) is not continuous at 0.
10. Let f be defined by
(
x2 + 2, x ≤ 2,
f (x) =
ax + b, x > 2.
*a. For what values of a and b is f continuous at 2 ?
*b. For what values of a and b is f differentiable at 2?
11. Let f be defined by

ax + b,
 x < −1,
f (x) = x3 + 1, −1 ≤ x ≤ 2,

cx + d, x > 2.

Determine the constants a, b, c, and d such that f is differentiable on R.


12. Assume there exists a function L : (0, ∞) → R satisfying L′ (x) = 1/x for
all x ∈ (0, ∞). Find the derivative of each of the following.
*a. f (x) = L(2x + 1), x > − 12 b. g(x) = L(x2 ), x 6= 0
*c. h(x) = [L(x)]3 , x > 0
d. k(x) = L(L(x)), x ∈ {x > 0 : L(x) > 0}
13. Let L be the function of Exercise 12.
a. Show that L is one-to-one on (0, ∞).
b. Let E = L−1 on R. By considering L(E(y)) prove that E ′ (y) = E(y).
14. For b real, let f be defined by

xb sin 1 , x > 0,
f (x) = x
 0, x ≤ 0.
192 Introduction to Real Analysis

Prove the following:


a. f is continuous at 0 if and only if b > 0.
*b. f is differentiable at 0 if and only if b > 1.
c. f ′ is continuous at 0 if and only if b > 2.
15. a. If f is differentiable at xo , prove that
f (xo + h) − f (xo − h)
lim = f ′ (xo ).
h→0 2h
f (xo + h) − f (xo − h)
*b. If lim exists, is f differentiable at xo ?
h→0 2h
16. If f : (a, b) → R is differentiable at p ∈ (a, b), prove that
f ′ (p) = lim n[f (p + 1
n
) − f (p)].
n→∞

Show by example that the existence of the limit of the sequence


{n[f (p + n1 ) − f (p)]} does not imply the existence of f ′ (p).
17. (Leibniz’s Rule) Suppose f and g have nth order derivatives on (a, b).
Prove that
n
!
(n)
X n
(f g) (x) = f (k) (x)g (n−k) (x).
k=0
k

5.2 The Mean Value Theorem


In this section, we will prove the mean value theorem and give several conse-
quences of this important result. Even though the proof itself is elementary,
the theorem is one of the most useful results of analysis. Its importance is
based on the fact that it allows us to relate the values of a function to values
of its derivative. We begin the section with a discussion of local maxima and
minima.

Local Maxima and Minima

DEFINITION 5.2.1 Suppose E ⊂ R and f is a real-valued function with


domain E. The function f has a local maximum at a point p ∈ E if there
exists a δ > 0 such that f (x) ≤ f (p) for all x ∈ E ∩ Nδ (p). The function f
has an absolute maximum at p ∈ E if f (x) ≤ f (p) for all x ∈ E.
Similarly, f has a local minimum at a point q ∈ E if there exists a
δ > 0 such that f (x) ≥ f (q) for all x ∈ E ∩ Nδ (q), and f has an absolute
minimum at q ∈ E if f (x) ≥ f (q) for all x ∈ E.

Remark. As a consequence of Corollary 4.2.9, every continuous real-valued


function defined on a compact subset K of R has an absolute maximum and
minimum on K.
Differentiation 193

FIGURE 5.2
Absolute maxima and minima on the graph of f

The function f illustrated in Figure 5.2 has a local maximum at a, p2 , and


p4 , and a local minimum at p1 , p3 , and b. The points (p4 , f (p4 )) and (p1 , f (p1 ))
are absolute maxima and absolute minima respectively.
The following theorem gives the relationship between local maxima of a
function defined on an interval and the values of its derivative.

THEOREM 5.2.2 Let f be a real-valued function defined on an interval I,


and suppose f has either a local minimum or local maximum at p ∈ Int(I). If
f is differentiable at p, then f ′ (p) = 0.
′ ′
Proof. If f is differentiable at p ∈ Int(I), then f− (p) and f+ (p) both exist
and are equal. Suppose f has a local maximum at p. Then there exists a
δ > 0 such that f (t) ≤ f (p) for all t ∈ I with |t − p| < δ. In particular, if
p < t < p + δ, t ∈ I, then
f (t) − f (p)
≤ 0.
t−p

Thus f+ (p) ≤ 0. Similarly, if p − δ < t < p,

f (t) − f (p)
≥ 0,
t−p
′ ′ ′
and therefore f− (p) ≥ 0. Finally, since f+ (p) = f− (p) = f ′ (p), we have

f (p) = 0. The proof of the case where f has a local minimum at p is similar.

As a consequence of the previous theorem we have the following corollary.
194 Introduction to Real Analysis

COROLLARY 5.2.3 Let f be a continuous real-valued function on [a, b]. If


f has a relative maximum or minimum at p ∈ (a, b), then either the derivative
of f at p does not exist, or f ′ (p) = 0.

Remark. The conclusion of Theorem 5.2.2 is not valid if p ∈ I is an endpoint


of the interval. For example, if f : [a, b] → R has a relative maximum at a, and
if f is differentiable at a, then we can only conclude that f ′ (a) = f+

(a) ≤ 0.
This is illustrated in the following

EXAMPLES 5.2.4 (a) The function


 2
1
f (x) = x − , 0 ≤ x ≤ 2,
2
has a local maximum at p = 0 and p = 2, and an absolute minimum at q = 12 .
By computation, we have f+ ′ ′
(0) = −1, f− (2) = 3, and f ′ ( 21 ) = 0. In Exercise
1 you will be asked to graph the function f .
(b) The function f (x) = |x|, x ∈ [−1, 1], has an absolute minimum at
p = 0. However, by Example 5.1.3(d) the derivative does not exist at p = 0.


Rolle’s theorem
Prior to stating and proving the mean value theorem we first state and prove
the following theorem due to Michel Rolle (1652–1719).

THEOREM 5.2.5 (Rolle’s Theorem) Suppose f is a continuous real-


valued function on [a, b] with f (a) = f (b), and that f is differentiable on
(a, b). Then there exists c ∈ (a, b) such that f ′ (c) = 0.

Since the derivative of f at c gives the slope of the tangent line at (c, f (c)), a
geometric interpretation of Rolle’s theorem is that if f satisfies the hypothesis
of the theorem, then there exists a least one value of c ∈ (a, b) where the
tangent line to the graph of f is horizontal. For the function f depicted in
Figure 5.3, there are exactly two such points.
Proof. If f is constant on [a, b], then f ′ (x) = 0 for all x ∈ [a, b]. Thus, we
assume that f is not constant. Since the closed interval [a, b] is compact, by
Corollary 4.2.9, f has a maximum and a minimum on [a, b]. If f (t) > f (a) for
some t, then f has a maximum at some c ∈ (a, b). Thus by Theorem 5.2.2,
f ′ (c) = 0. If f (t) < f (a) for some t, then f has a minimum at some c ∈ (a, b),
and thus again f ′ (c) = 0. 
Remarks. (a) Continuity of f on [a, b] is required in the proof of Rolle’s
theorem. The function
(
x, 0 ≤ x < 1,
f (x) =
0, x = 1,
Differentiation 195

FIGURE 5.3
Rolle’s theorem

is differentiable on (0, 1) and satisfies f (0) = f (1) = 0; yet f ′ (x) 6= 0 for all
x ∈ (0, 1). The function f fails to be continuous at 1.
(b) For Rolle’s theorem, differentiability
√ of f at a and b is not required. For
example, the function f (x) = 4 − x2 , x ∈ [−2, 2] satisfies the hypothesis of
Rolle’s theorem, yet the derivative does not exist at −2 and 2. For x ∈ (−2, 2),
−x
f ′ (x) = √ ,
4 − x2
and the conclusion of Rolle’s theorem is satisfied with c = 0.

The Mean Value Theorem


As a consequence of Rolle’s theorem we obtain the mean value theorem. This
result is usually attributed to Joseph Lagrange (1736–1813).

THEOREM 5.2.6 (Mean Value Theorem) If f : [a, b] → R is continuous


on [a, b] and differentiable on (a, b), then there exists c ∈ (a, b) such that

f (b) − f (a) = f ′ (c) (b − a).

Graphically, the mean value theorem states that there exists at least one
point c ∈ (a, b) such that the slope of the tangent line to the graph of the
function f is equal to the slope of the straight line passing through (a, f (a))
and (b, f (b)). For the function of Figure 5.4, there are two such values of c,
namely c1 and c2 .
Proof. Consider the function g defined on [a, b] by
 
f (b) − f (a)
g(x) = f (x) − f (a) − (x − a).
b−a
196 Introduction to Real Analysis

Then g is continuous on [a, b], differentiable on (a, b), with g(a) = g(b). Thus
by Rolle’s theorem there exists c ∈ (a, b) such that g ′ (c) = 0. But

f (b) − f (a)
g ′ (x) = f ′ (x) −
b−a
f (b) − f (a)
for all x ∈ (a, b). Taking x = c gives f ′ (c) = , from which the
b−a
conclusion now follows. 

FIGURE 5.4
Mean value theorem

The mean value theorem is one of the fundamental results of differential


calculus. Its importance lies in the fact that it enables us to obtain information
about a function f from its derivative f ′ . In Example 5.2.7 we will illustrate
how the mean value theorem can be used to derive inequalities. Other appli-
cations will be given later in this section and in the exercises. It will also be
used in many other instances later in the text.

EXAMPLE 5.2.7 In this example, we illustrate how the mean value theo-
rem may be used in proving elementary inequalities. We will use it to prove
that
x
≤ ln(1 + x) ≤ x for all x > −1,
1+x
where ln denotes the natural logarithm function. This function is considered
in greater detail in Example 6.3.5 of the next chapter. Let f (x) = ln(1 + x),
x ∈ (−1, ∞). Then f (0) = 0. If x > 0, then by the mean value theorem, there
Differentiation 197

exists c ∈ (0, x) such that

ln(1 + x) = f (x) − f (0) = f ′ (c) x.

But f ′ (c) = (1+c)−1 and (1+x)−1 < (1+c)−1 < 1 for all c ∈ (0, x). Therefore
x
< f ′ (c)x < x,
1+x
and as a consequence
x
≤ ln(1 + x) ≤ x for all x ≥ 0.
1+x
Now suppose −1 < x < 0. Then again by the mean value theorem there exists
c ∈ (x, 0) such that
x
ln(1 + x) = f (x) − f (0) = .
1+c
But since x < c < 0, we have 1 < (1 + c)−1 < (1 + x)−1 , and since x is
negative,
x
< ln(1 + x) < x.
1+x
Hence the desired inequality holds for all x > −1, with equality if and only if
x = 0. 

The following theorem, attributed to Cauchy, is a useful generalization of


the mean value theorem.

THEOREM 5.2.8 (Cauchy Mean Value Theorem) If f, g are contin-


uous real-valued functions on [a, b] that are differentiable on (a, b), then there
exists c ∈ (a, b) such that

[f (b) − f (a)] g ′ (c) = [g(b) − g(a)] f ′ (c).

Proof. Let
h(x) = [f (b) − f (a)] g(x) − [g(b) − g(a)] f (x).
Then h is continuous on [a, b], differentiable on (a, b) with

h(a) = f (b)g(a) − f (a)g(b) = h(b).

Thus by Rolle’s theorem, there exists c ∈ (a, b) such that h′ (c) = 0, which
gives the result. 
The geometric interpretation of the Cauchy mean value theorem is very
similar to that of the mean value theorem. If g ′ (x) 6= 0 for all x ∈ (a, b), then
g(a) 6= g(b) and the conclusion of Theorem 5.2.8 can be written as
f (b) − f (a) f ′ (c)
= ′ .
g(b) − g(a) g (c)
198 Introduction to Real Analysis

FIGURE 5.5
Cauchy mean value theorem

Suppose x = g(t), y = f (t), a ≤ t ≤ b, is a parametric representation of a


curve C in the plane. As t moves along the interval [a, b] the point (x, y) moves
along C from the point P = (g(a), f (a)) to Q = (g(b), f (b)). The slope of the
line joining P to Q is given by [f (b) − f (a)]/[g(b) − g(a)] (see Figure 5.5).
On the other hand, the quantity f ′ (t)/g ′ (t) is the slope of the curve C at the
point (g(t), f (t)). Thus one meaning of Theorem 5.2.8 is that there must be
a point on the curve C where the slope of the curve is the same as the slope
of the line joining P to Q.

Applications of the Mean Value Theorem


We now give several consequences of the mean value theorem. Additional
applications are also given in the exercises. In the following, I will denote an
arbitrary interval in R.

THEOREM 5.2.9 Suppose f : I → R is differentiable on the interval I.


(a) If f ′ (x) ≥ 0 for all x ∈ I, then f is monotone increasing on I.
(b) If f ′ (x) > 0 for all x ∈ I, then f is strictly increasing on I.
(c) If f ′ (x) ≤ 0 for all x ∈ I, then f is monotone decreasing on I.
(d) If f ′ (x) < 0 for all x ∈ I, then f is strictly decreasing on I.
(e) If f ′ (x) = 0 for all x ∈ I, then f is constant on I.

Proof. Suppose x1 , x2 ∈ I with x1 < x2 . By the mean value theorem applied


to f on [x1 , x2 ],
f (x2 ) − f (x1 ) = f ′ (c) (x2 − x1 )
for some c ∈ (x1 , x2 ). If f ′ (c) ≥ 0, then f (x2 ) ≥ f (x1 ). Thus, if f ′ (x) ≥ 0 for
all x ∈ I, we have f (x2 ) ≥ f (x1 ) for all x1 , x2 ∈ I with x1 < x2 . Thus f is
monotone increasing on I. The other results follow similarly. 
Differentiation 199

Remark. It needs to be emphasized that if the derivative of a function f


is positive at a point c, then this does not imply that f is increasing on
an interval containing c. The function f of Exercise 20 satisfies f ′ (0) = 1,
but f ′ (x) assumes both negative and positive values in every neighborhood
of 0. Thus f is not monotone on any interval containing 0. If f ′ (c) > 0, the
only conclusion that can be reached is that there exists a δ > 0 such that
f (x) < f (c) for all x ∈ (c − δ, c) and f (x) > f (c) for all x ∈ (c, c + δ) (Exercise
17). This however does not mean that f is increasing on (c−δ, c+δ). However,
if f ′ (c) > 0 and f ′ is continuous at c, then there exists a δ > 0 such that
f ′ (x) > 0 for all x ∈ (c − δ, c + δ). Thus f is increasing on (c − δ, c + δ).
Theorem 5.2.9 is often used to determine maxima and minima of functions
as follows: Suppose f is a real-valued continuous function on (a, b) and c ∈
(a, b) is such that f ′ (c) = 0 or f ′ (c) does not exist. Suppose f is differentiable
on (a, c) and (c, b). If f ′ (x) < 0 for all x ∈ (a, c) and f ′ (x) > 0 for all x ∈ (c, b),
then by Theorem 5.2.9, f is decreasing on (a, c) and increasing on (c, b). As a
consequence one concludes that f has a relative minimum at c. This method
is usually referred to as the first derivative test for relative maxima or
minima. The natural inclination is to think that the converse is also true;
namely, if f has a relative minimum at c, then f is decreasing to the left of c
and increasing to the right of c. This however, as the following example shows,
is false!

EXAMPLE 5.2.10 Let f be defined by


  
x4 2 + sin 1 , x 6= 0,
f (x) = x
0, x = 0.

The function f has an absolute minimum at x = 0; however, f ′ (x) has both


negative and positive values in every neighborhood of 0. The details are left as
an exercise (Exercise 21). The graph of f ′ (x) = 4x3 (2 + sin 1/x) − x2 cos 1/x),
x 6= 0, is given in Figure 5.6. 

The following theorem, besides being useful in computing right or left


derivatives at a point, also states that the derivative (if it exists everywhere
on an interval) can only have discontinuities of the second kind.

THEOREM 5.2.11 Suppose f : [a, b) → R is continuous on [a, b) and dif-


ferentiable on (a, b). If lim f ′ (x) exists, then f+

(a) exists and
x→a+


f+ (a) = lim+ f ′ (x).
x→a


Proof. Let L = lim+ f (x), that is assumed to exist. Given ǫ > 0, there exists
x→a
a δ > 0 such that

|f ′ (x) − L| < ǫ for all x, a < x < a + δ.


200 Introduction to Real Analysis

FIGURE 5.6
Graph of f ′ (x) = 4x3 (2 + sin(1/x)) − x2 cos(1/x), x 6= 0

Suppose 0 < h < δ is such that a+h < b. Since f is continuous on [a, a+h] and
differentiable on (a, a+h), by the mean value theorem f (a+h)−f (a) = f ′ (ζh ) h
for some ζh ∈ (a, a + h). Therefore
f (a + h) − f (a)
− L = |f ′ (ζh ) − L| < ǫ
h

for all h, 0 < h < δ. Thus f+ (a) = L. 

EXAMPLES 5.2.12 (a) To illustrate the previous theorem, consider the


function (
x2 + 1, x < 1,
f (x) = .
3 − x2 , x ≥ 1.
For x < 1, f ′ (x) = 2x that has a left limit of 2 at x = 1. Thus by the theorem,

f− (1) = lim− 2x = 2.
x→1

Similarly,

f+ (1) = lim −2x = −2.
x→1+

(b) The converse of Theorem 5.2.11 is false. The function



x2 sin 1 , x=6 0,
g(x) = x
 0, x = 0,

of Example 5.1.3(g) has the property that g ′ (0) exists but lim g ′ (x) does not.
x→0

Differentiation 201

Intermediate Value Theorem for Derivatives


Our second important result of this section, due to Jean Gaston Darboux
(1842–1917), is the intermediate value theorem for derivatives. The remarkable
aspect of this theorem is that the hypothesis does not require continuity of
the derivative. If the derivative were continuous, then the result would follow
from Theorem 4.2.11 applied to f ′ .

THEOREM 5.2.13 (Intermediate Value Theorem for Derivatives)


Suppose I ⊂ R is an interval and f : I → R is differentiable on I. Then given
a, b in I with a < b and a real number λ between f ′ (a) and f ′ (b), there exists
c ∈ (a, b) such that f ′ (c) = λ.

Proof. Define g by g(x) = f (x) − λ x. Then g is differentiable on I with


g ′ (x) = f ′ (x) − λ.
Suppose f ′ (a) < λ < f ′ (b). Then g ′ (a) < 0 and g ′ (b) > 0. As in the
remark following Theorem 5.2.9, since g ′ (a) < 0 there exists an x1 > a such
that g(x1 ) < g(a). Also, since g ′ (b) > 0, there exists an x2 < b such that
g(x2 ) < g(b). As a consequence, g has an absolute minimum at some point
c ∈ (a, b). But then
g ′ (c) = f ′ (c) − λ = 0,
i.e., f ′ (c) = λ. 
The previous theorem is often used in calculus to determine where a func-
tion is increasing or decreasing. Suppose it has been determined that the
derivative f ′ is zero at c1 and c2 with c1 < c2 , and that f ′ (x) 6= 0 for all
x ∈ (c1 , c2 ). Then by the previous theorem, it suffices to check the sign of the
derivative at a single point in the interval (c1 , c2 ) to determine whether f ′ is
positive or negative on the whole interval (c1 , c2 ). Theorem 5.2.9 then allows
us to determine whether f is increasing or decreasing on (c1 , c2 ).

Inverse Function Theorem


We conclude this section with the following version of the inverse function
theorem.

THEOREM 5.2.14 (Inverse Function Theorem) Suppose I ⊂ R is an


interval and f : I → R is differentiable on I with f ′ (x) 6= 0 for all x ∈ I. Then
f is one-to-one on I, the inverse function f −1 is continuous and differentiable
on J = f (I) with
′ 1
f −1 (f (x)) = ′
f (x)
for all x ∈ I.

Proof. Since f ′ (x) 6= 0 for all x ∈ I, by Theorem 5.2.13, f ′ is either positive


on I, or negative on I. Assume that f ′ (x) > 0 for all x ∈ I. Then by Theorem
202 Introduction to Real Analysis

5.2.9, f is strictly increasing on I and by Theorem 4.4.12 f −1 is continuous


on J = f (I).
It remains to be shown that f −1 is differentiable on J. Let yo ∈ J, and
let {yn } be any sequence in J with yn → yo , and yn 6= yo for all n. For
each n, there exists xn ∈ I such that f (xn ) = yn . Since f −1 is continuous,
xn → xo = f −1 (yo ). Hence

f −1 (yn ) − f −1 (yo ) xn − xo
lim = lim
n→∞ y n − yo n→∞ f (xn ) − f (xo )
1
= ′ .
f (xo )

Since this holds for any sequence {yn } with yn → yo , yn 6= yo , by Theorem


4.1.3 and the definition of the derivative
′ 1
f −1 (yo ) = . 
f ′ (x o)

Remark. The hypothesis that f ′ (x) 6= 0 for all x ∈ I is crucial. For example,
the function f (x) = x3 is strictly increasing on [−1, 1] with f ′ (0) = 0. The
inverse function f −1 (y) = y 1/3 however is not differentiable at y = 0.

EXAMPLES 5.2.15 (a) As an application of the previous theorem we show


that f (x) = x1/n , x ∈ (0, ∞), n ∈ N, is differentiable on (0, ∞) with
1 1 −1
f ′ (x) = xn
n
for all x ∈ (0, ∞). Consider the function g(x) = xn , n ∈ N, Dom g = (0, ∞).
Then g ′ (x) = nxn−1 and g ′ (x) > 0 for all x ∈ (0, ∞). By the previous theorem
g −1 is differentiable on J = g((0, ∞)) = (0, ∞) with
1 1
(g −1 )′ (g(x)) = = .
g ′ (x) nxn−1
1
If we set y = g(x) = xn , then x = y n and
1 1 1 −1
(g −1 )′ (y) = 1 = yn .
n(y n )n−1 n

Since f = g −1 the desired result follows.


(b) As in Example 5.2.7, let L(x) = ln x denote the natural logarithm
function on (0, ∞). Since L′ (x) = 1/x is stricly positive on (0, ∞), the function
L is one-to-one, the inverse function L−1 is continuous on R = Range L, and
by Theorem 5.2.14,
1
(L−1 )′ (L(x)) = ′ = x.
L (x)
Differentiation 203

If we set E = L−1 , then E ′ (L(x)) = x, or E ′ (y) = E(y) where y = L(x).


The function E(x), x ∈ R is called the natural exponential function on R
and is usually denoted by ex , where e is Euler’s number of Example 3.3.5. The
exponential function E(x) is considered in greater detail in Example 8.7.20(d).

(c) In this example, we consider the inverse function of g(x) = cos x, x ∈


[0, π]. Since g ′ (x) = − sin x is strictly negative for x ∈ (0, π), the function g
is strictly decreasing on [0, π] with g([0, π]) = [−1, 1]. Thus for y ∈ [−1, 1],
x = arccos y if and only if cos x = y. Finally, since g ′ (x) 6= 0 for x ∈ (0, π), by
the inverse function theorem
′ 1 1 −1
g −1 (g(x)) = ′ = =√ ,
g (x) − sin x 1 − cos2 x
or since y = cos x,
d −1
arccos y = p . 
dy 1 − y2

Exercises 5.2
1. Graph the function f (x) = (x − 21 )2 , 0 ≤ x ≤ 2. Show that f+

(0) = −1

and f− (2) = 3.
2. Which of the following functions satisfy the hypothesis of the mean value
theorem. For those to which the mean value theorem applies, calculate a
suitable c.
a. f (x) = |x|, −2 ≤ x ≤ 2 b. f (x) = 2x − x3 , 0 ≤ x ≤ 2
x
c. f (x) = , −1 ≤ x ≤ 2. d. f (x) = 1 − x2/3 , −2 ≤ x ≤ 1
x+2
3. For each of the following functions determine the interval(s) where the
function is increasing, decreasing, and find all local maxima and minima.
*a. f (x) = x3 + 6x − 5, x ∈ R. b. g(x) = 4x − x4 , x ∈ R.
2
x √
*c. h(x) = , x ∈ R. d. k(x) = x − 12 x, x ≥ 0.
1 + x2
4 x−a
*e. l(x) = x + 2 , x 6= 0. f. f (x) = , a 6= b, x 6= b
x x−b
Xn
4. Let f (x) = (x−ai )2 , where a1 , a2 , ..., an are constants. Find the value
i=1
of x where f is a minimum.
5. As in Example 5.2.7 use the mean value theorem to establish each of the
following inequalities.

*a. 1 + x < 1 + 21 x, x > −1
b. ex ≥ 1 + x, x ∈ R
*c. xα − aα < αaα−1 (x − a), 0 < a < x, 0 < α < 1
d.1 (1 + x)α ≥ 1 + αx, x > −1, α > 1
1 For α ∈ N, this inequality was proved by mathematical induction in Example 1.3.2(b).
d α
In this exercise and in Exercise 6(b) you may assume that for α ∈ R, x = αxα−1 .
dx
204 Introduction to Real Analysis

6. Prove each of the following inequalities.


*a. a1/n − b1/n < (a − b)1/n , a > b > 0, n ∈ N, n ≥ 2.
*b. aα b1−α ≤ αa + (1 − α)b, a, b > 0, 0 < α < 1
7. (Second Derivative Test) Let f : [a, b] → R be differentiable on (a, b).
Suppose c ∈ (a, b) is such that f ′ (c) = 0, and f ′′ (c) exists.
*a. If f ′′ (c) > 0, prove that f has a local minimum at c.
b. If f ′′ (c) < 0, prove that f has a local maximum at c.
c. Show by examples that no conclusion can be made if f ′′ (c) = 0.
8. *Suppose f : (a, b) → R satisfies |f (x) − f (y)| ≤ M |x − y|α for some
α > 1 and all x, y ∈ (a, b). Prove that f is a constant function on (a, b).
9. *Find a polynomial P (x) of degree less than or equal to 2 with P (2) = 0
such that the function
(
x2 , x ≤ 1,
f (x) =
P (x), x > 1,
is differentiable at x = 1.
10. *Let g be defined by
(
2 sin x + cos 2x, x ≤ 0,
g(x) =
ax2 + bx + c, x > 0.
Determine the constants a, b, and c such that g ′ (0) and g ′′ (0) exist.
11. a. Suppose f is differentiable on an interval I. Prove that f ′ is bounded
on I if and only if there exists a constant M such that |f (x) − f (y)| ≤
M |x − y| for all x, y ∈ I.
b. Prove that | sin x − sin y| ≤ |x − y| for all x, y ∈ R.
√ √ 1
c. Prove that | x − y| ≤ 2√ a
|x − y| for all x, y ∈ [a, ∞), a > 0.
12. *a. Show that tan x > x for x ∈ (0, π2 ].
*b. Set

 sin x , x ∈ (0, π2 ],
f (x) = x
 1, x = 0.
Show that f is strictly decreasing on [0, π2 ].
2
c. Using the result of part (b), prove that x ≤ sin x ≤ x for all
π
x ∈ [0, π2 ].
13. Give an example of a uniformly continuous function on [0, 1] that is dif-
ferentiable on (0, 1) but for which f ′ is not bounded on (0, 1).
14. *a. Suppose f ′ (x) exists for all x ∈ (a, b). Let c ∈ (a, b). Show that
there exists a sequence {xn } in (a, b) with xn 6= c and xn → c such that
f ′ (xn ) → f ′ (c).
b. Does f ′ (xn ) → f ′ (c) for every sequence {xn } with xn → c?
Differentiation 205

15. Let f, g : [0, ∞) → R be continuous on [0, ∞) and differentiable on


(0, ∞). If f (0) = g(0) and f ′ (x) ≥ g ′ (x) for all x ∈ (0, ∞), prove that
f (x) ≥ g(x) for all x ∈ [0, ∞).
16. A differentiable function f : [a, b] → R is uniformly differentiable on
[a, b] if for every ǫ > 0 there exists a δ > 0 such that
f (t) − f (x)
− f ′ (x) < ǫ
t−x
for all t, x ∈ [a, b] with 0 < |t − x| < δ. Show that f is uniformly
differentiable on [a, b] if and only if f ′ is continuous on [a, b].

17. *Suppose f : [a, b] → R with f+ (a) > 0. Prove that there exists a δ > 0
such that f (x) > f (a) for all x, a < x < a + δ.
18. Prove that the equation x3 − 3x + b = 0 has at most one root in the
interval [−1, 1].
19. *Suppose g is differentiable on (a, b) with |g ′ (x)| ≤ M for all x ∈ (a, b).
Prove that there exists an ǫ > 0 such that the function f (x) = x + ǫ g(x)
is one-to-one on (a, b).
20. Let

x + 2x2 sin 1 , x 6= 0,
f (x) = x
 0, x = 0.

a. Show that f (0) = 1.
*b. Prove that f ′ (x) assumes both positive and negative values in every
neighborhood of 0.
21. Let f be defined by
  
x4 2 + sin 1 , x 6= 0,
f (x) = x
0, x = 0.

a. Show that f has an absolute minimum at x = 0.


b. Show that f ′ (x) assumes both negative and positive values in every
neighborhood of 0.
22. Let f (x) = x2 , g(x) = x3 , x ∈ [−1, 1].
a. Find c ∈ (−1, 1) such that the conclusion of Theorem 5.2.8 holds.
b. Show that there does not exist any c ∈ (−1, 1) for which
f (1) − f (−1) f ′ (c)
= ′ .
g(1) − g(−1) g (c)
23. For r ∈ Q and x > 0, let f (x) = xr . Prove that f ′ (x) = r xr−1 .
24. Suppose L : (0, ∞) → R is a differentiable function satisfying L′ (x) = 1/x
with L(1) = 0. Prove each of the following:
*a. L(ab) = L(a) + L(b) for all a, b ∈ (0, ∞)
b. L(1/b) = −L(b), b > 0
*c. L(br ) = rL(b), b > 0, r ∈ R
206 Introduction to Real Analysis

d. L(e) = 1, where e is Euler’s number


e. Range L = R
25. Let g(x) = tan x, − π2 < x < π
2
.
a. Show that g is one-to-one on (− π2 , π2 ) with Range g = R.
*b. Let arctan x, x ∈ R, denote the inverse function of g. Use Theorem
5.2.14 to prove that
d 1
arctan x = .
dx 1 + x2
c. Sketch the graph of tan x and arctan x.
26. a. Show that f (x) = sin x is one-to-one on [− π2 , π2 ] with f ([− π2 , π2 ]) =
[−1, 1].
b. For x ∈ [−1, 1], let arcsin x denote the inverse function of f . Show that
arcsin x is differentiable on (−1, 1) and find the derivative of arcsin x.
27. Let f : (0, ∞) → R be differentiable on (0, ∞) and suppose that
lim f ′ (x) = L.
x→∞

f (x + h) − f (x)
a. Show that for any h > 0, lim = L.
x→∞ h
f (x)
b. Show that lim = L.
x→∞ x

5.3 L’Hospital’s Rule


As another application of the mean value theorem we now prove l’Hospital’s
rule for evaluating limits. Although the theorem is named after the Marquis
de l’Hospital (1661–1704), it should more appropriately be called Bernoulli’s
rule. The story is that in 1691, l’Hospital asked Johann Bernoulli (1667–1748)
to provide, for a fee, lectures on the new subject of calculus. L’Hospital subse-
quently incorporated these lectures into the first calculus text L’Analyse des
infiniment petis (Analysis of infinitely small quantities) published in 1696.
The initial version (stated without the use of limits) of what is now known as
l’Hospital’s rule first appeared in this text.

Infinite Limits
Since l’Hospital’s rule allows for infinite limits, we provide the following defi-
nitions.

DEFINITION 5.3.1 Let f be a real-valued function defined on a subset E


of R and let p be a limit point of E. We say that f tends to ∞ (or diverges
to ∞) as x approaches p, denoted

lim f (x) = ∞,
x→p
Differentiation 207

if for every M ∈ R, there exists a δ > 0 such that


f (x) > M for all x ∈ E with 0 < |x − p| < δ.
Similarly,
lim f (x) = −∞,
x→p

if for every M ∈ R, there exists a δ > 0 such that


f (x) < M for all x ∈ E with 0 < |x − p| < δ.

For f defined on an appropriate subset E of R it is also possible to define


each of the following limits:
lim f (x) = ±∞, lim f (x) = ±∞, lim f (x) = ±∞, lim f (x) = ±∞.
x→p+ x→p− x→∞ x→−∞

Since these definitions are similar to Definitions 4.1.11 and 4.4.1 they are left
to the exercises (Exercise 1).
Remark. Since we now allow the possibility of a function having infinite
limits, it needs to be emphasized that when we say that a function f has a
limit at p ∈ R (or at ±∞), we mean a finite limit.

L’Hospital’s Rule
L’Hospital’s rule is useful for evaluating limits of the form
f (x)
lim
x→p g(x)

where either (a) lim f (x) = lim g(x) = 0 or (b) f and g tend to ±∞ as x → p.
x→p
x→p
If (a) holds, then lim (f (x) g(x)) is usually referred to as indeterminate of
x→p
form 0/0, whereas in (b) the limit is referred to as indeterminate of form ∞/∞.
The reason that (a) and (b) are indeterminate are that previous methods may
no longer apply.
In (a), if either lim f (x) or lim g(x) is nonzero, then previous methods
x→p x→p
discussed in Section 4.1 apply. For example, if both f and g have limits at p
and lim g(x) 6= 0, then by Theorem 4.1.6(c)
x→p

lim f (x)
f (x) x→p
lim = .
x→p g(x) lim g(x)
x→p

On the other hand, if lim f (x) = A 6= 0 and g(x) > 0 with lim g(x) = 0, then
x→p x→p

as x → p, f (x) g(x) tends to ∞ if A > 0, and to −∞ if A < 0 (Exercise 5).
However, if lim f (x) = lim g(x) = 0, then unless the quotient f (x)/g(x) can
x→p x→p
somehow be simplified, previous methods may no longer be applicable.
208 Introduction to Real Analysis

THEOREM 5.3.2 (L’Hospital’s Rule) Suppose f, g are real-valued dif-


ferentiable functions on (a, b), with g ′ (x) 6= 0 for all x ∈ (a, b), where
−∞ ≤ a < b ≤ ∞. Suppose
f ′ (x)
lim+ = L, where L ∈ R ∪ {−∞, ∞}.
x→a g ′ (x)
If
(a) lim f (x) = 0 and lim g(x) = 0, or
x→a+ x→a+
(b) lim+ g(x) = ±∞, then
x→a

f (x)
lim+ = L.
x→a g(x)
Remark. The analogous result where x → b− is obviously also true. A more
elementary version of l’Hospital’s rule that relies only on the definition of
the derivative is given in Exercise 2. Also, Exercise 7 provides
 examples of
two functions f and g satisfying (a) for which lim (f (x) g(x)) exists but
x→a
lim (f ′ (x) g ′ (x)) does not exist.

x→a
Proof. Suppose (a) holds. We first prove the case where a is finite. Let {xn }
be a sequence in (a, b) with xn → a and xn 6= a for all n. Since we want to
apply the generalized mean value theorem to f and g on the interval [a, xn ],
we need both f and g continuous at a. This is accomplished by setting
f (a) = g(a) = 0.
Then by hypothesis (a), f and g are continuous at a. Thus by the generalized
mean value theorem, for each n ∈ N there exists cn between a and xn such
that
[f (xn ) − f (a)]g ′ (cn ) = [g(xn ) − g(a)]f ′ (cn ),
or
f (xn ) f ′ (cn )
= ′ .
g(xn ) g (cn )
Note, since g ′ (x) 6= 0 for all x ∈ (a, b), g(xn ) 6= g(a) for all n As n → ∞,
cn → a+ . Thus by Theorem 4.1.3 and the hypothesis,
f (xn ) f ′ (cn ) f ′ (x)
lim = lim ′ = lim+ ′ = L.
n→∞ g(xn ) n→∞ g (cn ) x→a g (x)

Since the above holds for every sequence {xn } with xn → a+ , the result
follows.
Suppose a = −∞. To handle this case, we make the substitution x = −1/t.
Then as t → 0+ , x → −∞. Define the functions ϕ(t) and ψ(t) on (0, c) for
some c > 0 by
ϕ(t) = f (−1/t) and ψ(t) = g(−1/t).
Differentiation 209

We leave it as an exercise (Exercise 3) to verify that


ϕ′ (t) f ′ (x)
lim+ ′
= lim ′ = L,
t→0 ψ (t) x→−∞ g (x)
and that
lim ϕ(t) = lim+ ψ(t) = 0.
t→0+ t→0

Thus by the above,


f (x) ϕ(t)
lim = lim = L.
x→−∞ g(x) t→0+ ψ(t)
Suppose now that (b) holds, i.e., lim+ g(x) = ∞. The case where g(x) →
x→a
−∞ is treated similarly. Rather than treating the finite case and infinite case
separately, we provide a proof that works for both.
Suppose first that −∞ ≤ L < ∞, and β ∈ R satisfies β > L. Choose r
such that L < r < β. Since
f ′ (x)
lim+ < r,
x→a g ′ (x)
there exists c1 ∈ (a, b) such that
f ′ (ζ)
<r for all ζ, a < ζ < c1 .
g ′ (ζ)

Fix a y, a < y < c1 . Since g(x) → ∞ as x → a+ , there exists a c2 , a < c2 < y,


such that g(x) > g(y) and g(x) > 0 for all x, a < x < c2 . Let x ∈ (a, c2 ) be
arbitrary. Then by the generalized mean value theorem, there exists ζ ∈ (x, y)
such that
f (x) − f (y) f ′ (ζ)
= ′ < r. (4)
g(x) − g(y) g (ζ)
Multiplying (4) by (g(x) − g(y))/g(x), which is positive, we obtain
 
f (x) − f (y) g(y)
<r 1− ,
g(x) g(x)
or  
f (x) f (y) g(y)
< +r 1− (5)
g(x) g(x) g(x)
for all x, a < x < c2 . Now for fixed y, since g(x) → ∞,
f (y) g(y)
lim = lim = 0.
x→a+ g(x) x→a g(x)
+

Therefore  
f (y) g(y)
lim+ +r 1− =r<β
x→a g(x) g(x)
210 Introduction to Real Analysis

Thus there exists c3 , a < c3 < c2 , such that


 
f (y) g(y)
+r 1− <β
g(x) g(x)

for all x, a < x < c3 . Thus by (5),


f (x)
<β for all x, a < x < c3 . (6)
g(x)

If L = −∞, then for any β ∈ R, there exists c3 such that (6) holds for all
x, a < x < c3 . Thus by definition,
f (x)
lim = −∞.
x→a+ g(x)
If L is finite, then given ǫ > 0, by taking β = L + ǫ, there exists c3 such that
f (x)
<L+ǫ for all x, a < x < c3 . (7)
g(x)

Suppose −∞ < L ≤ ∞. Let α ∈ R, α < L be arbitrary. Then an argument


similar to the above gives the existence of c′3 ∈ (a, b) such that
f (x)
>α for all x, a < x < c′3 .
g(x)
If L = ∞, then this implies that
f (x)
lim = ∞.
x→a+ g(x)
On the other hand, if L is finite, taking α = L − ǫ gives the existence of a c′3
such that
f (x)
> L − ǫ for all x, a < x < c′3 .
g(x)
Combining this with (7) proves that
f (x)
lim = L. 
x→a+ g(x)

Remarks. (a) The proof of case (a) could have been done similarly to that of
(b), treating the case where a is finite and −∞ simultaneously. I chose not to
do so since making the substitution x = −1/t is a useful technique, reducing
problems involving limits at −∞ to right limits at 0. Conversely, limits at 0
can be transformed to limits at ±∞ with the substitution x = 1/t. These
new limits are in many instances easier to evaluate than the original. This is
illustrated in Example 5.3.4(c).
Differentiation 211

(b) In hypothesis (b) we only required that lim g(x) = ±∞. If lim f (x)
x→a+ x→a+
is finite, then it immediately follows that

f (x)
lim = 0,
x→a+ g(x)

and l’Hospital’s rule is not required (Exercise 4). Thus in practice, hypothesis
(b) of l’Hospital’s rule is used only if both f and g have infinite limits.
For convenience we stated and proved l’Hospital’s rule in terms of right
limits. Since the analogous results for left limits are also true, combining the
two results gives the following corollary.

COROLLARY 5.3.3 (L’Hospital’s Rule) Suppose f, g are real-valued


differentiable functions on (a, p)∪(p, b), with g ′ (x) 6= 0 for all x ∈ (a, p)∪(p, b),
where −∞ ≤ a < b ≤ ∞. Suppose

f ′ (x)
lim = L, where L ∈ R ∪ {−∞, ∞}.
x→p g ′ (x)

If
(a) lim f (x) = lim g(x) = 0, or
x→p x→p
(b) lim g(x) = ±∞, then
x→p

f (x)
lim = L.
x→p g(x)

ln(1 + x)
EXAMPLES 5.3.4 (a) Consider lim , where ln is the natural
x
x→0+
logarithm function on (0, ∞). This limit is indeterminate of form 0/0. With
f (x) = ln(1 + x) and g(x) = x,

f ′ (x) 1
lim+ ′
= lim+ = 1.
x→0 g (x) x→0 1 + x

ln(1 + x)
Thus by l’Hospital’s rule lim = 1.
x→0+ x
Although l’Hospital’s rule provides an easy method for evaluating this
limit, the result can also be obtained by using previous techniques. In Example
5.2.7 we proved that
x
≤ ln(1 + x) ≤ x
1+x
for all x > −1. Thus
1 ln(1 + x)
≤ ≤1
1+x x
212 Introduction to Real Analysis
ln(1 + x)
for all x > 0. Thus by Theorem 4.1.9 lim = 1.
x
x→0+
1 − cos x
(b) In this example, we consider lim . This is indeterminate of
x→0 x2
form 0/0. If we apply l’Hospital’s rule we obtain

sin x
lim
x→0 2x
which is again indeterminate of form 0/0. However, applying l’Hospital’s rule
one more time gives
cos x 1
lim =
x→0 2 2
1 − cos x
. Therefore lim = 21 .
x→0 x2
(c) Consider
1
e− x
lim+ .
x→0 x
Since lim+ e−1/x = 0, the above limit is indeterminate of form 0/0. If we
x→0
apply l’Hospital’s rule we obtain
1
e− x
lim ,
x→0+ x2

and this limit is more complicated than the original limit. However, if we let
t = 1/x, then
1
e− x t
lim+ = lim t .
x→0 x t→∞ e

This limit is indeterminate of form ∞/∞. By l’Hospital’s rule


t 1
lim = lim t = 0.
t→∞ et t→∞ e

Therefore, lim+ e−1/x x = 0.




x→0

Exercises 5.3
1. Provide definitions for each of the following limits:
a. lim f (x) = ∞. b. lim f (x) = ∞.
x→p+ x→∞

2. *Suppose f, g are differentiable on (a, b), xo ∈ (a, b) and g ′ (xo ) 6= 0. If


f (xo ) = g(xo ) = 0, prove that
f (x) f ′ (xo )
lim = ′ .
x→xo g(x) g (xo )
(Hint: apply the definition of the derivative.)
Differentiation 213

3. Let h(x) be defined on (−∞, b). Show that there exists a c > 0 such that
ϕ(t) = h(−1/t) is defined on (0, c), and that lim h(x) = lim ϕ(t).
x→−∞ t→0+
4. *Let f, g be real-valued functions defined on (a, b). If lim f (x) exists in
x→a+
R and lim g(x) = ∞, prove that
x→a+
f (x)
lim = 0.
x→a+ g(x)
5. Suppose f, g are real-valued functions on (a, b) satisfying lim f (x) =
x→a+
A 6= 0, lim g(x) = 0, and g(x) > 0 for all x ∈ (a, b). If A > 0, prove
x→a+
that
f (x)
lim = ∞.
x→a+ g(x)
6. Use l’Hospital’s rule and any of the differentiation formulas from calculus
to find each of the following limits. In the following, ln x, x > 0 denotes
the natural logarithm function.
x5 + 2x − 3 x5 + 2x − 3
*a. lim 3 2
. b. lim .
x→1 2x − x − 1 x→−1 2x3 + x2 + 1
ln x 1 − cos 2x
*c. lim . d. lim .
x→∞ x x→0 sin x
ln(1 + x)
*e. lim xa ln, x where a > 0. f. lim .
x→0+ x→0 sin x
a p
(ln x) (ln x)
*g. lim , where a > 0. h. lim , p, q ∈ R.
x→∞  ex  x→∞ xq
1 1
*i. lim − . j. lim x1/x
x→0+ x sin x x→∞

f (x)
7. Let f (x) = x2 sin(1/x) and g(x) = sin x. Show that lim exists but
x→0 g(x)

f (x)
that lim ′ does not exist.
x→0 g (x)

p(x)
8. Investigate lim , where p and q are polynomials of degree n and m,
x→∞ q(x)
respectively.
9. Let f (x) = (sin x)/x for x 6= 0, and f (0) = 1.
*a. Show that f ′ (0) exists, and determine its value.
*b. Show that f ′′ (0) exists, and determine its value.
10. Let f be defined on R by
− 1
(
e x2 , x 6= 0,
f (x) =
0, x = 0.
Prove that f (n) (0) = 0 for all n = 1, 2, ....
214 Introduction to Real Analysis

5.4 Newton’s Method2


In this section, we consider the iterative method, commonly known as New-
ton’s method, for finding approximations to the solutions of the equation
f (x) = 0. Although the method is named after Newton, it is actually due
to Joseph Raphson (1648–1715) and in many texts the method is referred to
as the Newton-Raphson method. Newton did derive an iterative method for
finding the roots of a cubic equation; his method however is not the one used
in the procedure named after him. That was developed by Raphson.
Suppose f is a continuous function on [a, b] satisfying f (a)f (b) < 0. Then
f has opposite sign at the endpoints a and b and thus by the intermediate
value theorem (Theorem 4.2.11) there exists at least one value c ∈ (a, b) for
which f (c) = 0. If in addition f is differentiable on (a, b) with f ′ (x) 6= 0 for
all x ∈ (a, b), then f is either strictly increasing or decreasing on [a, b], and in
this case the value c is unique; that is, there is exactly one point where the
graph of f crosses the x-axis.
An elementary approach to finding a numerical approximation to the value
c is the method of bisection. For this method, differentiability of f is not
required. To illustrate the method, suppose f satisfies f (a) < 0 < f (b). Let
1
c1 = (a + b).
2
If f (c1 ) = 0, we are done. If f (c1 ) 6= 0, then c belongs to one of the two
intervals (a, c1 ) or (c1 , b), and thus |c1 − c| < 21 (b − a). Suppose f (c1 ) > 0.
Then c ∈ (a, c1 ), and in this case we set c0 = a and
1
c2 = (c0 + c1 ).
2
If f (c2 ) = 0, we are done. If not, then suppose f (c2 ) < 0. Then c ∈ (c2 , c1 ),
and as above we set
1
c3 = (c1 + c2 ).
2
In general, suppose c1 , c2 , ..., cn , n ≥ 2, have been determined. If by happen-
stance f (cn ) = 0, then we have obtained the exact value. If f (cn−1 )f (cn ) < 0,
then c lies between cn−1 and cn , and we define
1
cn+1 = (cn + cn−1 ).
2
On the other hand, if f (cn−1 )f (cn ) > 0, then c lies between cn and cn−2 , and
in this case, we define
1
cn+1 = (cn + cn−2 ).
2
2 The topics of this section are not required in subsequent chapters.
Differentiation 215

FIGURE 5.7
Newton’s method

This gives us a sequence {cn } that satisfies


1
|cn − c| ≤ (b − a),
2n
and thus lim cn = c.
n→∞
Although this method provides a sequence of numbers that converges to
the zero of f , it has the disadvantage that the convergence is rather slow.
An alternate method, due to Raphson, uses tangent lines to the curve to find
successive points cn approximating the zero of f . As we will see, this method
will often converge much more rapidly to the solution.
As above, assume that f is differentiable on [a, b] with f (a)f (b) < 0 and
f ′ (x) 6= 0 for all x ∈ [a, b]. Let c1 be an initial guess to the value c. The line
tangent to the graph of f at (c1 , f (c1 )) has equation given by
y = f (c1 ) + f ′ (c1 )(x − c1 ).
Since f ′ (c1 ) 6= 0, the line crosses the x-axis at a point that we denote by c2
(Figure 5.7). Thus
0 = f (c1 ) + f ′ (c1 )(c2 − c1 ),
that upon solving for c2 gives
f (c1 )
c2 = c1 − .
f ′ (c1 )
We now replace the point c1 by the second estimate c2 to obtain c3 , and so
forth. Inductively, we obtain a sequence {cn } given by the formula
f (cn )
cn+1 = cn − , n = 1, 2, ..., (8)
f ′ (cn )
216 Introduction to Real Analysis

where c1 is an initial guess to the solution f (c) = 0. As we will see, under


suitable hypothesis, the sequence {cn } will converge very rapidly to a solution
of the equation f (x) = 0. Before we prove the main result, we illustrate the
above with an example.

EXAMPLE 5.4.1 Let α > 0 and consider the function

f (x) = x2 − α.

If α > 1, then f has exactly one zero on [0, α], namely √α. If 0 < α < 1, then
the zero of f lies in [0, 1]. Let c1 be an initial guess to α. Then by formula
(8), for n ≥ 1,
c2n − α
 
1 α
cn+1 = cn − = cn + .
2cn 2 cn
This is exactly the sequence of Exercise 6 of Section
√ 3.3, where the reader
was asked to prove that the sequence converges to α. With α = 2, taking
c1 = 1.4 as an initial guess, yields

c2 = 1.4142857,
c3 = 1.4142135,

which is already correct to at least seven decimal places. 

THEOREM 5.4.2 (Newton’s Method) Let f be a real-valued function on


[a, b] that is twice differentiable on [a, b]. Suppose that f (a)f (b) < 0 and that
there exist constants m and M such that |f ′ (x)| ≥ m > 0 and |f ′′ (x)| ≤ M
for all x ∈ [a, b]. Then there exists a subinterval I of [a, b] containing a zero c
of f such that for any c1 ∈ I, the sequence {cn } defined by

f (cn )
cn+1 = cn − , n ∈ N,
f ′ (cn )
is in I, and lim cn = c. Furthermore,
n→∞

M
|cn+1 − c| ≤ |cn − c|2 . (9)
2m
Prior to proving Theorem 5.4.2 we first state and prove the following
lemma. The result is in fact a special case of Taylor’s theorem (8.7.16) that
will be discussed in detail in Chapter 8.

LEMMA 5.4.3 Suppose f : [a, b] → R is such that f and f ′ are continuous


on [a, b] and f ′′ (x) exists for all x ∈ (a, b). Let x0 ∈ [a, b]. Then for any
x ∈ [a, b], there exists a real number ζ between x0 and x such that
1
f (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + f ′′ (ζ)(x − x0 )2 .
2
Differentiation 217

Proof. For x ∈ [a, b], let α ∈ R be determined by

f (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + α(x − x0 )2 .

Define g on [a, b] by

g(t) = f (t) − f (x0 ) − f ′ (x0 )(t − x0 ) − α(t − x0 )2 .

If x = x0 then the conclusion is true with ζ = x0 . Assume that x > x0 . Then


g is continuous and differentiable on [x0 , x] with g(x0 ) = g(x) = 0. Thus by
Rolle’s theorem there exists c1 ∈ (x0 , x) such that g ′ (c) = 0. But

g ′ (t) = f ′ (t) − f ′ (x0 ) − 2α(t − x0 ).

By hypothesis g ′ is continuous on [x0 , c], differentiable on (x0 , c), and satisfies


g ′ (x0 ) = g ′ (c) = 0. Thus by Rolle’s theorem again, there exists ζ ∈ (x0 , c)
such that g ′′ (ζ) = 0. But

g ′′ (t) = f ′′ (t) − 2α.

Therefore α = 21 f ′′ (ζ) 
Proof of Theorem 5.4.2 Since f (a)f (b) < 0 and f ′ (x) 6= 0 for all x ∈ [a, b],
f has exactly one zero c in the interval (a, b).
Let x0 ∈ [a, b] be arbitrary. By Lemma 5.4.3 there exists a point ζ between
c and x0 such that
1
0 = f (c) = f (x0 ) + f ′ (x0 )(c − x0 ) + f ′′ (ζ)(c − x0 )2 ,
2
or
1
−f (x0 ) = f ′ (x0 )(c − x0 ) + f ′′ (ζ)(c − x0 )2 . (10)
2
If x1 is defined by
f (x0 )
x1 = x0 − ′ ,
f (x0 )
then by equation (10)

1 f ′′ (ζ)
x1 = x0 + (c − x0 ) + (c − x0 )2 .
2 f ′ (x0 )

Therefore
1 |f ′′ (ζ)| M
|x1 − c| = |c − x0 |2 ≤ |c − x0 |2 . (11)
2 |f ′ (x0 )| 2m
Choose δ > 0 so that δ < 2m/M and I = [c − δ, c + δ] ⊂ [a, b]. If cn ∈ I, then
|c − cn | < δ. If cn+1 is defined by (8), then by (11)

M 2
|cn+1 − c| ≤ δ < δ.
2m
218 Introduction to Real Analysis

FIGURE 5.8
An illustration of Remark (c)

Therefore cn+1 ∈ I. Thus if the initial choice c1 ∈ I, cn ∈ I for all n = 2, 3, ....

It remains to be shown that lim cn = c. If c1 ∈ I, then by induction


n→∞
 n
M
|cn+1 − c| < δ |c1 − c|.
2m

M
But by our choice of δ, δ < 1, and as a consequence cn → c. 
2m
Remarks. (a) For a given function f satisfying the hypothesis of the theorem,
the constants M and m, and thus δ can be determined. To determine the
interval I, one can use the method of bisection to find an approximation xn
to c satisfying |xn − xn−1 | < δ. If c1 is taken to be xn , then |c1 − c| < δ.
In practice however, one usually makes a judicious guess for c1 and proceeds
with the computations.
(b) Let en = c − cn be the error in approximating c, and let K = M/2m.
Then inequality (9) can be expressed as

|en+1 | ≤ K |en |2 .

Consequently, if |en | < 10−m , then |en+1 | < K10−2m . Thus, except for the
constant factor K, the accuracy actually doubles at each step. For this reason,
Newton’s method is usually referred to as a second order or quadratic
method.
(c) Even though Newton’s method is very efficient, there are a number of
things that can go wrong if c1 is poorly chosen. For example, in Figure 5.8,
the initial choice of c1 gives a c2 outside the interval, and the subsequent cn
tend to −∞. Such a function is given by f (x) = x/(x2 + 1). In Figure 5.9, the
initial choice of c1 causes the subsequent values to oscillate between c1 and
c2 . A function having this property is given by g(x) = x − 51 x3 . Taking c1 = 1
Differentiation 219

FIGURE 5.9
An example of oscillating c1 and c2

gives c2 = −1, c3 = 1, etc. For this reason, the initial choice of c1 for many
functions has to be sufficiently close to c in order to be sure that the method
works.

Exercises 5.4
1. *For α > 0, apply Newton’s method to f (x) = x3 −α to obtain a sequence
{cn } that converges to the cube root of α.
2. Use Newton’s method to find approximations to the roots, accurate to
six decimal places, of the given functions on the interval [0, 1].
*a. f (x) = x3 − 3x + 1. b. f (x) = 3x3 − 5x2 + 1
3 2
c. f (x) = 8x − 8x + 1.
3. Use Newton’s method to approximate the real zeros of f (x) = x4 − 4x − 3
accurate to four decimal places
4. Show that f (x) = ln x−x+3, x ∈ (0, ∞) has two real zeros. Use Newton’s
method to approximate them accurate to four decimal places.
5. Let f : [a, b] → R be differentiable on [a, b] with f (a) < 0 < f (b). Suppose
there exist constants m and M such that 0 < m ≤ f ′ (x) ≤ M for all
x ∈ [a, b]. Let c1 ∈ [a, b] be arbitrary, and define
f (cn )
cn+1 = cn − .
M
Prove that the sequence {cn } converges to the unique zero of f on [a, b].
(Hint: Consider the function g(x) = x − f (x)/M .)

Notes
Without question the most significant result of this chapter is the mean value theo-
rem. The simplicity of its proof disguises the importance and usefulness of the result.
The theorem allows us to obtain information about the function from its derivative.
220 Introduction to Real Analysis

This has many applications as was illustrated by the subsequent theorems and ex-
ercises. Additional applications will be encountered throughout the text.
Although the mean value theorem is attributed to Lagrange, his proof, that ap-
peared around 1772, was based on the false assumption that every function could be
expanded in a power series. Cauchy, in his 1823 text Résumé des Lecons donnees a
L’École Royale Polytechnique sur le Calcul Infinitésimal, used the modern definition
of the derivative to provide a proof of the mean value theorem. His statement and
proof of the theorem however differs from the version in the text in that he assumed
continuity of the derivative. What Cauchy actually
 proved was that if f ′ is contin-
uous on [a, b], then the quantity {f (b) − f (a)} {b − a} lies between the minimum
and maximum values of f ′ on [a, b]. (See Miscellaneous Exercise 4.) Then by the
intermediate value theorem (Theorem 4.2.11) applied to the continuous function f ′ ,
there exists c ∈ (a, b) such that

f (b) − f (a) = f ′ (c)(b − a).

It is worth noting that our proof of the mean value theorem depends ultimately on
the completeness property of R (through Rolle’s theorem and Corollary 4.2.9).
The mean value theorem can justifiably be called the fundamental theorem of
differential calculus. It allowed the development of rigorous proofs of many results
that were previously taken as fact or “proved” from geometric constructions. Al-
though the modern proof of l’Hospital’s rule uses the mean value theorem, it should
be remembered that the original version for calculating the limit of a quotient where
both the numerator and denominator become zero first appeared in 1696, 70 years
before Lagrange’s proof of the mean value theorem. The original version was stated
and “proved” in a purely geometric manner without reference to limits. For further
details, including a history of calculus, the reader is referred to the text by Katz
listed in the Bibliography.
The Bernoulli brothers, Jakob (1654–1705) and Johann (1667–1748) were among
the first mathematicians in Europe to use the new techniques of Newton and Leib-
niz in the study of curves and related physical problems. Among these were finding
the equations of the catenary and isochrone.3 Both brothers also contributed to the
study of differential equations by solving the Bernoulli equation y ′ +P (x)y = Q(x)y n .
Through their numerous publications and correspondence with other mathemati-
cians the Bernoulli brothers helped to establish the utility of the new calculus. The
first text on differential calculus by l’Hospital also contributed significantly to pop-
ularizing the subject.
Leonhard Euler (1707–1783), one of the most prolific mathematicians in his-
tory, contributed significantly to establishing calculus as an independent science.
Even though the calculus of exponential and logarithmic functions was basically
developed by Johann Bernoulli, it was Euler’s expositions on these topics in the
eighteenth century that brought them into the mainstream of mathematics. Much
of what we know today about the exponential, logarithmic and trigonometric func-
tions is due to Euler. He was also among the first mathematicians to define the
concept of a function. However to Euler, as for the other mathematicians of that
3 The catenary problem involves finding the equation of a freely hanging cable, whereas

the isochrone problem involves finding the equation of a curve along which an object would
fall with uniform vertical velocity.
Differentiation 221

period, a function was one that had a power series expansion. It is important to note
that most mathematicians of the eighteenth century, including Euler, were primarily
concerned with computations needed for the applications of calculus; proofs did not
gain prominence until the nineteenth century. For this reason, numerous results of
that era that were assumed to be true were subsequently proved to be true only
under more restrictive conditions.

Miscellaneous Exercises
1. Let f : (a, b) → R and suppose f ′′ exists at xo ∈ (a, b). Prove that
f (xo + h) + f (xo − h) − 2f (xo )
f ′′ (xo ) = lim .
h→0 h2
Give an example where this limit exists at xo but f ′′ does not exist at
xo .
2. Let f be a real-valued differentiable function on R.
a. If there exists a constant b < 1 such that |f ′ (x)| < b for all x ∈ R,
prove that f has a fixed point in R. (See Exercise 13 of Section 4.3).
b. Show that the function f (x) = x + (1 + ex )−1 satisfies |f ′ (x)| < 1 for
all x ∈ R but that f has no fixed point in R.
3. A function f is convex (or concave up) on the interval (a, b) if for any
x, y ∈ (a, b), and 0 < t < 1, f (tx + (1 − t)y) ≤ t f (x) + (1 − t) f (y).
a. If f is convex on (a, b), prove that f is continuous on (a.b).
′ ′
b. If f is convex on (a, b), prove that f+ (p) and f− (p) exist for every
p ∈ (a, b). Show by example that a convex function on (a, b) need not be
differentiable on (a, b).
c. Suppose f ′′ (x) exists for all x ∈ (a, b). Prove that f is convex on (a, b)
if and only if f ′′ (x) ≥ 0 for all x ∈ (a, b).
4. Suppose f is differentiable on [a, b] and that f ′ is continuous on [a, b].
Without using the mean value theorem, prove that
f (b) − f (a)
min{f ′ (x) : x ∈ [a, b]} ≤ ≤ max{f ′ (x) : x ∈ [a, b]}.
b−a
5. (T.M. Flett) If f is differentiable on [a, b] and f ′ (a) = f ′ (b), prove that
there exists ζ ∈ (a, b) such that
f (ζ) − f (a)
= f ′ (ζ).
ζ −a
6. Suppose f : (a, b) → R is differentiable at c ∈ (a, b). If {sn } and {tn } are
sequences in (a,b) with sn < c < tn and lim (tn − sn ) = 0, prove that
n→∞
f (tn ) − f (sn )
lim = f ′ (c).
n→∞ tn − s n
222 Introduction to Real Analysis

Supplemental Reading

Baxley, J. V. and Hayashi, E. K., “In- Miller, A. D. and Vyborny, R., “Some
determinate forms of exponential type,” remarks on functions with one-sided
Amer. Math. Monthly 85 (1978), 484– derivatives” Amer. Math. Monthly 93
486. (1986), 471–475.
Cajori, F., “Historical note on the Pan, D., “A maximum principle for
Newton-Raphson method of approxima- high-order derivatives,” Amer. Math.
tion,” Amer. Math. Monthly 18 (1911), Monthly 120 (2013), 846–848.
29–33. Range, R. M., “Where are lim-
Corless, R. M., “Variations on a its needed in Calculus,” Amer. Math.
theme of Newton,” Math. Mag. 71 Monthly 118 (2011), 404–417.
(1998), 34–41. Rosenholtz, “There is no differen-
Flett, T.M., “A mean value theo- tiable metric for Rn ,” Amer. Math.
rem,” Math. Gaz. 42 (1958), 38 – 39. Monthly 86 (1979), 585–586.
Hall, W. S. and Newell, M. L., “The Rotando, L. M. and Korn, H., “The
mean value theorem for vector valued indeterminate form 00 ,” Math. Mag. 50
functions,” Math. Mag. 52 (1979), 157– (1977), 41–42.
158. Sahoo, M. R., “Example of a mono-
Hartig, Donald, “L’Hopitals rule via tonic everywhere differentiable function
integration,” Amer. Math. Monthly 98 on R whose derivative is not continuous,”
(1991), 156–157. Amer. Math. Monthly 120 (2013), 566–
Katznelson, Y. and Stromberg, K., 568.
“Everywhere differentiable, nowhere Tandra, H., “A yet simpler proof of
monotone function,” Amer. Math. the chain rule,” Amer. Math. Monthly
Monthly 81 (1974), 349–354. 120 (2013), 900.
Langlois, W. E. and Holder, L. I., Thurston, H. A., “On the defini-

“The relation of f+ (a) to ”f ′ (a+),” tion of the tangent line,” Amer. Math.
Math. Mag. 39 (1966), 112–120. Monthly 71 (1964), 1099–1103.
Lynch, M., “A continuous function Tong, J. and Braza, P. A., “A con-
that is differentiable only at the ratio- verse to the mean value theorem,” Amer.
nals,” Math. Mag. 86 (2013), 132–135. Math. Monthly 104 (1997), 939–942.
6
Integration

When Newton and Leibniz developed the calculus, both considered integration
as the inverse operation of differentiation. For example, in the De analysi1 ,
Newton proved that the area under the curve y = axm/n (m/n 6= −1) is given
by
m
an x n +1
m+n
by using his differential calculus to prove that if A(x) represents the area from
0 to x then A′ (x) = axm/n . Even though Leibniz arrived at the concept of the
integral by using sums to compute the area, integration itself was always the
inverse operation of differentiation. Throughout the eighteenth century, the
Rb
definite integral of a function f (x) on [a, b], denoted a f (x)dx, was defined
as F (b) − F (a) where F was any function whose derivative was f (x). This
remained as the definition of the definite integral until the 1820’s.
The modern approach to integration is again due to Cauchy, who was the
first mathematician to construct a theory of integration based on approxi-
mating the area under the curve. Euler had previously used sums of the form
Pn
f (xk−1 )(xk − xk−1 ) to approximate the integral of a function f (x) in situ-
k=1
ations where the function F (x) could not be computed. Cauchy however used
limits of such sums to develop a theory of integration that was independent of
the differential calculus. One of the difficulties with Cauchy’s definition of the
integral was that it was very restrictive; only functions that were continuous or
continuous except at a finite number of points were proved to be integrable.
However, one of the key achievements of Cauchy was that using his defini-
tion he was able to prove the fundamental theorem of calculus; specifically,
if f is continuous on [a, b], then there exists a function F on [a, b] such that
F ′ (x) = f (x) for all x ∈ [a, b].
The modern definition of integration was developed in 1853 by Georg
Bernhard Riemann (1826–1866). Riemann was led to the development of the
integral by trying to characterize which functions were integrable accord-
ing to Cauchy’s definition. In the process, he modified Cauchy’s definition
and developed the theory of integration which now bears his name. One of
his achievements was that he was able to provide necessary and sufficient
1 The Mathematical Works of Isaac Newton, edited by D. T. Whiteside, Johnson Reprint

Corporation, New York, 1964.

223
224 Introduction to Real Analysis

conditions for a real-valued bounded function to be integrable. In Section 1,


we develop the theory of the Riemann integral using the approach of Jean
Gaston Darboux (1842–1917). In this section, we also include the statement
of Lebesgue’s theorem which provides necessary and sufficient conditions that
a bounded real-valued function defined on a closed and bounded interval be
Riemann integrable. The equivalence of the Riemann and Darboux approach
will be proved in Section 6.2.
In Section 6.5 we will consider the more general Riemann-Stieltjes integral
which will give meaning to the following types of integrals:
Z 1 Z b Z b
2
f (x) dx , f (x) d[x], or f (x) dα(x),
0 a a

where α is a monotone increasing function on [a, b]. These types of integrals


were developed by Thomas-Jean Stieltjes (1856–1894) and arise in many appli-
cations in both mathematics and physics. The theory itself involves only minor
modifications in the definition of the Riemann integral; the consequences how-
ever are far reaching. The Riemann-Stieltjes integral permits the expression
of many seemingly diverse results as a single formula.

6.1 The Riemann Integral


There are traditionally two approaches to the theory of the Riemann inte-
gral; namely the original method of Riemann, and the method introduced by
Darboux in 1875 using lower and upper sums. I have chosen the latter ap-
proach to define the Riemann integral because of its easy adaptability to the
Riemann-Stieltjes integral. We will however consider both methods and show
that they are in fact equivalent.

Upper and Lower Sums


Let [a, b], a < b, be a given closed and bounded interval in R. By a partition
P of [a, b] we mean a finite set of points P = {x0 , x1 , ..., xn } such that
a = x0 < x1 < · · · < xn = b.
There is no requirement that the points xi be equally spaced. For each i =
1, 2, ..., n, set
∆xi = xi − xi−1 ,
which is equal to the length of the interval [xi−1 , xi ].
Suppose f is a bounded real-valued function on [a, b]. Given a partition
P = {x0 , x1 , ..., xn } of [a, b], for each i = 1, 2, ..., n, let
mi = inf{f (t) : xi−1 ≤ t ≤ xi },
Mi = sup{f (t) : xi−1 ≤ t ≤ xi }.
Integration 225

Since f is bounded, by the least upper bound property the quantities mi


and Mi exist in R. If f is a continuous function on [a, b], then by Corollary
4.2.9, for each i there exist points ti , si ∈ [xi−1 , xi ] such that Mi = f (ti ) and
mi = f (si ).
The upper sum U (P, f ) for the partition P and function f is defined by
n
X
U (P, f ) = Mi ∆xi .
i=1

Similarly, the lower sum L(P, f ) is defined by


n
X
L(P, f ) = mi ∆xi .
i=1

Since mi ≤ Mi for all i = 1, ..., n, we always have

L(P, f ) ≤ U(P, f )

for any partition P of [a, b]. The upper sum for a nonnegative continuous
function f is illustrated in Figure 6.1. In this case, U (P, f ) represents the
circumscribed rectangular approximation to the area under the graph of f .
Similarly the lower sum represents the inscribed rectangular approximation
to the area under the graph of f .

FIGURE 6.1
Upper sum U (P, f )
226 Introduction to Real Analysis

Upper and Lower Integrals


If the function f satisfies m ≤ f (t) ≤ M for all t ∈ [a, b], then

m (b − a) ≤ L(P, f ) ≤ U(P, f ) ≤ M (b − a), (1)

for any partition P of [a, b]. To see that (1) holds, let P = {x0 , ..., xn } be any
partition of [a, b]. Since Mi ≤ M for all i = 1, ..., n,
n
X n
X
U (P, f ) = Mi ∆xi ≤ M (xi − xi−1 ) = M (b − a).
i=1 i=1

Similarly L(P, f ) ≥ m (b−a). Thus the set {U (P, f ) : P is a partition of [a, b]}
is bounded above and below, as is the set {L(P, f )}.

DEFINITION 6.1.1 Let f be a bounded real-valued function on the closed


and bounded interval [a, b]. The upper and lower integrals of f , denoted
Rb Rb
a
f and a f respectively, are defined by
Z b
f = inf{U (P, f ) : P is a partition of [a, b]},
a
Z b
f = sup{L(P, f ) : P is a partition of [a, b]}.
a

Since the sets {U (P, f )} and {L(P, f )} are nonempty and bounded, the
lower and upper integrals of a bounded function f : [a, b] → R always exist.
Rb Rb
Our first goal is to prove that a f ≤ a f for any bounded real-valued function
f on [a, b]. To this end we make the following definition.

DEFINITION 6.1.2 A partition P ⋆ of [a, b] is a refinement of P if


P ⊂ P ⋆.

A refinement of a given partition P is obtained by adding additional points


to P. If P1 and P2 are two partitions of [a, b], then P1 ∪ P2 is a refinement of
both P1 and P2 .

LEMMA 6.1.3 If P ⋆ is a refinement of P, then

L(P, f ) ≤ L(P ⋆ , f ) ≤ U(P ⋆ , f ) ≤ U(P, f ).

Proof. Suppose P = {x0 , x1 , ..., xn } and P ⋆ = P ∪ {x⋆ }, where x⋆ 6= xj for


any j = 0, 1, ..., n. Then there exists an index k such that xk−1 < x⋆ < xk .
Let

Mk1 = sup{f (t) : t ∈ [xk−1 , x⋆ ]},


Mk2 = sup{f (t) : t ∈ [x⋆ , xk ]}.
Integration 227

Since f (t) ≤ Mk for all t ∈ [xk−1 , xk ], we have that f (t) ≤ Mk for all t ∈
[xk−1 , x⋆ ] and also for all t ∈ [x⋆ , xk ]. Thus both Mk1 and Mk2 are less than or
equal to Mk . Now
k−1
X n
X
U (P ⋆ , f ) = Mj ∆xj + Mk1 (x⋆ − xk−1 ) + Mk2 (xk − x⋆ ) + Mj ∆xj .
j=1 j=k+1

But
Mk1 (x⋆ − xk−1 ) + Mk2 (xk − x⋆ ) ≤ Mk ∆xk .
Therefore
U (P ⋆ , f ) ≤ U(P, f ).
The proof for the lower sum is similar. If P ⋆ contains k more points than P,
we need only repeat the above argument k times to obtain the result. 

THEOREM 6.1.4 Let f be a bounded real-valued function on [a, b]. Then


Z b Z b
f≤ f.
a a

Proof. Given any two partitions P, Q of [a, b],


L(P, f ) ≤ L(P ∪ Q, f ) ≤ U(P ∪ Q, f ) ≤ U(Q, f ).
Thus L(P, f ) ≤ U(Q, f ) for any partitions P, Q. Hence
Z b
f = sup L(P, f ) ≤ U(Q, f )
a P

for any partition Q. Taking the infimum over Q gives the result. 

The Riemann Integral


If f : [a, b] → R is bounded, then the lower and upper integrals of f on [a, b]
Rb Rb
always exist and satisfy a f ≤ a f . As we will shortly see, there is a large
family of functions for which equality holds; such functions are said to be
integrable.

DEFINITION 6.1.5 Let f be a bounded real-valued function on the closed


and bounded interval [a, b]. If
Z b Z b
f= f,
a a

then f is said to be Riemann integrable or integrable on [a, b]. The com-


mon value, denoted
Z b
f,
a
228 Introduction to Real Analysis

is called the Riemann integral or integral of f over [a, b]. We denote by


R[a, b] the set of Riemann integrable functions on [a, b]. In addition, if
f ∈ R[a, b], we define
Z a Z b
f =− f.
b a

Alternate notation that is sometimes used to denote the Riemann integral


of f is
Z b
f (x) dx.
a
In the above, the variable x could just as easily have been t, or any other
convenient letter.
If f : [a, b] → R satisfies m ≤ f (t) ≤ M for all t ∈ [a, b], then by inequality
(1),
Z b Z b
m (b − a) ≤ f≤ f ≤ M (b − a).
a a

If in addition f ∈ R[a, b], then


Z b
m (b − a) ≤ f ≤ M (b − a).
a
Rb
In particular, if f (x) ≥ 0 for all x ∈ [a, b], then a f ≥ 0. If f ∈ R[a, b] is non-
Rb
negative, then the quantity a f represents the area of the region bounded
above by the graph y = f (x), below by the x-axis, and the lines x = a and
x = b.

EXAMPLES 6.1.6 (a) The following function, attributed to Dirichlet, is


the canonical example of a function that is not Riemann integrable on a closed
interval. Let f be defined by
(
1, x ∈ Q,
f (x) =
0, x 6∈ Q.

Suppose a < b. If P = {x0 , ..., xn } is any partition of [a, b], then mi = 0 and
Mi = 1 for all i = 1, ..., n. Thus

L(P, f ) = 0 and U (P, f ) = (b − a)

for any partition P of [a, b]. Therefore


Z b Z b
f =0 and f = (b − a).
a a

and as a consequence f is not Riemann integrable on [a, b].


Integration 229

(b) Let f : [0, 1] → R be defined by


(
0, 0 ≤ x < 21 ,
f (x) = 1
1, 2 ≤ x ≤ 1.
R1
In this example, we prove that f is integrable on [0, 1] with 0 f = 12 . Let
P = {x0 , x1 , ..., xn } be a partition of [0, 1], and let k ∈ {1, ..., n − 1} be such
that xk−1 < 21 ≤ xk (See Figure 6.2). If mi and Mi denote the infimum and
supremum of f on [xi−1 , xi ] respectively, then
( (
0, i = 1, ..., k, 0, i = 1, ..., k − 1,
mi = and Mi =
1, i = k + 1, ..., n, 1, i = k, ..., n.

FIGURE 6.2
The function of Example 6.1.6(b)

Thus
n
X
L(P, f ) = ∆xi = (1 − xk ), and
i=k+1
Xn
U (P, f ) = ∆xi = (1 − xk−1 )
i=k

1
Since 1 − xk ≤ 2< 1 − xk−1 , L(P, f ) ≤ 21 ≤ U (P, f ) for all partitions
R1 R1
P of [0, 1]. Thus 0 f ≤ 12 ≤ 0 f . Hence if f is integrable on [0, 1], then
R1
f = 1/2. We conclude by proving that f is indeed integrable on [0, 1]. Since
0
1 − xk−1 = (1 − xk ) + (xk − xk−1 ), we have
U (P, f ) = L(P, f ) + (xk − xk−1 ).
230 Introduction to Real Analysis

Let ǫ > 0 be arbitrary. If P is any partition of [0, 1] with ∆xi < ǫ for all i,
then U (P, f ) < L(P, f ) + ǫ. Thus
Z 1
f = inf U (Q, f ) ≤ U(P, f )
0 Q
Z 1
≤ L(P, f ) + ǫ ≤ sup L(Q, f ) + ǫ ≤ f + ǫ,
Q 0

where the infimum and supremum are taken over all partitions Q of [0, 1].
R1 R1
Since ǫ > 0 was arbitrary, we have 0 f = 0 f . Therefore f is integrable on
R1
[0, 1] with f = 1/2.
0
(c) We now provide another example to illustrate how tedious even a trivial
integral can be if one relies only on the definition of the integral. Luckily, the
fundamental theorem of calculus (Theorem 6.3.2) will allow us to avoid such
tedious computations. Let f (x) = x, x ∈ [a, b], where for the purpose of
illustration we take a ≥ 0 (Figure 6.3). Interpreting the integral as the area
under the curve, we intuitively see that
Z b
1 1
x dx = (b − a)(b + a) = (b2 − a2 ).
a 2 2
This is obtained from the formula for the area of a parallelogram. Let P =
{xo , x1 , ..., xn } be any partition of [a, b]. Since f (x) = x is increasing on [a, b],
mi = f (xi−1 ) = xi−1 , and Mi = f (xi ) = xi .

FIGURE 6.3
Example 6.1.6(c)

Therefore
n
X n
X
L(P, f ) = xi−1 ∆xi and U (P, f ) = xi ∆xi .
i=1 i=1
Integration 231

For each index i,


1
(xi−1 + xi ) ≤ xi .
xi−1 ≤
2
If we multiply this by ∆xi = xi − xi−1 and sum from i = 1 to n, we obtain
n
X 1
x2i − x2i−1 ≤ U(P, f ).

L(P, f ) ≤
i=1
2

But
n
X 1  1 2  1 2
x2i − x2i−1 = xn − x2o = b − a2 .

i=1
2 2 2

Finally, if we take the supremum and infimum over all partitions P of [a, b],
we have
Z b Z b
1 2 2

x dx ≤ b −a ≤ x dx.
a 2 a

This in itself does not prove that


b
1 2
Z
b − a2 .

x dx =
a 2

To prove that f is integrable on [a, b], we note that for any partition P of
[a, b],
Z b Z b n
X
0≤ x dx − x dx ≤ U(P, f ) − L(P, f ) = (xi − xi−1 )∆xi
a a i=1
 
≤ max ∆xi (b − a).
1≤i≤n


Let ǫ > 0 be given. If we choose a partition P such that ∆xi < ǫ (b−a+1) for
Rb Rb
all i, then a x dx − a x dx < ǫ. Since this holds for every ǫ > 0, we conclude
Rb Rb
that a x dx = a x dx. Thus f (x) = x is Riemann integrable on [a, b] with
Rb
x dx = 12 (b2 − a2 ).
a
(d) Consider the function f (x) = x2 , x ∈ [0, 1]. For n ∈ N, let Pn be
the partition {0, n1 , n2 , ..., 1}. Since f is increasing on [0, 1], its infimum and
supremum on each interval [ i−1 i
n ,n ] are attained at
 the left and right endpoint
respectively, with mi = (i − 1) n2 and Mi = i2 n2 . Since ∆xi = n1 for all i,
2

1 2
L(Pn , f ) = [1 + 22 + · · · + (n − 1)2 ], and
n3
1
U (Pn , f ) = 3 [12 + 22 + · · · + n2 ].
n
232 Introduction to Real Analysis

Using the identity 12 + 22 + · · · + m2 = 61 m(m + 1)(2m + 1) (Exercise 1(c),


Section 1.3), we have
     
1 1 1 1 1 1
L(Pn , f ) = 1− 2− and U (Pn , f ) = 1+ 2+ .
6 n n 6 n n
Thus sup L(Pn , f ) = 1/3 and inf U (Pn , f ) = 1/3. Since the collection {Pn :
n n
n ∈ N} is a subset of the set of all partitions of [0, 1],
Z 1
1
3 = sup L(P n , f ) ≤ sup L(P, f ) = x2 dx, and
n P 0
Z 1
1
3 = inf U (Pn , f ) ≥ inf U (P, f ) = x2 dx.
n P 0

R1
Therefore f (x) = x2 is integrable on [0, 1] with x2 dx = 1/3. 
0

Riemannn’s Criterion for Integrability


The following theorem, the original version of which is due to Riemann, pro-
vides necessary and sufficient conditions for the existence of the Riemann
integral.

THEOREM 6.1.7 A bounded real-valued function f is Riemann integrable


on [a, b] if and only if for every ǫ > 0, there exists a partition P of [a, b] such
that
U (P, f ) − L(P, f ) < ǫ. (2)
Furthermore, if P is a partition of [a, b] for which (2) holds, then the inequality
also holds for all refinements of P.

Proof. Suppose (2) holds for a given ǫ > 0. Then


Z b Z b
0≤ f− f ≤ U(P, f ) − L(P, f ) < ǫ.
a a

Thus f is integrable on [a, b].


Conversely, suppose f is integrable on [a, b]. Let ǫ > 0 be given. Then there
exist partitions P1 and P2 of [a, b] such that
Z b Z b
ǫ ǫ
U (P2 , f ) − f< and f − L(P1 , f ) < .
a 2 a 2
Let P = P1 ∪ P2 . Then
b
ǫ
Z
U (P, f ) ≤ U(P2 , f ) < f+ < L(P1 , f ) + ǫ ≤ L(P, f ) + ǫ.
a 2
Integration 233

Therefore, U (P, f ) − L(P, f ) < ǫ, which proves (2). If Q is any refinement of


P, then by Lemma 6.1.3

0 ≤ U(Q, f ) − L(Q, f ) ≤ U(P, f ) − L(P, f ) < ǫ.

Thus (2) is also valid for any refinement of Q of P. 

Integrability of Continuous and Monotone Functions


As an application of the previous theorem we prove that every continuous real-
valued function and every monotone function on [a, b] is Riemann integrable
on [a, b]. As we will see, both of these results will also follow from Lebesgue’s
theorem (Theorem 6.1.13).

THEOREM 6.1.8 Let f be a real-valued function on [a, b].


(a) If f is continuous on [a, b], then f is Riemann integrable on [a, b].
(b) If f is monotone on [a, b], then f is Riemann integrable on [a, b].

Proof. (a) Let ǫ > 0 be given. Choose η > 0 such that (b − a)η < ǫ. Since f
is continuous on [a, b], by Theorem 4.3.4 f is uniformly continuous on [a, b].
Thus there exists a δ > 0 such that

|f (x) − f (t)| < η (3)

for all x, t ∈ [a, b] with |x − t| < δ. Choose a partition P of [a, b] such that
∆xi < δ for all i = 1, 2, ..., n. Then by (3),

Mi − mi ≤ η

for all i = 1, 2, ..., n. Therefore


n
X n
X
U (P, f ) − L(P, f ) = (Mi − mi )∆xi ≤ η ∆xi = η (b − a) < ǫ.
i=1 i=1

Thus by Theorem 6.1.7 f is integrable on [a, b].


(b) Suppose f is monotone increasing on [a, b]. For n ∈ N, set h = (b−a)/n.
Also for i = 0, 1, ..., n, set xi = a + i h. Then P = {x0 , x1 , ..., xn } is a partition
of [a, b] which satisfies ∆xi = h for all i = 1, ..., n. Since f is monotone
increasing on [a, b], mi = f (xi−1 ) and Mi = f (xi ). Therefore,
n
X
U (P, f ) − L(P, f ) = [f (xi ) − f (xi−1 )]∆xi
i=1
n
X (b − a)
=h [f (xi ) − f (xi−1 )] = [f (b) − f (a)].
i=1
n
234 Introduction to Real Analysis

Given ǫ > 0, choose n ∈ N such that

(b − a)
[f (b) − f (a)] < ǫ.
n
For this n and corresponding partition P, U (P, f ) − L(P, f ) < ǫ. Thus f is
integrable on [a, b]. 

The Composition Theorem


We next prove that the composition ϕ ◦ f of a continuous function ϕ with a
Riemann integrable function f is again Riemann integrable. As an application
of Lebesgue’s theorem we will present a much shorter proof of this result later
in the section.

THEOREM 6.1.9 Let f be a bounded Riemann integrable function on [a, b]


with Range f ⊂ [c, d]. If ϕ is continuous on [c, d], then ϕ ◦ f is Riemann
integrable on [a, b].

Proof. Since ϕ is continuous on the closed and bounded interval [c, d], ϕ is
bounded and uniformly continuous on [c, d]. Let K = sup{|ϕ(t)| : t ∈ [c, d]},
and let ǫ > 0 be given. Set ǫ′ = ǫ/(b − a + 2K).
Since ϕ is uniformly continuous on [c, d], there exists δ, 0 < δ < ǫ′ , such
that
|ϕ(s) − ϕ(t)| < ǫ′ (4)
for all s, t ∈ [c, d] with |s − t| < δ. Furthermore, since f ∈ R[a, b], by Theorem
6.1.7 there exists a partition P = {x0 , x1 , ..., xn } of [a, b] such that

U (P, f ) − L(P, f ) < δ 2 .

To complete the proof we will show that

U (P, ϕ ◦ f ) − L(P, ϕ ◦ f ) ≤ ǫ. (5)

By Theorem 6.1.7 it then follows that ϕ ◦ f ∈ R[a, b].


For each k = 1, 2, ..., n, let mk and Mk denote the infimum and supremum
of f on [xk−1 , xk ]. Also, set

m∗k = inf{ϕ(f (t)) : t ∈ [xk−1 , xk ]} and Mk∗ = sup{ϕ(f (t)) : t ∈ [xk−1 , xk ]}.

We partition the set {1, 2, ..., n} into disjoint sets A and B as follows:

A = {k : Mk − mk < δ} and B = {k : Mk − mk ≥ δ}.

Since |f (t) − f (s)| ≤ Mk − mk for all s, t ∈ [xk−1 , xk ], if k ∈ A, then by (4)

|ϕ(f (t)) − ϕ(f (s))| < ǫ′


Integration 235

for all s, t ∈ [xk−1 , xk ]. But

Mk∗ − m∗k = sup{ϕ(f (t)) − ϕ(f (s)) : s, t ∈ [xk−1 , xk ]}.

Therefore Mk∗ − m∗k ≤ ǫ′ for all k ∈ A. On the other hand, if k ∈ B, then


Mk∗ − m∗k ≤ 2K. Thus
X X
U (P, ϕ ◦ f ) − L(P, ϕ ◦ f ) = (Mk∗ − m∗k )∆xk + (Mk∗ − m∗k )∆xk
k∈A k∈B
X X

≤ǫ ∆xk + 2K ∆xk
k∈A k∈B
X
≤ ǫ′ (b − a) + 2K ∆xk .
k∈B

But for k ∈ B, δ ≤ Mk − mk . Therefore,


X 1X 1
∆xk ≤ (Mk − mk )∆xk ≤ (U (P, f ) − L(P, f )) < δ < ǫ′ ,
δ δ
k∈B k∈B

and hence by the above

U (P, ϕ ◦ f ) − L(P, ϕ ◦ f ) ≤ ǫ′ (b − a) + 2Kǫ′ = ǫ.

This establishes (5), and thus ϕ ◦ f ∈ R[a, b]. 


As a consequence of the previous theorem, if f is Riemann integrable on
[a, b], then so are the functions |f | and f 2 . For emphasis we state this as a
corollary.

COROLLARY 6.1.10 If f ∈ R[a, b], then |f | and f 2 are Riemann inte-


grable on [a, b].

A natural question to ask is whether the composition of two Riemann


integrable functions is Riemann integrable. In Example 6.1.14(b) we will show
that the answer is an emphatic no!

Lebesgue’s Theorem
In Theorem 6.1.8(a) we proved that every continuous function on [a, b] is
Riemann integrable on [a, b]. By Exercise 16, this is also true for every bounded
function on [a, b] that is continuous except at a finite number of points. On the
other hand, as a consequence of Theorem 6.1.8(b), every monotone function on
[a, b] is Riemann integrable. Hence for example, if {rn }∞ n=1 is P
an enumeration
of the rational numbers in [0, 1] and cn > 0 are such that cn converges,
then by Theorem 4.4.10

X
f (x) = cn I(x − rn )
n=1
236 Introduction to Real Analysis

is monotone increasing on [0, 1], and thus is Riemann integrable on [0, 1]. By
Theorem 4.4.10, the function f is continuous at every irrational number and
discontinuous at every rational number in [0,1].
We now state the beautiful result of Lebesgue which provides necessary and
sufficient conditions that a bounded real-valued function on [a, b] be Riemann
integrable. To properly state Lebesgue’s result we need to introduce the idea
of a set of measure zero. The concept of measure of a set will be treated in
detail in Chapter 10. The basic idea is that the measure of an interval is its
length. This is then used to define what we mean by measurable set and the
measure of a measurable set. At this point we only need to know what it
means for a set to have measure zero.
DEFINITION 6.1.11 A subset E of R has measure zero if given any
ǫ > 0, there exists a finite or countable collection {In }n∈A of open intervals
such that [ X
E⊂ In and ℓ(In ) < ǫ,
n∈A n∈A
where ℓ(In ) denotes the length of the interval In .

EXAMPLES 6.1.12 (a) Every finite set E has measure zero. Suppose E =
{x1 , ..., xN } is a finite subset of R. For each n = 1, 2, ..., N , as in Figure 6.4
let  ǫ ǫ 
In = xn − , xn + .
2N 2N

FIGURE 6.4
Example 6.1.12(a)

Then
N
[ N
X
E⊂ In and ℓ(In ) = ǫ.
n=1 n=1
Therefore E has measure zero.
(b) Every countable subset of R has measure zero. Suppose E = {xn }∞
n=1
is a countable subset of R. Let ǫ > 0 be given. For each n ∈ N, let
 ǫ ǫ 
In = xn − n+1 , xn + n+1 .
2 2

In . Thus since ℓ(In ) = ǫ/2n ,
S
Since xn ∈ In for all n, E ⊂
n=1
∞ ∞
X X 1
ℓ(In ) = ǫ n
= ǫ.
n=1 n=1
2
Integration 237

As an example, the set Q of rational numbers has measure zero.


(c) As we shall see in the exercises of Section 10.2, the Cantor set P in
[0, 1], which is uncountable, also has measure zero. 

We now state the following theorem of Henri Lebesgue, the proof of which
will be given in Section 6.7. This result appeared in 1902 and provides the most
succinct form of necessary and sufficient conditions for Riemann integrability.

THEOREM 6.1.13 (Lebesgue) A bounded real-valued function f on [a, b]


is Riemann integrable if and only if the set of discontinuities of f has measure
zero.

Remark. If f is continuous on [a, b], then clearly f satisfies the hypothesis


of Theorem 6.1.13 and thus is Riemann integrable. If f is a bounded function
which is continuous except at a finite number of points, then by Example
6.1.12(a) the set of discontinuities of f has measure zero. Hence f ∈ R[a, b].
If f is monotone on [a, b], then by Corollary 4.4.8, the set of discontinuities
of f is at most countable, and thus by Example 6.1.12(b), has measure zero.
Hence again f ∈ R[a, b].
As an application of Lebesgue’s theorem we give the following short proof
of Theorem 6.1.9.
Proof of Theorem 6.1.9 using Lebesgue’s Theorem. As in Theorem
6.1.9, suppose f ∈ R[a, b] with Range f ⊂ [c, d], and suppose ϕ : [c, d] → R is
continuous. Let

E = {x ∈ [a, b] : f is not continuous at x} and


F = {x ∈ [a, b] : ϕ ◦ f is not continuous at x}.

By Theorem 4.2.4, F ⊂ E. Since f is Riemann integrable on [a, b], the set E


has measure zero, and as a consequence so does the set F . Therefore ϕ ◦ f ∈
R[a, b]. 

EXAMPLES 6.1.14 (a) As in Example 4.2.2(g), let f be defined on [0, 1]


by 
1,
 x = 0,
f (x) = 0, if x is irrational,
1
if x = m

n , n in lowest terms, x 6= 0.

Since f is continuous except at the rational numbers, which have measure


zero, f is Riemann integrable on [0,1]. Furthermore, since L(P, f ) = 0 for all
partitions P of [0, 1],
Z 1
f (x) dx = 0.
0
238 Introduction to Real Analysis

(b) Let f be the Riemann integrable function on [0, 1] given in (a), and
let g : [0, 1] → R be defined by
(
0, x = 0,
g(x) =
1, x ∈ (0, 1].

Since g is continuous except at 0, g ∈ R[0, 1]. But for x ∈ [0, 1],


(
1, if x is rational,
(g ◦ f )(x) =
0, if x is irrational.

By Example 6.1.6(a), g ◦ f 6∈ R[0, 1]. 

Exercises 6.1
1. Let f (x) = 1 − x2 , x ∈ [−1, 2]. Find L(P, f ) and U (P, f ) for each of the
following partitions of [−1, 2].
*a. P = {−1, 0, 1, 2} b. P = {−1, − 21 , 0, 12 , 1 32 , 2}
2. Show that each of the following
R 2functions is Riemann integrable on [0, 2]
and use the definition to find 0 f .

(
 1
 0 ≤ x < 12 ,
−1, 0 ≤ x < 1, 1
*a. f (x) = b. f (x) = −3, 2
≤ x < 23 ,
2, 1≤x≤2  3
−2, ≤x≤2

2

3. Show that each of the following functions is Riemann integrable on [a, b],
Rb
and find a f .

0,
 a≤x<c
*a. f (x) = c, (c a constant) b. f (x) = 12 , x = c,

1, c<x≤b

R1
4. Use one of the methods of Examples 6.1.6 to find 0 f for each of the
following functions f . In the following, [x] denotes the greatest integer
function.
*a. f (x) = [3x] b. f (x) = x[2x]
*c. f (x) = 3x + 1 d. f (x) = 1 − x2
Ra
5. Prove that f = 0 for any real-valued function f on [a, a].
a

Rb Rb
6. *If f, g ∈ R[a, b] with f (x) ≤ g(x) for all x ∈ [a, b], prove that f≤ g.
a a

7. a. Suppose f is continuous on [a, b] with f (x) ≥ 0 for all x ∈ [a, b]. If


Rb
f = 0, prove that f (x) = 0 for all x ∈ [a, b].
a

b. Show by example that the conclusion may be false if f is not contin-


uous.
Integration 239

8. *a. Let f : [0, 1] → R be defined by


(
0, x ∈ Q,
f (x) = .
x, x∈ /Q
R1 R1
Compute 0 f and 0 f . Is f integrable on [0, 1].
9. Suppose f is a nonnegative Riemann integrable function on [a, b] satisfy-
Rb
ing f (r) = 0 for all r ∈ Q ∩ [a, b]. Prove that f = 0.
a

10. *Use the method of Example 6.1.6(c) to show that


Z b
1
x2 dx = (b3 − a3 ) (0 ≤ a < b).
a 3
11. Suppose f is monotone increasing on [a, b]. For n ∈ N, set h = (b − a)/n.
Let Pn = {x0 , x1 , ..., xn } where for each k = 0, ..., n, xk = a + k h.
a. Prove that
b
(b − a)
Z
0 ≤ U(Pn , f ) − f≤ [f (b) − f (a)].
a n
Rb
b. Prove that f = lim U(Pn , f ).
a n→∞

12. Use the previous exercise to evaluate the following integrals.


Z 1 Z 1 Z 1 Zb
3
*a. x dx. b. (3x − 2) dx. *c. x dx d. x3 dx
0 −2 0
a

13. *Let f be a bounded function on [a, b]. Suppose there exists a sequence
{Pn } of partitions of [a, b] such that
lim L(Pn , f ) = lim U(Pn , f ) = L, L ∈ R.
n→∞ n→∞
Rb
Prove that f is Riemann integrable on [a, b] with f = L.
a

14. *a. If f ∈ R[a, b], prove directly (without using Theorems 6.1.9 or 6.1.13)
that |f | ∈ R[a, b].

b. If |f | ∈ R[a, b], is f ∈ R[a, b]?


Rb Rb
c. If f ∈ R[a, b], prove that f ≤ |f |.
a a

15. *a. If f ∈ R[a, b], prove directly that f 2 ∈ R[a, b].


b. Give an example of a bounded function f on [a, b] for which f 2 ∈
R[a, b], but f 6∈ R[a, b].
16. *Suppose f is a bounded real-valued function on [a, b] that has only a
finite number of discontinuities. Prove directly that f is Riemann inte-
grable on [a, b].
17. a. If E has measure zero, prove that every subset of E has measure zero.
b. If E1 , E2 have measure zero, prove that E1 ∪ E2 has measure zero.
240 Introduction to Real Analysis

S
c. If each En , n = 1, 2, ..., has measure zero, prove that En has mea-
n=1
sure zero.
18. Use Theorem 6.1.13 to prove that if f, g ∈ R[a, b], then f + g ∈ R[a, b].
19. Prove directly that the function f of Example 6.1.14(a) is Riemann inte-
grable on [a, b].
20. Let f : [a, b] → R be continuous. Suppose that for every Riemann inte-
grable function g : [a, b] → R the product f g is Riemann integrable and
Rb
a
f g = 0. Prove that f (x) = 0 for all x ∈ [a, b].
21. a. Let {In } be a finite S
or countable collection of disjoint open intervals
in [a, b], and let U = In . Define f on [a, b] by f (x) = 1 if x ∈ U ,
and 0 elsewhere. Prove that f is Riemann integrable on [a, b] and that
Rb P
f= ℓ(In ).
a n

b. Let P denote the Cantor ternary set in [0, 1] and let f be defined on
[0, 1] by f (x) = 0 if x ∈ P , and f (x) = 1 elsewhere. Prove that f is
R1
Riemann integrable on [0, 1] and that f = 1.
0

6.2 Properties of the Riemann Integral


In this section, we derive some basic properties of the Riemann integral. As
in the previous section, [a, b], a < b, will be a closed and bounded interval in
R, and R[a, b] denotes the set of Riemann integrable functions on [a, b].

THEOREM 6.2.1 Let f, g ∈ R[a, b]. Then


Z b Z b Z b
(a) f + g ∈ R[a, b] with (f + g) = f+ g,
a a a
Z b Z b
(b) cf ∈ R[a, b] for all c ∈ R with cf = c f , and
a a
(c) f g ∈ R[a, b].

Proof. (a) The integrability of f + g is actually a consequence of Theorem


6.1.13. However, in establishing the formula for the integral of (f + g), we will
also obtain the integrability of f + g as a consequence. Let P = {x0 , ..., xn }
be a partition on [a, b]. For each i = 1, ..., n let

Mi (f ) = sup{f (t) : t ∈ [xi−1 , xi ]},


Mi (g) = sup{g(t) : t ∈ [xi−1 , xi ]}.

Then f (t) + g(t) ≤ Mi (f ) + Mi (g) for all t ∈ [xi−1 , xi ] and thus

sup{f (t) + g(t) : t ∈ [xi−1 , xi ]} ≤ Mi (f ) + Mi (g).


Integration 241

Therefore, for all partitions P of [a, b],

U (P, f + g) ≤ U(P, f ) + U (P, g).

Let ǫ > 0 be given. Since f, g ∈ R[a, b], there exist partitions Pf and Pg
of [a, b] such that
Z b Z b
U (Pf , f ) < f + 12 ǫ and U (Pg , g) < g + 21 ǫ.
a a

Let Q = Pf ∪ Pg . Since Q is a refinement of both Pf and Pg ,


Z b Z b
U (Q, f + g) ≤ U(Pf , f ) + U (Pg , g) < f+ g + ǫ.
a a

Therefore
Z b Z b Z b
(f + g) < f+ g + ǫ.
a a a
Since the above holds for all ǫ > 0,
Z b Z b Z b
(f + g) ≤ f+ g.
a a a

A similar argument proves that


Z b Z b Z b
(f + g) ≥ f+ g.
a a a

Thus the lower integral of (f + g) is equal to the upper integral of (f + g),


and as a consequence (f + g) ∈ R[a, b] with
Z b Z b Z b
(f + g) = f+ g.
a a a

(b) The proof of (b) is left as an exercise (Exercise 1).


(c) To prove (c), we first note that
1
(f + g)2 − (f − g)2 .

fg =
4
By (a) the functions (f + g) and (f − g) are integrable on [a, b], and by
Corollary 6.1.10, (f + g)2 and (f − g)2 are integrable. Hence by (a) and (b),
f g is integrable on [a, b]. 

THEOREM 6.2.2 If f ∈ R[a, b], then |f | ∈ R[a, b] with


Z b Z b
f ≤ |f |.
a a
242 Introduction to Real Analysis

Proof. Since f ∈ R[a, b], by Corollary 6.1.10 |f | ∈ R[a, b]. Let c = ±1 be


Rb Rb
such that | a f | = c a f . Since cf (x) ≤ |f (x)| for all x ∈ [a, b], U (P, cf ) ≤
U (P, |f |) for any partition P of [a, b]. Therefore, since both cf and |f | are
Rb Rb
integrable, a cf ≤ a |f |. Combining the above we have
Z b Z b Z b Z b
f =c f= cf ≤ |f |. 
a a a a

THEOREM 6.2.3 Let f be a bounded real-valued function on [a, b], and


suppose a < c < b. Then f ∈ R[a, b] if and only if f ∈ R[a, c] and f ∈ R[c, b].
If this is the case, then
Z b Z c Z b
f= f+ f. (6)
a a c

Proof. Let c ∈ R satisfy a < c < b. We first prove that if f is a bounded


real-valued function on [a, b], then
Z b Z c Z b
f= f+ f. (7)
a a c

Suppose P1 and P2 are partitions of [a, c] and [c, b] respectively. Then P =


P1 ∪ P2 is a partition of [a, b] with c ∈ P. Conversely, if P is any partition of
[a, b] with c ∈ P, then P = P1 ∪ P2 where P1 and P2 are partitions of [a, c]
and [c, b], respectively. For such a partition P,
Z c Z b
U (P, f ) = U (P1 , f ) + U (P2 , f ) ≥ f+ f.
a c

If Q is any partition of [a, b], then P = Q ∪ {c} is a refinement of Q containing


c. Therefore
Z c Z b
U (Q, f ) ≥ U(P, f ) ≥ f+ f.
a c

Taking the infimum over all partitions Q of [a, b] gives


Z b Z c Z b
f≥ f+ f.
a a c

To prove the reverse inequality, let ǫ > 0 be given. Then there exist parti-
tions P1 and P2 of [a, c] and [c, b] respectively such that

c b
ǫ ǫ
Z Z
U (P1 , f ) < f+ and U (P2 , f ) < f+ .
a 2 c 2
Integration 243

Let P = P1 ∪ P2 . Then
Z b Z c Z b
f ≤ U(P, f ) = U (P1 , f ) + U (P2 , f ) < f+ f + ǫ.
a a c

Since ǫ > 0 was arbitrary,


Z b Z c Z b
f≤ f+ f,
a a c

which when combined with the previous inequality proves (7). A similar ar-
gument also proves
Z b Z c Z b
f= f+ f. (8)
a a c

If f is integrable on [a, c] and [c, b], then by (7) and (8)


Z c Z b Z b Z b Z c Z b
f+ f≤ f≤ f= f+ f.
a c a a a c

Therefore f ∈ R[a, b] and identity (6) holds. Conversely, if f ∈ R[a, b], then
by (7) and (8),
Z c Z b Z c Z b
f+ f= f+ f.
a c a c

Since the lower integral of f is always less than or equal to the upper integral
of f , the above holds if and only if
Z c Z c Z b Z b
f= f and f= f.
a a c c

Hence f ∈ R[a, c] and f ∈ R[c, b]. 

Riemann’s Definition of the Integral


We close this section by comparing the approach of Darboux with the original
method of Riemann.

DEFINITION 6.2.4 Let f be a bounded real-valued function on [a, b] and


let P = {x0 , x1 , ..., xn } be a partition of [a, b]. For each i = 1, 2, ..., n, choose
ti ∈ [xi−1 , xi ]. The sum
n
X
S(P, f ) = f (ti )∆xi
i=1

is called a Riemann sum of f with respect to the partition P and the points
{ti }.
244 Introduction to Real Analysis

FIGURE 6.5
A Riemann sum S(P, f ) of f

In the Riemann approach to integration, one defines the integral of a


bounded real-valued function f as the limit of the Riemann sums of f . Since
S(P, f ) depends not only on the partition P = {x0 , x1 , ..., xn } but also on the
points ti ∈ [xi−1 , xi ], we first need to clarify what we mean by the limit of the
Riemann sums S(P, f ). For a partition P = {x0 , x1 , ..., xn } of [a, b], set

kPk = max{∆xi : i = 1, 2, ..., n}.

The quantity kPk is called the norm or the mesh of the partition P.

DEFINITION 6.2.5 Let f be a bounded real-valued function on [a, b]. Then

lim S(P, f ) = I
kPk→0

if given ǫ > 0, there exists a δ > 0 such that


n
X
f (ti )∆xi − I < ǫ (9)
i=1

for all partitions P of [a, b] with kPk < δ, and all choices of ti ∈ [xi−1 , xi ].

THEOREM 6.2.6 Let f be a bounded real-valued function on [a, b]. If

lim S(P, f ) = I ,
kPk→0
Integration 245
Rb
then f ∈ R[a, b] and f = I. Conversely, if f ∈ R[a, b], then lim S(P, f )
a kPk→0
exists and Z b
lim S(P, f ) = f.
kPk→0 a

Proof. Suppose lim S(P, f ) = I. Let ǫ > 0 be given, and let δ > 0 be such
kPk→0
that (9) holds for all partitions P = {x0 , x1 , ..., xn } of [a, b] with kPk < δ, and
all ti ∈ [xi−1 , xi ]. By the definition of Mi , for each i = 1, ..., n, there exists
ζi ∈ [xi−1 , xi ] such that f (ζi ) > Mi − ǫ. Thus
n
X
U (P, f ) = Mi ∆xi
i=1
Xn n
X
< f (ζi ) ∆xi + ǫ ∆xi
i=1 i=1
< I + ǫ + ǫ [b − a] = I + ǫ [1 + b − a].

Similarly L(P, f, α) > I − ǫ [1 + b − a]. Therefore,

U (P, f ) − L(P, f ) < 2 ǫ [1 + b − a].


Rb
Thus as a consequence of Theorem 6.1.7, f ∈ R[a, b] with f = I.
a
Conversely, suppose f ∈ R[a, b]. Let M > 0 be such that |f (x)| ≤ M for
all x ∈ [a, b]. Let ǫ > 0 be given. Since f ∈ R[a, b], by Theorem 6.1.7 there
exists a partition Q of [a, b] such that
Z b Z b
f − ǫ < L(Q, f ) ≤ U(Q, f ) < f + ǫ.
a a

Suppose Q = {x0 , ..., xN }. Let δ = ǫ/N M , and let P = {y0 , ..., yn } be any
partition of [a, b] with kPk < δ. As in the definition of the integral, let

Mi = sup{f (x) : x ∈ [xi−1 , xi ]}, i = 1, ..., N.

Consider any interval [yk−1 , yk ], k = 1, ..., n. This interval may or may not
contain points xi ∈ Q. Since Q contains N + 1 points, there are at most N − 1
intervals [yk−1 , yk ] which contain an xi ∈ Q, i 6= 0, N . Suppose as in Figure 6.6
{xj , ..., xj+m } ⊂ [yk−1 , yk ]. If xj 6= yk−1 , set Mk1 = sup{f (x) : x ∈ [yk−1 , xj ]}.
Similarly, if xj+m 6= yk , set Mk2 = sup{f (x) : x ∈ [xj+m , yk ]}.
Let tk ∈ [yk−1 , yk ] be arbitrary. Since |f (t) − f (s)| ≤ 2M for all t, s ∈
[yk−1 , yk ],

f (tk ) ≤ 2M + Mj+s , s = 1, ..., m, and


f (tk ) ≤ 2M + Mki , i = 1, 2.
246 Introduction to Real Analysis

FIGURE 6.6
Partition of the interval [yk−1 , yk ]

Therefore
m
X
f (tk )∆yk = f (tk )(xj − yk−1 ) + f (tk )∆xj+s + f (tk )(yk − xj+m )
s=1
m
X
≤ 2M ∆yk + Mk1 (xj − yk−1 ) + Mj+s ∆xj+s + Mk2 (yk − xj+m )
s=1
< 2M δ + U (Pk , f ),

where Pk = {yk−1 , xj , ..., xj+m , yk } is a partition of [yk−1 , yk ]. If the interval


[yk−1 ,Syk ] contains no xi ∈ Q, i 6= 0, N , we simply let Pk = {yk−1 , yk }. Let
P ′ = Pk . Then P ′ is a partition of [a, b] that is also a refinement of Q. Since
at most N − 1 intervals [yk−1 , yk ] contain a point of Q other than x0 and xN ,
n
X n
X
S(P, f ) = f (tk )∆yk < 2M (N − 1) δ + U (Pk , f )
k=1 k=1
Z b

< 2ǫ + U (P , f ) < 2ǫ + U (Q, f ) < 3ǫ + f.
a

A similar argument proves that


Z b
S(P, f ) > f − 3ǫ.
a

Therefore,
Z b
S(P, f ) − f < 3ǫ.
a

Since this holds for any partition P = {y0 , ..., yn } of [a, b] with kPk < δ and
any choice of tk ∈ [yk−1 , yk ], k = 1, ..., n,
Z b
lim S(P, f ) = f. 
kPk→0 a
Integration 247

EXAMPLE 6.2.7 In this example, we will use the method of Riemann sums
Rb
to evaluate x dx. Since f (x) = x is Riemann integrable on [a, b],
a

Z b
x dx = lim S(P, f ).
a kPk→0

Since the limit exists for any ti ∈ [xi−1 , xi ], we can take ti = (xi−1 + xi )/2.
With this choice of ti ,
n n
1X 1X 2 1
S(P, f ) = (xi−1 + xi )(xi−1 − xi ) = (xi−1 − x2i ) = (b2 − a2 ),
2 i=1 2 i=1 2

which proves the result. 

Exercises 6.2
1. Prove Theorem 6.2.1(b).
Rb
2. *a. Use the method of Riemann sums to evaluate x2 dx.
a

Rb
b. Use the method of Riemann sums to evaluate xn dx, n ∈ N, n ≥ 3.
a

3. *a. Let f be a real-valued function on [a, b] such that f (x) = 0 for all
Rb
x 6= c1 , ..., cn . Prove that f ∈ R[a, b] with f = 0.
a

*b. Let f, g ∈ R[a, b] be such that f (x) = g(x) for all but a finite number
Rb Rb
of points in [a, b]. Prove that f = g.
a a

c. Is the result of (a) still true if f (x) = 0 for all but countably many
points in [a, b]?
4. Let f ∈ R[−a, a], a > 0. Prove each of the following:
Ra Ra
a. If f is even (i.e. f (−x) = f (x) for all x ∈ [−a, a]), then f =2 f.
−a 0

Ra
b. If f is odd (i.e. f (−x) = −f (x) for all x ∈ [−a, a]), then f = 0.
−a

5. *Let f be a bounded real-valued function on [a, b] such that f ∈ R[c, b]


Rb Rb
for every c, a < c < b. Prove that f ∈ R[a, b] with f = lim f .
a c→a+ c

6. Let f be continuous on [0, 1]. Prove that


n   Z 1
1 X k
lim f = f (x) dx.
n→∞ n n 0
k=1
248 Introduction to Real Analysis

7. Use the previous exercise to evaluate each of the following limits. (You
may use any applicable methods from calculus to evaluate the definite
integrals.)
n n n
1 X 2 X k X n
*a. lim 3 k b. lim 2 2
*c. lim
n→∞ n n→∞ n +k n→∞ n2 + k 2
k=1 k=1 k=1
m
1 X p 2
d. lim 3 k k + n2
n→∞ n
k=1

8. *As in Example 4.4.11(a), let f be a monotone function on [0, 1] defined


by
P∞ 1
f (x) = n
I(x − xn ),
n=1 2

R1
where xn = n/(n + 1), n ∈ N. Find f (x)dx. (Leave your answer in the
0
form of an infinite series.)
9. Suppose f ∈ R[a, b] and c ∈ R. Define fc on [a − c, b − c] by fc (x) =
f (x + c). Prove that fc ∈ R[a − c, b − c] with
Z b−c Z b
fc (x)dx = f (x)dx.
a−c a

10. Let f, g, h be bounded real-valued functions on [a, b] satisfying f (x) ≤


Rb Rb
g(x) ≤ h(x) for all x ∈ [a, b]. If f, h ∈ R[a, b] with f = h = I, prove
a a
Rb
that g ∈ R[a, b] with g = I.
a

6.3 Fundamental Theorem of Calculus


In this section, we will prove two well known and very important results.
Collectively they are commonly referred to as the fundamental theorem of
calculus. To Newton and Leibniz, integration was the inverse operation of
differentiation.
R ′ In Leibniz’s notation this result
R would simply be expressed as
f (x) dx = f (x), where here the symbol (denoting sum) was Leibniz’s
notation for the integral. This notation is still used in most calculus texts
to denote the antiderivative of a function. Since the modern definition of the
integral is based on either Riemann or Darboux sums, we now need to prove
that integration is indeed the inverse operation of differentiation. Both versions
of the fundamental theorem of calculus presented here are essentially due to
Cauchy who proved the results for continuous functions.
As was illustrated in Examples 6.1.6 and 6.2.7, the computation of the
integral of a function, using either Darboux’s or Riemann’s definition, can be
extremely tedious. For nontrivial functions these computations are in most
instances impossible. The first version of the fundamental theorem of calculus
Integration 249

provides a major tool for the evaluation of Riemann integrals. We begin with
the following definition.

DEFINITION 6.3.1 Let f be a real-valued function on an interval I. A


function F on I is called an antiderivative of f on I if F ′ (x) = f (x) for all
x ∈ I.

Remark. An antiderivative, if it exists is not unique. If F is an antiderivative


of f , then so is F + C for any constant C. Conversely, if F and G are an-
tiderivatives of f , then F ′ (x) − G′ (x) = 0 for all x ∈ [a, b]. Thus by Theorem
5.2.9, G(x) = F (x) + C for some constant C.

THEOREM 6.3.2 (Fundamental Theorem of Calculus) If f ∈ R[a, b]


and if F is an antiderivative of f on [a, b], then
Z b
f (x) dx = F (b) − F (a).2
a

Proof. Let P = {x0 , x1 , ..., xn } be any partition of [a, b]. If F is an antideriva-


tive of f , then by the mean value theorem, for each i = 1, ..., n, there exists
ti ∈ (xi−1 , xi ) such that
F (xi ) − F (xi−1 ) = f (ti )∆xi .
Therefore,
n
X n
X
f (ti )∆xi = F (xi ) − F (xi−1 ) = F (b) − F (a).
i=1 i=1

Since
n
X
L(P, f ) ≤ f (ti )∆xi ≤ U(P, f ),
i=1
we have
Z b Z b
f (x) dx ≤ F (b) − F (a) ≤ f (x) dx.
a a

Thus if f is Riemann integrable on [a, b],


Z b
f (x) dx = F (b) − F (a). 
a

Remark. The above version of the fundamental theorem of calculus is con-


siderably stronger than the version given in most elementary calculus texts in
that it does not require continuity of the function f . We only need that f is
integrable and that it has an antiderivative on [a, b]. We will illustrate this in
(b) of the following example.

2 The quantity F (b) − F (a) is usually denoted by F (x)|ba .


250 Introduction to Real Analysis
1
EXAMPLES 6.3.3 (a) If f (x) = xn , n ∈ N, then F (x) = n+1 x
n+1
is an
antiderivative of f . Thus for any a, b ∈ R, a < b,
b
1
Z
xn dx = (bn+1 − an+1 ).
a n+1

(b) Consider the function F on [0, 1] defined by



x2 sin 1 , x 6= 0,
F (x) = x
 0, x = 0.

A straightforward computation gives



− cos 1 + 2x sin 1 , x 6= 0,
f (x) = F ′ (x) = x x
 0, x = 0.

Then f is bounded on [0, 1] and continuous everywhere except x = 0. Thus


f ∈ R[0, 1], and by the previous theorem
Z 1
f = F (1) − F (0) = sin 1.
0

(c) Let f be defined on [0, 2] by


(
1, 0 ≤ x < 1,
f (x) =
x − 1, 1 ≤ x ≤ 2,

and for x ∈ [0, 2] let F (x) be defined by


Z x
F (x) = f (t) dt.
0

Rx
If 0 ≤ x ≤ 1, then F (x) = 1 dt = x. On the other hand, if 1 < x ≤ 2, then
0

1 x
1 2 3
Z Z
F (x) = 1 dt + (1 − t) dt = x −x+ .
0 1 2 2

Thus (
x, 0 ≤ x ≤ 1,
F (x) = 1 2 3
2x −x+ 2, 1 < x ≤ 2.
Even though f is not continuous at x = 1, the function F is continuous
everywhere. This in fact is always the case. (See Exercise 2.) 
Integration 251

The following version of the fundamental theorem of calculus, also due


to Cauchy, proves that for continuous functions, integration is the inverse
operation of differentiation.

THEOREM 6.3.4 (Fundamental Theorem of Calculus) Let f ∈


R[a, b]. Define F on [a, b] by
Z x
F (x) = f (t) dt.
a

Then F is continuous on [a, b]. Furthermore, if f is continuous at a point


c ∈ [a, b], then F is differentiable at c and

F ′ (c) = f (c).

Proof. The proof that F is continuous is left as an exercise (Exercise 2).


Suppose f is continuous at c ∈ [a, b). We will show that F+′ (c) = f (c). Let
h > 0. Then by Theorem 6.2.3,
Z c+h Z c Z c+h
F (c + h) − F (c) = f (t) dt − f (t) dt = f (t) dt.
a a c

Therefore,
c+h
F (c + h) − F (c) 1
Z
− f (c) = f (t) dt − f (c)
h h c
c+h
1
Z
= [f (t) − f (c)] dt.
h c

Let ǫ > 0 be given. Since f is continuous at c, there exists a δ > 0 such that

|f (t) − f (c)| < ǫ

for all t, |t − c| < δ. Therefore, if 0 < h < δ,


c+h
F (c + h) − F (c) 1
Z
− f (c) ≤ |f (t) − f (c)| dt
h h c
c+h
1
Z
< ǫ dt = ǫ.
h c

Thus
F (c + h) − F (c)
lim+ = f (c),
h→0 h
F+′ (c)
i.e., = f (c). Similarly, if f is continuous at c ∈ (a, b], F−′ (c) = f (c),
which proves the result. 
252 Introduction to Real Analysis

Remarks. (a) If f is continuous on [a, b], then an antiderivative of f always


exists; namely the function F given by
Z x
F (x) = f (t) dt, a ≤ x ≤ b.
a


Since f is continuous, F (x) = f (x) for all x ∈ [a, b]. As a consequence,
we obtain the following more elementary version of Theorem 6.3.2 normally
encountered in the study of calculus: If f is continuous on [a, b] and G is any
Rb
antiderivative of f , then f = G(b) − G(a).
a
(b) Integrability of a function f on [a, b] does not imply the existence of
an antiderivative
Rx of f . For example, if f is monotone increasing on [a, b] and
F (x) = a f , then for any c ∈ (a, b), F+′ (c) = f (c+) and F−′ (c) = f (c−)
(Exercise 15). Thus if f is not continuous at c, the derivative of F does not
exist at c.

The Natural Logarithm Function


As our first application of the fundamental theorem of calculus, we use the
result to define the natural logarithm function ln x.

EXAMPLE 6.3.5 For x > 0, let L(x) be defined by


Z x
1
L(x) = dt.
1 t

Since f (t) = 1/t is continuous on (0, ∞), by Theorem 6.3.4 L(x) satisfies
L′ (x) = 1/x for all x > 0. Furthermore, since L′ (x) > 0 for all x ∈ (0, ∞), L
is strictly increasing on (0, ∞).
We now prove that the function L(x) satisfies the usual properties of a
logarithm function; namely
(a) L(a b) = L(a) + L(b) for all a, b > 0,
(b) L(1/b) = −L(b), b > 0, and
(c) L(br ) = r L(b), b > 0, r ∈ R.
To prove (a), consider the function L(ax), x > 0. By the chain rule (The-
orem 5.1.6).
d 1 1
L(ax) = a = = L′ (x).
dx ax x
Thus by Theorem 5.2.9, L(ax) = L(x) + C for some constant C. From the
definition of L we have L(1) = 0. Therefore,

L(a) = L(1) + C = C.

Hence L(ax) = L(a) + L(x) for all x > 0, which proves (a). The proof of (b)
Integration 253

proceeds analogously. It is worth noting that for the proof of (a) and (b) we
only used the fact that L′ (x) = 1/x and L(1) = 0.
To prove (c), if n ∈ N, then by (a) L(bn ) = n L(b). Also by (b),
 
1
L(b−n ) = n L = −n L(b).
b

Therefore,
√ L(bn ) = n√ L(b) for all n ∈ Z. Consider L( n b) where n ∈ N. Since
n L( n b) = L(b), L( n b) = n1 L(b). Therefore
L(br ) = r L(b)
for all r ∈ Q. Since L is continuous the above holds for all r ∈ R.
Our final step will be to prove that L(e) = 1, where e is Euler’s number
of Example 3.3.5. To accomplish this we use the definition to compute the
derivative of L at 1. Since L′ (1) exists,
L(1 + n1 ) − L(1)
1 = L′ (1) = lim 1
n→∞
n
= lim n L(1 + n1 )
n→∞
= lim L((1 + n1 )n ) = L(e).
n→∞

The last equality follows by the continuity of L and the definition of e. There-
fore L(e) = 1 and the function L(x) is the logarithm function to the base e.
This function is usually denoted by loge x or ln x, and is called the natural
logarithm function. 

Consequences of the Fundamental Theorem of Calculus


We now prove several other consequences of Theorem 6.3.4. Our first result is
the mean value theorem for integrals.

THEOREM 6.3.6 (Mean Value Theorem for Integrals) Let f be a


continuous real-valued function on [a, b]. Then there exists c ∈ [a, b] such that
Z b
f = f (c) (b − a).
a
Rx
Proof. Let F (x) = a f . Since f is continuous on [a, b], F ′ (x) = f (x) for
all x ∈ [a, b]. Thus by the mean value theorem (Theorem 5.2.6), there exists
c ∈ [a, b] such that
Z b
f = F (b) − F (a) = F ′ (c)(b − a) = f (c)(b − a). 
a

An alternate proof of the above can also be based on the intermediate


value theorem using the continuity of f . This alternate method will be used
in the proof of the analogous result for the Riemann-Stieltjes integral.
254 Introduction to Real Analysis

THEOREM 6.3.7 (Integration by Parts Formula) Let f, g be differen-


tiable functions on [a, b] with f ′ , g ′ ∈ R[a, b]. Then
Z b Z b
f g ′ = f (b)g(b) − f (a)g(a) − g f ′.
a a

Proof. Since f, g are differentiable on [a, b], they are continuous and thus also
integrable on [a, b]. Therefore by Theorem 6.2.1(c), f g ′ and gf ′ are integrable
on [a, b]. Since
(f g)′ = g f ′ + f g ′ ,
the function (f g)′ ∈ R[a, b]. By the fundamental theorem of calculus (Theo-
rem 6.3.2),
Z b Z b Z b
′ ′
f (b)g(b) − f (a)g(a) = (f g) = gf + f g′ ,
a a a

from which the result follows. 

THEOREM 6.3.8 (Change of Variable Theorem) Let ϕ be differen-


tiable on [a, b] with ϕ′ ∈ R[a, b]. If f is continuous on I = ϕ([a, b]), then
Z b Z ϕ(b)
f (ϕ(t))ϕ′ (t) dt = f (x) dx.
a ϕ(a)

Proof. Since ϕ is continuous, I = ϕ([a, b]) is a closed and bounded interval.


Also, since f ◦ϕ is continuous and ϕ′ ∈ R[a, b], by Theorem 6.2.1(c), (f ◦ϕ)ϕ′ ∈
R[a, b]. If I = ϕ([a, b]) is a single point, then ϕ is constant on [a, b]. In this
case, ϕ′ (t) = 0 for all t and both integrals above are zero. Otherwise, for x ∈ I
define Z x
F (x) = f (s) ds.
ϕ(a)

Since f is continuous, F ′ (x) = f (x) for all x ∈ I. By the chain rule

d
F (ϕ(t)) = F ′ (ϕ(t))ϕ′ (t) = f (ϕ(t))ϕ′ (t)
dt
for all t ∈ I. Therefore by Theorem 6.3.2
Z b Z ϕ(b)

f (ϕ(t))ϕ (t) dt = F (ϕ(b)) − F (ϕ(a)) = f (s) ds. 
a ϕ(a)

Remark. Another version of the change of variable theorem is given in Ex-


ercise 11.
Integration 255

EXAMPLES 6.3.9 (a) To illustrate the change of variable theorem, con-


R2
sider t/(1 + t2 ) dt. If we let ϕ(t) = 1 + t2 and f (x) = 1/x, then
0

2 2
t 1
Z Z
dt = f (ϕ(t))ϕ′ (t) dt,
0 1 + t2 2 0

which by Theorem 6.3.8

5
1 1 1
Z
= dx = ln 5.
2 1 x 2

√ √ √
(b) For integrals involving a2 − x2 , x2 − a2 , and x2 + a2 , an appro-
priate trigonometric substitution isZuseful in evaluating the integral. We illus-
ap
trate this with the following. Find a2 − x2 dx. We make the substitution
0

π
x = a sin t, 0≤t≤ .
2

Then dx = a cos t dt and a2 − x2 = a cos t. Thus
Z ap Z π2
a2 − x2 dx = a2 cos2 t dt,
0 0

which by the identity cos2 t = 21 (1 + cos 2t)

π π
a2 a2 a2 π
 
1
Z 2 2
= (1 + cos 2t)dt = t + sin 2t = . 
2 0 2 2 0 4

Exercises 6.3
1. Sketch both the graph of f (x) and the graph of F (x) of Example 6.3.3
(c).
Rx
2. *Let f ∈ R[a, b]. For x ∈ [a, b], set F (x) = f . Prove that F is continuous
a
on [a, b].
Rx
3. For x ∈ [0, 1], find F (x) = f (t)dt for each of the following functions f
0
defined on [0, 1]. In each case verify that F is continuous on [0, 1], and
that F ′ (x) = f (x) at all points where f is continuous.
(
3 1, 0 ≤ x < 21 ,
a. f (x) = x − 3x + 5 *b. f (x) = 1
−2, 2
≤x<1
c. f (x) = x − [3x] *d f (x) = x[3x].
256 Introduction to Real Analysis
(
t, 0 ≤ t < 1,
4. Let f (t) be defined by f (t) = , and let F (x) be
b − t2 , 1 ≤ t ≤ 2,
Rx
defined by F (x) = f (t) dt, 0 ≤ x ≤ 2.
0

a. Find F (x).
b. For what value of b in the definition of f is F (x) differentiable for all
x ∈ [0, 2].
5. Let f be a continuous real-valued function on [a, b] and define H on [a, b]
Rb
by H(x) = f . Find H ′ (x).
x

6. Find F ′ (x) where F is defined on [0, 1] as follows:


Z x Z x
1
a. F (x) = 2
dt *b. F (x) = cos t2 dt
0 1+t 0
R1 √
c. F (x) = 1 + t3 dt
x

R2
x
*d. F (x) = f (t) dt, where f is continuous on [0, 1]
0

7. *Let L(x) be defined as in Example 6.3.5. Prove that L(1/x) = −L(x).


8. Let f : R → R be continuous. For a > 0 define g on R by
x+a
R
g(x) = f (t)dt
x−a

Show that g is differentiable and find g ′ (x).


9. Suppose f : [a, b] → R is continuous and g, h : [c, d] → [a, b] are differen-
tiable. For x ∈ [c, d] define
g(x)
R
H(x) = f (t) dt.
h(x)

Find H ′ (x).
10. *Let f be a continuous real-valued function on [a, b], g ∈ R[a, b] with
g(x) ≥ 0 for all x ∈ [a, b]. Prove that there exists c ∈ [a, b] such that
Rb Rb
f (x)g(x)dx = f (c) g(x)dx.
a a

11. Prove the following change of variables formula: Let ϕ be a real-valued


differentiable function on [a, b] with ϕ′ (x) 6= 0 for all x ∈ [a, b]. Let ψ be
the inverse function of ϕ on I = ϕ([a, b]). If f : I → R is continuous on
I, then
Rb ϕ(b)
f (t)ψ ′ (t) dt.
R
f (ϕ(x)) dx =
a ϕ(a)

12. Evaluate each of the following integrals. Justify each step.


Z 2 Z 4p √
ln x 1+ x
*a. dx. *b. √ dx.
1 x 1 x
Integration 257
1 √
Z Z x
c. x ln x, dx. d. t at + b dt, a, b > 0.
Z0 1 √ Z 04
x 1
*e. √ dx. f. √ dx.
0 1+ x 1 x x+1
13. Use the suggested trigonometric substitution to evaluate each of the fol-
lowing integrals.
Z a
x2
a. √ dx, x = a sin t, − 21 π ≤ t ≤ 21 π
−a a − x2
2
Z a
1
*b. √ dx, x = a tan t, 0 ≤ t ≤ π4
x 2 + a2
0
Z 2a p
c. x2 − a2 dx, x = a sec t, 0 ≤ t ≤ 31 π
0
14. *Suppose f : [a, b] → R is continuous. Let M = max{|f (x)| : x ∈ [a, b]}.
Show that
Z b 1/n
lim |f (x)|n dx = M.
n→∞ a

15. *Let f be a monotone increasing function on [a, b] and let F (x) =


Z b
f (t) dt. Prove that F+′ (c) = f (c+) and F−′ (c) = f (c−) for every
a
c ∈ (a, b).
16. Use Theorem 4.4.10 and the previous exercise to construct a continuous,
increasing function F on [0, 1] that is differentiable at every irrational
number in [0, 1], and not differentiable at any rational number in [0, 1].
17. (Cauchy-Schwarz Inequality for Integrals) Let f, g ∈ R[a, b]. Prove
that
Z b 2 Z b  Z b 
f (x)g(x) dx ≤ f 2 (x) dx g 2 (x) dx .
a a a
Rb
(Hint: For α, β ∈ R consider a
(αf − βg)2 . )
18. (The Exponential Function) As in Example 6.3.5, let L : (0, ∞) → R
be defined by
Z x
1
L(x) = dt.
1 t
a. Show that L is strictly increasing on (0, ∞) with Range L = R.
b. Let E : R → (0, ∞) denote the inverse function of L. Use Theorem
5.2.14 to prove that E ′ (x) = E(x) for all x ∈ R. (The function E is called
the natural exponential function, and is often denoted by exp (x).)
c. Prove that E(x + y) = E(x)E(y) for all x, y ∈ R.
d. Prove that E(x) = ex for all x ∈ R, where e is Euler’s number, and ex
is as defined in Miscellaneous Exercise 3 of Chapter 1.
258 Introduction to Real Analysis

6.4 Improper Riemann Integrals


In the definition of the Riemann integral of a real-valued function f , we re-
quired that f be a bounded function defined on a closed and bounded interval
[a, b]. If these two hypothesis are not satisfied, then the preceding theory does
not apply and we have to make some modifications in the definition. In this
section, we will briefly consider the changes that are required if the function
f is unbounded at some point of its domain, or if the interval on which f is
defined is itself unbounded.

Unbounded Functions on Finite Intervals


If f ∈ R[a, b] with |f (x)| ≤ M for all x ∈ [a, b], then by Theorem 6.2.3
f ∈ R[c, b] for every c ∈ (a, b), and
Z b Z b Z c Z c
f− f = f ≤ |f | ≤ M (c − a).
a c a a

As a consequence, if f ∈ R[a, b], then f ∈ R[c, b] for every c ∈ (a, b) and


Z b Z b
lim+ f= f.
c→a c a

Suppose now that f is a bounded real-valued function defined only on (a, b]


with f ∈ R[c, b] for every c ∈ (a, b). If we define f (a) = 0, then f is a bounded
function on [a, b] satisfying f ∈ R[c, b] for every c ∈ (a, b). By Exercise 5,
Section 6.2, f ∈ R[a, b] with
Z b Z b
lim f= f.
c→a+ c a

Furthermore, by Exercise 3 of Section 6.2 the answer is independent of how


we define f (a).
Using the above as motivation, we extend the definition of the integral to
include the case where f becomes unbounded at an endpoint. This extension
of the integral is also due to Cauchy.

DEFINITION 6.4.1 Let f be a real-valued function on (a, b] such that f ∈


R[c, b] for every c ∈ (a, b). The improper Riemann integral of f on (a, b],
Rb
denoted f , is defined to be
a
Z b Z b
f = lim f,
a c→a+ c

provided the limit exists. If the limit exists, then the improper integral is said
to be convergent. Otherwise, the improper integral is said to be divergent.
Integration 259

A similar definition can also be given if f is defined on [a, b) and becomes


unbounded at b. If f becomes unbounded at a point p, a < p < b, we consider
the improper integrals of f on the intervals [a, p) and (p, b]. If each of the
improper integrals exist, then we define the improper integral of f on [a, b] to
be the sum Z Z p b
f+ f.
a p

EXAMPLES 6.4.2 (a) Consider the function f (x) = 1/x on (0, 1]. This
function is clearly unbounded at 0. Since f is continuous on (0, 1], f ∈ R[c, 1]
for every c ∈ (0, 1). By Example 6.3.5
Z 1
1
dx = ln c.
c x

To evaluate lim+ ln c, we consider ln rn , where 0 < r < 1 and n ∈ N. Since


c→0
ln rn = n ln r and ln r < 0,
lim ln rn = lim n ln r = −∞.
n→∞ n→∞
n
Therefore, since r → 0 as n → ∞ and ln x is monotone increasing
lim+ ln c = −∞. Thus the improper integral of 1/x on (0, 1] diverges.
c→0
(b) For our second example we consider the improper integral of
L(x) = ln x on (0, 1]. Since L(x) is continuous on (0, 1], L ∈ R[c, 1] for every
R1
c ∈ (0, 1). Consider ln xdx. If we take g(x) = x, then by the integration by
c
parts formula,
Z 1 Z 1
ln x dx = L(x)g ′ (x) dx
c c
Z 1
= L(1)g(1) − L(c)g(c) − L′ (x)g(x) dx
c
= −c ln c − 1 + c.
By the substitution c = 1/t, t > 0, and l’Hospital’s rule,
− ln t 1
lim c ln c = lim = − lim = 0.
c→0+ t→∞ t t→∞ t

Therefore, Z 1
lim+ ln x dx = −1.
c→0 c
Hence the improper integral of ln x converges on (0, 1] with
Z 1
ln x dx = −1.
0
260 Introduction to Real Analysis

(c) Consider the function f defined on [−1, 1] by



0, −1 ≤ x ≤ 0,
f (x) = 1
 , 0 < x ≤ 1.
x
Here f is unbounded at 0. For this example, the improper integral of f over
[−1, 1] fails to exist because the improper integral of f over (0, 1] does not
exist.
(d) There are some significant differences between the Riemann integral
and the improper Riemann integral. For example, if f ∈ R[a, b], then by
Theorem 6.2.1 f 2 ∈ R[a, b]. This however
√ is false for the improper Riemann
integral! For example, if f (x) = 1/ x, x ∈ (0, 1], then
Z 1 Z 1
1 1 √
√ dx = lim √ dx = lim (2 − 2 c) = 2.
0 x c→0 +
c x c→0 +

Thus the improper integral of f converges on (0, 1]. However, f 2 (x) = 1/x,
and the improper integral of 1/x on (0, 1] diverges. Also, if f ∈ R[a, b], then
|f | ∈ R[a, b]. This again is false for improper Riemann integrals. A function f
for which the improper integral of f converges, but the improper integral of
|f | diverges is given in Exercise 4. 

Infinite Intervals
We now turn our attention to functions defined on infinite intervals.

DEFINITION 6.4.3 Let f be a real-valued function on [a, ∞) that is Rie-


mann integrable on [a, c]Rfor every c > a. The improper Riemann integral

of f on [a, ∞), denoted a f , is defined to be
Z ∞ Z c
f = lim f,
a c→∞ a

provided the limit exists. If the limit exists, then the improper integral is said
to be convergent. Otherwise, the improper integral is said to be divergent.

If f is a real-valued function defined on (−∞, b] satisfying f ∈ R[c, b] for


every c < b, then the improper integral of f on (−∞, b] is defined as
Z b Z b
f = lim f,
−∞ c→−∞ c

provided the limit exists. If f is defined on (−∞, ∞), then the improper inte-
gral of f is defined as Z p Z ∞
f+ f,
−∞ p
Integration 261

for some fixed p ∈ R, provided that the improper integrals of f on (−∞, p]


and [p, ∞) are both convergent. For a function f defined on (−∞, ∞), care
must be exercised in computing the improper integral of f . It is incorrect to
compute Z c
lim f.
c→∞ −c

For example, if f (x) = x, then


Z c
1 2
lim x dx = lim (c − (−c)2 ) = 0.
c→∞ −c c→∞ 2

However, the improper integral of f on (−∞, ∞) is divergent since


Z ∞ Z c
f = lim x dx = lim 12 c2 = ∞.
0 c→∞ 0 c→∞

Remark. If f is nonnegative on [a, ∞) with f ∈ R[a, c] for every c > a, then


Rc Rc
f is a monotone increasing function of c on [a, ∞). Thus lim f exists
a c→∞ a
either as a real number or diverges to ∞. For this reason, if f (x) ≥ 0 for all
x ∈ [a, ∞), we use the notation
Z ∞ Z b
f (x) dx < ∞ or f (x) dx = ∞
a a

to denote that the improper integral of f on [a, ∞) converges or diverges


respectively.

EXAMPLES 6.4.4 (a) Let f (x) = 1/x2 , x ∈ [1, ∞). Since f is continuous
on [1, ∞), f ∈ R[1, c] for every c > 1. Therefore,
Z ∞ Z c  
1 1
f = lim dx = lim − + 1 = 1.
1 c→∞ 1 x2 c→∞ c
Thus the improper integral converges to the value 1.
(b) In this example, we consider the function f (x) = (sin x)/x, x ∈ [π, ∞).
Since f is continuous, f ∈ R[π, c] for every c > π. This function has the
property that the improper integral of f on [π, ∞) converges, but the improper
integral of |f | diverges. The proof of the convergence of the improper integral
of f is left as an exercise (Exercise 7). Here we will show that
Z ∞ Z ∞
| sin x|
|f | = dx = ∞.
π π x
For n ∈ N, consider
Z (n+1)π n (k+1)π
| sin x| | sin x|
X Z
dx = dx.
π x kπ x
k=1
262 Introduction to Real Analysis

Since the integrand is nonnegative,


Z (k+1)π Z (k+ 34 )π
| sin x| | sin x|
dx ≥ dx.
kπ x 1
(k+ 4 )π x

On the interval [(k + 41 )π, (k + 34 )π], | sin x| ≥ 2/2. Also,
1 1
≥ for all x ∈ [(k + 14 )π, (k + 34 )π].
x (k + 1)π
Therefore
Z (k+ 43 )π √ √
| sin x| 2 1 2 1
(k + 34 )π − (k + 14 )π =
 
dx ≥ ,
(k+ 14 )π x 2 (k + 1)π 4 k+1

and as a consequence
(n+1)π
√ n
| sin x| 2X 1
Z
dx ≥ .
π x 4 k+1
k=1


P
By Example 3.7.4 the series 1/k diverges. Therefore,
k=1

∞ (n+1)π
| sin x| | sin x|
Z Z
dx = lim dx = ∞. 
π x n→∞ π x

As the previous example shows, the convergence of the improper integral


of f does not imply the convergence of the improper integral of |f |. If f is
a real-valued function on [a, ∞) such that f ∈ R[a, c] for every c > a and
the improper integral of |f | exists on [a, ∞), then f is said to be absolutely
integrable on [a, ∞). An analogous definition can also be given for unbounded
functions on a finite interval. We leave it as an exercise to prove that if f is
absolutely integrable on [a, ∞), then the improper integral of f also converges
on [a, ∞) (Exercise 5).
We conclude this section with the following useful comparison test for
improper integrals.

THEOREM 6.4.5 (Comparison Test) Let g : [a, ∞) → R be a nonneg-


R∞
ative function satisfying g ∈ R[a, c] for every c > a and g(x)dx < ∞. If
a
f : [a, ∞) → R satisfies
(a) f ∈ R[a, c] for every c > a, and
(b) |f (x)| ≤ g(x) for all x ∈ [a, ∞), then the improper integral of f on
[a, ∞) converges, and
Z ∞ Z ∞
f (x) dx ≤ g(x)dx.
a a
Integration 263

Proof. The proof is left to the exercises (Exercise 6). 

Exercises 6.4
1. For each of the following functions f defined on (0, 1], determine whether
R1
the improper integral of f exists. If it exists, find f .
0
1 x ln x
*a. f (x) = p , 0 < p < 1 b. f (x) = √ c. f (x) =
x 1−x x
1
*d. f (x) = x ln x e. f (x) =
π  (1 + x) ln(1 + x)
*f. f (x) = tan x
2
2. For each of the following determine whether the improper integral con-
verges
Z ∞or diverges. If it Zconverges, evaluate the integral.
∞ Z ∞
−x 1
*a. e dx b. dx *c. x−p dx, p > 1
Z0 ∞ Z 1∞ x Z 1∞
ln x 1 1
d. dx *e. f. ,p>1
Z ∞1 x 2
Z ∞ x ln x 2 x(ln x)p
x x
*g. 2 +1
dx h. 2 + 1)p
dx, p > 1
0 x 0 (x
3. For each of the following, determine the values of p and q for which the
improper integral converges.
Z 1 Z ∞ Z ∞
2
*a. xp | ln x|q dx b. xp (ln x)q dx c. xp [ln(1 + x)]q dx
0 2 0
4. Let f be defined on (0, 1] by
 
d 1 1 2 1
f (x) = x2 sin 2 = 2x sin 2 − cos 2 .
dx x x x x
Show that the improper Riemann integral of f converges on (0, 1], but
that the improper integral of |f | diverges on (0, 1].
5. *If f is absolutely integrable on [a, ∞), and integrable on [a, c] for every
c > a, prove that the improper Riemann integral of f on [a, ∞) exists.
6. Prove Theorem 6.4.5.
cos x
7. Let f (x) = , x ∈ [π, ∞).
x2
*a. Show that the improper integral of |f | converges on [π, ∞).
R∞ sin x
*b. Use integration by parts on [π, c], c > π, to show that dx
π x
exists.
R∞
8. Show that x−p sin xdx converges for all p, 0 < p < 2.
0
9. For x > 0, set
R∞
Γ(x) = e−t tx−1 dt.
0

The function Γ is called the Gamma function.


*a. Show that the improper integral converges for all x > 0.
264 Introduction to Real Analysis

b. Use integration by parts to show that Γ(x + 1) = xΓ(x), x > 0.


c. Show that Γ(1) = 1.
d. For n ∈ N, prove that Γ(n + 1) = n!.

6.5 The Riemann-Stieltjes Integral3


In this section, we consider the Riemann-Stieltjes integral, which as we will
see is an extension of the Riemann integral. To motivate the Riemann-Stieltjes
integral we consider the following example from physics involving the moment
of inertia.

EXAMPLE 6.5.1 Consider n-masses, each of mass mi , i = 1, ..., n, located


along the x-axis at distances ri from the origin with 0 < r1 < · · · < rn (Figure
6.7). The moment of inertia I about an axis through the origin at right angles
to the system of masses is given by
n
X
I= ri2 mi .
i=1

On the other hand, if we have a wire of length ℓ along the x-axis with one
end at the origin, then the moment of inertia I is given my
Z ℓ
I= x2 ρ(x) dx,
0

where for each x ∈ [0, ℓ], ρ(x) denotes the cross-sectional density at x. 

FIGURE 6.7
Example 6.5.1

Although these two problems are totally different, the first being discrete
and the second continuous, the Riemann-Stieltjes integral will allow us to
express both of these formulas as a single integral. In the definition of the
3 Since the results of this section are not specifically required in subsequent chapters, this

topic can be omitted.


Integration 265

Riemann integral we used the length ∆xi of the ith interval to define the
upper and lower Riemann sums of a bounded function f . The only difference
between the Riemann and Riemann-Stieltjes integral is that we replace ∆xi
by
∆αi = α(xi ) − α(xi−1 ),
where α is a nondecreasing function on [a, b]. Taking α(x) = x will give the
usual Riemann integral. Although the modification in the definition is only
minor, the consequences however are far reaching. Not only will we obtain
a more extensive theory of integration, we also obtain an integral which has
broad applications in the mathematical sciences.

Definition of the Riemann-Stieltjes Integral


Let α be a monotone increasing function on [a, b], and let f be a bounded
real-valued function on [a, b]. For each partition P = {x0 , x1 , ..., xn } of [a, b],
set
∆αi = α(xi ) − α(xi−1 ), i = 1, . . . , n.
Since α is monotone increasing, ∆αi ≥ 0 for all i. As in Section 6.1, let

mi = inf{f (t) : t ∈ [xi−1 , xi ]},


Mi = sup{f (t) : t ∈ [xi−1 , xi ]}.

As for the Riemann integral, the upper Riemann-Stieltjes sum of f with


respect to α and the partition P, denoted U (P, f, α), is defined by
n
X
U (P, f, α) = Mi ∆αi .
i=1

Similarly, the lower Riemann-Stieltjes sum of f with respect to α and the


partition P, denoted L(P, f, α), is defined by
n
X
L(P, f, α) = mi ∆αi .
i=1

Since mi ≤ Mi and ∆αi ≥ 0, we always have L(P, f, α) ≤ U (P, f, α).


Furthermore, if m ≤ f (x) ≤ M for all x ∈ [a, b], then

m [α(b) − α(a)] ≤ L(P, f, α) ≤ U(P, f α) ≤ M [α(b) − α(a)], (10)

for all partitions P of [a, b]. Let P be any partition of [a, b]. Since Mi ≤ M for
all i and ∆αi ≥ 0,
n
X n
X n
X
Mi ∆αi ≤ M ∆αi = M ∆αi = M [α(b) − α(a)].
i=1 i=1 i=1
266 Introduction to Real Analysis

Thus U (P, f, α) ≤ M [α(b) − α(a)]. The other inequality follows similarly. In


the above we have used the fact that
n
X
∆αi = (α(x1 ) − α(x0 )) + (α(x2 ) − α(x1 )) + · · · + (α(xn ) − α(xn−1 ))
i=1
= α(xn ) − α(x0 ) = α(b) − α(a).

In analogy with the Riemann integral, the upper and lower Riemann-
Rb
Stieltjes integrals of f with respect to α over [a, b], denoted a f dα and
Rb
a
f dα respectively, are defined by

Z b
f dα = inf {U (P, f, α) : P is a partition of [a, b]} ,
a
Z b
f dα = sup {L(P, f, α) : P is a partition of [a, b]} .
a

By inequality (10) the set {U (P, f, α) : P is a partition of [a, b]} is bounded


below, and thus the upper integral of f with respect to α exists as a real
number. Similarly, the lower sums are bounded above and thus the supremum
defining the lower integral is also finite. As for the Riemann integral, our first
step is to prove that the lower integral is less than or equal to the upper
integral.

THEOREM 6.5.2 Let f be bounded real-valued function on [a, b], and α a


monotone increasing function on [a, b]. Then
Z b Z b
f dα ≤ f dα.
a a

Proof. As in the proof of Lemma 6.1.3, if P ∗ is a refinement of the partition


P, then
L(P, f, α) ≤ L(P ∗ , f, α) ≤ U(P ∗ , f, α) ≤ U(P, f, α).
Thus if P, Q are any two partitions of [a, b],

L(P, f, α) ≤ L(P ∪ Q, f, α) ≤ U(P ∪ Q, f, α) ≤ U(Q, f, α).

Therefore L(P, f, α) ≤ U(Q, f, α) for any partitions P, Q. Hence


Z b
f dα = sup L(P, f, α) ≤ U(Q, f, α)
a P

for any partition Q. Taking the infimum over Q gives the result. 
Integration 267

DEFINITION 6.5.3 Let f be a bounded real-valued function on [a, b], and


α a monotone increasing function on [a, b]. If
Z b Z b
f dα = f dα,
a a

then f is said to be Riemann-Stieltjes integrable or integrable with re-


spect to α on [a, b]. The common value is denoted by
Z b Z b
f dα or f (x) dα(x),
a a

and is called the Riemann-Stieltjes integral of f with respect to α.

As was indicated previously, the special case α(x) = x gives the usual
Riemann integral on [a, b].

EXAMPLES 6.5.4 (a) Fix a < c ≤ b. As in Definition 4.4.9, let Ic (x) =


I(x − c) be the unit jump function at c defined by
(
0, x < c,
Ic (x) =
1, x ≥ c.

We now prove the following: If f is a bounded real-valued function on [a, b]


that is continuous at c, a < c ≤ b, then f is integrable with respect to Ic and
Z b Z b
f dIc = f (x) dI(x − c) = f (c).
a a

For convenience, we set α(x) = Ic (x), which is clearly monotone increasing on


[a, b]. Let P = {x0 , x1 , ..., xn } be any partition of [a, b]. Since a < c ≤ b, there
exists an index k , 1 ≤ k ≤ n, such that

xk−1 < c ≤ xk .

Then

∆αk = α(xk ) − α(xk−1 ) = 1 − 0 = 1, and


∆αi = 0, for all i 6= k.

Therefore

U (P, f, α) = Mk ∆αk = Mk = sup{f (t) : xk−1 ≤ t ≤ xk }, and


L(P, f, α) = mk ∆αk = mk = inf{f (t) : xk−1 ≤ t ≤ xk }.

Since f is continuous at c, given ǫ > 0 there exists a δ > 0 such that

f (c) − ǫ < f (t) < f (c) + ǫ


268 Introduction to Real Analysis

for all t ∈ [a, b] with |t−c| < δ. If P is any partition of [a, b] with xj −xj−1 < δ
for all j, then
f (c) − ǫ ≤ mk ≤ Mk ≤ f (c) + ǫ.
Therefore f (c) − ǫ ≤ L(P, f, α) ≤ U(P, f, α) ≤ f (c) + ǫ. As a consequence
Z b Z b
f (c) − ǫ ≤ f dα ≤ f dα ≤ f (c) + ǫ.
a a

Since ǫ > 0 was arbitrary, the upper and lower integrals of f are equal, and
thus f is integrable with respect to α on [a, b] with
Z b
f dα = f (c).
a

(b) The function (


1, x∈Q
f (x) =
0, x 6∈ Q,
is not integrable with respect to any non-constant monotone increasing func-
tion α. Suppose α is monotone increasing on [a, b], a < b, with α(a) 6= α(b).
If P = {x0 , x1 , ..., xn } is any partition of [a, b], then mi = 0 and Mi = 1 for
all i = 1, ..., n. Therefore L(P, f, α) = 0 and
n
X
U (P, f, α) = ∆αi = α(b) − α(a).
i=1

Thus f is not integrable with respect to α 

Remark: It should be noted that in Example 6.5.4(a) only left continuity of


f at c is required (Exercise 2(a)).
The following theorem, which is the analogue of Theorem 6.1.7 for the
Riemann integral, provides necessary and sufficient conditions for the existence
of the Riemann-Stieltjes integral. The proof of the theorem follows verbatim
the proof of Theorem 6.1.7 and thus is omitted.

THEOREM 6.5.5 Let α be a monotone increasing function on [a, b]. A


bounded real-valued function f is Riemann-Stieltjes integrable with respect to
α on [a, b] if and only if for every ǫ > 0, there exists a partition P of [a, b]
such that
U (P, f, α) − L(P, f, α) < ǫ.
Furthermore, If P is a partition of [a, b] for which the above holds, then the
inequality also holds for all refinements of P.

We now use the previous theorem to prove the following analogue of The-
orem 6.1.8. Except for some minor differences, the two proofs are very similar.
Integration 269

THEOREM 6.5.6 Let f be a real-valued function on [a, b] and α a monotone


increasing function on [a, b].
(a) If f is continuous on [a, b], then f is integrable with respect to α on
[a, b].
(b) If f is monotone on [a, b] and α is continuous on [a, b], then f is
integrable with respect to α on [a, b].

Proof. (a) The proof of (a) is identical to the proof of Theorem 6.1.8(a) except
that given ǫ > 0, we choose η > 0 such that

[α(b) − α(a)] η < ǫ.

The remainder of the proof now follows verbatim the proof of Theorem
6.1.8(a).
(b) For any positive integer n, choose a partition P = {x0 , x1 , ..., xn } of
[a, b] such that
1
∆αi = α(xi ) − α(xi−1 ) = [α(b) − α(a)].
n
Since α is continuous, such a choice is possible by the intermediate value
theorem. Assume f is monotone increasing on [a, b]. Then Mi = f (xi ) and
mi = f (xi−1 ). Therefore,
n
X
U (P, f, α) − L(P, f, α) = [f (xi ) − f (xi−1 )]∆αi
i=1
n
[α(b) − α(a)] X
= [f (xi ) − f (xi−1 )]
n i=1
[α(b) − α(a)]
= [f (b) − f (a)].
n
Given ǫ > 0, choose n ∈ N such that
[α(b) − α(a)]
[f (b) − f (a)] < ǫ.
n
For this n and corresponding partition P, U (P, f, α) − L(P, f, α) < ǫ, which
proves the result. 
Remark. In part (b) above, the result may be false if α is not continuous.
For example, if a < c ≤ b, then the monotone function Ic is not integrable
with respect to Ic on [a, b] (Exercise 2(b)).

Properties of the Riemann-Stieltjes Integral


We next consider several basic properties of the Riemann-Stieltjes integral.
For notational convenience, we make the following definition.
270 Introduction to Real Analysis

DEFINITION 6.5.7 For a given monotone increasing function α on [a, b],


R(α) denotes the set of bounded real-valued functions f on [a, b] which are
Riemann-Stieltjes integrable with respect to α.

THEOREM 6.5.8
(a) If f, g ∈ R(α), then f + g and cf are in R(α) for every c ∈ R, and
Z b Z b Z b Z b Z b
(f + g) dα = f dα + g dα and cf dα = c f dα.
a a a a a

(b) If f ∈ R(αi ), i = 1, 2, then f ∈ R(α1 + α2 ) and


Z b Z b Z b
f d(α1 + α2 ) = f dα1 + f dα2 .
a a a

(c) If f ∈ R(α) and a < c < b, then f is integrable with respect to α on


[a, c] and [c, b] with
Z b Z c Z b
f dα = f dα + f dα.
a a c

(d) If f, g ∈ R(α) with f (x) ≤ g(x) for all x ∈ [a, b], then
Z b Z b
f dα ≤ g dα.
a a

(e) If |f (x)| ≤ M on [a, b] and f ∈ R(α), then |f | ∈ R(α) and


Z b Z b
f dα ≤ |f | dα ≤ M [α(b) − α(a)].
a a

Proof. We provide the proofs of (b) and (e). The proofs of (a), (c), and (d),
along with other properties of the Riemann-Stieltjes integral are left to the
exercises.
(b) Since f ∈ R(αi ), given ǫ > 0, there exists a partition Pi , i = 1, 2, such
that
ǫ
U (Pi , f, αi ) − L(Pi , f, αi ) < . (11)
2
Let P = P1 ∪ P2 . Since P is a refinement of both P1 and P2 , inequality (11)
is still valid for the partition P. Thus since

∆(α1 + α2 )i = ∆(α1 )i + ∆(α2 )i ,


Integration 271

for all i = 1, ..., n,

U (P, f,α1 + α2 ) − L(P, f, α1 + α2 )


ǫ ǫ
= U (P, f, α1 ) − L(P, f, α1 ) + U (P, f, α2 ) − L(P, f, α2 ) < + = ǫ.
2 2
Therefore by Theorem 6.5.5, f ∈ R(α1 + α2 ). Furthermore, for any partition
P of [a, b],

L(P, f, α1 + α2 ) = L(P, f, α1 ) + L(P, f, α2 )


Z b Z b
≤ f dα1 + f dα2
a a
≤ U(P, f, α1 ) + U (P, f, α2 ) = U (P, f, α1 + α2 ).

Thus since f ∈ R(α1 + α2 ),


Z b Z b Z b
f d(α1 + α2 ) = f dα1 + f dα2 .
a a a

(e) Suppose f ∈ R(α) and P = {x0 , x1 , ..., xn } is a partition of [a, b]. For
each i = 1, ..., n, let

Mi = sup{f (t) : t ∈ [xi−1 , xi ]}, Mi∗ = sup{|f (t)| : t ∈ [xi−1 , xi ]},


mi = inf{f (t) : t ∈ [xi−1 , xi ]}, m∗i = inf{|f (t)| : t ∈ [xi−1 , xi ]}.

If t, x ∈ [xi−1 , xi ], then

| |f (t)| − |f (x)| | ≤ |f (t) − f (x)| ≤ Mi − mi .

Thus Mi∗ − m∗i ≤ Mi − mi , for all i = 1, ..., n, and as a consequence

U (P, |f |, α) − L(P, |f |, α) ≤ U(P, f, α) − L(P, f, α).


R
Therefore, by Theorem 6.5.5, |f | ∈ R(α). Choose c = ±1 such that c f dα ≥
0. Then
Z b Z b Z b Z b
f dα = c f dα = c f dα ≤ |f | dα
a a a a
Z b
≤M dα = M [α(b) − α(a)]. 
a

As for the Riemann integral, we also have the following mean value theorem
and integration by parts formula for the Riemann-Stieltjes integral.
272 Introduction to Real Analysis

THEOREM 6.5.9 (Mean Value Theorem) Let f be a continuous real-


valued function on [a, b], and α a monotone increasing function on [a, b]. Then
there exists c ∈ [a, b] such that
Z b
f dα = f (c) [α(b) − α(a)].
a

Proof. Let m and M denote the minimum and maximum of f on [a, b] re-
spectively. Then by Theorem 6.5.8(d),
Z b
m [α(b) − α(a)] ≤ f dα ≤ M [α(b) − α(a)].
a

If α(b) − α(a) = 0, then any c ∈ [a, b] will work. If α(b) − α(a) 6= 0, then by
the intermediate value theorem there exists c ∈ [a, b] such that
b
1
Z
f (c) = f dα,
α(b) − α(a) a

which proves the result. 

THEOREM 6.5.10 (Integration by Parts Formula) Suppose α and β


are monotone increasing functions on [a, b]. Then α ∈ R(β) if and only if
β ∈ R(α). If this is the case,
Z b Z b
α dβ = α(b)β(b) − α(a)β(a) − β dα.
a a

Proof. By Exercise 9, for any partition P of [a, b],

U (P, α, β) = α(b)β(b) − α(a)β(a) − L(P, β, α), and


L(P, α, β) = α(b)β(b) − α(a)β(a) − U (P, β, α).

Therefore,

U (P, α, β) − L(P, α, β) = U (P, β, α) − L(P, β, α).

From this identity it immediately follows by Theorem 6.5.5 that α ∈ R(β) if


and only if β ∈ R(α). Furthermore, if β ∈ R(α), then given ǫ > 0, there exists
a partition P of [a, b] such that
Z b
L(P, β, α) > β dα − ǫ.
a

Hence,
Z b Z b
α dβ ≤ U(P, α, β) < α(b)β(b) − α(a)β(a) − β dα + ǫ.
a a
Integration 273

Since the above holds for any ǫ > 0,


Z b Z b
α dβ ≤ α(b)β(b) − α(a)β(a) − β dα.
a a

A similar argument using the lower sum proves the reverse inequality. 
We conclude this section with two results that represent the extremes
encountered in Riemann-Stieltjes integration. As in Example 6.5.4(a), let Ic
be the unit jump function at c ∈ R. Suppose {sn }Nn=1 is a finite subset of (a, b]
and {cn }N
n=1 are non-negative real numbers. Define the monotone increasing
function α on [a, b] by
XN
α(x) = cn I(x − sn ).
n=1

If f is continuous on [a, b], then by Example 6.5.4(a) and Theorem 6.5.8(b),


Z b N
X Z b N
X
f dα = cn f (x) dI(x − sn ) = cn f (sn ). (12)
a n=1 a n=1

Suppose {sn }∞ ∞
Pof (a, b] and {cn }n=1 is a sequence of
n=1 is a countable subset
nonnegative real numbers for which cn converges. As in Theorem 4.4.10
define α on [a, b] by
X∞
α(x) = cn I(x − sn ). (13)
n=1

Since 0 ≤ I(x−sn ) ≤ 1 for all n, the series in (13) converges for every x ∈ [a, b],
and α is a monotone increasing function on [a, b]. For such a function α we
have the following theorem.

THEOREM 6.5.11 Let f be a continuous real-valued function on [a, b] and


let α be the monotone function defined by (13). Then f ∈ R(α) and
Z b ∞
X
f dα = cn f (sn ).
a n=1

Proof. Since f is continuous on [a, b], f ∈ R(α) (Theorem 6.5.6(a)). Let ǫ > 0
be given. Choose a positive integer N such that

X
cn < ǫ.
n=N +1

Define β1 and β2 as follows:


N
X ∞
X
β1 (x) = cn I(x − sn ), β2 (x) = cn I(x − sn ).
n=1 n=N +1
274 Introduction to Real Analysis

Then α = β1 + β2 , and by identity (12)


Z b N
X
f dβ1 = cn f (sn ).
a n=1

Let M = max{|f (x)| : x ∈ [a, b]}. Then by Theorem 6.5.8(b) and (e),
Z b N
X Z b
f dα − cn f (sn ) = f dβ2 ≤ M [β2 (b) − β2 (a)]
a n=1 a

X
≤M cn < M ǫ,
n=N +1

which proves the result. 


At the other extreme, if the monotone function α is also differentiable,
then we have the following result.

THEOREM 6.5.12 Suppose f ∈ R[a, b] and α is a monotone increasing


differentiable function on [a, b] with α′ ∈ R[a, b]. Then f ∈ R(α) and
Z b Z b
f dα = f (x) α′ (x) dx.
a a

Proof. Since both f and α′ are Riemann integrable on [a, b], by Theorem
6.2.1(c), f α′ ∈ R[a, b]. Let ǫ > 0 be given. Since α′ ∈ R[a, b], by Theorem
6.1.7 there exists a partition P of [a, b] such that

U (P, α′ ) − L(P, α′ ) < ǫ. (14)

Let Q = {x0 , ..., xn } be any refinement of P. As in Theorem 6.2.6, for each


i = 1, ..., n, we can choose si ∈ [xi−1 , xi ] such that
n
X
U (Q, f, α) < f (si )∆αi + ǫ. (15)
i=1

By the mean value theorem, for each i = 1, ..., n, there exists ti ∈ [xi−1 , xi ]
such that
∆αi = α(xi ) − α(xi−1 ) = α′ (ti )∆xi
Therefore,
n
X n
X
f (si )∆αi = f (si )α′ (ti )∆xi . (16)
i=1 i=1

Let M = sup{|f (x)| : x ∈ [a, b]}, and for i = 1, ..., n, let mi and Mi denote the
infimum and supremum respectively of α′ over the interval [xi−1 , xi ]. Then

|α′ (si ) − α′ (ti )| ≤ Mi − mi


Integration 275

for all i = 1, ..., n. Therefore,


n
X n
X n
X
f (si )α′ (ti )∆xi − f (si )α′ (si )∆xi ≤ |f (si )||α′ (ti ) − α′ (si )|
i=1 i=1 i=1
Xn
≤M (Mi − mi )∆xi
i=1
= M (U (Q, α′ ) − L(Q, α′ )) < M ǫ.

The last inequality follows by (14) since Q is a refinement of P. Therefore


n
X n
X
f (si )α′ (ti )∆xi ≤ f (si )α′ (si )∆xi + M ǫ ≤ U(Q, f α′ ) + M ǫ.
i=1 i=1

Thus by (15) and (16),

U (Q, f, α) < U (Q, f α′ ) + (M + 1)ǫ.

Since f α′ ∈ R[a, b], there exists a partition Q of [a, b], which is a refinement
of P, such that
Z b
U (Q, f α′ ) < f α′ + ǫ.
a
Thus
Z b Z b
f dα < f α′ + (M + 2)ǫ.
a a
Since this holds for any ǫ > 0,
Z b Z b
f dα ≤ f α′ .
a a

A similar argument using lower sums proves the reverse inequality. Thus f ∈
R(α) and
Z b Z b
f dα = f α′ . 
a a
We now give several examples to illustrate the previous two theorems.

EXAMPLES 6.5.13 (a) For our first example we illustrate the finite ver-
sion of Theorem 6.5.11; namely, identity (12). Consider
Z 2
ex d[x].
0

For x ∈ [0, 2],


[x] = I(x − 1) + I(x − 2).
276 Introduction to Real Analysis

Therefore since ex is continuous, it is integrable with respect to [x], and by


(12)
Z 2
ex d[x] = e1 + e2 .
0

(b) To illustrate Theorem 6.5.12, if α(x) = x2 on [0, 1], then for any
Riemann integrable function f on [0, 1],
Z 1 Z 1
f (x) dx2 = 2 f (x)x dx.
0 0

In particular, if f (x) = sin πx, then


Z 1 Z 1
2
sin πx d x = 2 x sin πx dx
0 0

which by the integration by parts formula

1 1 1 1
Z
= − x cos πx + cos πxdx
π 0 π 0
 
1 1 1 1
= − cos π + 2 sin πx = .
π π 0 π

(c) As another illustration of Theorem 6.5.12,


Z 3 Z 3
[x]d(e2x ) = 2 [x]e2x dx
0 0
Z 2 Z 3
=2 e2x dx + 2 2e2x dx = 2e6 − e4 − e2 . 
1 2

The Riemann-Stieltjes integral has important applications in a variety of


areas, including physics and probability theory. It allows us to express di-
verse formulas as a single expression. To illustrate this, we again consider the
moment of inertia problem of Example 6.5.1.

EXAMPLE 6.5.14 In this example, we will show that both formulas of


Example 6.5.1 can be expressed as
Z ℓ
I= x2 dm(x),
0

where m(x) denotes the mass of the wire or system from 0 to x. Clearly the
function m(x) is nondecreasing, and thus since x2 is continuous, the above
integral exists.
Integration 277

In the first case, since 0 < r1 < r2 < · · · < rn , and the masses mi are
located at ri (see Figure 6.7), m(x) is given by
n
X
m(x) = mi I(x − ri ).
i=1

Thus by Theorem 6.5.11,


Z ℓ n
X
x2 dm(x) = ri2 mi .
0 i=1

On the other hand, if m(x) is differentiable and m′ (x) is Riemann integrable,


then by Theorem 6.5.12,
Z ℓ Z ℓ
2
x dm(x) = x2 m′ (x) dx.
0 0

It only remains to be shown that m (x) is the density. The density ρ(x) of a
wire (mass per unit length) is defined as the limit of the average density. In
the interval [x, x + ∆x], the average density is
m(x + ∆x) − m(x)
.
∆x
Thus if m(x) is differentiable, ρ(x) = m′ (x). 

Riemann-Stieltjes Sums
We conclude this section with a few remarks concerning Riemann-Stieltjes
sums. As in Definition 6.2.4, let f be a bounded real-valued function on [a, b],
α a monotone increasing function on [a, b], and P = {x0 , ..., xn } a partition of
[a, b]. For each i = 1, 2, ..., n, choose ti ∈ [xi−1 , xi ]. Then
n
X
S(P, f, α) = f (ti )∆αi
i=1

is called a Riemann-Stieltjes sum of f with respect to α and the partition


P.
A natural question to ask is whether the analogue of Theorem 6.2.6 is valid
for the Riemann-Stieltjes integral. Unfortunately, only part of the Theorem
Rb
holds. If lim S(P, f, α) = I, then f ∈ R(α) on [a, b] and a f dα = I
kPk→0
(Exercise 11). The converse, as the following example will illustrate, is false!
However, if f and α satisfy either of the hypothesis of Theorem 6.5.6, then
Z b
f dα = lim S(P, f, α).
a kPk→0

The result where f is continuous and α is monotone increasing is given as an


exercise (Exercise 10).
278 Introduction to Real Analysis

EXAMPLE 6.5.15 Let f and α be defined on [0, 2] as follows:


(
0, 0 ≤ x ≤ 1,
f (x) =
1, 1 < x ≤ 2,
(
0, 0 < x < 1,
α(x) =
1, 1 ≤ x ≤ 2.
R2
Since f is left continuous at 1, by Example 6.5.4(a) 0 f dα = f (1) = 0.
Suppose P = {x0 , x1 , ..., xn } is any partition of [0, 2] with 1 6∈ P. Let k be such
that xk−1 < 1 < xk . Then ∆αk = 1 and ∆αi = 0 for all i 6= k. If tk ∈ (1, xk ],
then S(P, f, α) = 1. On the other hand, if tk = 1, then S(P, f, α) = 0. As a
consequence,
lim S(P, f, α)
kPk→0

does not exist. 

Exercises 6.5
Z 1
1. *Evaluate f (x) dα(x) where f is bounded on [−1, 1] and continuous
−1
at 0, and α is given by

−1,
 x < 0,
α(x) = 0, x = 0,

1, x > 0.

2. a. In Example 6.5.4(a), prove that if the function f is left continuous at


c, then f is integrable with respect to Ic on [a, b].
b. Show that Ic is not integrable with respect to Ic
3. Let α be non-decreasing on [a, b]. Suppose f is bounded on [a, b] and
Rx
integrable with respect to α on [a, b]. For x ∈ [a, b] set F (x) = f dα.
a

*a. Prove that |F (x)−F (y)| ≤ M |α(x)−α(y)| for some positive constant
M and all x, y ∈ [a, b].
b. Prove that if α is continuous at xo ∈ [a, b], then F is also continuous
at xo .
4. a. Prove Theorem 6.5.8(a).
b. Prove Theorem 6.5.8(c).
c. Prove Theorem 6.5.8(d).
5. Use the theorems from the text to compute each of the following integrals:
Z π/2 Z 3 Z 3
*a. x d(sin x). b. [x] dx2 . *c. x2 d[x].
0
Z 3 0
Z 1 0
Z 4
d. ([x] + x) d(x2 + ex ) e. sin πxd[4x]. *f. (x − [x]) dx3
1 0 1
Integration 279

6. Verify the integration by parts formula with (b) and (c) of the previous
exercise.
Z 1
7. Find f dα, where f is continuous on [0, 1] and
0

X 1 1
α(x) = n
I(x − n
).
n=1
2
Leave your answer in the form of an infinite series.
8. Let α be as in Exercise 7. Evaluate each of the following integrals. Leave
your answers in the form of an infinite series.
Z 1 Z 1
*a. x dα(x) b. α(x) dx
0 0
9. Suppose α and β are monotone increasing on [a, b], and P is a partition
of [a, b]. Prove that
U (P, α, β) = α(b)β(b) − α(a)β(a) − L(P, β, α).
10. Prove that if f is a continuous real-valued function on [a, b] and α is
monotone increasing on [a, b], then lim S(P, f, α) exists and
kPk→0
Rb
lim S(P, f, α) = f dα.
kPk→0 a

11. Let f be a bounded real-valued function on [a, b] and α a monotone


increasing function on [a, b]. Prove the following: If lim S(P, f, α) = I,
kPk→0
Rb
then f ∈ R(α) and f dα = I.
a

12. *a. Let α be a monotone increasing function on [a, b]. If f ∈ R(α), prove
that f 2 ∈ R(α).
b. If f, g ∈ R(α), prove that f g ∈ R(α).
13. If f ∈ R(α) on [a, b] with Range f ⊂ [c, d], and ϕ is continuous on [c, d],
prove that ϕ ◦ f ∈ R(α) on [a, b].
14. Suppose f is a nonnegative continuous function on [a, b], and α is non-
Rx
decreasing on [a, b] Define the function β on [a, b] by β(x) = f dα. If g
1
is continuous on [a, b], prove that
Z b Z b
g dβ = f g dα.
1 a

6.6 Numerical Methods4


In this section, we will take a brief look at some elementary numerical meth-
ods that are useful in obtaining approximations to Riemann integrals. Even
4 The topics of this section may be omitted on first reading.
280 Introduction to Real Analysis

though the fundamental theorem of calculus provides an easy method for eval-
uating definite integrals, it is useful only if we can find an antiderivative of
the function being integrated. To illustrate this, in Example 6.3.9 we showed
that Z 2
t
2
dt = 12 ln 5.
0 1+t
This however is not particular useful if we do not know the value of ln 5. By
Example 6.3.5, Z 5
1
ln 5 = dt.
1 t
To obtain an approximation to ln 5, we can choose from several available
methods for obtaining numerical approximations to the definite integral.

Approximations Using Riemann Sums


Since upper and lower sums are in general difficult to evaluate, Darboux’s defi-
nition of the Riemann integral is not particularly useful in obtaining numerical
approximations. The most elementary numerical method is to use Riemann
sums. One of the first mathematicians to use numerical methods was Euler,
who considered sums of the form
n
X
f (xk−1 )(xk − xk−1 )
k=1

as an approximation to the integral. This is nothing but the Riemann sum for
the partition P = {x0 , x1 , ..., xn } of [a, b] with tk = xk−1 for all k.
In using Riemann sums to approximate the integral of f it is convenient
to take equally spaced partitions. Let n ∈ N, and set h = (b − a)/n. Define
x0 = a, x1 = a + h, x2 = a + 2h, · · · , xn = a + nh = b. (17)
Thus if Pn = {x0 , x1 , ..., xn } with xi as defined, we always have ∆xi = h for
all i = 1, ..., n, and
n
X n
X
S(Pn , f ) = f (ti )∆xi = h f (ti ),
i=1 i=1

where for each i = 1, ..., n, ti ∈ [xi−1 , xi ]. If we take ti = xi−1 for all i, then
we obtain the above formula of Euler. Similarly, we could take ti to be the
right endpoint xi of the interval [xi−1 , xi ]. Another choice of ti would be the
midpoint; i.e., ti = (xi−1 + xi )/2. For monotone functions, it is intuitively
obvious that the midpoint gives a better approximation than either the right
or left endpoint to the integral of f over the interval [xi−1 , xi ] (see Figure 6.8).
With xi as defined by (17) and ti = (xi−1 + xi )/2 the above formula becomes
n
X
Mn (f ) = h f (a + (i − 12 )h). (18)
i=1
Integration 281

FIGURE 6.8
Midpoint approximation

The quantity Mn (f ) is called the nth midpoint approximation to the in-


tegral of f over [a, b].
Regardless of the method used, it is important to be able to estimate the
error between the true value and the approximate value. If f ∈ R[a, b], then
for any partition P of [a, b], we always have
Z b
f (x)dx − S(P, f ) ≤ U(P, f ) − L(P, f ) (19)
a

for any choice of tk ∈ [xk−1 , xk ]. Inequality (19) follows from the fact that
both the Riemann sum of f and the integral of f lie between the lower and
upper sum of f . If f is monotone increasing on [a, b], then by the proof of
Theorem 6.1.8(b),
U (Pn , f ) − L(Pn , f ) = h[f (b) − f (a)].
Thus by inequality (19),
Z b n
X
f (x)dx − h f (ti ) ≤ h|f (b) − f (a)| (20)
a i=1

for any choice of ti ∈ [xi−1 , xi ]. Although we proved (20) for a monotone


increasing function, the inequality is also valid for a monotone decreasing
function. Thus for monotone functions, inequality (20) provides an estimate
on the error between the true value and the approximate value.
EXAMPLE 6.6.1 The function f (x) = 1/x is decreasing on [1, 5]. Thus by
inequality (20), for n ∈ N and h = (b − a)/n,
n
X 4 1 16 1
| ln 5 − h f (ti )| ≤ −1 =
i=1
n 5 5 n
282 Introduction to Real Analysis

for any choice of ti ∈ [xi−1 , xi ]. Thus with n = 8, the error is less than
2/5 = 0.4; n = 160 only guarantees an error of less than 1/50 = 0.02. We
would be required to take very large values of n to be guaranteed a sufficiently
small error.
With n = 8, h = 1/2 and xi = 1 + i/2, i = 0, 1, ..., 8, and ti = xi−1 =
(1 + i)/2 (the left end point),
8
X 1
S(P8 , f ) =
i=1
1+i
1 1 1 1 1 1 1 1
= + + + + + + +
2 3 4 5 6 7 8 9
= 1.8290 (to four decimal places).

To four decimal places, ln 5 = 1.6094. Thus the error is 0.2196 which is less
than the predicted error of 0.4.
If we use the midpoint approximation (18) (again with n = 8), then ti =
1 + (i − 21 ) 12 which upon simplification equals (3 + 2i)/4. Thus
8
1X 4
M8 (f ) =
2 i=1 3 + 2i
 
1 1 1 1 1 1 1 1
=2 + + + + + + +
5 7 9 11 13 15 17 19
= 1.5998 (to four decimal places).

Using the midpoint approximation the error is less than 0.01. This is consid-
erably better than predicted by inequality (20). As we will shortly see, this
improved accuracy is not a coincidence. 

Inequality (20) provides an error estimate for monotone functions. For arbi-
trary real-valued functions, under the additional hypothesis that the derivative
is bounded, we have the following theorem.

THEOREM 6.6.2 Let f be a real-valued differentiable function on [a, b] for


which f ′ (x) is bounded on [a, b]. Then for any partition P of [a, b],
Z b
f (x)dx − S(P, f ) ≤ kPkM (b − a),
a

where M = sup{|f ′ (x)| : x ∈ [a, b]}.

Proof. Suppose P = {x0 , x1 , ..., xn } is a partition of [a, b]. Then


n
X
U (P, f ) − L(P, f ) = (Mi − mi )∆xi .
i=1
Integration 283

Since f is continuous on [a, b], for each i there exist si , s′i ∈ [xi−1 , xi ] such
that
Mi − mi = f (si ) − f (s′i ).
By the mean value theorem, f (si ) − f (s′i ) = f ′ (ti )(si − s′i ) for some ti between
si and s′i . Therefore, since |f ′ (ti )||si − s′i | ≤ M kPk,
n
X
U (P, f ) − L(P, f ) ≤ M kPk ∆xi = kPk M (b − a).
i=1

The result now follows by inequality (19). 


If f satisfies the hypothesis of the theorem, h = (b − a)/n, n ∈ N, and the
xi are defined by (17), then
Z b n
X
f (x)dx − h f (ti ) ≤ M (b − a) h
a i=1

for any choice of ti ∈ [xi−1 , xi ]. A slight improvement of this inequality is


given in the exercises (Exercise 6). If En (f ) denotes the error between the
true value and the approximate value, that is,
Z b
En (f ) = f (x) dx − S(Pn , f ),
a

then the above inequality can simply be written as |En (f )| ≤ C h, where C


is a fixed constant depending only on f and the interval [a, b]. Since the term
“h” occurs to the first power, this method is commonly referred to as a first
order method.
In Example 6.6.1 we saw that the midpoint rule provided much greater ac-
curacy than using the left end point. This is not a coincidence as the following
theorem proves.

THEOREM 6.6.3 If f is twice differentiable on [a, b] and f ′′ (x) is bounded


on [a, b], then
Z b
M (b − a) 2
f (x) − Mn (f ) ≤ h ,
a 24
where M = sup{|f ′′ (x)| : x ∈ [a, b]}.

Remark. Since the error between the true value and the approximate value in-
volves the term h2 , the midpoint approximation is a second order method.

Proof. To prove the result, we first prove that for each i = 1, ..., n,
xi
M 3
Z
f (x)dx − h f (a + (i − 12 )h) ≤ h . (21)
xi−1 24
284 Introduction to Real Analysis

For t ∈ [0, h/2], consider the function


Z ci +t
gi (t) = f (x)dx − 2t f (ci ),
ci −t

where ci = (xi−1 + xi )/2. Then


Z xi
gi ( h2 ) = f (x) dx − h f (a + (i − 12 )h).
xi−1

Since f is continuous, by Theorem 6.3.4,

gi′ (t) = f (ci + t) + f (ci − t) − 2f ′ (ci ), and


gi′′ (t) = f ′ (ci + t) − f ′ (ci − t).

By the mean value theorem

f ′ (ci + t) − f ′ (ci − t) = f ′′ (ζ)2t

for some ζ ∈ (ck − t, ck + t). Since |f ′′ (ζ)| ≤ M , we obtain

|gi′′ (t)| ≤ 2M t

for all t ∈ [0, h/2]. Since gi′ (0) = 0, by the fundamental theorem of calculus,
Z t Z t Z t
|gi′ (t)| = gi′′ (x)dx ≤ |gi′′ (x)|dx ≤ 2M x dx = M t2 .
0 0 0

Also, since gi (0) = 0, by the fundamental theorem of calculus again,


h/2 h/2  3
M h M 3
Z Z
|gi ( h2 )| = gi′ (t)dt ≤ |gi′ (t)|dt ≤ = h ,
0 0 3 2 24

which proves inequality (21). Therefore,


Z b n Z
X xi n
X
f (x)dx − Mn (t) = f (x)dx − hf (a + (i − 12 )h) = gi ( h2 )
a i=1 xi−1 i=1
n
X M 3 M (b − a) 2
≤ gi ( h2 ) ≤ h n= h . 
i=1
24 24
Integration 285

FIGURE 6.9
Trapezoidal approximation

The Trapezoidal Rule


We now consider another common second order approximation method known
as the trapezoidal rule. In using Riemann sums, regardless of the choice of
the points ti , we used rectangles to approximate the integral of the function f .
In our second numerical method we will replace rectangles by trapezoids. Let
f be a Riemann integrable function on [a, b] and let P = {x0 , x1 , ..., xn } be a
partition of [a, b]. As in Figure 6.9, for each interval [xi−1 , xi ], the area of the
trapezoid formed by the points (xi−1 , 0), (xi−1 , f (xi−1 )), (xi , f (xi )), (xi , 0) is
given by 21 [f (xi−1 ) + f (xi )]∆xi . Summing these up gives
n
1X
[f (xi−1 ) + f (xi )]∆xi
2 i=1
as an approximation to the integral of f on [a, b]. If as previously we set
h = (b − a)/n and xi = a + ih, i = 0, ..., n, then the above sum becomes
n n−1
" #
hX h X
[f (a + (i − 1)h) + f (a + ih)] = f (a) + 2 f (a + ih) + f (b) .
2 i=1 2 i=1

If f is Riemann integrable on [a, b] and n ∈ N, the quantity Tn (f ) defined by


n−1
" #
h X
Tn (f ) = f (a) + 2 f (a + ih) + f (b) (22)
2 i=1

where h = (b − a)/n, is called the nth trapezoidal approximation to the


integral of f on [a, b]. If we set yi = f (a + ih), then Tn (f ) can be expressed as
n−1
" #
h X
Tn (f ) = y0 + 2 yi + yn .
2 i=1
286 Introduction to Real Analysis

The following theorem, under suitable hypothesis on f , provides an esti-


mate on the error of the trapezoidal approximation.

THEOREM 6.6.4 If f is twice differentiable on [a, b] and f ′′ (x) is bounded


on [a, b], then
Z b
M (b − a) 2
f (x)dx − Tn (f ) ≤ h ,
a 12

where M = sup{|f ′′ (x)| : x ∈ [a, b]}.

The above error estimate can also be expressed as follows: If f satisfies the
hypothesis of the theorem, then for n ∈ N,
b
M (b − a)3 1
Z
f (x)dx − Tn (f ) ≤ . (23)
a 12 n2

In this form it is possible to determine the value of n required to guarantee


predetermined accuracy.
Proof. The proof of this theorem is very similar to the proof of Theorem
6.6.3. As a consequence we leave most of the details to the exercises (Exercise
5). The first step is to prove that for each i = 1, ..., n,
xi
h M 3
Z
f (x)dx − [f (xi−1 ) + f (xi )] ≤ h .
xi−1 2 12

To accomplish this, consider the function


Z xi−1 +t
t
gi (t) = f (x)dx − [f (xi−1 ) + f (xi−1 + t)]
xi−1 2

for t ∈ [0, h]. Then gi (0) = 0 and


Z xi
h
gi (h) = f (x)dx − [f (xi−1 ) + f (xi )].
xi−1 2

By computing gi′ (t) and gi′′ (t) one obtains from the hypothesis on f that
|gi′′ (t)| ≤ 21 tM . Since gi (0) = gi′ (0) = 0, applying the fundamental theorem
1
of calculus twice we obtain as in Theorem 6.6.3 that |gi (h)| ≤ 12 M h3 . The
remainder of the proof is identical to Theorem 6.6.3. 
Integration 287

Simpson’s Rule
The trapezoidal approximation Tn (f ) amounts to approximating the func-
tion f with a piecewise linear function gn that passes through the points
{(xi , f (xi ))}, i = 0, . . . , n. Our intuition should convince us that one way to
obtain a better approximation to the integral of f over [a, b] is to use smoother
curves. This is exactly what is done in Simpson’s rule which uses parabolas
to approximate the integral. To use quadratic approximations we will need to
use three successive points of the partition of [a, b]. This is due to the fact
that three points are required to uniquely determine a parabola.
Prior to deriving Simpson’s rule, we first establish the following formula:
If p(x) = Ax2 + Bx + C is the quadratic function passing through the points
(0, y0 ), (h, y1 ), (2h, y2 ), then
2h
h
Z
p(x)dx = [y0 + 4y1 + y2 ]. (24)
0 3

One way to derive this formula would be to first determine the coefficients
A, B, C so that p(0) = y0 , p(h) = y1 , p(2h) = y2 , and then integrate p(x).
This however is not necessary. By integrating first,
2h
A B
Z
p(x) dx = (2h)3 + (2h)2 + C(2h)
0 3 2
h
= [8Ah2 + 6Bh + 6C]
3
h h
= [p(0) + 4p(h) + p(2h)] = [y0 + 4y1 + y2 ].
3 3
Let f ∈ R[a, b] and let n ∈ N be even. Set h = (b − a)/n. On each of the
intervals [a + 2(i − 1)h, a + 2ih], i = 1, ..., n/2, we approximate the integral of
f by the integral of the quadratic function that agrees with f at the points

y0 = f (a + 2(i − 1)h), y1 = f (a + (2i − 1)h), y2 = f (a + 2ih).

By identity (24) this gives

h
[f (a + 2(i − 1)h) + 4f (a + (2i − 1)h) + f (a + 2ih)]
3
as an approximation to the integral of f over the interval [a+2(i−1)h, a+2ih].
Summing these terms from i = 1 to n/2 gives

h
Sn (f ) = [f (a)+4f (a + h) + 2f (a + 2h)+
3
+ 4f (a + 3h) + · · · + 4f (a + (n − 1)h) + f (b)]

as an approximation to the integral of f over [a, b]. The quantity Sn (f ) is


288 Introduction to Real Analysis

called the nth Simpson approximation to the integral of f over [a, b]. If we
set yi = f (a + ih), then Sn (f ) is given by
h
Sn (f ) = [y0 + 4y1 + 2y2 + 4y3 + · · · + 2yn−2 + 4yn−1 + yn ]. (25)
3
The following theorem, again under suitable restrictions on f , provides an
error estimate for Simpsons’s rule.

THEOREM 6.6.5 If f is four times differentiable on [a, b] and f (4) (x) is


bounded on [a, b], then for n ∈ N even,
b
M (b − a) 4
Z
f (x)dx − Sn (f ) ≤ h ,
a 180

where M = sup{|f (4) (x)| : x ∈ [a, b]}.


Remarks. Since the error term involves h4 , Simpson’s rule is a fourth order
method. Also, if f (x) is a polynomial of degree less than or equal to 3, then
f (4) (x) = 0 and thus
Z b
f (x) dx = Sn (f ).
a
If f satisfies the hypothesis of the theorem and n ∈ N is even, then the above
error estimate can be expressed as
Z b
M (b − a)5 1
f (x)dx − Sn (f ) ≤ . (26)
a 180 n4

Proof. The proof of the theorem proceeds in an analogous manner as the


proofs of the previous two results. We first prove that for i = 1, ..., n/2,
ci +h
h M 5
Z
f (x)dx − [f (ci − h) + 4f (ci ) + f (ci + h)] ≤ h , (27)
ci −h 3 90

where ci = a + (2i − 1)h. To accomplish this, define gi on [0, h] by


Z ci +t
t
gi (t) = f (x)dx − [f (ci − t) + 4f (ci ) + f (ci + t)].
ci −t 3

To prove inequality (27), we are required to show that |gi (h)| ≤ M h5 /90.
Upon computing the successive derivatives of gi we obtain
t ′ 2 4
gi′ (t) =
[f (ci − t) + f ′ (ci + t)] + [f (ci − t) + f (ci + t)] − f (ci ),
3 3 3
′′ t ′′ ′′ 1 ′ ′
gi (t) = [−f (ci − t) − f (ci + t)] + [−f (ci − t) + f (ci + t)], and
3 3
′′′ t ′′′ ′′′
gi (t) = [f (ci − t) − f (ci + t)].
3
Integration 289

By the mean value theorem f ′′′ (ci − t) − f ′′′ (ci + t) = f (4) (ζ)(2t) for some
ζ ∈ (ci − t, ci + t). Therefore
2
|gi′′′ (t)| ≤ M t2 .
3
As in the previous two theorems, since gi (0) = gi′ (0) = gi′′ (0) = 0, upon three
integrations we have |gi (h)| ≤ M h5 /90. This proves inequality (27). Finally,

Z b n/2 n/2
X X
f (x)dx − Sn (f ) = gi (x) ≤ |gi (x)|
a i=1 i=1

M h5  n  M (b − a) 4
≤ = h . 
90 2 180

EXAMPLE 6.6.6 In this example, we will use the trapezoidal rule and
Simpson’s rule with n = 8 to obtain approximations to ln 5. For f (x) = x−1 ,
f ′′ (x) = 2x−3 and f (4) (x) = 24x−5 . Therefore

sup{|f ′′ (x)| : x ∈ [1, 5]} = 2 and sup{|f (4) (x)| : x ∈ [1, 5]} = 24.

By Theorem 6.6.4 with h = 1/2, the error E8 in the trapezoidal approximation


satisfies  2
2·4 1 1
|E8 (f )| ≤ = = 0.16666..
12 2 6
On the other hand, the error E8 in using Simpson’s rule is guaranteed to
satisfy
 4
24 · 4 1 1
|E8 (f )| ≤ = = 0.03333...
180 2 30
which is considerably better.
With xi = 1 + ih and yi = f (xi ), i = 0, 1, ..., 8,

y0 = 1, y1 = 23 , y2 = 12 , y3 = 52 , y4 = 31 , y5 = 27 , y6 = 41 , y7 = 29 , y8 = 51 .

Therefore,
7
" #
1 X
T8 (f ) = y0 + 2 yi + y8
4 i=1
 
1 4 4 2 4 1 4 1
= 1+ +1+ + + + + +
4 3 5 3 7 2 9 5
= 1.6290 (to four decimal places).
290 Introduction to Real Analysis

Since ln 5 = 1.6094 to four decimal places, the error is less than 0.02, well
within the tolerance predicted by the theory. With Simpson’s rule,
1
S8 (f ) = [y0 + 4y1 + 2y2 + 4y3 + 2y4 + 4y5 + 2y6 + 4y7 + y9 ]
6 
1 8 8 2 8 1 8 1
= 1+ +1+ + + + + +
6 3 5 3 7 2 9 5
= 1.6108 (to four decimal places).

With Simpson’s rule the actual error is less than 0.0014.


We can also use the results of the theory to determine how large n must be
chosen to guarantee predetermined accuracy. For the trapezoidal rule, since
M = 2 and (b − a) = 4, by inequality (23)

2 · 43 1
|En (f )| ≤ .
12 n2
Thus to obtain accuracy to within 0.001 we need

2 · 43 3
n2 > 10 = 10666.66..
12
which is accomplished with n ≥ 104. On the other hand, using Simpson’s rule,

24 · 45 1
|En (f )| ≤ .
180 n4
Thus to obtain accuracy to within 0.001, we need n ∈ N even with

24 · 45 3
n4 > 10 .
180
This will be satisfied with n ≥ 20. 

Exercises 6.6
1. a. Use the midpoint rule, trapezoidal rule, and Simpson’s rule to ap-
R2
proximate ln 2 = (1/x) dx with n = 4. For each method, determine
1
the estimated error and compare your answer to ln 2 = 0.69315 (to five
decimal places).
b. Repeat (a) with n = 8.
c. For each of the three methods (midpoint rule, trapezoidal rule, Simp-
son’s rule) determine how large n must be chosen to assure accuracy in
the approximation of ln 2 to within 0.0001.
Integration 291

2. *a. Using the fact that


Z 1
dt π
2
= ,
0 1 + t 4
use the midpoint rule, trapezoidal rule, and Simpson’s rule to approxi-
mate π/4 with n = 4. For each method, determine the estimated error.
b. Repeat (a) with n = 8.
c. For each of the three methods determine how large n must be chosen
to obtain an approximation of π to within 0.0001.
3. Use Simpson’s rule to obtain approximations of each of the following
integrals accurate to at least four decimal places.
Z 1 Z 2p Z 1
1
a. 4
dx *b. 1 + x 2 dx c. sin(x2 ) dx.
0 1 + x 0 0
4. How large must n be chosen so that the trapezoidal approximation Tn
R2 2
approximates e−x with an error less than 10−6 .
0
5. Fill in the details of the proof of Theorem 6.6.4.
6. Let f be a differentiable function on [a, b] with f ′ (x) bounded on [a, b].
Let n ∈ N. Prove that
Z b n
X M (b − a)
f (x)dx − h f (a + ih) ≤ h,
a i=1
2
where h = (b − a)/n and M = sup{|f ′ (x)| : x ∈ [a, b]}.
7. a. Show that T2n (f ) = 21 [Mn (f ) + Tn (f )].
b. Show that S2n (f ) = 32 Mn (f ) + 31 Tn (f ).
8. Prove the following variation of Theorem 6.6.4: If f is twice differentiable
on [a, b] and f ′′ is continuous on [a, b], then there exists a point c ∈ [a, b]
such that
Z b
(b − a)h2 ′′
Tn (f ) − f (x) dx = f (c).
a 12
1 1
9. a. Use the previous exercise to show that < T8 − ln 2 < .
3072 384
b. Using (a) and Exercise 1(b) show that .6915 < ln 2 < .6938.

6.7 Proof of Lebesgue’s Theorem


In this final section, we present a self-contained proof of Lebesgue’s charac-
terization of the Riemann integrable functions on [a, b]. Recall that a subset
E of R has measure zero if for any ǫ > 0, there S existsPa finite or countable
collection {In } of open intervals with E ⊂ n In and n ℓ(In ) < ǫ, where
ℓ(In ) denotes the length of the interval In . We begin with several preparatory
lemmas.
292 Introduction to Real Analysis

LEMMA 6.7.1 A finite or countable union of sets of measure zero has mea-
sure zero.
Proof. We will prove the lemma for the case of a countable sets of measure
zero. The result for a finite union is an immediate consequence.
Suppose
S {En }n∈N is a countable collection of sets of measure zero. Set
E = n En and let ǫ > 0 be given. Since each set En is a set of measure
zero, for each n ∈ N there exists S a finite orPcountable collection {In,k }k of
open intervals such that En ⊂ k In,k and k ℓ(In,k ) < ǫ/2k . Since we can
always take In,k to be the empty set, there is no loss of generality in assuming
that the collection {In,k }k is countable. S Then {In,k }n,k is again a countable
collection of open intervals with E ⊂ n,k In,k .
Since N × N is countable, there exists a one-to-one function f from N onto
N×N. For each m ∈ N, set JS m = If (m . Then {Jm }m∈N is a countable collection
of open intervals with E ⊂ m Jm . Since f is one-to-one, for each N ∈ N, the
set FN = f ({1, . . . , N }) is a finite subset of N × N. Hence there exists positive
integers N1 and K1 such that for all (n, k) ∈ FN we have 1 ≤ n ≤ N1 and
1 ≤ k ≤ K1 . Hence
N
X X X
ℓ(Jm ) = ℓ(In,k ) ≤ ℓ(In,k ).
m=1 (n,k)∈FN (n,k)∈N1 ×K1

But
K1
N1 X N1 X
∞ N1
X X X X ǫ
ℓ(In,k ) = ℓ(In,k ) ≤ ℓ(In,k ) < < ǫ.
n=1 k=1 n=1 k=1 n=1
2n
(n,k)∈N1 ×K1


P
Thus ℓ(Jm ) < ǫ. Therefore E has measure zero. 
m=1

LEMMA 6.7.2 Suppose f is a nonnegative Riemann integrable function on


Rb
[a, b] with f = 0. Then {x ∈ [a, b] : f (x) > 0} has measure zero.
a

Proof. We first prove that for each c > 0, the set Ec = {x ∈ [a, b] : f (x) ≥ c}
Rb
has measure zero. Let ǫ > 0 be given. Since f = 0, there exists a partition
a
P = {x0 , x1 , . . . , xn } of [a, b] with U (P, f ) < c ǫ, where
n
X
U (P, f ) = Mi ∆xi ,
i=1

and Mi = sup{f (x) : x ∈ [xi−1 , xi ]}. Let I = {i : Ec ∩ [xi−1 , xi ] 6= ∅}. If i ∈ I,


then Mi ≥ c. Hence
X X
c ǫ > U (P, f ) ≥ Mi ∆xi ≥ c ∆xi .
i∈I i∈I
Integration 293
P P S
Thus i∈I ℓ([xi−1 , x1 ]) = i∈I ∆xi < ǫ. Finally, since E ⊂ [xi−1 , xi ], it
i∈I
follows that Ec has measure zero. S
To conclude the proof we note that {x ∈ [a, b] : f (x) > 0} = n∈N En ,
where for each n ∈ N,
 
1
En = x ∈ [a, b] : f (x) ≥ .
n

by the above each En has measure zero. Thus by Lemma 6.7.1, the set E has
measure zero. 

DEFINITION 6.7.3 If E is a subset of R, the characteristic function of


E, denoted χE , is the function defined by
(
1, x ∈ E,
χE (x) =
0, x∈/ E.

As in Section 6.2, if P = {x0 , x1 , . . . , xn } is a partition of [a, b], the norm


kPk of the partition P is defined by kPk = max ∆xi . If f is a bounded
1≤i≤n
function on [a, b], we denote by mi and Mi the infimum and supremum re-
spectively of f on [xi−1 , xi ]. The lower function Lf and upper function
Uf for f and the partition P are defined respectively by
n
X
Lf (x) = mi χ[xi−1 ,xi ) (x), and
i=1
Xn
Uf (x) = Mi χ[xi−1 ,xi ) (x).
i=1

The graphs of the lower function Lf and upper function Uf are depicted in
Figure 6.10. Since mi ≤ f (x) ≤ Mi for all x ∈ [xi−1 , xi ),

Lf (x) ≤ f (x) ≤ Uf (x) for all x ∈ [a, b).

Also, since Lf and Uf are continuous except at a finite number of points,


they are Riemann integrable on [a, b] (Exercise 1) with
Z b Z b
Lf = L(P, f ) and Uf = U (P, f ).
a a

THEOREM 6.7.4 (Lebesgue) A bounded real valued function f on [a, b]


is Riemann integrable on [a, b] if and only if the set of discontinuities of f has
measure zero.
294 Introduction to Real Analysis

FIGURE 6.10
Graphs of the lower and upper function of f

Proof. Assume first that f is Riemann integrable on [a, b]. Then for each
n ∈ N there exists a partition Pn of [a, b] with kPn k < 1/n, such that
b b
1 1
Z Z
0≤ f − L(Pn , f ) < and 0 ≤ U(Pn , f ) − f< . (28)
a n a n

Since L(Pn , f ) ≤ L(P, f ) and U (P, f ) ≤ U(Pn , f ) for any refinement P of Pn ,


the partitions Pn , n = 1, 2, . . . can be chosen so that Pn+1 is a refinement of
Pn .
For each n ∈ N, let Ln and Un respectively denote the lower and upper
functions of f for the partition Pn . Then Ln (x) ≤ f (x) ≤ Un (x) for all
x ∈ [a, b] and
Z b Z b
Ln = L(Pn , f ) and Un = U (Pn , f ). (29)
a a

Since Pn+1 is a refinement of Pn , the functions Ln and Un satisfy Ln (x) ≤


Ln+1 (x) and Un+1 (x) ≤ Un (x) for all x ∈ [a, b]. Define the function L and U
on [a, b] by

L(x) = lim Ln (x) and U (x) = lim Un (x).


n→∞ n→∞

Then
Ln (x) ≤ L(x) ≤ f (x) ≤ U (x) ≤ Un (x)
for all x ∈ [a, b]. Hence
Z b Z b Z b Z b Z b Z b Z b
Ln ≤ L≤ L≤ f≤ U≤ U≤ Un .
a a a a a a a
Integration 295

But by equations (28) and (29),


Z b Z b Z b
lim Ln = lim Un = f.
n→∞ a n→∞ a a

Hence L and U are Riemann integrable on [a, b] with


Z b Z b Z b
L= U= f.
a a a

Since U (x) − L(x) ≥ 0 for all x ∈ [a, b], by Lemma 6.7.2, {x ∈ [a, b] : U (x) 6=
L(x)} has measure zero. Furthermore, since each Pn has measure zero, by
Lemma 6.7.1, the set
!
[ [
E = {x ∈ [a, b] : U (x) 6= L(x)} Pn
n

also has measure zero. We conclude by showing that f is continuous on [a, b] \


E.
Fix xo ∈ [a, b] \ E, and let ǫ > 0 be given. Since L(xo ) = U (xo ), there
exists an integer k ∈ N such that Uk (xo ) − Lk (xo ) < ǫ. Also, since xo ∈
/ Pk ,
the functions Uk and Lk are constant in a neighborhood of xo . Hence there
exists a δ > 0 such that

0 ≤ Uk (xo ) − Lk (x) = Uk (x) − Lk (xo ) < ǫ

for all x ∈ [a, b] with |x − xo | < δ. Finally, since Lk (x) ≤ f (x) ≤ Uk (x) for all
x ∈ [a, b),

−ǫ < Lk (x) − Uk (xo ) ≤ f (x) − f (xo ) < Uk (x) − Lk (xo ) < ǫ

for all x with |x − xo | < δ. Therefore f is continuous at xo .


Conversely, suppose f is continuous on [a, b]\E, where E is a set of measure
zero and a < b. Let M > 0 be such that |f (x)| ≤ M for all x ∈ [a, b], and let
ǫ > 0 be given. Since E has measure zero, there S exists a finite
P or countable
collection {In } of open intervals such that E ⊂ n In and n ℓ(In ) < ǫ/4M .
Also, since f is continuous on [a, b] \ E, for each x ∈ [a, b] there exists an
open interval Jx such that |f (z) − f (y)| < ǫ/2(b − a) for all y, z ∈ Jx ∩ [a, b].
The collection {In } ∪ {Jx : x ∈ [a, b] \ E} is an open cover of [a, b]. Thus by
compactness, a finite number, say {Ik }nk=1 and {Jxj }m j=1 also cover [a, b]. Let

P = {a = t0 , t1 , . . . , tN = b}

be the partition of [a, b] determined by those endpoints of Ik , k = 1, . . . , n, and


Jxj , j = 1, . . . , m, that are in [a, b]. For each j, 1 ≤ j ≤ N , the interval (tj−1 , tj )
is contained in some Ik or Jxj . Let J = {j : (tj−1 , tj ) ⊂ Ik for some k}. Also,
for each j ∈ {1, . . . , N }, let mj and Mj denote the infimum and supremum of
296 Introduction to Real Analysis

f on [tj−1 , tj ] respectively. Then Mj − mj ≤ 2M for all j ∈ J, and Mj − mj <


ǫ/2(b − a) for all j ∈ / J. Thus
X X
U (P, f ) − L(P, f ) = (Mj − mj )∆tj + (Mj − mj )∆tj
j∈J j ∈J
/
X ǫ X
≤ 2M ∆tj + ∆tj
2(b − a)
j∈J j ∈J
/
n
X ǫ X
≤ 2M ℓ(Ik ) + ∆tj
2(b − a) j=1
k
 ǫ  ǫ
≤ 2M + (b − a) = ǫ.
4M 2(b − a)

Hence by Theorem 6.1.7 the function f is Riemann integrable on [a, b]. 

Exercises 6.7
1. Let f be a real-valued function on [a, b] and P a partition of [a, b]. Prove
that the lower function Lf and upper function Uf for the partition P are
Riemann integrable on [a, b] with
Z b Z b
Lf = L(P, f ) and Uf = U (P, f ).
a a
2. R*Let P be that Cantor set in [0, 1]. Show that χP ∈ R[0, 1] and find
1
0
χP (x)dx.
Rx
3. Let f ∈ R[a, b], and for x ∈ [a, b], set F (x) = f (t)dt. Prove that there
a
exists a subset E of [a, b] of measure zero such that F ′ (x) = f (x) for all
x ∈ [a, b] \ E,
4. *Suppose f ∈ R[a, b] and g is a bounded real-valued function on [a, b]. If
{x ∈ [a, b] : g(x) 6= f (x)} has measure zero, is g Riemann integrable on
[a, b]?

Notes
The fundamental theorem of calculus is without question the key theorem of cal-
culus; it relates the Cauchy-Riemann theory of integration with differentiation. For
Newton, Leibniz, and their successors, integration was the inverse operation of differ-
entiation. When Cauchy however defined integration independent of differentiation,
the fundamental theorem of calculus became a necessity. It was crucial in proving
that Cauchy’s integral was the inverse of differentiation, thereby providing both a
convenient tool for the evaluation of definite integrals and proving that every con-
tinuous real-valued function defined on an interval I has an antiderivative on I.
Integration 297

Although we stated the fundamental theorem of calculus as two separate theo-


rems, for continuous functions they are the same. Specifically, if f is a continuous
real-valued function on [a, b] and F is an antiderivative of f , then for any x ∈ [a, b],
Z x
F (x) = F (a) + f (t) dt.
a

Conversely, if f is continuous and F is defined as above, then F ′ (x) = f (x).


Lebesgue’s theorem is one of the most beautiful results of analysis. It pro-
vides very concise necessary and sufficient conditions for Riemann integrability of
a bounded real-valued function f . Although Riemann was the first to provide such
conditions, his result lacked the simplicity and elegance of Lebesgue’s theorem. Un-
fortunately for Riemann, the concept of measure had not yet been developed when
he stated and proved his result. In Chapter 10 we will develop the theory of measure
and Lebesgue’s extension of the Riemann integral.
The need for numerical methods for the evaluation of definite integrals was rec-
ognized as early as the eighteenth century. Euler and Thomas Simpson (1710–1761),
among others, used numerical techniques to approximate the definite integral in
problems where an antiderivative could not be found. Even the error estimates devel-
oped in Section 6 date back to that era. With the availability of efficient calculators
and high speed computers, numerical methods have increased in importance in the
past few decades. This has led to the development of very sophisticated numerical
algorithms for the evaluation of definite integrals.
Although Newton and Leibniz are credited with inventing the differential and
integral calculus, many mathematicians prior to their time knew formulas for com-
puting tangents and areas in particular instances. Archimedes (around 200 B.C.) in
his treatise Quadrature of the Parabola used the method of exhaustion by inscribed
triangles to derive a formula for the area under a parabolic segment. By the mid
1640’s, Pierre de Fermat (1601–1665) had determined the formulas for the area un-
der any curve of the form y = xk (k 6= −1), and for finding tangents to such curves.
Issac Barrow (1630–1677), who was a professor of geometry at Cambridge, in his
1670 treatise Lectiones geometriae developed an algebraic procedure which is virtu-
ally identical to the differential calculus for finding tangents to a curve. However,
all these early methods were developed using geometric arguments. Newton and
Leibniz developed the concepts, the notation, and the algorithms for making these
computations for arbitrary functions. Most importantly however, both men realized
the inverse nature of the problems of tangents and areas. For these reasons they
are credited with the development of the differential and integral calculus. Further
information on the historical development of calculus may be found in the text by
Katz listed in the Bibliography and the article by Rosenthal listed at the end of the
chapter.

Miscellaneous Exercises
A real-valued function f on [a, b] is a step function if there exist a finite number
of disjoint intervals {Ij }n j=1 with ∪Ij = [a, b] such that f is constant on each of the
intervals Ij , j = 1, ..., n.
298 Introduction to Real Analysis

1. a. If f is a step function on [a, b], prove that f ∈ R[a, b] with


Z b Xn
f= ci ℓ(Ii ),
a i=1
where ci is the value of f on Ii .
b. If f ∈ R[a, b] and ǫ > 0 is given, prove that there exists a step function
h on [a, b] such that
Z b
|f − h| < ǫ.
a
c. If h is a step function on [a, b] and ǫ > 0 is given, prove that there
exists a continuous function g on [a, b] such that
Z b
|h − g| < ǫ.
a
d. If f ∈ R[a, b] and ǫ > 0 is given, prove that there exists a continuous
function g on [a, b] such that
Z b
|f − g| < ǫ.
a

Let f be a real-valued function defined on [0, ∞). The Laplace trans-


form of f , denoted L(f ), is the function defined by
Z ∞
L(f )(s) = e−st f (t) dt,
0

whenever the improper Riemann integral exists.


2. Let f be defined on [0, ∞). Prove that there exists a ∈ R∪{−∞, ∞} such
that L(f )(s) is defined for all s ∈ (a, ∞), and that the integral defining
L(f )(s) diverges for all s ∈ (−∞, a).
(Hint: First show that if L(f )(so ) exists for some so , then L(f )(s) exists
for all s > so .)
3. Suppose f ∈ R[0, c] for every c > 0, and that there exists a positive
constant C, and a ∈ R, such that |f (t)| ≤ C eat for all t ≥ 0. Prove that
L(f )(s) exists for all s > a.
4. Compute the Laplace transform of each of the following functions. In
each case, specify the interval on which L(s) is defined.
a. f (t) = 1.
b. f (t) = eat , a ∈ R.
c. f (t) = cos ωt.
d. f (t) = sin ωt.
e. f (t) = tn , n ∈ N. (Use induction.)
f. f (t) = I(t − c), where I(t − c) is the unit jump function at t = c.
g. f (t) = tα , α > −1. (See Exercise 8, Section 6.4.)
5. Suppose f is differentiable on [0, ∞) and a ∈ R is such that L(f )(s) exists
for all s > 1. If lim e−st f (t) = 0 for all s > a, prove that L(f ′ )(s) =
t→∞
s L(f )(s) − f (0).
Integration 299

Supplemental Reading

Bagby, R. J., “A convergence of lim- Jones, L. K., “An elementary deriva-


its,” Math. Mag. 71 (1998), 270–277. tion of the numerical integration bounds
Bao-lin, Z., “A note on the mean- in beginning calculus,” Amer. Math.
value theorem for integrals,” Amer. Monthly 124 (2017), 558–561.
Math. Monthly 104 (1997), 561–562. Jones, W. R. and Landau, M. D.,
Bartle, R. G., “Return to the Rie- “One-sided limits and integrability,”
mann integral,” Amer. Math. Monthly Math. Mag. 45 (1972), 19–21.
103 (1996), 625–632. Klippert, J., “On the right-hand
Botsko, M. W., “An elementary proof derivative of a certain integral function,”
that a bounded a.e. continuous function Amer. Math. Monthly 98 (1991), 751–
is Riemann integrable,” Amer. Math. 752.
Monthly 95 (1988), 249–252. Kristensen, E., Poulsen, E. T.,
Bressoud, D. M., “Historical reflec- and Reich, E., “A characterization of
tions on teaching the fundamental theo- Riemann integrability,” Amer. Math.
rem of calculus,” Amer. Math. Monthly Monthly 69 (1962), 498–505.
118 (2011), 99–115. Rickey, V. F. and Tuchinsky, P. M.,
Bullock, G. L., “A geometric inter- “An application of geography to mathe-
pretation of the Riemann-Stieltjes inte- matics: History of the integral of the se-
gral,” Amer. Math. Monthly 95, (1988), cant,” Math. Mag. 53 (1980), 162–166.
448-455. Rosenthal, A., “The history of cal-
Fazekas, Jr. E. C. and Mercer, P. culus,” Amer. Math. Monthly 58 (1951),
R., “Elementary proofs of error estimates 75–86.
for the midpoint and Simpson’s rules,” Stein, S. K., “The error of the trape-
Math. Mag. 82 (2009), 365–370. zoidal method for a concave curve,”
Goel, S. K. and Rodriguez, D. M., “A Amer. Math. Monthly 83 (1976) 643–
note on evaluating limits using Riemann 645.
sums,” Math. Mag. 60 (1987), 225–228. Talman, L. A., “Simpson’s rule is ex-
Gordon, R. A., “Some integrals in- act for quintics,” Amer. Math. Monthly
volving the Cantor function,” Amer. 113 (2006), 141-155.
Math. Monthly 116 (2009), 218–227. Tandra, H., “A new proof of the
Gordon, R. A., “A bounded deriva- change of variable theorem for the Rie-
tive that is not Riemann integrable,” mann integral,” Amer. Math. Monthly
Math. Mag. 89 (2016), 364–370. 122 (2015), 795–799.
Jacobson, B., “On the mean-value R ∞ Williams,
 K. S., “Note on
theorem for integrals,” Amer. Math. 0
(sin x x)dx,” Math. Mag. 44 (1971),
Montly 89 (1982) 300–301. 9–11.
7
Series of Real Numbers

Although the study of series has a long history in mathematics,1 the modern
definition of convergence dates back only to the beginning of the nineteenth
century. In 1821, Cauchy, in his text Cours d’Analyse, used his definition of
limit to provide the first formal definition of convergence of a series in terms
of convergence of the sequence of partial sums. The Cauchy criterion (Theo-
rem 3.7.3) was the first significant result to provide necessary and sufficient
conditions for convergence of a series. Cauchy not only stated and proved the
result, he also applied his result to prove convergence and divergence of given
series. Many of the early convergence tests, such as the root and ratio test, are
due to him. Cauchy, with his formal development of series, placed the subject
matter on a rigorous mathematical foundation.
In this chapter, we will continue our study of series of real numbers begun
earlier in the text. Our primary emphasis in Section 7.1 will be on deriving
several tests that are useful in determining the convergence or divergence of a
given series. In Section 7.3 we will study the concepts of absolute convergence,
conditional convergence, and rearrangements of series. One of the key results
of this section is that every rearrangement of an absolutely convergent series
not only converges, but converges to the same sum. As we will also see, this
fails dramatically if the series converges but fails to converge absolutely.
In Section 7.4 we give a brief introduction to the topic of square summable
sequences. These play an important role in the study of Fourier series. One
of the main result of this section will be the Cauchy-Schwarz inequality for
series. This section also contains a brief introduction to normed linear spaces.

7.1 Convergence Tests


In Section 3.7, we provided a very brief introduction to the subject of infinite
series. In the study of infinite series, it is very useful to have tests available by
means of which one is able to determine whether a given series converges or
diverges. For example, Corollary 3.7.5 is very useful in determining divergence
of a series. If the sequence {ak } does not converge to zero, then the series
1 See the notes at the end of the chapter.

301
302 Introduction to Real Analysis
P
ak diverges. On the other hand however, if lim ak = 0, then nothing P can
be ascertained concerning convergence or divergence of the series ak . In
this section, we will state and prove several useful results that can be used
to establish convergence or divergence of a given series. Additional tests for
convergence will also be given in the exercises and subsequent sections. With
the exception of Theorem 7.1.1, all of our results in this section will be stated
for series of nonnegative terms.

P
As in Definition 3.7.1, given an infinite series ak of real numbers,
k=1
{sn }∞
n=1 will denote the associated sequence of partial sums defined by

n
X
sn = ak .
k=1
P
The series ak converges if and only if the sequence {sn } of nth partial
sums converges. Furthermore, if lim sn = s, (s ∈ R), then s is called the
n→∞
sum of the series, and we write

X
ak = s.
k=1
P
If the sequence {sn } diverges, then the series aP
k is said to diverge. Fur-
thermore, if lim sn = ∞ (or −∞), then we write ak = ∞ (−∞) to denote
n→∞
that the series diverges to ∞ (or −∞). P
If ak ≥ 0 for all k, then by Theorem 3.7.6 ak converges if and only if
lim sn < ∞. Thus for series of nonnegative terms we adopt the notation
n→∞


X
ak < ∞
k=1

to denote that the series converges.


Remarks. (a) Although we generally index a series by the positive integers
N, it is sometimes more convenient to start with k = 0 or with k = ko for
some integer ko . In this case, the resulting series are denoted as

X ∞
X
ak , ak .
k=0 k=ko


P
Also, from the Cauchy criterion (Theorem 3.7.3) it is clear that ak con-
k=1

P
verges if and only if ak converges for some, and hence every, ko ∈ N.
k=ko
Series of Real Numbers 303

P (b) Given any sequence {sn }n=1 of real numbers we can always find a series
ak whose nth partial sum is sn . If we set a1 = s1 and ak = sk −sk−1 , k > 1,
then
Xn
ak = s1 + (s2 − s1 ) + · · · + (sn − sn−1 ) = sn .
k=1


P ∞
P
THEOREM 7.1.1 If ak = α and bk = β, then
k=1 k=1

P
(a) c ak = c α, for any c ∈ R, and
k=1

P
(b) (ak + bk ) = α + β.
k=1

Proof. The proof of (a) is similar to (b) and thus is omitted. To prove (b),
for each n ∈ N, let
n
X n
X
sn = ak and tn = bk .
k=1 k=1

Since the series converge to α and β respectively, lim sn = α and lim tn = β.


Therefore by Theorem 3.2.1, lim(sn + tn ) = α + β. But
n
X n
X n
X
s n + tn = ak + bk = (ak + bk ).
k=1 k=1 k=1
P
Therefore sn + tn is the nth partial sum of the series (ak + bk ). Since the
sequence {sn + tn } converges to α + β,

X
(ak + bk ) = α + β. 
k=1

Comparison Test
One of the most important convergence tests is the comparison test. Although
very elementary, it provides one of the most useful tools in determining con-
vergence or divergence of a series. It is useful both in applications and theory.
Several of the proofs of subsequent theorems rely on this test. In applications,
by comparing the terms of a given series with the terms of a series for which
convergence or divergence is known, we are then able to determine whether
the given series converges or diverges.
P P
THEOREM 7.1.2 (Comparison Test) Suppose ak and bk are two
given series of nonnegative real numbers satisfying

0 ≤ ak ≤ M bk
304 Introduction to Real Analysis

for some positive constant M and all integers k ≥ ko , for some fixed ko ∈ N.

P ∞
P
(a) If bk < ∞, then ak < ∞.
k=1 k=1

P ∞
P
(b) If ak = ∞, then bk = ∞.
k=1 k=1

Proof. Suppose the terms {ak } and {bk } satisfy ak ≤ M bk for all k ≥ ko and
some positive constant M . Then for n > m ≥ ko
n
X n
X
0≤ ak ≤ M bk .
k=m+1 k=m+1
P
Suppose bk converges. Then given ǫ > 0, by the Cauchy criterion (3.7.3)
there exists an integer no ≥ ko such that
n
X ǫ
bk <
M
k=m+1

Pn
for all n > m ≥ no . Thus 0 ≤ ak < ǫ for all n > m ≥ no , and hence
P k=m+1 P
by thePCauchy criterion ak converges. On the other hand, if ak diverges,
then bk must also diverge. 
As a corollary of the previous theorem we also have the following version
of the comparison test.
P P
COROLLARY 7.1.3 (Limit Comparison Test) Suppose ak and bk
are two given series of positive real numbers.
an P
(a) If lim = L with 0 < L < ∞, then ak converges if and only if
P n→∞ b n
bk converges.
an P P
(b) If lim = 0 and bk converges, then ak converges.
n→∞ bn

Proof. The proof, the details of which are left to the exercises (Exercise 6),
follows immediately from the definition of the limit and the comparison test.

P
Remark. If lim an /bn = 0 and ak converges, then nothing can be con-
n→∞ P
cluded about the convergence of the seriesP bk . In Example 7.1.4(d),
P we
provide an example of a divergent series bk and a convergent series ak
for which lim an /bn = 0. On the other hand, in Exercise 23, given a con-
n→∞ P
vergentP series ak with ak > 0, you will be asked to construct a convergent
series bk with bk > 0 such that lim an /bn = 0.
n→∞
Series of Real Numbers 305

EXAMPLES 7.1.4 (a) As an application of the comparison test consider


the series

X k
.
3k
k=1

We
P willk compare the given series with the convergent geometric series
(1/2) . Thus we wish to show that there exists ko ∈ N such that k 3k ≤
k
1 2 for all k ≥ ko . Since 2/3 < 1, by Theorem 3.2.6(d)
 k
2
lim k = 0.
k→∞ 3

Thus by taking ǫ = 1, there exists an integer ko such that k(2/3)k ≤ 1 for all
k ≥ ko . As a consequence,
k 1
≤ k for all k ≥ ko .
3k 2
Since (1/2)k < ∞, by the comparison testP
P
the given series also converges. A
k n ak converges for any n ∈ Z

similar argument can be used to prove that
and a ∈ R with aP > 1 (Exercise 2(l)). This is accomplished by comparing the
given series with b−k , where 1 < b < a.
(b) As our second example consider the series
∞ r
X k+1
.
2k 3 + 1
k=1

In order to use the comparison test we have to determine what series we want
to compare with. Since
s s
1 1 + k1 1 + k1 √
r
k+1
3
= 1 and lim 1 = 21 2,
2k + 1 k 2 + k3 k→∞ 2 + k3
P
we will compare the given series with the series
√ 1/k. This series is known
1
to diverge (Example 3.7.4). If we take ǫ = 4 2, then we can conclude that
there exists ko ∈ N such that
s
1 + k1 1
√ 1

1 ≥ 2 2−ǫ= 4 2
2 + k3

for all k ≥ ko . Thus there exists a positive constant M and ko ∈ N such that
r
k+1 1
3
≥M
2k + 1 k
for all k ≥ ko , and as a consequence, the given series diverges.
306 Introduction to Real Analysis
r

k+1
P
(c) The divergence of the series can also be obtained by the
k=1 2k 3 + 1
limit comparison
P test. Comparing the terms of the given series to the terms
of the series 1/k we have


q
k+1
s
2k3 +1 1 + k1 2
lim 1 = lim 1 = 2 .
k→∞
k
k→∞ 2 + k3
P
Thusrsince the series 1/k diverges, by the limit comparison test the series
P∞ k+1
also diverges.
k=1 2k 3 + 1
(d) Let ak = 2−k and bk = 1/k. Then
P P
ak converges, bk diverges, and
an n
lim = lim n = 0.
n→∞ bn n→∞ 2

an P
Thus if lim = 0, convergence of the series ak does not imply conver-
n→∞ bn P
gence of the series bk . 

Integral Test
Our second major convergence test is the integral test. Recall from Section
6.4, if f is a real-valued function defined on [a, ∞) with f ∈ R[a, c] for every
c > a, then the improper Riemann integral of f is defined by
Z ∞ Z c
f (x) dx = lim f (x) dx,
a c→∞ a

provided the limit exists. If f (x) ≥ 0, we use the notation


Z ∞  Z ∞ 
f (x) dx < ∞ or f (x) dx = ∞
a a

to denote that the improper integral of f converges (diverges).

THEOREM 7.1.5 (Integral Test) Let {ak }∞ k=1 be a decreasing sequence


of nonnegative real numbers, and let f be a nonnegative monotone decreasing
function on [1, ∞) satisfying f (k) = ak for all k ∈ N. Then

X Z ∞
ak < ∞ if and only if f (x) dx < ∞.
k=1 1

Proof. Since f is monotone on [1, ∞), by Theorem 6.1.8 it is Riemann inte-


grable on [1, c] for every c > 1. Let n ∈ N, n ≥ 2, and consider the partition
Series of Real Numbers 307

FIGURE 7.1
Integral test

P = {1, 2, ..., n} of [1, n]. Since f is decreasing, for each k = 2, 3, ..., n, (see
Figure 7.1)
sup{f (t) : t ∈ [k − 1, k]} = f (k − 1) = ak−1 , and
inf{f (t) : t ∈ [k − 1, k]} = f (k) = ak .
Therefore
n
X Z n n−1
X
ak = L(P, f ) ≤ f (x) dx ≤ U(P, f ) = ak ,
k=2 1 k=1

from which the result now follows. 

EXAMPLES 7.1.6 (a) As our first application of the integral test we con-
sider the p-series

X 1
.
kp
k=1
When p = 1 this series is referred to as the harmonic series. If p ≤ 0,
then {k −p } does not converge to zero, and thus by Corollary 3.7.5 the series
diverges. Suppose p > 0, p 6= 1. Let f (x) = x−p , which is decreasing on [1, ∞).
Then Z c  
−p 1 1
x dx = 1 − p−1 .
1 p−1 c
Therefore,
 1 ,

Z ∞ Z c
−p −p p > 1,
x dx = lim x dx = p − 1
c→∞
1 1  ∞, p < 1.
308 Introduction to Real Analysis

Thus by the integral test the series diverges for p < 1 and converges for p > 1.
When p = 1, then by Example 6.3.5,
Z c
1
dx = ln c.
1 x

Since lim ln x = − lim ln t = ∞ (Example 6.4.2(a)), by the integral test the


x→∞ t→0+
series also diverges for p = 1. Summarizing our results we find

(
X 1 converges, if p > 1,
kp diverges, if p ≤ 1.
k=1

(b) As our second application of the integral test we consider the series

X 1
.
k ln k
k=2

Let f (x) = (x ln x)−1 , x ∈ [2, ∞). Since


(1 + ln x)
f ′ (x) = − ,
(x ln x)2
f ′ (x) < 0 for all x > 1. Thus f is monotone decreasing on [2, ∞). But
Z c
1
lim dx = lim ln(ln c) − ln(ln 2) = ∞.
c→∞ 2 x ln x c→∞

Thus by the integral test the given series diverges.


(c) As our final example, we consider the series

X ln k
, p ∈ R.
kp
k=2

We first consider the case p = 1. Since f (x) = (ln x)/x is decreasing on [e, ∞)
and Z c
ln x
lim dx = lim (ln c)2 − 1 = ∞,
c→∞ e x c→∞
P∞
by the integral test the series k=2 (ln k)/k diverges.
Suppose now that p > 1. Write p = q + r, where q > 1 and r > 0.
By l’Hospital’s rule, lim (ln x)/xr = 0. Thus there exists ko ∈ N such that
x→∞
(ln k)/k r ≤ 1 for all k ≥ ko . As a consequence
ln k 1 ln k 1
= q r ≤ q
kp k k k
1/k q converges. Hence by the compar-
P
for all k ≥ ko . SinceP
q > 1 the series
ison test the series (ln k)/k p also converges when p > 1.
Series of Real Numbers 309

1/k p implies that


P
P Finally,p if p < 1, then again the divergence of
(ln k)/k also diverges. This follows from the fact that (ln k)/k p ≥ (ln 2)/k p
for all k ∈ N, k ≥ 2. Summarizing our results we find

(
X ln k converges, if p > 1,

kp diverges, if p ≤ 1.
k=2

Root and Ratio Test


We now consider the well known root and ratio tests. Although useful in
determining the convergence or divergence of certain types of series, both of
these tests are also very important in the study of power series. The ratio
test, attributed to Jean d’Alembert (1717–1783), is particularly applicable to
series involving factorials. The root test, due to Cauchy, will be used in the
next chapter to define the interval of convergence of a power series. Because
of the close similarity between these two tests we state them together.
Prior to stating the ratio and root tests, we recall the definition of the limit
inferior and limit superior of a sequence of real numbers (Definition 3.5.1) If
{sn } is a sequence in R, then

lim sn = sup inf{sn : n ≥ k} = lim inf{sn : n ≥ k}, and


n→∞ k∈N k→∞

lim sn = inf sup{sn : n ≥ k} = lim sup{sn : n ≥ k}.


n→∞ k∈N k→∞

By Theorem 3.5.7, If E denotes the set of subsequential limits of {sn } in


R ∪ {−∞, ∞}, then

lim sn = inf E and lim sn = sup E.


n→∞ n→∞

In particular, if the sequence {sn } converges (or diverges to either −∞ or ∞),


then lim sn = lim sn = lim sn .
P
THEOREM 7.1.7 (Ratio Test) Let ak be a series of positive terms,
and let
ak+1 ak+1
R = lim , and r = lim .
k→∞ ak k→∞ ak

P
(a) If R < 1, then ak < ∞.
k=1

P
(b) If r > 1, then ak = ∞.
k=1
(c) If r ≤ 1 ≤ R, then the test is inconclusive.
310 Introduction to Real Analysis
P
THEOREM 7.1.8 (Root Test) Let ak be a series of nonnegative terms,
and let

α = lim k ak .
k→∞

P
(a) If α < 1, then ak < ∞.
k=1

P
(b) If α > 1, then ak = ∞.
k=1
(c) If α = 1, then the test is inconclusive.

Before proving the theorems we give several examples that illustrate these
two convergence tests.

P∞ 1
EXAMPLES 7.1.9 (a) Consider the p-series p
for p > 0. For this
k=1 k
series,
r = R = α = 1
for any p > 0; thus both tests are inconclusive. The series however diverges if
0 < p ≤ 1 and converges for p > 1.
(b) As our second example we consider

X pk
, 0 < p < ∞.
k!
k=1

Here ak = pk /k!. Thus

ak+1 pk+1 k! 1
lim = lim = p lim = 0.
k→∞ ak k→∞ pk (k + 1)! k→∞ k + 1

By the ratio test the series converges for all p, 0 < p < ∞. In this example,
the presence of k! makes the root test difficult to use.
P
(c) Consider an where
1

 k,
 if n = 2k,
2

an =
1,


if n = 2k + 1.

3k
Here

X 1 1 1 1
an = 1 + + + 2 + 2 + ··· .
n=1
2 3 2 3
Series of Real Numbers 311

By computation,
  k
2
, n = 2k,


an+1 
3
=  k
an
1 3

 , n = 2k + 1.
2 2
For the subsequence of {an+1 /an } with even n,
 k
a2k+1 2
lim = lim = 0.
k→∞ a2k k→∞ 3

For the subsequence with odd n,


 k
a2k+2 1 3
lim = lim
2 k→∞ = ∞.
k→∞ a2k+1 2
Thus {0, ∞} is the set of sub-sequential limits of {an+1 /an }. Consequently,
r = 0 and R = ∞. Therefore the ratio test is inconclusive. On the other hand
 1
√  √ ,
 n = 2k,
n
an = 2
1 √ 1/(2k+1)
√
 3 , n = 2k + 1.
3
√ √
By Theorem 3.2.6(b), lim ( 3)1/(2k+1) = 1. Thus the sequence { n an } has
k→∞ √ √ √
two sub-sequential limits; namely 1/ 2 and 1/ 3. Therefore α = 1/ 2, and
the series converges by the root test. 

Proof of Ratio Test. Suppose


an+1
lim = R < 1.
an
Choose c such that R < c < 1. Then by Theorem 3.5.3(a) there exists a
positive integer no such that
an+1
<c for all n ≥ no .
an
In particular, ano +1 < c ano . By induction on m, ano +m < cm ano . Therefore,
writing n = no + m, m ≥ 0,

an ≤ M cn for all n ≥ no ,
no
where
P n M = ano /c . ThusP since 0 < c < 1, by Example 3.7.2(a) the series
c converges. Therefore an also converges by the comparison test.
Suppose
an+1
lim = r > 1.
n→∞ an
312 Introduction to Real Analysis

Again choose c so that r > c > 1. As above, there exists a positive integer no
n
P n that an ≥ M c for some constant M and
such P all n ≥ no . But since c > 1,
c = ∞, and thus by the comparison test, an = ∞. 
Proof of Root Test. Let

α = lim n an .
n→∞

Suppose α < 1. Choose c so that α < c < 1. Again by Theorem 3.5.3 there
exists a positive integer no such that

n
an < c for all n ≥ no .
n
P
But then an ≤ c for all n ≥ no , and an < ∞ by the comparison test.

If α > 1, then n an > 1 for infinitely many n. Thus an > 1 for infinitely
many n, and as a consequenceP {an } does not converge to zero. Hence by
Corollary 3.7.5 the series an diverges. 
Example 7.1.9(c) provides an example of a series for which the ratio test
is inconclusive, but the root test worked. Thus it appears that the root test is
a stronger test, a fact which is confirmed by the following theorem.

THEOREM 7.1.10 Let {an } be a sequence of positive numbers. Then


an+1 √ √ an+1
lim ≤ lim n an ≤ lim n an ≤ lim .
n→∞ an n→∞ n→∞ n→∞ an
P
Remark. As a consequence of the theorem, if a series ak converges P by
virtue of the ratio test, i.e., R < 1, then we also have α < 1. Similarly, if ak
diverges by virtue of the ratio test, i.e., r > 1, then we also have α > 1. Thus
if the ratio test proves convergence or divergence of the series, so does the
root test. The converse however, as indicated above, is false.
Proof. Let
an+1
R = lim .
n→∞ an

If R = ∞, then there is nothing to prove. Thus assume that R < ∞, and let
β > R be arbitrary. Then there exists a positive integer no such that
an+1
≤β for all n ≥ no .
an
As in the proof of the ratio test, this gives an ≤ M β n for all n ≥ no , where
M = ano /β no . Hence
√ √
n
n
an ≤ β M for all n ≥ no .

n
Since lim M = 1, we have
n→∞

lim n
an ≤ β.
n→∞

Since β > R was arbitrary, lim n an ≤ R, which proves the result. The
n→∞
inequality for the limit inferior is proved similarly. 
Series of Real Numbers 313

EXAMPLE 7.1.11 If ak = k!, then lim(ak+1 /ak ) = ∞. Thus as a conse-


quence of the previous theorem,
√k
lim k! = ∞. 
k→∞

Exercises 7.1
1. If a and b are positive real numbers, prove that

X 1
(ak + b)p
k=1

converges if p > 1 and diverges if p ≤ 1.


2. Test each of the following series for convergence:

P∞ k ∞
P k ∞ k2
P
*a. 2
b. 2
*c. k
k=1 k + 1 k=1 k + 2k − 1 k=1 2

P 3 −k ∞
P 3 k ∞
P 3 k
d. k e *e. 3
f.
k=1 k=1 k k=1 k! √

P∞ (k!)2 ∞ k!
P P∞ k+1− k
*g. h. k
*i.
k=1 (2k)! k=1 k k=1 k
∞ √ ∞ 1
P k k P
j. ( k − 1) *k. 2
k=1 k=2 (ln k)

P k n ∞
sin(1/kp ), p > 0
P
l. k
, a > 1, n ∈ Z *m.
k=1 a k=1
∞ ∞ 1
 
1
cos(1/kp ), p > 0 *o.
P P
n. ln 1 +
k=1 k=1 k k
3. For each of the following, determine all values of p ∈ R for which the
given geometric series converges, and find the sum of the series.
∞ ∞  p 2k ∞
 k
1+p
(sin p)k
P P P
*a. b. c , p 6= 1
k=0 k=1 3 k=0 1−p
P
4. Suppose ak ≥ 0 for all k ∈ N and ak < ∞. For each of the following,
either prove that the given series converges or provide an example for
which the series diverges.
∞ ak ∞ ∞ √
a2k
P P P
a. *b. c. ak
k=1 1 + a k k=1 k=1r

P P √

k P∞ ak
*d. kak e. k ak *f.
k=1 k=1 k=1 k
P
g. ank where {ank } is a subsequence of {an }
5. *Determine all values of p and q for which the following series converges:

X 1
.
kq (ln k)p
k=2

(Hint: Consider the three cases q > 1, q = 1, q < 1)


6. *Prove Corollary 7.1.3.
P P P
7. If ak converges and bk = ∞, prove that (ak + bk ) = ∞.
314 Introduction to Real Analysis

8. *Suppose {an } is a sequence in R with an > 0 for all n ∈ N. For each


k ∈ N set
k
1X
bk = an .
k n=1
Prove that ∞
P
k=1 bk diverges.

9. Determine all values of p for which the following series converges:


∞ k
!
X 1 X 1
kp n=1 np
k=1
P
10. Suppose that the series ak converges and {nj } is a strictly increasing
sequence of positive integers. Define the sequence {bk } as follows.
b1 = a1 + · · · + an1
b2 = an1 +1 + · · · + an2
..
.
bk = ank−1 +1 + · · · + ank .
P P∞ P∞
Prove that bk converges P and that k=1 bk = k=1 ak . (This exercise
proves
P that if the series ak converges, then any series obtained from
ak by inserting parentheses also converges to the same sum. The fol-
lowing exercise shows that removing parentheses may lead to difficulties.)
P P∞
11. *Give an example of a series ak such that (a2k−1 + a2k ) con-
P k=1
verges,but ak diverges.
P
12. *Suppose that the series ak of positive real numbers converges by

kn ak converges
P
virtue of the root or ratio test. Show that the series
k=1
for all n ∈ N.
13. *Show that the series
1 1 1 1
+ 3 + 2 + 3 + ···
12 2 3 4
converges, but that both the ratio and root tests are inconclusive.
P
14. Apply the root and ratio tests to the series ak where
1
 , when k is even,
k
ak = 2 1
 , when k is odd.
2k+2

P
15. Suppose ak ≥ 0 for all k ∈ N. Prove that the series ak converges if
k=1
and only if some subsequence {snk } of the sequence {sn } of partial sums
converges.
16. *(Cauchy Condensation Test) Suppose that a1 ≥ a2 ≥ a3 ≥ · · · ≥ 0.

P
Use the previous exercise to prove that ak converges if and only if
k=1

2k a2k converges.
P
k=0
Series of Real Numbers 315
P∞ 1
17. Use the Cauchy condensation test to show that p
converges for all
n=1 n
p > 1, and diverges for all p, 0 < p ≤ 1.
18. Use the Cauchy condensation test to determine the convergence or diver-
gence of each of the following series.
∞ ∞ ∞
X 1 X 1 X 1
*a. b. (p > 1). c.
k ln k k(ln k)p k(ln k)(ln(ln k))
k=2 k=2 k=3

19. *For k ∈ N let ck be defined by


1 1
ck = 1 + + · · · + − ln k.
2 k
Prove that {ck } is a monotone decreasing sequence of positive numbers
that is bounded below. The limit c of the sequence is called Euler’s
constant. Show that c is approximately 0.577.
20. (Raabe’s Test) Let ak > 0 for all k ∈ N. Prove the following:

P If ak+1 /ak ≤ 1 − r/k for some r > 1 and all k ≥ ko , ko ∈ N, then


a.
ak < ∞. (Hint: Show that (k −1)ak −kak+1 ≥ (r −1)ak for all k ≥ ko ).
P
b. If ak+1 /ak ≥ 1 − 1/k for all k ≥ ko , ko ∈ N, then ak = ∞. (Hint:
Show that {kak+1 } is monotone increasing for k ≥ ko ).
21. *If p, q > 0, show that the series

X (p + 1)(p + 2) · · · (p + k)
(q + 1)(q + 2) · · · (q + k)
k=1

converges for q > p + 1 and diverges for q ≤ p + 1.


22. For p > 0 consider the series
 p  p  p
1 1·3 1 · 3 · · · (2k − 1)
+ + ··· + + ···
2 2·4 2 · 4 · · · (2k)
a. Show that the ratio test fails for this series.
b. Use Raabe’s test to show that the series converges for p > 2, diverges
for p < 2, and that the test is inconclusive when p = 2.
*c. Prove that the series diverges for p = 2.
23. Let {an } be a sequence of positive real numbers.
P P
*a. Suppose an converges. Construct a convergent series bn with
bn > 0 such that lim an /bn = 0.
n→∞
P P
b. Suppose an diverges. Construct a divergent series bn with
bn > 0 such that lim bn /an = 0.
n→∞

7.2 The Dirichlet Test


In this section, we prove the Dirichlet convergence test, named after Peter
Lejeune Dirichlet (1805–1859), and then apply it to both alternating series and
316 Introduction to Real Analysis

trigonometric series. The key to the Dirichlet test is the following summation
by parts formula of Neils Abel (1802–1829). This formula is the analogue for
series of the integration by parts formula.

THEOREM 7.2.1 (Abel Partial Summation Formula) Let {ak } and


{bk } be sequences of real numbers. Set
n
X
A0 = 0 and An = ak , if n ≥ 1.
k=1

Then if 1 ≤ p ≤ q,
q
X q−1
X
a k bk = Ak (bk − bk+1 ) + Aq bq − Ap−1 bp .
k=p k=p

Proof. Since ak = Ak − Ak−1 ,


q
X q
X
a k bk = (Ak − Ak−1 )bk
k=p k=p
q
X q−1
X
= Ak bk − Ak bk+1
k=p k=p−1
q−1
X
= Ak (bk − bk+1 ) + Aq bq − Ap−1 bp . 
k=p

As an application of the partial summation formula we prove the following


theorem of Dirichlet. Another application is given in Exercise 1.

THEOREM 7.2.2 (Dirichlet Test) Suppose {ak } and {bk } are sequences
of real numbers satisfying
n
P
(a) the partial sums An = ak form a bounded sequence,
k=1
(b) b1 ≥ b2 ≥ b3 ≥ · · · ≥ 0, and
(c) lim bk = 0.
k→∞

P
Then ak bk converges.
k=1

Proof. Since {An } is a bounded sequence, we can choose M > 0 such that
|An | ≤ M for all n. Also, since bn → 0, given ǫ > 0, there exists a positive
Series of Real Numbers 317

integer no such that bn < ǫ/2M for all n ≥ no . Thus, if no ≤ p ≤ q, by the


partial summation formula
q
X q−1
X
a k bk = Ak (bk − bk+1 ) + Aq bq − Ap−1 bp
k=p k=p
q−1
X
≤ |Ak |(bk − bk+1 ) + |Aq | bq + |Ap−1 | bp
k=p
 
q−1
X
≤M  (bk − bk+1 ) + bq + bp 
k=p

≤ 2 M bp < ǫ.

P
Hence by the Cauchy criterion (Theorem 3.7.3), ak bk converges. 
k=1

Alternating Series
Our first application of the Dirichlet P
test is to alternating series. An alter-
nating series is a series of the form (−1)k bk or (−1)k+1 bk , with bk ≥ 0
P
for all k.

THEOREM 7.2.3 (Alternating Series Test) If {bk } is a sequence of real


numbers satisfying
(a) b1 ≥ b2 ≥ · · · ≥ 0, and
(b) lim bk = 0,
k→∞

(−1)k+1 bk converges.
P
then
k=1

Proof. Let ak = (−1)k+1 . Then |An | ≤ 1 for all n, and the Dirichlet test
applies. 
Remark. The hypothesis (a) that {bk } is decreasing is required. If we only
assume that bk ≥ 0 and lim bk = 0, then the conclusion is false (Exercise 2).
k→∞

For an alternating series satisfying the hypothesis of Theorem 7.2.3, we


can actually do better than just prove convergence. The following theorem
provides an estimate on the rate of convergence of the partial sums {sn } to
the sum of the series.

(−1)k+1 bk , where the sequence
P
THEOREM 7.2.4 Consider the series
k=1
{bk } satisfies the hypothesis of Theorem 7.2.3. Let
n
X ∞
X
sn = (−1)k+1 bk and s= (−1)k+1 bk .
k=1 k=1
318 Introduction to Real Analysis

Then |s − sn | ≤ bn+1 for all n ∈ N.


Proof. Consider the sequence {s2n }. Since
2n
X
s2n = (−1)k+1 bk = (b1 − b2 ) + · · · + (b2n−1 − b2n ),
k=1

and (bk−1 − bk ) ≥ 0 for all k, the sequence {s2n } is monotone increasing.


Similarly {s2n+1 } is monotone decreasing. Since {sn } converges to s, so do
the subsequences {s2n } and {s2n+1 }. Therefore s2n ≤ s ≤ s2n+1 for all n ∈ N.
As a consequence |s − sk | ≤ |sk+1 − sk | = bk+1 for all k ∈ N. 

EXAMPLE 7.2.5 Since the sequence {1/(2k − 1)} decreases to zero, by


Theorem 7.2.3 the series

X (−1)k+1
2k − 1
k=1
converges. As we will see in Chapter 9 (Example 9.5.7(a)), this series converges
to π/4. Thus by Theorem 7.2.4, if sn is the nth partial sum of the series,
π 1
− sn ≤
4 2n + 1
for all n ∈ N. Although this can be used to obtain an approximation to π,
the convergence is very slow. We would have to take n = 50 to be guaranteed
accuracy to two decimal places. 

Trigonometric Series
Our next application of the Dirichlet test is to the convergence of trigonometric
series. These types of series will be studied in much greater detail in Chapter
9.

THEOREM 7.2.6 (Trigonometric Series) Suppose {bk } is a sequence of


real numbers satisfying b1 ≥ b2 ≥ · · · ≥ 0, and lim bk = 0. Then
k→∞

P
(a) bk sin kt converges for all t ∈ R, and
k=1

P
(b) bk cos kt converges for all t ∈ R, except perhaps t = 2pπ, p ∈ Z.
k=1

Proof. To prove the result, we require the following two identities: For t 6=
2pπ, p ∈ Z,
n
X cos 12 t − cos(n + 12 )t
sin kt = , (1)
k=1
2 sin 12 t
n
X sin(n + 12 )t − sin 21 t
cos kt = . (2)
k=1
2 sin 12 t
Series of Real Numbers 319
n
P
We will prove (1), leaving (2) for the exercises (Exercise 4). Set An = sin kt.
k=1
1
Using the trigonometric identity sin A sin B = 2 [cos(A − B) − cos(A + B)] we
obtain
n
X
(sin 21 t) An = sin 21 t sin kt
k=1
n
1 X
cos(k − 21 )t − cos(k + 21 )t

=
2
k=1
1 
cos 21 t − cos(n + 12 )t .

=
2
Thus for t 6= 2pπ, p ∈ Z,

cos 12 t − cos(n + 21 )t
An = ,
2 sin 21 t

and therefore
| cos 12 t| + | cos(n + 21 )t| 1
|An | ≤ ≤ ,
2| sin 12 t| | sin 21 t|

which is finite provided t 6= 2pπ, p ∈ Z. Consequently, by the Dirichlet test


the series in (a) converges for all t 6= 2pπ, p ∈ Z. If t = 2pπ, p ∈ Z, then
sin kt = 0 for all k. Thus the series in (a) converges for all t ∈ R.
The proof of the convergence of the series in (b) is similar. However in this
case, when t = 2pπ, cos kt = 1 for all k ∈ N and the given series may or may
not converge. 

EXAMPLE 7.2.7 By Theorem 7.2.6, the series



X 1
cos kt
k
k=1

t 6= 2pπ, p ∈ Z. When t = 2pπ, p ∈ Z, then cos kt = 1 for all


converges for all P
k and the series 1/k diverges. On the other hand, the series

X 1
cos kt
k2
k=1

converges for all t ∈ R. 


320 Introduction to Real Analysis

Exercises 7.2
P
1. *(Abel’s Test)PProve that if ak converges, and {bk } is monotone and
bounded, then ak bk converges.
2. *Show by example that the hypothesis of Theorem 7.2.3 cannot be re-
placed by bk ≥ 0 and lim bk = 0.
k→∞
P P 2
3. If ak converges, does ak always converge?
4. *Prove that
n
X sin(n + 12 )t − sin 21 t
cos kt = , t 6= 2pπ, p ∈ Z.
k=1
2 sin 21 t
5. Test each of the following series for convergence:
P∞ (−1)k+1 ∞ (−1)k ln k
P ∞ (−1)k
P
*a. p
,p>0 b. *c.
k=1 k k=2 k k=2 k ln k

P k+1 kk P∞
k+1 kk P∞ sin k
*d. (−1) k
e. (−1) k+1
*f.
k=1 (k + 1) k=1 (k + 1) k=2 ln k
P∞ sin kt ∞ cos kt
P
g. p
, t ∈ R, p > 0 *h. , t ∈ R, p > 0
k=1 k k=1 kp

(−1)k+1 sin(π/k)
P
i.
k=1
6. Given that
∞ (−1)k+1
= π 2 12,
P 
k 2
k=1

determine how large n ∈ N must be chosen so that π 2 /12 − sn < 10−4 ,


where sn is the nth partial sum of the series.
7. If p and q are strictly positive real numbers, show that
∞ (ln k)p
(−1)k
P
k=2 kq
converges.
P
8. *Suppose that ak converges. Prove that
1 Pn
lim kak = 0.
n→∞ n k=1
1 1
9. As in Exercise 19 of Section 7.1, let ck = 1 + 2
+ ··· + k
− ln k. Set
1 1 1 P2n (−1)k+1
bn = 1 − + − · · · − = .
2 3 2n k=1 k
Show that lim bn = ln 2. (Hint: bn = c2n − cn + ln 2.)
n→∞

7.3 Absolute and Conditional Convergence


In this section, we introduce the concept of absolute convergence of a se-
ries. As we will see in this and subsequent sections of the text, the notion of
Series of Real Numbers 321

absolute convergence is very important in the study of series. We begin with


the definition of absolute and conditional convergence.

P
DEFINITION 7.3.1 A series ak of real numbers
P is said to be abso-
lutely convergent (or converges absolutely) if |ak | converges. The series
is said to be conditionally convergent if it is convergent but not absolutely
convergent.

We illustrate these two definitions with the following examples.

EXAMPLES 7.3.2 P (a) Since the sequence {1/k} decreases to zero, by The-
orem 7.2.3 the series (−1)k+1 /k converges. However,
∞ ∞
X (−1)k+1 X 1
= = ∞.
k k
k=1 k=1

Thus the series (−1)k+1 /k is conditionally convergent.


P

(b) Consider the series (−1)k+1P /k 2 . By Theorem 7.2.3 the alternating


P
series converges. Furthermore, since 1/k 2 < ∞, the series is absolutely
convergent. 

Our first result for absolutely convergent series is as follows:

THEOREM 7.3.3 Every absolutely convergent series of real numbers con-


verges.
P P
Proof. Suppose ak converges absolutely; i.e. |ak | < ∞. By the triangle
inequality, for 1 ≤ p ≤ q,

q
X q
X
ak ≤ |ak |,
k=p k=p

and the result now follows by the Cauchy criterion (Theorem 3.7.3). 
P
Remark. To test a series ak for absolute convergence we can
P apply any of
the appropriate convergence tests of Section 7.1 to the series |ak |. There
P is
however one important fact which needs to be emphasized. If the series |ak |
diverges by virtue of the ratio or root test, i.e.,

|an+1 | p
r = lim >1 or α = lim n |an | > 1,
n→∞ |an | n→∞

P P
then not only does |ak | diverge, but ak also diverges.
To see this, suppose α > 1. Then as in the proof of the root test, |ak | > 1
for infinitely many k. Hence the sequence {ak } does not converge to zero, and
322 Introduction to Real Analysis
P
thus by Corollary 3.7.5, ak diverges. Similarly, if r > 1 and if 1 < c < r,
then as in the proof of Theorem 7.1.7(b), there exists a positive integer no
and constant M such that
|an | ≥ M cn
for all n ≥Pno . Thus again, since c > 1, {an } does not converge to zero, and
the series ak diverges. We summarize this as follows:
P
THEOREM 7.3.4 (Root and Ratio Test) Let ak be a series of real
numbers, and let p
α = lim k |ak |.
k→∞
Also, if ak 6= 0 for all k ∈ N, let
ak+1 ak+1
R = lim and r = lim .
k→∞ ak k→∞ ak
P
(a) If α < 1 or R < 1, then the series ak is absolutely convergent.
P
(b) If α > 1 or r > 1, then the series ak is divergent.
(c) If α = 1 or r ≤ 1 ≤ R, then the test is inconclusive.

EXAMPLE 7.3.5 To illustrate the previous theorem we consider the series



p(k) bk , where p is a polynomial and b ∈ R with |b| < 1. In this example,
P
k=1
ak = p(k)bk and thus
ak+1 p(k + 1)
= |b| .
ak p(k)
 
Since lim p(k + 1) p(k) = 1 (Exercise 8), lim |ak+1 ak | = |b| < 1. Thus by
k→∞ k→∞
the ratio test the series is absolutely convergent. 

Rearrangements of Series
We nextP take up the topic of rearrangements P of series. Loosely speaking, a
series P a′k is a rearrangement of the series P
ak if all the terms in the original
series ak appear exactly once in the series a′k , but not necessarily in the
same order. For example, the series
1 1 1 1 1 1 1
+ 2 + 2 + 2 + 2 + 2 + 2 + ···
12 3 2 4 5 7 6
is a rearrangement of the series

X 1
.
k2
k=1

The following provides a formal definition of this concept.


Series of Real Numbers 323

DEFINITION 7.3.6 A series a′k is a rearrangement of the series ak


P P
if there exists a one-to-one function j from N onto N such that a′k = aj(k) for
all k ∈ N.
P
P A′ natural question to ask P is the following: If the P
series ak converges and
ak is a rearrangement of ak , does the series a′k converge? If it con-
verges, does it converge to the same sum. As the following example illustrates,
the answer to both of these questions is no!

EXAMPLE 7.3.7 Consider the series



X (−1)k+1 1 1 1 1
=1− + − + ···
k 2 3 4 5
k=1

which converges, but not absolutely. Consider also the following series which
is a rearrangement of the above:
1 1 1 1 1 1 1 1
1+ − + + − + + − + ··· . (3)
3 2 5 7 4 9 11 6
Let

X 1
s= (−1)k+1 .
k
k=1

As in the proof of Theorem 7.2.4, s < s2n+1 for all n ∈ N. In particular,


1 1 5
s < s3 = 1 − + = .
2 3 6
Let s′n denote the nth partial sum of the series (3). Then
n  
X 1 1 1
s′3n = + − .
4k − 3 4k − 1 2k
k=1

Since
1 1 1 8k − 3
+ − =
4k − 3 4k − 1 2k 2k(4k − 1)(4k − 3)
we have
1 1 1 1
0< + − ≤M 2
4k − 3 4k − 1 2k k
for some constant M . Thus the sequence {s′3n } is strictly increasing, and by
the comparison test converges. Let s′ = lim s′3n . Since
n→∞

1
s′3n+1 = s′3n + , and
4n + 1
1 1
s′3n+2 = s′3n + + ,
4n + 1 4n + 3
324 Introduction to Real Analysis

the sequences {s′3n+1 } and {s′3n+2 } also converge to s′ . Therefore lim s′n = s′ .
n→∞
Thus the series (3) also converges. However, since
5
= s′3 < s′6 < s′9 < · · · ,
6
s′ = lim s′n > 65 . Thus the series (3) does not converge to the same sum as
n→∞
the original series. This, as we will see, is due to the fact that the given series
does not converge absolutely. 

P
THEOREM P 7.3.8 If the series ak converges absolutely, then every rear-
rangement of ak converges to the same sum.
P ′ P P
Proof. Let ak be a rearrangement of ak . Since |ak | < ∞, given ǫ > 0,
there exists a positive integer N such that
m
X
|ak | < ǫ (4)
k=n

for all m ≥ n ≥ N . Suppose a′k = aj(k) , where j is a one-to-one function of N


onto N. Choose an integer p (p ≥ N ) such that

{1, 2, ..., N } ⊂ {j(1), j(2), ..., j(p)}.

Such a p exists since the function j is one-to-one and onto. Let


n
X n
X
sn = ak and s′n = a′k .
k=1 k=1

If n ≥ p,
n
X n
X
sn − s′n = ak − aj(k) .
k=1 k=1
By the choice of p, the numbers a1 , ..., aN appear in both sums and conse-
quently will cancel. Thus the only terms remaining will have index k or j(k)
greater than or equal to N . Therefore by (4),

|sn − s′n | ≤ 2ǫ

for all n ≥ p. Therefore lim s′n = lim sn , and thus the rearrangement con-
n→∞ n→∞
verges to the same sum as the original series. 
For conditionally convergent series we have the following very interesting
result of Riemann.
P
THEOREM 7.3.9 Let ak be a conditionally convergent P
series ofPreal
numbers. Suppose α ∈ R. Then there exists a rearrangement a′k of ak
which converges to α.
Series of Real Numbers 325

Before proving the theorem, we illustrate the method of proof with the
alternating series

X (−1)k+1 1 1 1
=1− + − + ··· .
k 2 3 4
k=1

This series converges, but fails to converge absolutely. The positive terms of
this series are Pk = 1/(2k − 1), k ∈ N, and the absolute value of the negative
P1 P P
terms are Qk = 1/2k, k ∈ N. Since = ∞, we also have Pk = Qk =
k
∞. Suppose for purposes of illustration α = 1.5. Our first step is to add “just
enough” positive terms to exceed α. More precisely, we let m1 be the smallest
integer such that P1 + · · · + Pm1 > α. For α = 1.5, m1 = 3; i.e., 1 + 31 + 15 > 1.5
whereas 1 + 13 < 1.5. Our next step is to go in the other direction; namely, we
let n1 be the smallest integer such that

P1 + · · · + Pm1 − Q1 − · · · − Qn1 < α.


1 1 1
Again, for α = P Pand 1 + 3 + 5 − 2 < 1.5. We are able to do this
1.5, n1 = 1,
since the series Pk and Qk both diverge to ∞. We continue this process
inductively. Choosing the smallest integers mk and nk at each stage of the
construction will be the key in proving that the resulting series converges to
α.
Proof. Without loss of generality we assume that ak 6= 0 for all k. Let pk and
qk be defined by
1 1
pk = (|ak | + ak ) and qk = (|ak | − ak ).
2 2
Then pk − qk = ak and pk + qk = |ak |. Furthermore, if ak > 0, then qk = 0
and pk = ak ; if ak < 0, then pk =P P|ak |.
0 and qk =
We first prove that the series pk and qk both diverge. Since

X ∞
X
(pk + qk ) = |ak |,
k=1 k=1

they cannot both converge. Also, since


n
X n
X n
X
ak = pk − qk ,
k=1 k=1 k=1

the convergence of one implies the convergence of the other. Thus they both
must diverge. P
Let P1 , P2 , P3 , ... denote the positive terms of ak in the original order,
and let Q1 , Q2 , Q3 , ... be the
P absolute
P values of the negative
P terms,
P also in the
original order. The series Pk and Qk differ from pk and qk only by
zero terms and thus are also divergent.
326 Introduction to Real Analysis

We will inductively construct sequences {mk } and {nk } such that the series
P1 +· · ·+Pm1 −Q1 −· · ·−Qn1 +Pm1 +1 +· · · Pm2 −Qn1 +1 −· · ·−Qn2 +· · · (5)
has the desired property. Clearly (5) is a rearrangement of the original series.
Let m1 be the smallest integer such that
X1 = P1 + · · · + Pm1 > α.
P
Such an m1 exists since Pk = ∞. Similarly, let n1 be the smallest integer
such that
Y1 = X1 − Q1 − · · · − Qn1 < α.
Suppose {m1 , ..., mk } and {n1 , ..., nk } have been chosen. Let mk+1 and nk+1
be the smallest integers greater than mk and nk respectively such that
Xk+1 = Yk + Pmk +1 + · · · + Pmk+1 > α, and
Yk+1 = Xk+1 − Qnk +1 − · · · − Qnk+1 < α.
P P
Such integers always exist due to the divergence of the series Pk and Qk .
Since mk+1 was chosen to be the smallest integer such that the above holds,
Xk+1 − Pmk+1 ≤ α.
Therefore
0 < Xk+1 − α ≤ Pmk+1 .
Similarly,
0 < α − Yk+1 ≤ Qnk+1 .
P
Since an converges, we have lim Pk = lim Qk = 0. Therefore,
k→∞ k→∞

lim Xk = lim Yk = α.
k→∞ k→∞

Let Sn be the nth partial sum of the series (5). If the last term of Sn is a
Pn , then there exists a k such that
Yk < Sn ≤ Xk+1 .
If the last term of Sn is −Qn , then there exists a k such that
Yk+1 ≤ Sn < Xk+1 .
In either case we obtain lim Sn = α, which proves the result. 
n→∞
Remark.
P By a variation of the above proof one can show that if the series
ak is conditionally convergent, then given α, β with
−∞ ≤ α ≤ β ≤ ∞,
P ′ P
there exists a rearrangement ak of ak such that
lim Sn = α, and lim Sn = β,
n→∞ n→∞

a′k (Exercise 13).


P
where Sn is the nth partial sum of
Series of Real Numbers 327

Exercises 7.3
1. Prove the following:
a. If lim kp ak = A for some p > 1, then
P
ak converges absolutely.
k→∞
P
b. If lim k ak = A 6= 0, then ak diverges.
k→∞

c. If lim k ak = 0, then the test is inconclusive.


k→∞
P 2 P 2 P
2. *Suppose ak < ∞ and bk < ∞. Prove that the series ak bk con-
verges absolutely.
P P
3. P
a. Prove that if ak converges and bk converges absolutely, then
ak bk converges.
b. Show by example that the conclusion may be false if one of the two
series does not converge absolutely.
4. *Suppose that the sequence {bn } is monotone decreasing with lim bn = 0.
} is a sequence in R satisfying |an | ≤ bn − bn+1 for all n ∈ N, prove
If {anP
that ak converges absolutely.
P P
5. a. If ak converges absolutely, prove that ǫk ak converges for every
choice of ǫk ∈ {−1, 1}.
P P
b. If ǫk ak converges for every choice of ǫk ∈ {−1, 1}, prove that ak
converges absolutely.
6. Test each of the following series for absolute and conditional convergence.
∞ ∞ ∞
X (−1)k+1 X (−1)k X (−1)k+1
*a. √ b. √ *c. ,p>0
k k ln(ln k) kp
k=1 k=3 k=1
∞ ∞ ∞
X (−1)k ln(ln k) X (−1)k X (−1)k+1 kk
d. √ *e. , p > 0 f.
ln k k(ln k)p (k + 1)k+1
k=3 k=2 k=1

X (−1) k+1 k ∞   ∞
X (−1)k
k X k+1 1
*g. h. (−1) sin *i.
(k + 1)k k k2 + (−1)k
k=1 k=1 k=2

X pk
7. Test the series , p ∈ R, for absolute and conditional convergence.
kp
k=1

|p(k + 1)|
8. Prove that lim = 1 for any polynomial p.
k→∞ |p(k)|
1 1 1 1 1
9. *Show that the series 1 + − + + − + · · · diverges.
2 3 4 5 6
1 1 1 1 1 1
10. Determine whether the series 1 − − + + − − + · · · converges
2 3 4 5 6 7
or diverges.

X sin k
11. *Prove that the series is conditionally convergent.
k
k=1
P P ′
12. If ak ≥ 0 for allPk ∈ N,P and ak = ∞, prove that ak = ∞ for any

rearrangement ak of ak .
328 Introduction to Real Analysis
P
13. Suppose that the series ak is conditionally convergent. Given
P ′α, βPwith
−∞ ≤ α ≤ β ≤ ∞, show that there exists a rearrangement ak of ak
such that lim Sn = α and lim Sn = β, where Sn is the nth partial sum
n→∞
P ′ n→∞
of ak .
P
14. P
Suppose every rearrangement of the series ak converges, prove that
ak converges absolutely.

7.4 Square Summable Sequences2

In this section, we introduce the set ℓ2 of square summable sequences of real


numbers and derive several useful inequalities. This set occurs naturally in
the study of Fourier series.

DEFINITION 7.4.1 A sequence {ak }∞k=1 of real numbers is said to be in


ℓ2 , or to be square summable, if

X
a2k < ∞.
k=1

For {ak } ∈ ℓ2 set v


u∞
uX
k{ak }k2 = t a2k .
k=1

2
The set ℓ is called the space of square summable sequences, and the
quantity k{ak }k2 is called the norm of the sequence {ak }.

Remark. Since a sequence {ak } in R is by definition a function a from N into


R with ak = a(k), it is sometimes convenient to think of ℓ2 as the set of all
functions a : N → R for which
v
u∞
uX
kak2 = t |a(k)|2 < ∞.
k=1

2 The topic of square summable sequences, although important in the study of Fourier

series, is not specifically required in Chapter 9 and thus can be omitted on first reading. The
concept of a normed linear space occurs on several occassions in the discussion of subsequent
topics in the text. At that point the reader can study this topic more carefully.
Series of Real Numbers 329

EXAMPLES 7.4.2 (a) For the sequence {1 k}∞



k=1 ,

X 1
k{1/k}k22 = .
k2
k=1
P  2
Since
 this is a p-series with p = 2, the series 1 k converges and thus
{1 k} ∈ ℓ2 . On the other hand, since

√ ∞
X 1
k{1/ k}k22 = = ∞,
k
k=1

we have {1 k} 6∈ ℓ2 .
√

(b) For fixed q, 0 < q < ∞, consider the sequence {1 k q }. Then





X 1
k{1/k q }k22 = .
k 2q
k=1

By Example 7.1.9(a)
 this series converges for all q > 1/2 and diverges for all
q ≤ 1/2. Thus {1 k q } ∈ ℓ2 if and only if q > 1/2. 

Cauchy-Schwarz Inequality
Our main goal in this section is to prove the Cauchy-Schwarz inequality for
sequences in ℓ2 . First however we prove the finite version of this inequality.

THEOREM 7.4.3 (Cauchy-Schwarz Inequality) If n ∈ N and a1 , ..., an


and b1 , ..., bn are real numbers, then
v v
Xn u n u n
uX uX
|ak bk | ≤ t 2
ak t b2k .
k=1 k=1 k=1

Proof. Let λ ∈ R and consider


n
X n
X n
X n
X
0≤ (|ak | − λ|bk |)2 = a2k − 2λ |ak bk | + λ2 b2k .
k=1 k=1 k=1 k=1

The above can be written as

0 ≤ A − 2λC + λ2 B,
n n n
a2k , C = b2k . If B = 0, then bk = 0 for
P P P
where A = |ak bk |, and B =
k=1 k=1 k=1
all k = 1, ..., n and the Cauchy-Schwarz inequality certainly holds. If B 6= 0,
we take λ = C B which gives

C2
0≤A−
B
330 Introduction to Real Analysis

or C 2 ≤ AB; that is,


n
!2 n
! n
!
X X X
|ak bk | ≤ a2k b2k .
k=1 k=1 k=1

Taking the square root of both sides gives the desired result. 
As a consequence of the previous result we have the following corollary.

COROLLARY 7.4.4 (Cauchy-Schwarz Inequality) If {ak }, {bk } ∈ ℓ2 ,



P
then ak bk is absolutely convergent and
k=1


X
|ak bk | ≤ k{ak }k2 k{bk }k2 .
k=1

Proof. For each n ∈ N, by the previous theorem


v v
Xn u n u n
uX uX
|ak bk | ≤ t 2
ak t b2k ≤ k{ak }k2 k{bk }k2 .
k=1 k=1 k=1

Letting n → ∞ gives the desired result. 


For a, b ∈ ℓ2 the inner product of a and b, denoted ha, bi, is defined as

X
ha, bi = a k bk .
k=1

As a consequence of the Cauchy-Schwarz inequality we have

|ha, b, i| ≤ kak2 kbk2 .

Minkowski’s Inequality
Our next result shows that the norm k k2 satisfies the “triangle inequality” on
ℓ2 .

THEOREM 7.4.5 (Minkowski’s Inequality) If {ak } and {bk } are in ℓ2 ,


then {ak + bk } ∈ ℓ2 and

k{ak + bk }k2 ≤ k{ak }k2 + k{bk }k2 .


P 2 P 2
Proof. By hypothesis, each P of the series ak and bk converge. Also, by
Corollary 7.4.4 the series ak bk converges absolutely. Since

(ak + bk )2 = a2k + 2ak bk + b2k ≤ a2k + 2|ak bk | + b2k ,


Series of Real Numbers 331

we have

X ∞
X
k{ak + bk }k22 = 2
(ak + bk ) ≤ k{ak }k22 +2 |ak bk | + k{bk }k22 ,
k=1 k=1

which by the Cauchy Schwarz inequality

≤ k{ak }k22 + 2k{ak }k2 k{bk }k2 + k{bk }k22


= (k{ak }k2 + k{bk }k2 )2 .

Taking the square root of both sides gives the desired result. 
Although not specifically stated, Minkowski’s inequality is also true for
finite sums. In particular if n ∈ N and a1 , ..., an , b1 , ..., bn , are real numbers,
then v v v
u n u n u n
uX uX uX
t (ak + bk )2 ≤ t 2
ak + t b2k . (6)
k=1 k=1 k=1

In the following theorem we summarize some of the properties of the norm


k k2 on ℓ2 . These are very similar to the properties of the absolute value
function on R. As for the absolute value, the inequality

k{ak + bk }k2 ≤ k{ak }k2 + k{bk }k2

is also referred to as the triangle inequality for ℓ2 .

THEOREM 7.4.6 The norm k k2 on ℓ2 satisfies the following properties:


(a) k{ak }k2 ≥ 0 for all {ak } ∈ ℓ2 .
(b) k{ak }k2 = 0 if and only if ak = 0 for all k ∈ N.
(c) If {ak } ∈ l2 and c ∈ R, then {cak } ∈ ℓ2 and

k{cak }k2 = |c| k{ak }k2 .

(d) If {ak }, {bk } ∈ ℓ2 , then {ak + bk } ∈ ℓ2 and

k{ak + bk }k2 ≤ k{ak }k2 + k{bk }k2 .

Proof. The results (a) and (b) are obvious from the definition, and (d) is just
a restatement of Minkowski’s inequality. The verification of (c) is left as an
exercise (Exercise 5). 

Normed Linear Spaces


The space ℓ2 as well as Rn are both examples of vector spaces over R. For
completeness we include the definition of a vector space.
332 Introduction to Real Analysis

DEFINITION 7.4.7 A set X with two operations “+”, vector addition, and
“·”, scalar multiplication, satisfying
x+y ∈X for all x, y ∈ X, and
c·x∈X for all x ∈ X, c ∈ R
is a vector space over R if the following are satisfied:
(a) x + y = y + x. (commutative law)
(b) x + (y + z) = (x + y) + z. (associative law)
(c) There is a unique element in X called the zero element, denoted 0,
such that
x+0=x for all x ∈ X.

(d) For each x ∈ X, there exists a unique element −x ∈ X such that


x + (−x) = 0.
(e) (ab) · x = a · (b · x) for all a, b ∈ R, x ∈ X.
(f ) a · (x + y) = a · x + b · y.
(g) (a + b) · x = a · x + b · x.
(h) 1 · x = x.

It is clear that Rn with + and · defined by


a + b = (a1 + b1 , ..., an + bn ),
c · a = (ca1 , ..., can )
is a vector space over R. Similarly if addition and scalar multiplication of
sequences {ak }, {bk } in ℓ2 are defined by
{ak } + {bk } = {ak + bk },
c · {ak } = {cak }, c ∈ R,
then it is easily shown that ℓ2 is a vector space over R. The fact that {ak + bk }
and {cak } are in ℓ2 is part of the statement of Theorems 7.4.5 and 7.4.6. The
zero element 0 in ℓ2 is the sequence {ak } where ak = 0 for all k ∈ N.
In addition to being vector spaces, the spaces Rn and ℓ2 are also examples
of normed linear spaces. The concepts of a “normed linear space” as well as
that of a “norm” are defined as follows:

DEFINITION 7.4.8 Let X be a vector space of R. A function k k : X → R


satisfying
(a) kxk ≥ 0 for all x ∈ X,
(b) kxk = 0 if and only if x = 0,
(c) kcxk = |c| kxk for all c ∈ R, x ∈ X, and
(d) kx + yk ≤ kxk + kyk for all x, y ∈ X,
is called a norm on X. The pair (X, k k) is called a normed linear space.
Series of Real Numbers 333

Inequality (d) is called the triangle inequality for the norm k k. By


Theorem 7.4.6, (ℓ2 , k k2 ) is a normed linear space. Additional examples of
normed linear spaces will occur elsewhere in the book, both in the text and
the exercises.
If (X, k k) is a normed linear space, the distance d(x, y) between two
points x, y ∈ X is defined as d(x, y) = kx − yk. From the definition of the
norm k k, it immediately follows that
(a) d(x, y) ≥ 0 for all x, y ∈ X,
(b) d(x, y) = 0 if and only if x = y,
(c) d(x, y) = d(y, x), and
(d) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ X.
Thus d is a metric on the vector space X. Since the notions of “limit point of
a set” (Definition 2.2.12) and “convergence of a sequence” (Definition 3.1.1)
are defined in terms of ǫ-neighborhoods, both of these concepts have analogous
definitions in a normed linear space (X, k k). Thus for example, a sequence
{xn } in X converges to x ∈ X, if given ǫ > 0, there exists no ∈ N such that
kxn − xk < ǫ for all n ∈ N, n ≥ no . This type of convergence, referred to as
norm convergence, will be encountered again later in the text.
As a general rule, many of the results involving sequences of real numbers
are still valid in the setting of normed linear space. This is especially true for
those theorems whose proofs relied only on properties of the absolute value.
Great care however must be exercised with theorems that rely on the supre-
mum property of R. For example, the Bolzano-Weierstrass theorem fails for
(ℓ2 , k k2 ) (Exercise 6). On the other hand however, the Bolzano-Weierstrass
theorem still holds in (Rn , k k2 ) (Miscellaneous Exercise 5).

Exercises 7.4
1. Determine which of the following sequences are in ℓ2 .
 ∞  ∞  ∞  ∞
1 1 ln k sin k
*a. b. √ *c. √ a. .
ln k k=2 k ln k k=2 k k=2 k k=1

2. Determine all values of p ∈ R such that the given sequence is in ℓ2 .


 p ∞  ∞ ( )∞
 k ∞ k 1 1
*a. p k=1 b. *c. d. √ .
pk k=1 kp ln k k=2 k(ln k)p k=2

3. *If {ak } ∈ ℓ2 , prove that
P 
ak k converges absolutely.
k=1

4. Give an example of a sequence {ak } ∈ ℓ2 for which
P
|ak | = ∞.
k=1

5. If {ak } ∈ ℓ2 and c ∈ R, prove that {cak } ∈ ℓ2 and k{cak }k2 = |c|k{ak }k2 .
6. For each n ∈ N, let en be the sequence in ℓ2 defined by
(
0, k 6= n,
en (k) = .
1, k = n.
334 Introduction to Real Analysis

Show that ken − em k2 = 2 if n 6= m. (Remark. The sequence {en }
is a bounded sequence in ℓ2 with no convergent subsequence. Thus the
Bolzano-Weierstrass theorem (3.4.6) fails in ℓ2 . )
7. Show that equality holds in the Cauchy-Schwarz inequality if and only if
for all k ∈ N, bk = cak for some c ∈ R.
8. For a, b ∈ Rn , let ha, bi denote the inner product of a and b. Prove each
of the following:
a. ha, ai ≥ 0.
b. ha, ai = 0 if and only if a = (0, ..., 0).
c. ha, b + ci = ha, bi + ha, ci.
d. ha, bi = hb, ai.
e. ha, bi = 21 kak22 + kbk22 − ka − bk22 .
 

9. *For a, b ∈ Rn , use the law of cosines to prove that ha, bi =


kak2 kbk2 cos θ, where θ is the angle between the vectors a and b.
10. Let (X, k k) be a normed linear space. Prove that | kxk − kyk | ≤ kx − yk
for all x, y ∈ X.

11. Let ℓ1 denote the set of all sequences {ak } satisfying k{ak }k1 =
P
|ak | <
k=1
∞.
a. Prove that (ℓ1 , k k1 ) is a normed linear space.
*b. Prove that ℓ1 $ ℓ2 .
12. Determine all values of p ∈ R for which each of the sequences in Exercise
2 is in ℓ1
13. Let X be a non-empty set, and let B(X) denote the set of all bounded
real-valued functions on X. For f ∈ B(x), set
kf k∞ = sup{|f (x)| : x ∈ X}.
Prove that (B(X), k k∞ ) is a normed linear space.

Notes
The geometric series is perhaps one of the most important series in analysis. In the
seventeenth and eighteenth century, the convergence of many series was established
by comparison with the geometric series. It forms the basis for the proof of the
root and ratio test, and thus also for the study of convergence of power series. The
geometric series dates back to Euclid in the 3rd century B.C. The formula for a finite
sum of a geometric progression appeared in Euclid’s Elements, and Archimedes, in
his treatise Quadrature of the Parabola, indirectly used the geometric series to find
the area under a parabolic arc.3 Even though Greek mathematicians knew how to
3 See p. 105 of the text by Katz.
Series of Real Numbers 335

sum a finite geometric progression, they used reducto ad absurdum arguments to


avoid dealing with infinite quantities.
Infinite series first appeared in the middle ages. In the fourteenth century, Nicole
Oresme (1323?–1382) of Italy provided a geometric proof to the effect that the
infinite series
1 2 n
+ 2 + ··· + n + ···
2 2 2
had sum equal to 2, a result now quite easy to prove using power series. He was
also the first to prove that the harmonic series diverged, whereas the Italian math-
ematician Pietro Mengoli (1624–1686) was the first to show that the sum of the
alternating harmonic series 1 − 21 + 13 − 14 + · · · is ln 2. The results of Oresme and
Mengoli, as well as others of this era, were stated verbally; the infinite sum notation
did not appear until the seventeenth century.
With the development of the calculus in the mid seventeenth century, the em-
phasis shifted to the study of power series expansions of functions, and computations
involving power series. The expansion of functions in power series was for Newton,
and his successors, an indispensable tool. In his treatise Of the Method of Fluxions
and Infinite series4 , Newton includes a discussion of infinite series techniques for the
solution of both algebraic and differential equations. Both the geometric series and
the binomial series were employed in many of his computations.
During the seventeenth and eighteenth century, mathematicians operated with
power series in the same way as with polynomials; often ignoring questions of con-
vergence. An excellent illustration of these eighteenth century techniques is Euler’s
derivation of the sum of the series
1 1 1
+ 2 + 2 + ··· .
12 2 3
The convergence of this series had already been established earlier by Johann
Bernoulli. Using results from the theory of equations and applying them to the
power series expansion of sin x, Euler was able to show that the sum of the given
series was π 2 /6 5 .
As power series were used more frequently to approximate mathematical quan-
tities, the emphasis turned to deriving precise error estimates in approximating a
function by a finite sum of the terms of its power series. In 1768, d’Alembert made
a careful examination of the binomial series (1 + r)p , p ∈ Q, discovered earlier by
Newton6 . D’Alembert computed the value of n for which the absolute value of the
ratio of successive terms is less than 1, thereby ensuring that the successive terms
decrease. More importantly, d’Alembert computed the bounds on the error made
in approximating (1 + r)p by a finite number of terms of this series. His argument
relied on a term by term comparison with a geometric series, similar to the proof
used in establishing the ratio test. D’Alembert’s result can easily be converted into
a proof of the convergence of the series.
Computations analogous to those of d’Alembert were also undertaken by La-
grange in this book Théorie des Fonctions Analytiques published in 1797. The re-
mainder term for Taylors formula (to be discussed in the next chapter) first appeared
4 The Mathematical Works of Issac Newton, edited by D. T. Whiteside, Johnson Reprint

Corporation, New York, 1964.


5 For details the reader is referred to the article by J. Grabiner.
6 Réflexions sur les suites ét sur les racines imaginaires, Opuscules Mathématiques, vol.

5 (1768), pp. 171–215.


336 Introduction to Real Analysis

in this text. Cauchy was undoubtedly influenced by the works of d’Alembert and
Lagrange in his definition of convergence of a series. However, unlike the results of
d’Alembert, which applied only to the binomial series, and those of Lagrange for
Taylor series, Cauchy’s definition of convergence was entirely general; applicable to
any series of real numbers.
Using his definition of convergence, Cauchy proved that the statement now
known as the Cauchy criterion was necessary for the convergence of a series of
real numbers. This was was also known earlier to Bolzano. However, without the
completeness property of the real number system, neither Bolzano nor Cauchy were
able to prove that the Cauchy criterion was also sufficient for convergence of the
series.

Miscellaneous Exercises

P ∞
P n
P
1. Given two series ak and bk , set cn = ak bn−k , n = 0, 1, 2, ....
k=0 k=0 k=0
P∞ P P
The series cn is called the Cauchy product of ak and bk .
n=0
P P P
a. If ak and bk converge absolutely, prove that cn converges ab-
solutely and that

∞  ∞ 
P P P
cn = ak bk .
n=0 k=0 k=0
k
√
b. Let ak =P bk = (−1)
P k + 1, k = 0, 1, 2... Prove that the Cauchy
product of ak and bk diverges.
c. Prove that the result of (a) is still true if only one of the two series
converge absolutely; the other series must still converge.
2. Let X be a non-empty set. If f is a real-valued function on X, define
P
kf k1 = sup{ |f (x)| : F is a finite subset of X}.
x∈F

In the above, the supremum is taken over all finite subsets F of X. Denote
by ℓ1 (X) the set of all real-valued functions f on X for which kf k1 < ∞.
a. Suppose X is infinite. If f ∈ ℓ1 (X), prove that {x ∈ X : f (x) 6= 0} is
at most countable.
∈ X : f (x) 6= 0} = {xn : n ∈ A}, where A ⊂ N, prove that
b. If {x P
kf k1 = |f (xn )|.
n∈A

c. Prove that (ℓ1 (X), k k1 ) is a normed linear space.


3. For f ∈ R[a, b], set
b 1/2
(f (x))2 dx
R
kf k2 = .
a

Rb
a. Prove that for f, g ∈ R[a, b], |f (x)g(x)| dx ≤ kf k2 kgk2 .
a
Series of Real Numbers 337

b. For f, g ∈ R[a, b], prove that kf + gk2 ≤ kf k2 + kgk2 .


c. Prove that k k2 defines a norm on C[a, b], the space of continuous
real-valued functions on [a, b].
d. Is k k2 a norm on R[a, b]?
4. As in the previous exercise, let C[a, b] denote the vector space of continu-
ous real-valued functions on [a, b]. For f ∈ C[a, b], set kf ku = max{|f (x)| :
x ∈ [a, b]}.
a. Prove that k ku is a norm on C[a, b].
b. Prove that a sequence {fn } in C[a, b] converges to f ∈ C[a, b] if and
only if given ǫ > 0, there exists no ∈ N such that |f (x) − fn (x)| < ǫ for
all x ∈ [a, b] and all n ∈ N, n ≥ no .

Definition (a) Let (X, k k) be a normed linear space, and let E ⊂ X.


A function f : E → R is continuous at p ∈ E if given ǫ > 0, there exists
a δ > 0 such that

|f (x) − f (p)| < ǫ for all x ∈ E with kx − pk < δ.

(b) Let X be a vector space over R. A function f : X → R is linear if

f (ax + by) = af (x) + bf (y)

for all x, y ∈ X and a, b ∈ R.


5. Let (X, k k) be a normed linear space and let f : X → R be a linear
function. Prove that the following are equivalent.
a. f is continuous at some p ∈ X.
b. f is continuous at 0.
c. There exists a positive constant M such that |f (x)| ≤ M kxk for all
x ∈ X.
d. There exists a positive constant M such that |f (x)−f (y)| ≤ M kx−yk
for all x, y ∈ X.
e. f is uniformly continuous on X.

6. For fixed b ∈ ℓ2 , define Γ : ℓ2 → R by Γ(a) = ha, bi =
P
a(k)b(k).
k=1
2
Prove that Γ is a continuous linear function on ℓ .
7. As in Exercise 3, let C[a, b] denote the vector space of continuous real
valued functions on [a, b] with norm k k2 . For fixed g ∈ R[a, b], prove
that Γ defined by
Rb
Γ(f ) = f (x)g(x) dx
a

is a continuous linear function on C[a, b].


338 Introduction to Real Analysis

Supplemental Reading

Ali, S. A., “The mth ratio test: century analysis,” Math. Mag. 72 (1999),
new convergence tests for series,” Amer. 347–355.
Math. Monthly 115 (2008), 514–524. Grabinger, Judith V., The Origins of
Behforooz, G. H., “Thinning out the Cauchy’s Rigorous Calculus, The MIT
harmonic series,” Math. Mag. 68 (1995), Press, Cambridge, Massachusetts, 1981.
289–293. Hoang, N. S., “A limit comparison
Boas, R. P., “Estimating remain- test for general series,” Amer. Math.
ders,” Math. Mag. 51 (1978), 83–89. Monthly 122 (2015), 893–896.
Boas, R. P., “Partial sums of infinite Jungck, G. “An alternative to the in-
series and how they grow,” Amer. Math. tegral test,” Math. Mag. 56 (1983), 232–
Monthly 84 (1977), 237–258. 235.
Bradley, D. M., “An infinite series Katz, V. J., “Ideas of Calculus in Is-
that displays the concavity of the nat- lam and India,” Math. Mag. 68 (1995),
ural logarithm,” Math. Mag. 90 (2017), 163–174.
353–354. Kortram, R. A., “Simple proofs

Cowen, C. C., Davidson, K. R., and for
P
1/k2 = 61 π 2 and sin x =
Kaufman, R.P., “Rearranging the alter- k=1
∞ x2
 
nating harmonic series,” Amer. Math. Q
x 1 − 2 2 ,” Math. Mag. 69
Monthly 87 (1980), 817–819. k=1 k π
Creswell, S. H.. “A continuous bi- (1996), 122-125.
jection of ℓ2 onto a subset of ℓ2 whose Krantz, S. G. and McNeal, J.
inverse is discontinuous everywhere,” D., “Creating more convergent series,”
Amer. Math. Monthly 117 (2010), 823– Amer. Math. Monthly 111 (2004), 32–38.
828. Maher, P., “Jensens inequality by
Daners,
P D.,2 “A 2short elementary differentiation,” Math. Gaz. 73 (1989),
proof of 1/k = π /6,” Math. Mag. 139–140.
85 (2012), 361–364. P Passare, M., “How to compute
Dence, T. P. and Dence, J. P., “A 1/n2 by solving triangles,” Amer.
survey of Euler’s constant,” Math. Mag. Math. Monthly 115 (2008), 745–752.
82 (2009), 255–265. Tolsted, E., “An elementary deriva-
Efthimiou, C. J., “Finding exact val- tion of the Cauchy, Hölder, and
ues for infinite series,” Math. Mag. 72 Minkowski inequalities from Young’s in-
(1999), 45–51. equality,” Math. Mag. 37 (1964), 2–12.
Goar, M., “Olivier and Abel on series Young, R. M., “Euler’s constant,”
convergence: An episode from early 19th Math. Gazette 75 (1991), 187–190.
8
Sequences and Series of Functions

In this chapter, we will study convergence properties of sequences and series


of real-valued functions defined on a set E. In most instances E will be a
subset of R. Since we are dealing with sequences and series of functions, there
naturally arise questions involving preservation of continuity, differentiability,
and integrability. Specifically, is the limit function of a convergent sequence
of continuous, differentiable, or integrable functions again continuous, differ-
entiable, or integrable? We will discuss these questions in detail in Section 1
and show by examples that the answer to all of these questions is in general
no! Convergence by itself is not sufficient for preservation of either continuity,
differentiability, or integrability. Additional hypotheses will be required.
In the 1850’s Weierstrass made a careful distinction between convergence
of a sequence or series of numbers and that of a sequence or series of func-
tions. It is to him that we are indebted for the concept of uniform convergence
which is the additional hypothesis required for the preservation of continuity
and integrability. It was also Weierstrass who constructed a continuous but
nowhere differentiable function, and who proved that every continuous real-
valued function on a closed and bounded interval can be uniformly approx-
imated by a polynomial. As prominently as Cauchy is associated with the
study of sequences and series of numbers, Weierstrass is likewise associated
with the study of sequences and series of functions. For his many contributions
to the subject area, Weierstrass is often referred to as the father of modern
analysis.
The study of sequences and series of functions has its origins in the study
of power series representation of functions. The power series of ln(1 + x) was
known to Nicolaus Mercator (1620–1687) by 1668, and the power series for
many of the transcendental functions such as arctan x, arcsin x, among oth-
ers, were discovered around 1670 by James Gregory (1625–1683). All of these
series were obtained without any reference to calculus. Newton’s first discov-
eries, dating back to the early months of 1665, resulted from his ability to
express functions in terms of power series. His treatise on calculus, published
posthumously in 1737, was appropriately entitled A treatise of the method of
fluxions and infinite series. Among his many accomplishments, Newton de-
rived the power series expansion of (1 + x)m/n using algebraic techniques.
This series and the geometric series were crucial in many of his computa-
tions. Newton also displayed the power of his calculus by deriving the power

339
340 Introduction to Real Analysis

series expansion of ln(1 + x) using term-by-term integration of the expansion


of 1/(1 + x). The mathematicians Colin Maclaurin (1698–1746) and Brooks
Taylor (1685–1731) are prominent for being the first mathematicians to use
the methods of the new calculus in determining the coefficients in the power
series expansion of a function.

8.1 Pointwise Convergence and Interchange of Limits


In this section, we consider a number of questions involving sequences and
series of functions and interchange of limits. Some of these questions were
actually believed to be true by many mathematicians prior to the nineteenth
century. Even Cauchy in his text Cours d’Analyse “proved” a theorem to the
effect that the limit of a convergent sequence of continuous functions was again
continuous. As we will shortly see, this result is false!
To begin our study of sequences and series of functions we first define
what we mean by pointwise convergence of a sequence of functions. If E is a
nonempty set and if for each n ∈ N, fn is a real-valued function on E, then
we say that {fn } is a sequence of real-valued functions on E. For each
x ∈ E, such a sequence gives rise to the sequence {fn (x)} of real numbers,
which may or may not converge. If the sequence {fn (x)} converges for all
x ∈ E, then the sequence {fn } is said to converge pointwise on E, and by the
uniqueness of the limit
f (x) = lim fn (x)
n→∞
defines a function f from E into R. We summarize this in the following defi-
nition.

DEFINITION 8.1.1 Let (X, d) be a metric space and let E ⊂ X. Let


{fn }∞
n=1 be a sequence of real-valued functions defined on E. The sequence
{fn } converges pointwise on E if {fn (x)}∞ n=1 converges for every x ∈ E.
If this is the case, then f defined by
f (x) = lim fn (x), x ∈ E,
n→∞

defines a function on E. The function f is called the limit of the sequence


{fn }.

In terms of ǫ and no , the sequence {fn } converges pointwise to f if for


each x ∈ E, given ǫ > 0, there exists a positive integer no = no (x, ǫ) such that
|fn (x) − f (x)| < ǫ
for all n ≥ no . The expression no = no (x, ǫ) indicates that the positive integer
no may depend both on ǫ and x ∈ E.
Sequences and Series of Functions 341

If as above, {fn }∞
n=1 is a sequence of real-valued functions on a nonempty
set E, then with the sequence {fn } we can associate the sequence {Sn } of
nth partial sums, where for each n ∈ N, Sn is the real-valued function on E
defined by
n
X
Sn (x) = f1 (x) + · · · + fn (x) = fk (x).
k=1
P∞
The sequence {Sn } is called a series of functions on E denoted by fk or
P P k=1
simply fk . The series fk converges pointwise on E if for each x ∈ E

P
the sequence {Sn (x)} of partial sums converges; that is, the series fk (x)
k=1
converges for each x ∈ E. If the sequence {Sn } converges
Ppointwise to the
function S on E, then S is called the sum of the series fk and we write

P
S= fk , or if we wish to emphasize the variable x,
k=1

X
S(x) = fk (x), x ∈ E.
k=1

Suppose fn : [a, b] → R for all n ∈ N, and fn (x) → f (x) for each x ∈ [a, b].
Among the questions we want to consider are the following:
(a) If each fn is continuous at p ∈ [a, b], is the function f continuous at
p? Recall that the function f is continuous at p if and only if
lim f (t) = f (p).
t→p

Since f (x) = lim fn (x), what we are really asking is whether


n→∞
   
lim lim fn (t) = lim lim fn (t) ?
t→p n→∞ n→∞ t→p

(b) If for each n ∈ N the function fn is differentiable at p ∈ [a, b], is f


differentiable at p? If so, does
f ′ (p) = lim fn′ (p) ?
n→∞

(c) If for each n ∈ N the function fn is Riemann integrable on [a, b], is f


Riemann integrable? If so, does
Z b Z b
f = lim fn ?
a n→∞ a

We now provide a number of examples to show that the answer to all of


the above questions is in general no.
342 Introduction to Real Analysis

EXAMPLES 8.1.2 (a) Let E = [0, 1], and for each x ∈ E, n ∈ N, let
fn (x) = xn . Clearly each fn is continuous on E. Since fn (1) = 1 for all n,
lim fn (1) = 1. If 0 ≤ x < 1, then by Theorem 3.2.6(e), lim fn (x) = 0.
n→∞ n→∞
Therefore (
0, 0 ≤ x < 1,
lim fn (x) = f (x) =
n→∞ 1, x = 1.
The function f however is not continuous on [0, 1]. (In Exercise 1 you will be
asked to sketch the graphs of f1 , f2 , f4 .)
(b) Consider the sequence {fk }∞ k=0 defined by

x2
fk (x) = , x ∈ R.
(1 + x2 )k

P
For each k = 0, 1, 2, .., fk is continuous on R. Consider the series fk which
k=0
for each x ∈ R is given by
∞ ∞  k
X X
2 1
fk (x) = x .
1 + x2
k=0 k=0

We now show that this series converges for all x ∈ R and also find its sum
f . If x = 0, then fk (0) = 0 for all k, and thus f (0) = 0. If x 6= 0, then
1/(1 + x2 ) < 1 and hence by Example 3.7.2(a)
∞  k " #
2
X 1 2 1
x =x 1 = 1 + x2 = f (x).
1 + x2 1 − 1+x2
k=0

Therefore (
0, x = 0,
f (x) =
1 + x2 , x 6= 0,
which again is not continuous on R.
(c) Let {xk } be an enumeration of the rational numbers in [0, 1]. For each
n ∈ N, define fn as follows:
(
0, if x = xk , 1 ≤ k ≤ n,
fn (x) =
1, otherwise.

Since each Rfn is continuous except at x1 , ..., xn , fn is Riemann integrable on


1
[0, 1] with 0 fn (x)dx = 1. On the other hand,
(
0, if x is rational,
lim fn (x) = f (x) =
n→∞ 1, if x is irrational,

which by Example 6.1.6(a) is not Riemann integrable on [0, 1].


Sequences and Series of Functions 343

(d) For x ∈ [0, 1], n ∈ N, let fn (x) = n x (1 − x2 )n . Since each fn is


continuous, fn is Riemann integrable on [0, 1]. If 0 < x < 1, then 0 < 1 − x2 <
1, and thus by Theorem 3.2.6
lim nx (1 − x2 )n = 0, if 0 < x < 1.
n→∞

Finally, since fn (0) = fn (1) = 0, we have


f (x) = lim fn (x) = 0 for all x ∈ [0, 1].
n→∞

Thus f is also Riemann integrable on [0, 1]. On the other hand,


Z 1 Z 1
1 n
fn (x) dx = n x (1 − x2 )n dx = .
0 0 2n+1
Therefore,
1 1
1
Z Z
lim fn (x) dx = =6 0= f (x) dx.
n→∞ 0 2 0

sin nx
(e) As our final example, consider fn (x) = , x ∈ R. Since | sin nx| ≤
n
1 for all x ∈ R and n ∈ N,
f (x) = lim fn (x) = 0 for all x ∈ R.
n→∞

Therefore f (x) = 0 for all x. On the other hand,
fn′ (x) = cos nx.
In particular fn′ (0) = 1 so that lim fn′ (0) = 1 6= f ′ (0). This example shows
n→∞
that in general
d  
lim fn (x) 6= lim fn′ (x).
dx n→∞ n→∞
Additional examples are also given in the exercises. 

Exercises 8.1
1. Let fn be as in Example 8.1.2. Sketch the graphs of f1 , f2 , and f4 .
2. Find the pointwise limits of each of the following sequences of functions
on the
 given set.
  
nx sin nx
*a. , x ∈ [0, ∞) b. , x ∈ [0, ∞)
1 + nx n 1 + nx 2 o
*c. (cos x)2n , x ∈ R. d. nxe−nx , x ∈ R.


3. Determine the values of x for which each of the following series converge.
∞ ∞
X nxn X xn
*a. n
b. , x=6 1
n=1
2 n=1
(1 − x)n
∞ ∞
X 1 X 2n (sin x)n
*c. d.
n=1
3nx n=1
n
344 Introduction to Real Analysis

4. For n ∈ N, define fn : N → R by fn (m) = n/(m + n). Prove that


   
lim lim fn (m) 6= lim lim fn (m) .
m→∞ n→∞ n→∞ m→∞

5. Consider the sequence {fn } with n ≥ 2, defined on [0, 1] by



2
 n x,
 0 ≤ x ≤ 1/n,
fn (x) = 2n − n2 x, 1/n < x ≤ 2/n,

0, 2/n < x ≤ 1.

a. Sketch the graph of fn for n = 2, 3, and 4.


b. Prove that lim fn (x) = 0 for each x ∈ [0, 1].
n→∞
R1
*c. Show that 0 fn (x) dx = 1 for all n = 2, 3, ....
6. Let gn (x) = e−nx /n, x ∈ [0, ∞), n ∈ N. Find lim gn (x) and lim gn′ (x).
n→∞ n→∞

7. Let fn (x) = (x/n)e−x/n , x ∈ [0, ∞).


*a. Show that lim fn (x) = 0 for all x ∈ [0, ∞).
n→∞

*b. Given ǫ > 0, does there exist an integer no ∈ N such that |fn (x)| < ǫ
for all x ∈ [0, ∞) and all n ≥ no . (Hint: determine the maximum of fn
on [0, ∞))
c. Answer the same question as in (b) for x ∈ [0, a], a > 0.
8. *If an,m ≥ 0, n, m ∈ N, prove that
X∞ X ∞ ∞ X
X ∞
an,m = an,m ,
n=1 m=1 m=1 n=1

with the convention that if one of the sums is finite, the other is also and
equality holds, and if one is infinite, so is the other.

8.2 Uniform Convergence


All of the examples of the previous section show that pointwise convergence by
itself is not sufficient to allow the interchange of limit operations; additional
hypotheses are required. It was Weierstrass who realized in the 1850’s what
additional assumptions were needed to insure that the limit function of a
convergent sequence of continuous functions was again continuous.
Recall from Definition 8.1.1, a sequence {fn } of real-valued functions de-
fined on a set E converges pointwise to a function f on E if for each x ∈ E,
given ǫ > 0, there exists a positive integer no = no (x, ǫ) such that

|f (x) − fn (x)| < ǫ

for all n ≥ no . The key here is that the choice of the integer no may depend
Sequences and Series of Functions 345

not only on ǫ, but also on x ∈ E. If this dependence on x can be removed,


then we have the following:

DEFINITION 8.2.1 A sequence of real-valued functions {fn } defined on a


set E converges uniformly to f on E, if for every ǫ > 0, there exists a
positive integer no such that

|fn (x) − f (x)| < ǫ



P
for all x ∈ E and all n ≥ no . Similarly, a series fk of real-valued functions
k=1
converges uniformly on a set E if and only if the sequence {Sn } of partial
sums converges uniformly on E.

The inequality in the definition can also be expressed as

f (x) − ǫ < fn (x) < f (x) + ǫ

for all x ∈ E and n ≥ no . If E is a subset of R, then the geometric interpre-


tation of the above inequality is that for n ≥ no the graph of y = fn (x) lies
between the graphs of y = f (x) − ǫ and y = f (x) + ǫ.

EXAMPLES 8.2.2 (a) For x ∈ [0, 1], n ∈ N, let fn (x) = xn . By Example


8.1.2(a), the sequence {fn } converges pointwise to the function
(
0, 0 ≤ x < 1,
f (x) =
1, x = 1.

We now show that the convergence is not uniform. If the convergence were
uniform, then given ǫ > 0, there would exist a positive integer no such that
|fn (x) − f (x)| < ǫ for all n ≥ no . In particular,

x no < ǫ for all x ∈ [0, 1).

This however is a contradiction if ǫ < 1. Even though the convergence is not


uniform on [0, 1], the sequence does converge uniformly to 0 on [0, a] for every
a, 0 < a < 1. This follows immediately from the fact that for x ∈ [0, a],
|fn (x)| = |xn | ≤ an .
(b) Consider the series
∞ h i
X 2 2
kxe−kx − (k − 1)xe−(k−1)x , 0 ≤ x ≤ 1.
k=1

Since the series is a telescoping series, the nth partial sum Sn (x) is given by
2
Sn (x) = nxe−nx .
346 Introduction to Real Analysis

FIGURE 8.1
Graphs of S4 , S8 , S16

It is easily shown (Exercise 2(d), Section 8.1) that


S(x) = lim Sn (x) = 0 for all x ∈ [0, 1].
n→∞

The graphs of S4 , S8 , S16 are given in Figure 8.1.


We now show that the convergence is not uniform. Suppose that the se-
quence {Sn } converges uniformly to 0 on [0, 1]. Then if we take ǫ = 1, there
exists a positive integer no such that
2
|Sn (x) − S(x)| = Sn (x) = nxe−nx < 1

q  for each n ∈ N, by the first derivative


for all n ≥ no and x ∈ [0, 1]. However,
test Sn has a maximum at xn = 1 2n with
r
n
Mn = max Sn (x) = .
0≤x≤1 2e
This term however is greater than 1 for n ≥ 6. Thus the convergence is not
uniform. Since the maximum of each Sn moves along the x−axis as n →
∞, such functions are often referred to as “sliding-hump” functions. For this
example Mn → ∞ as n → ∞. 

The Cauchy Criterion


Our first criterion for uniform convergence is the Cauchy criterion. The state-
ment of this result is very similar to the definition of Cauchy sequence.
Sequences and Series of Functions 347

THEOREM 8.2.3 (Cauchy Criterion) A sequence {fn } of real-valued


functions defined on a set E converges uniformly on E if and only if for
every ǫ > 0, there exists an integer no ∈ N such that

|fn (x) − fm (x)| < ǫ (1)

for all x ∈ E and all n, m ≥ no .

Proof. If {fn } converges uniformly to f on E, then the proof that (1) holds
is similar to the proof that every convergent sequence is Cauchy. Conversely,
suppose that the sequence {fn } satisfies (1). Then for each x ∈ E, the sequence
{fn (x)} is a Cauchy sequence in R, and hence converges (Theorem 3.6.5).
Therefore,
f (x) = lim fn (x)
n→∞

exists for every x ∈ E.


We now show that the sequence {fn } converges uniformly to f on E. Let
ǫ > 0 be given. By hypothesis, there exists no ∈ N such that (1) holds for all
x ∈ E and all n, m ≥ no . Fix an m ≥ no . Then

|f (x) − fm (x)| = lim |fn (x) − fm (x)| ≤ ǫ


n→∞

for all x ∈ E. Since the above holds for all m ≥ no , the sequence {fn } converges
uniformly to f on E. 
The analogous result for series is as follows:

P
COROLLARY 8.2.4 The series fk of real-valued functions on E con-
k=1
verges uniformly on E if and only if given ǫ > 0, there exists a positive integer
no , such that
Xm
fk (x) < ǫ
k=n+1

for all x ∈ E and all integers m > n ≥ no .

Proof. The proof of the Corollary followsP by applying the previous theorem
to the partial sums Sn (x) of the series fk (x). 

THEOREM 8.2.5 Suppose the sequence {fn } of real-valued functions on


the set E converges pointwise to f on E. For each n ∈ N, set

Mn = sup |fn (x) − f (x)|.


x∈E

Then {fn } converges uniformly to f on E if and only if lim Mn = 0.


n→∞

Proof. Exercise 1. 
348 Introduction to Real Analysis

EXAMPLES 8.2.6 (a) To illustrate the previous theorem we consider the


sequence
2
Sn (x) = n x e−nx , n = 1, 2, ...
of Example 8.2.2(b). For this sequence, lim Sn (x) = 0 for all x, 0 ≤ x < ∞.
n→∞
However, r
n
Mn = sup Sn (x) = ,
x∈[0,∞) 2e
which diverges to ∞ . Thus the convergence is not uniform on [0, ∞). However,
the sequence {Sn } does converge uniformly to the zero function on [a, ∞) for
every fixed a > 0. (Exercise 6)
(b) Consider the sequence {fn } of Example 8.2.2(a) given by fn (x) =
xn , x ∈ [0, 1]. This sequence converges pointwise to the function f (x) = 0, 0 ≤
x < 1, and f (1) = 1. Since
(
xn , 0 ≤ x < 1,
|fn (x) − f (x)| =
0, x = 1,

we have
Mn = sup |fn (x) − f (x)| = 1.
x∈[0,1]

Thus since {Mn } does not converge to zero, the sequence {fn } does not con-
verge uniformly to f on [0, 1]. On the other hand, if 0 < a < 1 is fixed,
then
Mn = sup |fn (x)| = an .
x∈[0,a]
n
Since lim a = 0, by Theorem 8.2.5 the sequence {fn } converges uniformly
n→∞
to the zero function on [0, a] for every fixed a, 0 < a < 1, 

The Weierstrass M-Test


The following theorem of Weierstrass provides a very useful test for uniform
convergence of a series of functions.

THEOREM 8.2.7 (Weierstrass M-Test) Suppose {fk } is a sequence of


real-valued functions defined on a set E, and {Mk } is a sequence of real num-
bers satisfying

|fk (x)| ≤ Mk , for all x ∈ E and k ∈ N.



P ∞
P
If Mk < ∞, then fk (x) converges uniformly and absolutely on E.
k=1 k=1
Sequences and Series of Functions 349
n
P
Proof. Let Sn (x) = fk (x). Then for n > m,
k=1

n
X n
X n
X
|Sn (x) − Sm (x)| = fk (x) ≤ |fk (x)| ≤ Mk .
k=m+1 k=m+1 k=m+1
P
Uniform convergence now follows by the Cauchy Criterion. That |fk (x)|
also converges is clear. 
P
EXAMPLES 8.2.8 (a) If ak converges absolutely, then P since |ak cos kx| ≤
|ak | for all x ∈ R, by the Weierstrass M-test
P the series ak cos kx converges
uniformly on R. Similarly for the series ak sin kx. In particular, the series
∞ ∞
X cos kx X sin kx
, , p > 1,
kp kp
k=1 k=1

converge uniformly on R.

(x/2)k . This is a geometric series that converges
P
(b) Consider the series
k=1
for all x ∈ R satisfying |x| < 2. If 0 < a < 2 and |x| ≤ a, then
 x k  a k
≤ .
2 2

Since a/2 < 1 the series (a/2)k converges. Thus by the Weierstrass M-test,
P

(x/2)k converges uniformly on [−a, a] for any a, 0 < a < 2. The
P
the series
k=1
series however does not converge uniformly on (−2, 2) (Exercise 11). 

Although the Weierstrass M-test automatically implies absolute conver-


gence, the following example shows that uniform convergence as a general
rule does not imply absolute convergence.

EXAMPLES 8.2.9 (a) Consider the series



X xk
(−1)k+1 , 0 ≤ x ≤ 1.
k
k=1

For each k ∈ N, set ak (x) = xk /k. For x ∈ [0, 1], we have

a1 (x) ≥ a2 (x) ≥ · · · ≥ 0, and lim ak (x) = 0.


k→∞

Thus by Theorem 7.2.3, the series (−1)k+1 ak (x) converges for all x ∈ [0, 1].
P
Let

X
S(x) = (−1)k+1 ak (x).
k+1
350 Introduction to Real Analysis

If Sn (x) is the nth partial sum of the series, then by Theorem 7.2.4
1
|S(x) − Sn (x)| ≤ an+1 (x) ≤ , for all x ∈ [0, 1].
n+1
Thus {Sn } converges uniformly to S on [0, 1]. However, the given
 series does
not converge absolutely when x = 1. The series (−1)k+1 xk k, x ∈ [0, 1],
P

also provides an example of a series that converges uniformly on [0, 1] but for
which the Weierstrass M-test fails.
(b) The converse is also false; absolute convergence need not imply uniform

x2 (1 + x2 )−k of Example
P
convergence! As an example, consider the series
k=1
8.1.2(b). Since all the terms are nonnegative, the series converges absolutely
to (
0, x = 0,
f (x) = 2
1+x , x 6= 0,
on R. However, as a consequence of Corollary 8.3.2 of the next section, since
f is not continuous at 0, the convergence cannot be uniform on any interval
containing 0. 

Exercises 8.2
1. Prove Theorem 8.2.5.
2. a. If {fn } and {gn } converge uniformly on a set E, prove that {fn + gn }
converges uniformly on E.
*b. If {fn } and {gn } converge uniformly on a set E, and if in addition
there exist constants M and N such that |fn (x)| ≤ M and |gn (x)| ≤ N
for all n ∈ N and all x ∈ E, prove that {fn gn } converges uniformly on E.
c. Find examples of sequences {fn } and {gn } that converge uniformly
on a set E, but for which {fn gn } does not converge uniformly on E.
3. Show that if {fn } converges uniformly on (a, b) and {fn (a)} and {fn (b)}
converge, then {fn } converges uniformly on [a, b].
4. *Let fn (x) = n x(1 − x2 )n , 0 ≤ x ≤ 1. Show that {fn } does not converge
uniformly to 0 on [0, 1].
xn
5. Let fn (x) = , 0 ≤ x ≤ 1.
1 + xn
*a. Show that {fn } converges uniformly to 0 on [0, a] for any a, 0 < a < 1.
*b. Does {fn } converge uniformly on [0, 1] ?
2
6. Show that the sequence {nxe−nx } converges uniformly to 0 on [a, ∞)
for every a > 0.
x
7. For each n ∈ N, set fn (x) = x + sin nx, x ∈ R. Show that the sequence
n
{fn } converges uniformly to f (x) = x for all x ∈ [−a, a], a > 0. Does
{fn } converge uniformly to f on R?
Sequences and Series of Functions 351

8. Show that each of the following series converge uniformly on the indicated
interval.
∞ 1 ∞
e−kx xk , 0 ≤ x < ∞.
P P
*a. 2 + x2
, 0 ≤ x < ∞. b.
k=1 k k=1
∞ ∞ (−1)k+1
k2 e−kx , 1 ≤ x < ∞.
P P
*c. d. , 0 ≤ x < ∞.
k=1 k=1 k + x
9. Test each of the following series for uniform convergence on the indicated
interval.

P sin 2kx P∞ xk
*a. 3/2
, x ∈ R. b. 2
, |x| ≤ 1.
k=1 (2k + 1) k=2 k (ln k)
∞ (−1) k+1 2k+1
P x
c. , |x| ≤ 1.
k=0 2k + 1

sin(x/kp ), p > 1, |x| ≤ 2.
P
*d.
k=1 


P 1 1
*e. − , 0 ≤ x ≤ 1.
k=0 kx + 2 (k + 1)x + 2
10. Show that each of the following series converge uniformly on [a, ∞) for
any a > 0, but do not converge uniformly on (0, ∞).
P∞ 1 P∞ 1
*a. 2
. b. 1+x
.
k=0 1 + k x k=1 k

(x/2)k does not converges uniformly on (−2, 2).
P
11. Show that the series
k=1

12. *If ∞
P P∞ k
k=0 ak converges absolutely, prove that k=0 ak x converges uni-
formly on [−1, 1].
13. Let {fn } be a sequence of functions that converges uniformly to a con-
tinuous function f on (−∞, ∞). Prove that
lim fn x + n1 = f (x) for all x ∈ (−∞, ∞).

n→∞
P
14. Let {ck } be a sequence of real numbers satisfying |ck | < ∞, and let
{xk } be a countable subset of [a, b]. Prove that the series
X∞
ck I(x − xk )
k=1

converges uniformly on [a, b]. Here I is the unit jump function defined in
Definition 4.4.9.
15. (Dirichlet Test for Uniform Convergence) Suppose {fk } and {gk }
are sequences of functions on a set E satisfying
n
P
(a) the partial sums Sn (x) = gk (x) are uniformly bounded on E,
k=1
i.e., there exists M > 0 such that |Sn (x)| ≤ M for all n ∈ N and x ∈ E.
(b) fk (x) ≥ fk+1 (x) ≥ 0 for all k ∈ N and x ∈ E, and
(c) lim fk (x) = 0 uniformly on E.
k→∞
P
Prove that fk (x)gk (x) converges uniformly on E.
P∞ sin kx ∞ cos kx
P
16. *Prove that p
, (p > 0) converge uniformly on any
k=1 k k=1 kp
closed interval which does not contain an integer multiple of 2π.
352 Introduction to Real Analysis

17. Define a sequence of functions {fn } on [0, 1] by



1, if
1
< x ≤ n,
1
fn (x) = n 2n+1 2 .
0, elsewhere.

P
Prove that fn (x) converges uniformly on [0, 1], but that the Weier-
n=1
strass M-test fails.
18. *Let F0 be a bounded Riemann integrable function on [0, 1]. For n ∈ N,
define Fn (x) on [0, 1] by
Rx
Fn (x) = Fn−1 (t)dt.
0

P
Prove that Fk (x) converges uniformly on [0, 1].
k=0

8.3 Uniform Convergence and Continuity


In this section, we will prove that the limit of a uniformly convergent sequence
of continuous functions is again continuous. Prior to proving this result, we
first prove a stronger result that will have additional applications later.

THEOREM 8.3.1 Suppose {fn } is a sequence of real-valued functions that


converges uniformly to a function f on a subset E of a metric space (X, d).
Let p be a limit point of E, and suppose that for each n ∈ N,

lim fn (x) = An .
x→p

Then the sequence {An } converges and

lim f (x) = lim An .


x→p n→∞

Remark. The last statement can be rewritten as


   
lim lim fn (x) = lim lim fn (x) .
x→p n→∞ n→∞ x→p

It should be noted that p is not required to be a point of E; only a limit point


of E.
Proof. Let ǫ > 0 be given. Since the sequence {fn } converges uniformly to f
on E, there exists a positive integer no such that

|fn (x) − fm (x)| < ǫ (2)


Sequences and Series of Functions 353

for all n, m ≥ no and all x ∈ E. Since (2) holds for all x ∈ E, letting x → p
gives
|An − Am | ≤ ǫ, for all n, m ≥ no .
Thus {An } is a Cauchy sequence in R, which as a consequence of Theorem
3.6.5 converges. Let A = lim An .
n→∞
It remains to be shown that lim f (x) = A. Again, let ǫ > 0 be given. First,
x→p
by the uniform convergence of the sequence {fn (x)} and the convergence of
the sequence {An }, there exists a positive integer m such that
ǫ
|f (x) − fm (x)| <
3
for all x ∈ E, and also that
ǫ
|A − Am | < .
3
Since lim fm (x) = Am , there exists a δ > 0 such that
x→p

ǫ
|fm (x) − Am | < for all x ∈ E, 0 < d(x, p) < δ.
3
By the triangle inequality,

|f (x) − A| ≤ |f (x) − fm (x)| + |fm (x) − Am | + |Am − A|



< + |fm (x) − Am | < ǫ.
3
Thus if x ∈ E with 0 < d(x, p) < δ,

|f (x) − A| < ǫ;

i.e., lim f (x) = A. 


x→p

COROLLARY 8.3.2 Let E be a subset of a metric space X.


(a) If {fn } is a sequence of continuous real-valued functions on E, and if
{fn } converges uniformly to f on E, then f is continuous on E.
(b) If {fn } is a sequence of continuous real-valued functions on E, and if
P∞
fn converges uniformly on E, then
n=1


X
S(x) = fn (x)
n=1

is continuous on E.
354 Introduction to Real Analysis

Proof. (a) If p ∈ E is an isolated point, then f is automatically continuous


at p. If p ∈ E is a limit point of E, then since fn is continuous for each n ∈ N,

lim fn (x) = fn (p).


x→p

Thus by the previous theorem,

lim f (x) = lim fn (p) = f (p).


x→p n→∞

Therefore f is continuous at p.
(b) For the proof of (b), let
n
X
Sn (x) = fk (x).
k=1

Then for each n ∈ N, Sn is continuous on E. Since {Sn } converges uniformly


to S on E, by part (a) S is also continuous on E. 

EXAMPLE 8.3.3 The sequence {xn }∞ n=1 , x ∈ [0, 1], of Example 8.1.2(a)
does not converge uniformly on [0, 1] since the limit function
(
0 0 ≤ x < 1,
f (x) =
1, x = 1,

is not continuous on [0, 1]. Likewise, the series


∞  k (
X
2 1 0, x = 0,
x =
k=0
1 + x2 1 + x2 , x 6= 0,

of Example 8.1.2(b) cannot converge uniformly on any interval containing 0,


since the sum of the series is not continuous at 0. 

Dini’s Theorem1
A natural question to ask is whether the converse of Corollary 8.3.2 is true.
Namely, if f and fn are continuous for all n and fn → f pointwise, is the
convergence necessarily uniform? The following example shows that this need
not be the case. However, in Theorem 8.3.5 we will prove that with the addi-
tional assumption that the sequence {fn (x)} is monotone for all x, then the
convergence is indeed uniform.

1 This topic is not required in subsequent sections and thus can be omitted on first

reading.
Sequences and Series of Functions 355

EXAMPLE 8.3.4 As in Example 8.2.2(b), for each n ∈ N, let


2
Sn (x) = nxe−nx , x ∈ [0, 1].
Then Sn is continuous on [0, 1] for each n, and lim Sn (x) = S(x) = 0, which
n→∞
is also continuous. However, since
r
n
max Sn (x) = ,
0≤x≤1 2e
by Theorem 8.2.5 the convergence cannot be uniform. 

THEOREM 8.3.5 (Dini’s Theorem) Suppose K is a compact subset of a


metric space X and {fn } is a sequence of continuous real-valued functions on
K satisfying,
(a) {fn } converges pointwise on K to a continuous function f , and
(b) fn (x) ≥ fn+1 (x) for all x ∈ K and n ∈ N.
Then {fn } converges uniformly to f on K.
Proof. For each n ∈ N let gn = fn − f . Then gn is continuous on K, gn (x) ≥
gn+1 (x) ≥ 0 for all x ∈ K and all n ∈ N, and
lim gn (x) = 0 for each x ∈ K.
n→∞

Let ǫ > 0 be given. For each n ∈ N, let


Kn = {x ∈ K : gn (x) ≥ ǫ}.
We first prove that Kn is closed. Let p be a limit point of Kn . By Theorem
3.1.4 there exists a sequence {xk } in Kn which converges to p. Since gn is
continuous and gn (xk ) ≥ ǫ for all k ∈ N,
gn (p) = lim gn (xk ) ≥ ǫ.
k→∞

Therefore p ∈ Kn . Thus Kn is closed and as a consequence of Theorem 2.3.5


also compact. Furthermore, since gn (x) ≥ gn+1 (x) for all x ∈ K,
Kn+1 ⊂ Kn for all n.
Finally, since gn (x) → 0 for each x ∈ K,

\
Kn = ∅.
n=1

However, by Theorem 2.3.8 this can only be the case if Kno = ∅ for some
no ∈ N. Thus for all n ≥ no ,
0 ≤ gn (x) < ǫ for all x ∈ K.
Therefore the sequence {gn } converges uniformly to 0 on K. 
356 Introduction to Real Analysis

EXAMPLE 8.3.6 We now provide an example to show that compactness is


required. For each n ∈ N, set
1
fn (x) = , 0 < x < 1.
nx + 1
Then {fn (x)} monotonically decreases to f (x) = 0 for each x ∈ (0, 1). How-
ever, since lim fn (x) = 1, by Theorem 8.3.1 the convergence cannot be uni-
x→0
form. 

The Space C(K)2


We conclude this section with a brief discussion of the space C(K) of all
continuous real-valued functions on a compact set K. If f, g ∈ C(K) and
c ∈ R, then by Theorem 4.2.3 the functions f + g and cf are also continuous
on K. Thus C(K) is a vector space over R where for the zero element we take
the constant function 0; that is, the function given by f (x) = 0 for all x ∈ K.
We define a norm on C(K) as follows.

DEFINITION 8.3.7 For each f ∈ C(K), set

kf ku = max{|f (x)| : x ∈ K}.

The quantity kf ku is called the uniform norm of f on K.

That k ku is indeed a norm on C(K) is left to the exercises (Exercise 12).


We now take a closer look at uniform convergence of a sequence of continuous
real-valued functions and also introduce the concept of convergence in norm.
We first note that a sequence {fn } of continuous real-valued functions on K is
nothing else but a sequence in the set C(K). Suppose that the sequence {fn }
in C(K) converges uniformly to f on K. Then by Corollary 8.3.2 the function
f ∈ C(K). By the definition of uniform convergence, given ǫ > 0 there exists
a positive integer no such that

|fn (x) − f (x)| < ǫ

for all x ∈ K and all n ≥ no . Since fn −f is continuous, if n ≥ no , by Corollary


4.2.9
kfn − f ku = max{|fn (x) − f (x)| : x ∈ [a, b]} < ǫ.
Therefore kfn − f ku < ǫ for all n ≥ no
Conversely, suppose f, fn ∈ C(K) satisfy the following: For each ǫ > 0
there exists a positive integer no such that kf − fn ku < ǫ for all n ≥ no . But
then |f (x) − fn (x)| < ǫ for all x ∈ K and all n ≥ no ; i.e., {fn } converges
uniformly to f on K. This proves the following theorem.

2 This topic can also be omitted on first reading. The concept of a complete normed

linear space (Definition 8.3.10) is only required in Section 10.9.


Sequences and Series of Functions 357

THEOREM 8.3.8 A sequence {fn } in C(K) converges uniformly to f ∈


C(K) if and only if given ǫ > 0, there exists no ∈ N such that kf − fn ku < ǫ
for all n ≥ no .

Using the above as motivation, we define convergence in a normed linear


space as follows:

DEFINITION 8.3.9 Let (X, k k) be a normed linear space. A sequence


{xn } in X converges in norm if there exists x ∈ X such that for every
ǫ > 0, there exists a positive integer no such that kx − xn k < ǫ for all n ≥ no .
If this is the case, we say that {xn } converges in norm to x and denote this
kk
by xn → x as n → ∞.

From the definition it is clear that a sequence {xn } in X converges in norm


to x ∈ X if and only if lim kxn − xk = 0. Also, as in the proof of Theorem
n→∞
3.1.4 if {xn } converges in norm then its limit is unique. Using the norm it
is also possible to define what we mean by a Cauchy sequence in a normed
linear space.

DEFINITION 8.3.10 (a) A sequence {xn } in a normed linear space


(X, k k) is a Cauchy sequence if for every ǫ > 0 there exists a positive
integer no such that
kxn − xm k < ǫ
for all integers n, m ≥ no .
(b) A normed linear space (X, k k) is complete if every Cauchy sequence
in X converges in norm to an element of X.

As for sequences of real numbers, every sequence {xn } in X that converges


in norm to x ∈ X is a Cauchy sequence. In Theorem 3.6.5 we proved that the
normed linear space (R, | |) is complete. The following theorem proves that
(C(K), k ku ) is also complete.

THEOREM 8.3.11 If K is a compact subset of a metric space X, then the


normed linear space (C(K), k ku ) is complete.

Proof. Let {fn } be a Cauchy sequence in C(K); i.e., given ǫ > 0, there exists
a positive integer no such that kfn − fm ku < ǫ for all n, m ≥ no . But then

|fn (x) − fm (x)| ≤ kfn − fm ku < ǫ

for all x ∈ [a, b] and all n, m ≥ no . Thus by Theorem 8.2.3 and Corollary
8.3.2 the sequence {fn } converges uniformly to a continuous function f on K.
Finally, since the convergence is uniform, given ǫ > 0, there exists an integer
no such that

|fn (x) − f (x)| < ǫ for all x ∈ K and n ≥ no .


358 Introduction to Real Analysis

As a consequence we have kfn −f ku < ǫ for all n ≥ no . Therefore the sequence


{fn } converges to f in the norm k ku . 

Contraction Mappings
In Exercise 13 of Section 4.3 we defined the notion of a contractive function
on a subset E of R. We now extend this to normed linear spaces.

DEFINITION 8.3.12 Let (X, k k) be a normed linear space. A mapping


(function) T : X → X is called a contraction mapping (function) if there
exists a constant c, 0 < c < 1, such that
kT (x) − T (y)k ≤ ckx − yk
for all x, y ∈ X.

Clearly every contraction mapping on X is continuous, in fact uniformly


continuous on X. As in Exercise 13 of Section 4.3, we now prove that if T is a
contraction mapping on a complete normed linear space (X, k k), then T has
a unique fixed point in X.

THEOREM 8.3.13 Let (X, k k) be a complete normed linear space and let
T : X → X be a contraction mapping. Then there exists a unique point x ∈ X
such that T (x) = x.
Proof. Suppose T : X → X satisfies kT (x) − T (y)k ≤ ckx − yk for fixed
c, 0 < c < 1, and all x, y ∈ X. We now define a sequence {xn } in X as
follows: Let xo ∈ X be arbitrary. For n ∈ N set xn = T (xn−1 ). That is,
x1 = T (xo ), x2 = T (x1 ), etc. Since
kxn+1 − xn k = kT (xn ) − T (xn−1 )k ≤ ckxn − xn−1 k,
the sequence {xn } is a contractive sequence in X (see Definition 3.6.8). An
argument similar to the one used for a contractive sequence in Section 3.6
shows that
cn
kxn+m − xn k ≤ kx1 − xo k
1−c
for all n, m ∈ N. The details are left as an exercise (Exercise 10). Since cn → 0,
the sequence {xn } is a Cauchy sequence in X. By completeness, the sequence
{xn } converges (in the norm) to some x ∈ X. But by continuity of the mapping
T,
x = lim xn+1 = lim T (xn ) = T (x),
n→∞ n→∞
i.e., T (x) = x. Suppose y ∈ X also satisfies T (y) = y. But then
ky − xk = kT (y) − T (x)k ≤ cky − xk.
Since 0 < c < 1, the above can be true if and only if ky − xk = 0, that is
y = x. Thus x is unique. 
Sequences and Series of Functions 359

Exercises 8.3

x(1 − x)k cannot converge uniformly for 0 ≤
P
1. *Show that the series
k=0
x ≤ 1.
2. For n ∈ N, let fn (x) = xn /(1 + xn ), x ∈ [0, 1]. Prove that the sequence
{fn } does not converge uniformly on [0, 1].
3. Give an example of a sequence of functions that are not continuous at
any point but which converges uniformly to a continuous function.
4. *Suppose that f is uniformly continuous on R. For each n ∈ N, set
fn (x) = f (x + n1 ). Prove that the sequence {fn } converges uniformly to
f on R.
5. Let {fn } be a sequence of continuous real-valued functions that converges
uniformly to a function f on a set E ⊂ R. Prove that lim fn (xn ) = f (x)
n→∞
for every sequence {xn } ⊂ E such that xn → x ∈ E.
6. * Let E ⊂ R and let D be a dense subset of E. If {fn } is a sequence of
continuous real-valued functions on E, and if {fn } converges uniformly
on D, prove that {fn } converges uniformly on E. (Recall that D is dense
in E if every point of E is either a point of D or a limit point of D)
7. Find a sequence {fn } in C[0, 1] with kfn ku = 1 such that no subsequence
of {fn } converges (in norm) in C[0, 1].
8. Suppose {fn } is a sequence of continuous functions on [a, b] that converges
uniformly on [a.b]. For each x ∈ [a, b], set g(x) = sup{fn (x)}.
n

a. Prove that g is continuous on [a, b].


b. Show by example that the conclusion may be false if the sequence
{fn } converges only pointwise on [a, b].
9. *For each n ∈ N and x ∈ R, set fn (x) = (1 + nx )n . Use Dini’s theorem to
prove that the sequence {fn } converges uniformly to ex on [a, b] for any
fixed a, b ∈ R.
10. Let (X, k k) be a normed linear space and let T : X → X be a contraction
mapping with constant c, 0 < c < 1. If {xn } is the sequence in X as
defined in the proof of Theorem 8.3.13, prove that
kxn+m − xn k ≤ (cn (1 − c))kx1 − xo k for all n, m ∈ N.


Rx
11. Define T : C[0, 1] → C[0, 1] by (T ϕ)(x) = ϕ(t) dt, 0 ≤ x ≤ 1,
0
2
ϕ ∈ C[0, 1], and set T = T ◦ T .
*a. Prove that |(T 2 ϕ)(x)| ≤ 21 x2 kϕku .
b. Show that T 2 is a contraction mapping on C[0, 1] and thus has a fixed
point in C[0, 1].
c. Prove that T has a fixed point in C[0, 1].
12. Prove that (C(K), k ku ) is a normed linear space.
13. Prove that (ℓ2 , k k2 ) is a complete normed linear space.
360 Introduction to Real Analysis

8.4 Uniform Convergence and Integration


In Example 8.1.2(c) we provided an example of a sequence of Riemann in-
tegrable functions that converges pointwise, but for which the limit function
is not Riemann integrable. Furthermore, in Example 8.1.2(d) we provided an
example of a seuence of continuous function on [0, 1] for which lim fn (x) = 0
n→∞
for all x ∈ [0, 1] but for which
1
1 n
Z
fn (x)dx = .
0 2n+1

R1 R1
Thus lim fn (x) 6= lim fn (x)dx. Hence, pointwise convergence, even if
n→∞ 0 0 n→∞
the limit function is Riemann integrable, is also not sufficient for the inter-
change of limits.
In this section, we will prove that uniform convergence of a sequence {fn }
of Riemann integrable functions is again sufficient for the limit function f
to be Riemann integrable, and for convergence of the definite integrals of fn
to the definite integral of f . The analogous result for the Riemann-Stieltjes
integral is left to the exercises (Exercise 2).

THEOREM 8.4.1 Suppose fn ∈ R[a, b] for all n ∈ N, and suppose that the
sequence {fn } converges uniformly to f on [a, b]. Then f ∈ R[a, b] and
Z b Z b
f (x) dx = lim fn (x) dx.
a n→∞ a

Proof. For each n ∈ N, set

ǫn = max |fn (x) − f (x)|.


x∈[a,b]

Since fn → f uniformly on [a, b], by Theorem 8.2.5, lim ǫn = 0. Also, for all
n→∞
x ∈ [a, b],
fn (x) − ǫn ≤ f (x) ≤ fn (x) + ǫn .
Hence
Z b Z b Z b Z b
(fn − ǫn ) ≤ f ≤ f ≤ (fn + ǫn ). (3)
a a a a

Therefore
Z b Z b
0≤ f − f ≤ 2ǫn [b − a].
a a
Sequences and Series of Functions 361

Since ǫn → 0, f ∈ R[a, b]. Also by (3),


Z b Z b
f (x) dx − fn (x) dx < ǫn [b − a],
a a

and thus Z b Z b
lim fn (x) dx = f (x) dx. 
n→∞ a a

COROLLARY 8.4.2 If fk ∈ R[a, b] for all k ∈ N, and if



X
f (x) = fk (x), x ∈ [a, b],
k=1

where the series converges uniformly on [a, b], then f ∈ R[a, b] and
Z b ∞ Z b
X
f (x) dx = fk (x) dx.
a k=1 a

n
P
Proof. Apply the previous theorem to Sn (x) = fk (x), which by Theorem
k=1
6.2.1 is integrable for each n ∈ N. 
Although uniform convergence is sufficient for the conclusion of Theorem
8.4.1; it is not necessary. For example, if fn (x) = xn , x ∈ [0, 1], then {fn }
converges pointwise, but not uniformly, to the function
(
0, 0 ≤ x < 1,
f (x) =
1, x = 1.

The function f ∈ R[0, 1] and


Z 1 1
1
Z
lim xn dx = lim =0= f (x)dx.
n→∞ 0 n→∞ n + 1 0

In Section 10.6, using results from the Lebesgue theory of integration, we will
be able to prove a stronger convergence result that does not require uniform
convergence of the sequence {fn }. However, it does require that the limit
function f is Riemann integrable. For completeness we include a statement of
that result at this point.

THEOREM 8.4.3 (Bounded Convergence Theorem) Suppose f and


fn , n ∈ N, are Riemann integrable functions on [a, b] with lim fn (x) = f (x)
n→∞
for all x ∈ [a, b]. Suppose also that there exists a positive constant M such
that |fn (x)| ≤ M for all x ∈ [a, b] and all n ∈ N. Then
Z b Z b
lim fn (x) dx = f (x) dx.
n→∞ a a
362 Introduction to Real Analysis

It is easily checked that the sequence {fn } on [0, 1], where for each n ∈ N
fn (x) = xn , satisfies the hypothesis of the previous theorem. Also, since the
limit function f is continuous except at x = 1, f ∈ R[0, 1]. On the other hand,
the sequence {fn } of Example 8.1.2(d) does not satisfy the hypothesis of the
theorem.

Exercises 8.4
P
1. *If |ak | < ∞, prove that

Z 1 ! ∞
X k
X ak
ak x dx = .
0 k+1
k=0 k=0

2. Let α be a monotone increasing function on [a, b]. Suppose fn ∈ R(α)


on [a, b] for all n ∈ N, and suppose that the sequence {fn } converges
uniformly to f on [a, b]. Then f ∈ R(α) and
Z b Z b
f dα = lim fn dα.
a n→∞ a

3. For each n ∈ N, let fn (x) = nx (1 + nx), x ∈ [0, 1]. Show that the
sequence {fn } converges pointwise, but not uniformly, to an integrable
function f on [0, 1], and that
R1 R1
lim fn (x) dx = f (x) dx.
n→∞ 0 0

4. *If f is Riemann integrable on [0, 1], use the bounded convergence theo-
rem to prove that
Z 1
lim xn f (x) dx = 0.
n→∞ 0
5. Let {fn } be a sequence in R[a, b] that converges uniformly to f ∈ R[a, b].
Rx Rx
For n ∈ N set Fn (x) = fn , and let F (x) = f , x ∈ [a, b]. Prove that
0 0
{Fn } converges uniformly to F on [a, b].
R1
6. *Suppose f : [0, 1] → R is continuous. Prove that lim f (xn )dx = f (0).
n→∞ 0

7. *Let {rn } be an enumeration of the rational numbers in [0, 1], and let
f : [0, 1] → R be defined by
P∞ 1
f (x) = k
I(x − xk ),
k=1 2
where I is the unit jump function of Definition 4.4.9. Prove that
f ∈ R[0, 1].
8. Define g on R by g(x) = x − [x], where [x] denotes the greatest integer
function. Prove that the function
P∞ g(nx)
f (x) =
n=1 n2
is Riemann integrable on [0, 1]. (This function was given by Riemann as
an example of a function that is not integrable according to Cauchy’s
definition.)
Sequences and Series of Functions 363

9. *Let g ∈ R[a, b] and let {fn } be a sequence of Riemann integrable


functions on [a, b] that converges uniformly to f on [a, b]. Prove that
Rb Rb
lim fn g = f g.
n→∞ a a
R∞
10. *Let g be a nonnegative real-valued function on [0, ∞) for which g(x)dx
0
is finite. Suppose {fn } is a sequence of real-valued functions on [0, ∞)
satisfying fn ∈ R[0, c] for every c > 0 and |fn (x)| ≤ g(x) for all x ∈ [0, ∞)
and n ∈ N. If the sequence {fn } converges uniformly to f on [0, c] for
every c > 0, prove that
Z ∞ Z ∞
lim fn (x) dx = f (x) dx.
n→∞ 0 0
Rb
11. For f ∈ C[a, b], define kf k1 = |f (x)| dx.
a

a. Prove that (C[a, b], k k1 ) is a normed linear space.


*b. Show by example that the normed linear space (C[a, b], k k1 ) is not
complete.

8.5 Uniform Convergence and Differentiation


In this section, we consider the question of interchange of limits and differ-
entiation. Example 8.1.2(e) shows that even if the sequence {fn } converges
uniformly to f , this is not sufficient for convergence of the sequence {fn′ }
of derivatives. Example 8.5.3 will further demonstrate very dramatically the
failure of the interchange of limits and differentiation. There we will give an
example of a series, each of whose terms has derivatives of all orders, that
converges uniformly to a continuous function f , but for which f ′ fails to exist
at every point of R. Clearly, uniform convergence of the sequence {fn } is not
sufficient. What is required is uniform convergence of the sequence {fn′ }.

THEOREM 8.5.1 Suppose {fn } is a sequence of differentiable functions on


[a, b]. If
(a) {fn′ } converges uniformly on [a, b], and
(b) {fn (xo )} converges for some xo ∈ [a, b],
then {fn } converges uniformly to a function f on [a, b], with

f ′ (x) = lim fn′ (x).


n→∞

Remarks. (a) Convergence of {fn (xo )} at some xo ∈ [a, b] is required. For


example, if we let gn (x) = fn (x) + n, then gn′ (x) = fn′ (x), but {gn (x)} need
not converge for any x ∈ [a, b]. In Exercise 1 you will be asked to show that
364 Introduction to Real Analysis

uniform convergence of {fn′ } is also required; pointwise convergence is not


sufficient.
(b) If in addition to the hypotheses we assume that fn′ is continuous
on [a, b], then a much shorter and easier proof can be provided using the
fundamental theorem of calculus. Since fn′ is continuous, by Theorem 6.3.2
Z x
fn (x) = fn (xo ) + fn′ (t)dt
xo

for all x ∈ [a, b]. The result can now be proved using Corollary 8.3.2 and
Theorem 8.4.1. The details are left to the exercises (Exercise 2).
Proof. Let ǫ > 0 be given. Since {fn (xo )} converges and {fn′ } converges
uniformly, there exists no ∈ N such that
ǫ
|fn (xo ) − fm (xo )| < , for all n, m ≥ no , (4)
2
and
ǫ
|fn′ (t) − fm

(t)| < , for all t ∈ [a, b], and all n, m ≥ no . (5)
2(b − a)
Apply the mean value theorem to the functions fn − fm with n, m ≥ no fixed.
Then for x, y ∈ [a, b], there exists t between x and y such that

|(fn (x) − fm (x)) − (fn (y) − fm (y))| = |[fn′ (t) − fm



(t)](x − y)|.

Thus by (5),
ǫ ǫ
|(fn (x) − fm (x)) − (fn (y) − fm (y))| ≤ |x − y| < . (6)
2(b − a) 2

Take y = xo in (6). Then by (4) and (6), for all x ∈ [a, b] and n, m ≥ no ,

|fn (x) − fm (x)| ≤ |(fn (x) − fm (x)) − (fn (xo ) − fm (xo ))| + |fn (xo ) − fm (xo )|
ǫ ǫ
< + = ǫ.
2 2
Hence by Theorem 8.2.3 the sequence {fn } converges uniformly on [a, b]. Let

f (x) = lim fn (x).


n→∞

It remains to be shown that f is differentiable and that

f ′ (x) = lim fn′ (x)


n→∞

for all x ∈ [a, b]. Fix p ∈ [a, b], and for t 6= p, t ∈ [a, b], define

fn (t) − fn (p) f (t) − f (p)


gn (t) = , g(t) = .
t−p t−p
Sequences and Series of Functions 365

Then gn (t) → g(t) for each t ∈ [a, b], t 6= p, and for each n

lim gn (t) = fn′ (p).


t→p

Let E = [a, b] \ {p}. Take y = p in inequality (6). Then for all t ∈ E,


ǫ
|gn (t) − gm (t)| ≤ , for all n, m ≥ no
2(b − a)

Therefore {gn } converges uniformly to g on E. Hence by Theorem 8.3.1,

f ′ (p) = lim g(t) = lim fn′ (p). 


t→p n→∞

EXAMPLE 8.5.2 To illustrate the previous theorem consider the series



X sin kx
.
2k
k=1

Since 2−k sin kx ≤ 2−k for all x ∈ R, by the Weierstrass M-test this series
converges uniformly to a function S on R. For n ∈ N, let
n
X sin kx
Sn (x) = .
2k
k=1

Then
n
X k cos kx
Sn′ (x) = .
2k
k=1
P −k
Since k2 converges, by the Weierstrass M-test the sequence {Sn′ } con-
verges uniformly on R. Thus by Theorem 8.5.1,

X k cos kx
S ′ (x) = lim Sn′ (x) = . 
n→∞ 2k
k=1

A Continuous Nowhere Differentiable Function


We conclude this section with the following example of Weierstrass of a con-
tinuous function which is nowhere differentiable. When this example first ap-
peared in 1874, it astounded the mathematical community.

EXAMPLE 8.5.3 Consider the function f on R defined by



X cos ak πx
f (x) = , (7)
2k
k=0
366 Introduction to Real Analysis

FIGURE 8.2
Graphs of S1 , S2 , S3

where a is an odd positive integer satisfying a > 3π + 2. Since


cos ak πx 1
≤ k,
2k 2
the series (7) converges uniformly on R, and hence f is continuous. The
graphs of the partial sums (with a = 13) S1 (x) = cos(πx), S2 (x) = S1 (x) +
1 1 2
2 cos(13πx), and S3 (x) = S2 (x) + 22 cos(13 πx) are illustrated in Figure 8.2.
We prove that f is nowhere differentiable by showing that for each x ∈ R,
there exists a sequence hn → 0 such that
f (x + hn ) − f (x)
lim = ∞.
n→∞ hn
For n ∈ N, set
n−1
X cos ak πx
Sn (x) = ,
2k
k=0

X cos ak πx
Rn (x) = .
2k
k=n

Then for h > 0,


f (x + h) − f (x) Sn (x + h) − Sn (x) Rn (x + h) − Rn (x)
= + .
h h h
Our first step will be to estimate
Sn (x + h) − Sn (x)
h
Sequences and Series of Functions 367

from above. By the mean value theorem,


cos ak π(x + k) − cos ak πx
= − ak π sin[ak π(x + ζ)],
h
for some ζ, 0 < ζ < h. Since

| − ak π sin[ak π(x + ζ)]| ≤ ak π,

we obtain
n−1
X 1 cos ak π(x + h) − cos ak πx
Sn (x + h) − Sn (x)
≤ (8)
h 2k h
k=0
n−1 n 
π 1 − a2

X  a k
≤π =
2 1 − a2
k=0
2π  a n
< .
a−2 2
We now proceed to obtain a lower estimate on the term involving Rn . To
do so we write
a n x = k n + δn ,
where kn is an integer, and − 12 ≤ δn < 21 . Set
1 − δn
hn = .
an
Since − 21 ≤ δn < 21 , we have 3
2 ≥ 1 − δn > 21 . Therefore,
2 n 1
a ≤ < 2 an . (9)
3 hn
For k ≥ n,

ak π(x + hn ) = ak−n an π(x + hn )


= ak−n π(an x + (1 − δn )) = ak−n π(kn + 1).

Since ak−n is odd and kn is an integer,

cos[ak π(x + hn )] = cos[ak−n π(kn + 1)] = (−1)kn +1 .

Also, ak πx = ak−n an πx = ak−n π(kn + δn ). Using the trigonometric identity

cos(A + B) = cos A cos B − sin A sin B

and the fact that sin(kn ak−n π) = 0, we have

cos ak πx = cos(ak−n kn π) cos(ak−n δn π)


= (−1)kn cos(ak−n δn π).
368 Introduction to Real Analysis

Therefore,

cos ak π(x + hn ) − cos ak πx = (−1)kn +1 − (−1)kn cos(ak−n δn π)


= (−1)kn +1 [1 + cos(ak−n δn π)].

As a consequence,

Rn (x + hn ) − Rn (x) X 1 cos ak π(x + hn ) − cos ak πx
=
hn 2k hn
k=n

X (−1)kn +1 [1 + cos ak−n δn π]
=
hn 2k
k=n

1 X 1 + cos ak−n δn π 1 1 + cos δn π
= ≥ .
hn 2k hn 2n
k=n

Since − 21 ≤ δn < 21 , cos δn π ≥ 0. Therefore by (9) and the above,

Rn (x + hn ) − Rn (x) 1 1 2  a n
≥ ≥ . (10)
hn hn 2n 3 2

Using the reverse triangle inequality, |a + b| ≥ |a| − |b|, we have

f (x + hn ) − f (x) Rn (x + hn ) − Rn (x) Sn (x + hn ) − Sn (x)


≥ − ,
hn hn hn

which by (8) and (10),

2  a n 2π  a n
≥ −
3 2 a−2 2
 a n  2 2π

= − .
2 3 a−2

Since a/2 > 1, we obtain

f (x + hn ) − f (x)
→∞ as n → ∞,
hn

provided a is an odd positive integer satisfying


2 2π
− > 0;
3 a−2
i.e., a > 3π + 2. Since π < 3.15, we need a ≥ 13. 

Remark. The above proof is based on the proof of a more general result given
Sequences and Series of Functions 369

in the text by E. Hewitt and K. Stromberg. There it is proved (Theorem 17.7)


that

X cos ak πx
f (x) =
bk
k=0
has the desired property if a is an odd positive integer, and b is any real
number with b > 1 satisfying
a
> 1 + 23 π.
b
The above function was carefully examined by G.H. Hardy [Trans. Amer.
Math. Soc., 17 (1916), 301–325] who proved that the above f has the stated
properties provided 1 < b ≤ a.
These are by no means the only examples of such functions. A slightly
easier construction of a continuous function which is nowhere differentiable is
given in Exercise 7.

Exercises 8.5
1. For n ∈ N, set fn (x) = xn /n, x ∈ [0, 1]. Prove that the sequence {fn }
converges uniformly to f (x) = 0 on [0, 1], that the sequence {fn′ (x)}
converges pointwise on [0, 1], but that {fn′ (1)} does not converge to f ′ (1).
2. *Let {fn } be a sequence of differentiable functions on [a, b] for which
fn′ is continuous on [a, b] for all n ∈ N. If {fn′ } converges uniformly on
[a, b], and {fn (xo )} converges for some xo ∈ [a, b], use the fundamental
theorem of calculus to prove that {fn } converges uniformly to a function
f on [a, b] and that f ′ (x) = lim fn′ (x) for all x ∈ [a, b].
n→∞

3. Let {ak }∞
P
k=0 be a sequence of real numbers satisfying k|ak | < ∞. Show

P k
that the series ak x converges uniformly to a function f on |x| ≤ 1
k=0
and that

f ′ (x) = kak xk−1
P
k=1

for all x, |x| ≤ 1.


4. *Let {fn } be a sequence of differentiable real-valued functions on (a, b)
that converges pointwise to a function f on (a, b). Suppose the sequence
{fn′ } converges uniformly on every compact subset of (a, b). Prove that
f is differentiable on (a, b) and that f ′ (x) = lim fn′ (x) for all x ∈ (a, b).
n→∞
5. P
State and prove an analogue of Theorem 8.5.1 for a series of functions
fk (x).
6. Show that each of the following series converge on the indicated interval
and that the derivative of the sum can be obtained by term by term
differentiation of the series:
∞ 1 ∞
e−kx , x ∈ (0, ∞)
P P
*a. 2
, x ∈ (0, ∞) b.
k=1 (1 + kx) k=1
∞ ∞ xk
xk , |x| < 1
P P
*c. d. , e ∈ (−∞, ∞)
k=0 k=0 k!
370 Introduction to Real Analysis

7. This exercise provides another construction of a continuous function f


on R which is nowhere differentiable. Set g(x) = |x|, −1 ≤ x ≤ 1, and
extend g to R to be periodic of period 2 by setting g(x+2) = g(x). Define
f on R by
∞  k
X 3
f (x) = g(4k x).
4
k=0

a. Prove that f is continuous on R.


b. Fix xo ∈ R and m ∈ N. Set δm = ± 21 4−m , where the sign is chosen so
that no integer lies between 4m xo and 4m (xo + δm ). Show that
g(4n (xo + δm )) − g(4n xo ) = 0 for all n > m.
c. Show that
f (xo + δm ) − f (xo ) 1
≥ (3m + 1).
δm 2
Since δm → 0 as m → ∞, it now follows that f ′ (xo ) does not exist.

8.6 The Weierstrass Approximation Theorem


In this section, we will prove the following well known theorem of Weierstrass.

THEOREM 8.6.1 (Weierstrass) If f is a continuous real-valued function


on [a, b], then given ǫ > 0, there exists a polynomial P such that

|f (x) − P (x)| < ǫ

for all x ∈ [a, b].

An equivalent version, and what we will actually prove, is the following:


If f is a continuous real-valued function on [a, b], then there exists a se-
quence {Pn } of polynomials such that

f (x) = lim Pn (x) uniformly on [a, b].


n→∞

Before we prove Theorem 8.6.1, we state and prove a more fundamental


result that will also have applications later. Prior to doing so, we need the
following definitions.

DEFINITION 8.6.2 A real-valued function f on R is periodic with period


p if
f (x + p) = f (x) for all x ∈ R.
Sequences and Series of Functions 371

FIGURE 8.3
Graph of a periodic function

The canonical examples of periodic functions are the functions sin x and
cos x, both of which are periodic of period 2π. The graph of a periodic function
of period p is illustrated in Figure 8.3. The graphs of a periodic function of
period p on any two successive intervals of length p are identical. It is clear
that if f is periodic of period p, then

f (x + kp) = f (x) for all k ∈ Z.

Another useful property of periodic functions is as follows:

THEOREM 8.6.3 If f is periodic of period p and Riemann integrable on


[0, p], then f is Riemann integrable on [a, a + p] for every a ∈ R, and
Z a+p Z p
f (x) dx = f (x) dx.
a 0

Proof. Exercise 2. 

Approximate Identities

DEFINITION 8.6.4 A sequence {Qn } of nonnegative Riemann integrable


functions on [−a, a] satisfying
Z a
(a) Qn (t) dt = 1, and
−a
Z
(b) lim Qn (t) dt = 0 for every δ > 0, is called an approximate
n→∞ {δ≤|t|}
identity on [−a, a].
372 Introduction to Real Analysis

An approximate identity {Qn } is sometimes also referred to as a Dirac


sequence.
Remark. In (b), by the integral over the set {δ ≤ |t|} we mean the integral
over the two intervals [−a, −δ] and [δ, a].
As we will see in Theorem 8.6.5, and again in Chapter 9, approximate
identities play a very important role in analysis. An elementary example of
such a sequence {Qn }∞n=1 is as follows:
(
n
, − n1 ≤ x ≤ n1 ,
Qn (t) = 2 1
0, n < |x| ≤ 1.

It is easily shown that the sequence {Qn } is an approximate identity on [−1, 1]


(Exercise 3). Other examples will be encountered in the proof of Theorem 8.6.1
and in the exercises, and still others when we study Fourier series.
As a general rule, the Qn are usually taken to be even functions; i.e.
Qn (−x) = Qn (x). The fact that the integrals over the set {t : |t| ≥ δ} become
small as n → ∞ seems to suggest that in some sense the functions themselves
become small as n becomes large. On the other hand, since the integrals over
[−a, a] are always 1, by property (b) of Definition 8.6.4
Z δ
lim Qn (t) dt = 1
n→∞ −δ

for every δ > 0. This seems to indicate that the functions are concentrated
near 0 and must become very large near 0 (see Exercise 6). The graphs of the
first few functions Q1 , Q2 , and Q3 of a typical approximate identity {Qn } are
given in Figure 8.4.

THEOREM 8.6.5 Let {Qn } be an approximate identity on [−1, 1], and let f
be a bounded real-valued periodic function on R of period 2 with f ∈ R[−1, 1].
For n ∈ N, x ∈ R, define
Z 1
Sn (x) = f (x + t)Qn (t) dt. (11)
−1

If f is continuous at x ∈ R, then

lim Sn (x) = f (x).


n→∞

Furthermore, if f is continuous on [−1, 1], then

lim Sn (x) = f (x) uniformly on R.


n→∞

Proof. We first note that since f is periodic and integrable on [−1, 1], f is
integrable on every finite subinterval of R. Thus the integral in (11) is defined
Sequences and Series of Functions 373

FIGURE 8.4
Graphs of Q1 , Q2 , Q3

for all x ∈ R. Also, since f is bounded, there exists a constant M > 0 such
that |f (x)| ≤ M for all x ∈ R.
Suppose first that f is continuous at x ∈ R. By (a) of Definition 8.6.4
Z 1
f (x) = f (x)Qn (t) dt.
−1

Therefore,
Z 1
|Sn (x) − f (x)| = [f (x + t) − f (x)]Qn (t)dt
−1
Z 1
≤ |f (x + t) − f (x)|Qn (t) dt. (12)
−1

Let ǫ > 0 be given. Since f is continuous at x, there exists a δ > 0 such that
ǫ
|f (x + t) − f (x)| <
2
for all t, |t| < δ. Therefore
δ δ 1
ǫ ǫ ǫ
Z Z Z
|f (x + t) − f (x)|Qn (t) dt < Qn (t) dt ≤ Qn (t) dt = (13)
−δ 2 −δ 2 −1 2

On the other hand,


Z Z
|f (x + t) − f (x)|Qn (t) dt ≤ 2M Qn (t) dt.
{δ≤|t|} {δ≤|t|}
374 Introduction to Real Analysis

Since {Qn } is an approximate identity, by property (b) there exists no ∈ N


such that
ǫ
Z
Qn (t) dt <
{δ≤|t|} 4M
for all n ≥ no . Thus if n ≥ no ,
ǫ
Z
|f (x + t) − f (x)|Qn (t) dt < .
{δ≤|t|} 2

Therefore by (12) and the above,


Z δ
|Sn (x) − f (x)| ≤ |f (x + t) − f (x)|Qn (t) dt
−δ
Z
+ |f (x + t) − f (x)|Qn (t) dt
{δ≤|t|}
ǫ ǫ
< + =ǫ
2 2
for all n ≥ no . Thus lim Sn (x) = f (x).
n→∞
Suppose f is continuous on [−1, 1]. Since f is periodic, this implies that
f (−1) = f (1). By Theorem 4.3.4, f is uniformly continuous on [−1, 1], and
hence by periodicity, also on R (Exercise 1). Thus given ǫ > 0, there exists a
δ > 0 such that
ǫ
|f (x + t) − f (x)| <
2
for all x ∈ R and all t, |t| < δ. As a consequence, inequality (13) holds for all
x ∈ R. Therefore, as above, there exists no ∈ N such that

|Sn (x) − f (x)| < ǫ

for all x ∈ R and all n ≥ no . This proves that the sequence {Sn } converges
uniformly to f on R. 

Proof of the Weierstrass Approximation Theorem. We now


use Theorem 8.6.5 to prove the Weierstrass approximation theorem (Theorem
8.6.1). Suppose f is a continuous real-valued function on [a, b]. By making a
change of variable, i.e.,

g(x) = f ((b − a)x + a), x ∈ [0, 1],

we can assume that f is continuous on [0, 1]. Also, if we let

g(x) = f (x) − f (0) − x[f (1) − f (0)], x ∈ [0, 1],

then g(0) = g(1) = 0 and g(x) − f (x) is a polynomial. If we can approximate


g by a polynomial Q and set

P (x) = Q(x) + [f (x) − g(x)],


Sequences and Series of Functions 375

then P is also a polynomial with |f (x) − P (x)| = |g(x) − Q(x)|. Therefore,


without loss of generality, we can assume that f is defined on [0, 1] satisfying

f (0) = f (1) = 0.

Extend f to [−1, 1] by defining f (x) = 0 for all x ∈ [−1, 0). Then f is


continuous on [−1, 1]. Finally, we extend f to all of R by defining

f (x) = f (x − 2k), k ∈ Z,

where k ∈ Z is chosen so that x − 2k ∈ (−1, 1]


Our next step is to find an approximate identity {Qn } on [−1, 1] such that
the corresponding functions Sn of Theorem 8.6.5 defined by equation (11) are
polynomials. To accomplish this we let

Qn (t) = cn (1 − t2 )n ,

where cn > 0 is chosen such that


Z 1
Qn (t) dt = 1.
−1

Thus the sequence {Qn } satisfies hypothesis (a) of Definition 8.6.4. To show
that it also satisfies (b) we need an estimate on the magnitude of cn . Since
Z 1 Z 1
2 n
1 = cn (1 − t ) dt = 2cn (1 − t2 )n dt
−1 0

Z 1/ n
≥ 2cn (1 − t2 )n dt
0

1/ n  
1 1
Z
2
≥ 2cn (1 − nt ) dt = 2cn √ − √
0 n 3 n
4cn
= √ ,
3 n
we obtain √ √
3
cn ≤ 4 n< n.
In the above we have used the inequality (1 − t2 )n ≥ 1 − nt2 valid for all
t ∈ [0, 1] (Example 1.3.2(b)). Finally, for any δ, 0 < δ < 1,

Qn (t) = cn (1 − t2 )n ≤ n(1 − δ 2 )n , for all t, δ ≤ |t| ≤ 1.

Thus since 0 < (1 − δ 2 ) < 1, by Theorem 3.2.6(d), lim Qn (t) = 0 uniformly


n→∞
in δ ≤ |t| ≤ 1. Therefore,
Z
lim Qn (t) dt = 0.
n→∞ {δ≤|t|}
376 Introduction to Real Analysis

For x ∈ [0, 1], set


Z 1
Pn (x) = f (x + t)Qn (t) dt.
−1

This is the function Sn (x) of Theorem 8.6.5 except restricted to x ∈ [0, 1]. Let
x ∈ [0, 1]. Since f (t) = 0 for t ∈ [−1, 0] ∪ [1, 2],
Z 1 Z 1−x
f (x + t)Qn (t) dt = f (x + t)Qn (t) dt,
−1 −x

which by the change of variables s = t + x gives


Z 1
Pn (x) = f (s)Qn (s − x) ds.
0

Therefore Pn (x), for x ∈ [0, 1], is a polynomial of degree less than or equal to
2n. As a consequence of Theorem 8.6.5,

lim Pn (x) = f (x) uniformly on [0, 1],


n→∞

thereby proving the result. 


Remarks. (a) The above proof of the Weierstrass approximation theorem
is a variation of a proof found in the text by Walter Rudin listed in the
Bibliography.
(b) The Weierstrass approximation theorem proves that the set of poly-
nomials is dense in C([0, 1]). A natural question is the following. Do we need
all polynomials? Let N = {nj }∞ j=1 be a stricly increasing sequence of positive
integers, and let PN be the set of all polynomials of the form

P (x) = a0 + a1 xn1 + · · · + ak xnk .

A very interesting result, whose proof is beyond the scope of the text, is the
Müntz-Szasz Theorem3 as follows: The set Pn is dense in C([0, 1]) if and
only if

X 1
= ∞.
nk
k=1

Hence, the set of all polynomials with even exponents is dense, whereas the
set of all polynomials of the form
2 2 2
P (x) = a0 + a1 x2 + a2 x2 + a3 x3 + · · · + an xn

is not.

3 A proof of the Müntz-Szasz theorem may by found in the following text by Walter

Rudin: Real and Complex Analysis, McGraw–Hill, New York, 1966.


Sequences and Series of Functions 377

Exercises 8.6
1. If f : R → R is periodic of period 2 and continuous on [−1, 1], prove that
f is uniformly continuous on R.
2. *Prove Theorem 8.6.3.
3. For n ∈ N, define Qn on [−1, 1] as follows:
(
n
, − n1 ≤ x ≤ n1 ,
Qn (x) = 2 1
.
0, n
< |x| ≤ 1.
Show that {Qn } is an approximate identity on [−1, 1].
4. For n ∈ N, set Qn (x) = cn (1 − |x|)n , x ∈ [−1, 1].
R1
*a. Determine cn > 0 so that Qn (t)dt = 1.
−1

b. Prove that with the above choice of cn the sequence {Qn } is an


approximate identity on [−1, 1].
c. Sketch the graph of Qn (x) for n = 2, 4, 8.
2
5. For n ∈ N, set Qn (x) = cn |x|e−nx , x ∈ [−1, 1].
R1
a. Determine cn > 0 so that Qn (t) dt = 1.
−1

b. Prove that with the above choice of cn the sequence {Qn } is an ap-
proximate identity on [−1, 1].
6. *If {Qn } is an approximate identity on [−1, 1], prove that
lim sup{Qn (x) : x ∈ [−δ, δ]} = ∞
n→∞

for every δ > 0


7. Let f be a continuous real-valued function on [0, 1]. Prove that given
ǫ > 0, there exists a polynomial P with rational coefficients such that
|f (x) − P (x)| < ǫ for all x ∈ [0, 1].
8. Suppose f is a continuous real-valued function on [0, 1] satisfying
R1
f (x)xn dx = 0 for all n = 0, 1, 2, ...
0

RProve that f (x) = 0 for all x ∈ [0, 1]. (Hint: First show that
1
0
f (x)P (x)dx = 0 for every polynomial P , then use the Weierstrass
R1
theorem to show that 0 f 2 (x)dx = 0.)

8.7 Power Series Expansions


In this section, we turn our attention to the study of power series and the
representation of functions by means of power series. Because of their special
378 Introduction to Real Analysis

nature, power series possess certain properties which are not valid for series
of functions. We begin with the following definition.

DEFINITION 8.7.1 Let {ak }∞ k=0 be a sequence of real numbers, and let
c ∈ R. A series of the form

X
ak (x − c)k = a0 + a1 (x − c) + a2 (x − c)2 + a3 (x − c)3 + · · ·
k=0

is called a power series in (x − c). When c = 0, the series is called a power


series in x. The numbers ak are called the coefficients of the power series.

Even though the study of representation of functions by means of power


series dates back to the mid seventeenth century, the rigorous study of con-
vergence is much more recent. Certainly Newton and his successors were con-
cerned with questions involving the convergence of a power series to its defin-
ing functions. It was Cauchy however who with his formal development of
series brought mathematical rigor to the subject. As an application of his
root and ratio test, Cauchy was among the first to use these tests to deter-
mine the interval of convergence of a power series. This is accomplished as

ak (x − c)k . Applying the root test to this
P
follows: Consider a power series
k=0
series gives
q p
k
lim |ak ||x − c|k = |x − c| lim k |ak | = |x − c| α,
k→∞ k→∞
p
where α = lim k
|ak |. Thus by Theorem 7.3.4 the series converges absolutely
k→∞
if α|x − c| < 1, and diverges if α|x − c| > 1. If α = 0, then α|x − c| < 1 for all
x ∈ R. If 0 < α < ∞, then
1
α|x − c| < 1 if and only if |x − c| < .
α

ak (x − c)k , the radius of


P
DEFINITION 8.7.2 Given a power series
convergence R is defined by
1 p
= lim n |an |.
R n→∞
p p
If lim n |an | = ∞ we take R = 0, and if lim n |an | = 0 we set R = ∞.

ak (x − c)k converges only for x = c. On


P
When R = 0, the power series
the other hand, if R = ∞, then the power series converges for all x ∈ R.
Sequences and Series of Functions 379

Remark. If ak 6= 0 for all k and lim |ak+1 | |ak | exists, then by Theorem
k→∞
ak xk is also given by
P
7.1.10 the radius of convergence of

1 |ak+1 |
= lim .
R k→∞ |ak |

This formulation is particularly useful if the coefficients involve factorials.



ak (x − c)k with radius of con-
P
THEOREM 8.7.3 Given a power series
k=0
vergence R, 0 < R ≤ ∞, then the series
(a) converges absolutely for all x with |x − c| < R, and
(b) diverges for all x with |x − c| > R.
(c) Furthermore, if 0 < ρ < R, then the series converges uniformly for all
x with |x − c| ≤ ρ.

Proof. Statements (a) and (b) were proved in the discussion preceding the
statement of the theorem. Suppose 0 < ρ < R. Choose β such that ρ < β < R.
Since
p 1 1
lim k |ak | = < ,
k→∞ R β
there exists no ∈ N such that
p
k 1
|ak | < for all k ≥ no .
β

Hence for k ≥ no and |x − c| ≤ ρ,


 k
ρ
|ak (x − c)k | ≤ |ak |ρk < .
β

But (ρ/β) < 1 and thus (ρ/β)k < ∞. Therefore by the Weierstrass M-test,
P
the series converges uniformly on |x − c| ≤ ρ. 
The previous theorem provides no suggestion as to what happens when
|x − c| = R. As the following examples (with c = 0) illustrate, the series may
either converge or diverge when |x| = R.


xk has radius of convergence R = 1.
P
EXAMPLES 8.7.4 (a) The series
k=0
This series diverges at both x = 1 and −1.
∞ xk
P
(b) The series also has radius of convergence R = 1. In this case,
k=1 k
when x = 1 the series diverges; whereas when x = −1, the series is an alter-
nating series that converges by Theorem 7.2.3.
380 Introduction to Real Analysis
∞ xk
P
(c) Consider the series 2
. Again the radius of convergence is R = 1.
k=1 k
In this example, the series converges at both x = 1 and −1.
(d) Consider the series

X
1 + 2x + 32 x2 + 23 x3 + 34 x4 + · · · = ak xk
k=0

where (
3k , if k is even,
ak =
2k , if k is odd.
p
Hence lim k |ak | = 3, and therefore R = 1/3. The series diverges at both
x = 1/3 and x = −1/3.
k!xk . Here ak = k!, and
P
(e) Finally, consider the series
ak+1
lim = lim (k + 1) = ∞.
k→∞ ak k→∞
p
Thus by Theorem 7.1.10 k |ak | → ∞, and R = 0. Therefore the power series
converges only for x = 0. 

Abel’s Theorem
ak (x − c)k with radius of convergence
P
Suppose we are given a power series
R > 0. By setting
X∞
f (x) = ak (x − c)k , (14)
k=0

we obtain a function that is defined for all x, |x − c| < R. Functions that


are defined in terms of a power series (as in (14)) are usually referred to as
real analytic functions. Since the series converges uniformly for all x with
|x − c| ≤ ρ, for any ρ, 0 < ρ < R, the function f is continuous on |x − c| ≤ ρ.
Since this holds for all ρ < R, the function f is continuous on |x − c| < R.
If the series (14) also converges at an endpoint, say at x = c + R, then f is
continuous not only in (c − R, c + R) but also at x = c + R. This follows from
the following theorem of Abel. For convenience, we take c = 0 and R = 1.

ak xk has radius
P
THEOREM 8.7.5 (Abel’s Theorem) Suppose f (x) =
k=0

P
of convergence R = 1, and that ak converges. Then
k=0


X
lim f (x) = ak .
x→1−
k=0
Sequences and Series of Functions 381
n
P
Proof. Set s−1 = 0, and for n = 0, 1, 2, ... let sn = ak . Then by the partial
k=0
summation formula (7.2.1)
n
X n−1
X
ak xk = sk (xk − xk+1 ) + sn xn
k=0 k=0
n−1
X
= (1 − x) s k xk + s n xn .
k=0

Since the sequence {sn } converges, if we let n → ∞, then for all x, |x| < 1,

X
f (x) = (1 − x) s k xk .
k=0

Let s = lim sn , and let ǫ > 0 be given. Choose no ∈ N such that |s−sn | <
n→∞
ǫ/2 for all n ≥ no . Since

X
(1 − x) xk = 1, |x| < 1,
k=0

we have for all x, 0 < x < 1,



X ∞
X
|f (x) − s| = (1 − x) (sk − s)xk ≤ (1 − x) |sk − s|xk
k=0 k=0
no ∞
X ǫ X
≤ (1 − x) |sk − s| + (1 − x) xk
2
k=0 k=no +1
ǫ
≤ (1 − x)M + ,
2
no
P
where M = |sk − s|. If we now choose δ > 0 such that 1 − δ < x < 1
k=0
implies that (1 − x)M < ǫ/2, then |f (x) − s| < ǫ for all x, 1 − δ < x < 1. Thus
lim f (x) = s. 
x→1−

EXAMPLE 8.7.6 To illustrate Abel’s theorem, consider the series



(−1)k tk . This series has radius of convergence R = 1. Furthermore, the
P
k=0 
series converges to f (t) = 1 (1 + t) for all t, |t| < 1. Since the convergence is
uniform on |t| ≤ |x|, where |x| < 1, by Corollary 8.4.2
Z x ∞ Z x
dt X
k
ln(1 + x) = = (−1) tk dt
0 1+t k=0 0
∞ ∞
X (−1)k X (−1)k+1
= xk+1 = xk
k+1 k
k=0 k=1
382 Introduction to Real Analysis
∞ (−1)k+1
xk has radius of convergence R = 1,
P
for all x, |x| < 1. The series
k=1 k
and also converges when x = 1. Thus by Abel’s theorem,

X (−1)k+1 1 1 1
ln 2 = =1− 2 + 3 − 4 + ··· . 
k
k=1

Differentiation of Power Series



ak (x − c)k has radius of convergence R > 0. If
P
Suppose the power series
k=0
we differentiate the series term-by-term we obtain the new power series

X ∞
X
k ak (x − c)k−1 = (k + 1)ak+1 (x − c)k . (15)
k=1 k=0

The obvious question to ask is, what is the radius of convergence of the dif-

ak (x − c)k ,
P
ferentiated series (15)? Furthermore, if f is defined by f (x) =
k=0
|x − c| < R, does the series (15) converge to f ′ (x)? The answers to both of
these questions are provided by the following theorem.

ak (x − c)k has radius of convergence R > 0,
P
THEOREM 8.7.7 Suppose
k=0
and

X
f (x) = ak (x − c)k , |x − c| < R.
k=0

Then

k ak (x − c)k−1
P
(a) has radius of convergence R, and
k=1

(b) f ′ (x) = k ak (x − c)k−1 ,
P
for all x, |x − c| < R.
k=1

Proof. For convenience we take c = √ 0. Consider the differentiated series


k ak xk−1 . By Theorem 3.2.6, lim k k = 1, and for x 6= 0,
P
k→∞

|x|
q
lim k |x|k−1 = lim p = |x|.
k→∞ k→∞ k
|x|
p
Therefore lim k
k|x|k−1 = |x|. By Exercise 10 of Section 3.5, for x 6= 0,
k→∞
q p q
k k
lim |ak | k |x|k−1 = lim k
|ak | lim k |x|k−1
k→∞ k→∞ k→∞
pk
= |x| lim |ak |.
k→∞
Sequences and Series of Functions 383

Therefore, if R = ∞, the differentiated series (15) converges for all x, and if


0 < R < ∞, the differentiated series converges for all x, |x| < R, and diverges

k ak xk−1 is also R.
P
for all x, |x| > R. Thus the radius of convergence of
k=1
Furthermore, for any ρ, 0 < ρ < R, by Theorem 8.7.3 the series k ak xk−1
P
converges uniformly for all x, |x| ≤ ρ. Thus by Theorem 8.5.1, the series (15)
obtained by term-by-term differentiation converges to f ′ (x), i.e.,

X
f ′ (x) = k ak xk−1 , for all x, |x| < R. 
k=1


ak (x − c)k has radius of convergence R >
P
COROLLARY 8.7.8 Suppose
k=0
0, and

X
f (x) = ak (x − c)k , |x − c| < R.
k=0

Then f has derivatives of all orders in |x − c| < R, and for each n ∈ N,



X
f (n) (x) = k(k − 1) · · · (k − n + 1) ak (x − c)k−n . (16)
k=n

In particular,
f (n) (c) = n! an . (17)

Proof. The result is obtained by successively applying the previous theorem


to f, f ′ , f ′′ , etc. Equation (17) follows by setting x = c in (16). 

DEFINITION 8.7.9 A real-valued function f defined on an open interval


I is said to be infinitely differentiable on I if f (n) (x) exists on I for all
n ∈ N. The set of infinitely differentiable functions on an open interval I is
denoted by C ∞ (I).

ak (x − c)k has radius of conver-


P
As a consequence of Corollary 8.7.8, if

ak (x − c)k for |x − c| < R,
P
gence R > 0 and if f is defined by f (x) =
k=0
then the function f is infinitely differentiable on (c − R, c + R) and its nth
derivative is given by (16). We illustrate this with the following example.

EXAMPLE 8.7.10 For |x| < 1,



1 X
= xk .
1−x
k=0
384 Introduction to Real Analysis

Thus by the previous corollary,


∞ ∞
1 X X
2
= k xk−1 = (k + 1)xk ,
(1 − x)
k=1 k=0
∞ ∞
2 X X
= k(k − 1) xk−2 = (k + 2)(k + 1)xk ,
(1 − x)3
k=2 k=0

and for arbitrary n ∈ N,



(n − 1)! X
= (k + n − 1) · · · (k + 1) xk . 
(1 − x)n
k=0

Uniqueness Theorem for Power Series


The following uniqueness result for power series is another consequence of
Corollary 8.7.8 .

COROLLARY 8.7.11 Suppose ak (x−c)k and bk (x−c)k are two power


P P
series which converge for all x, |x − c| < R, for some R > 0. Then

X ∞
X
ak (x − c)k = bk (x − c)k , |x − c| < R,
k=0 k=0

if and only if ak = bk for all k = 0, 1, 2, ....

Proof. Clearly, if ak = bk for all k, then the two power series are equal and
converge to the same function. Conversely, set

X ∞
X
f (x) = ak (x − c)k and g(x) = bk (x − c)k .
k=0 k=0

If f (x) = g(x) for all x, |x−c| < R, then f (n) (x) = g (n) (x) for all n = 0, 1, 2, ...,
and all x, |x − c| < R. In particular, f (n) (c) = g (n) (c) for all n = 0, 1, 2, ....
Thus by (17), an = bn for all n. 

Representation of a Function by a Power Series


Up to this point we have shown that if a function f is defined by a power
series, that is

X
f (x) = ak (x − c)k , |x − c| < R,
k=0

with radius of convergence R > 0, then by Corollary 8.7.8, f is infinitely differ-


entiable on (c − R, c + R) and the coefficients ak are given by ak = f (k) (c) k!.

Sequences and Series of Functions 385

We now consider the converse question. Given an infinitely differentiable func-


tion on an open interval I and c ∈ I, can f be expressed as a power series in a
neighborhood of the point c. Specifically, does there exist an ǫ > 0 such that

X
f (x) = ak (x − c)k
k=0

for all x, |x − c| < ǫ, with ak = f (k) (c) k! for all k = 0, 1, 2, ... The following


example of Cauchy shows that this is not always possible.

EXAMPLE 8.7.12 Let f be defined on R by


( 2
e−1/x , x=6 0,
f (x) =
0, x = 0.
2 2
Since lim e−1/x = lim e−t = 0, f is continuous at 0. For x 6= 0,
x→0 t→∞

2
2 e−1/x
f ′ (x) = .
x3
When x = 0, we have
2
′ f (h) − f (0) e−1/h t
f (0) = lim = lim = lim t2 = 0.
h→0 h h→0 h t→∞ e

The last step follows from l’Hospital’s rule. Thus,



 2 e−1/x2 , x=6 0,
f ′ (x) = x3
 0, x = 0.

By induction, it follows as above, that for each n ∈ N,


( 2
(n) P (1/x)e−1/x , x 6= 0,
f (x) =
0, x = 0,

where P is a polynomial of degree 3n. The details are left to the exercises
(Exercise 16). Thus the function f is infinitely differentiable on R. If there

ak xk for all x, |x| < R, then ak = 0 for all
P
exists R > 0 such that f (x) =
k=0
k. As a consequence, f cannot be presented by a power series which converges
to f in a neighborhood of 0.
386 Introduction to Real Analysis

Taylor Polynomials and Taylor Series


We now consider the problem of representing a function f in terms of a power
series in greater detail. Newton derived the power series expansion of many of
the elementary functions by algebraic techniques or term-by-term integration.
For example, the series expansion of 1/(1 + x) can easily be obtained by long
division, which upon term-by-term integration gives the power series expan-
sion of ln(1 + x). Maclaurin and Taylor were among the first mathematicians
to use Newton’s calculus in determining the coefficients in the power series
expansion of a function. Both realized that if a function f (x) had a power
ak (x − c)k , then the coefficients ak had to be given by
P
series expansion
f (k) (c)/k!.

DEFINITION 8.7.13 Let f be a real-valued function defined on an open


interval I, and let c ∈ I and n ∈ N. Suppose f (n) (x) exists for all x ∈ I. The
polynomial
n
X f (k) (c)
Tn (f, c)(x) = (x − c)k
k!
k=0

is called the Taylor polynomial of order n of f at the point c. If f is


infinitely differentiable on I, the series

X f (k) (c)
(x − c)k
k!
k=0

is called the Taylor series of f at c.

For the special case c = 0, the Taylor series of a function f is often referred
to as the Maclaurin series. The first three Taylor polynomials T0 , T1 , T2 ,
are given specifically by

T0 (f, c)(x) = f (c),


T1 (f, c)(x) = f (c) + f ′ (c)(x − c),
f ′′ (c)
T2 (f, c)(x) = f (c) + f ′ (c)(x − c) + (x − c)2 ,
2!
The Taylor polynomial T1 (f, c) is the linear approximation to f at c; that
is, the equation of the straight line passing through (c, f (c)) with slope f ′ (c).
In general, the Taylor polynomial Tn of f is a polynomial of degree less
than or equal to n, that satisfies

Tn(k) (f, c)(c) = f (k) (c),

for all k = 0, 1, ..., n. Since f (n) (c) might possibly be zero, Tn (as the next
example shows) could very well be a polynomial of degree strictly less than n.
Sequences and Series of Functions 387

EXAMPLES 8.7.14 In the following examples we compute the Taylor series


of several functions. At this stage nothing is implied about the convergence of
the series to the function.
(a) Let f (x) = sin x and take c = π2 . Then
π
f ( π2 ) = sin = 1,
2
π
f ′ ( π2 ) = cos = 0,
2
′′ π π
f ( 2 ) = − sin = −1,
2
(3) π π
f ( 2 ) = − cos = 0.
2

Thus
1
T3 (f, π2 )(x) = 1 − (x − π2 )2 ,
2!
which is a polynomial of degree 2. In general, if n is odd, f (n) ( π2 ) = 0, and if
n = 2k is even, f (2k) ( π2 ) = (−1)k . Therefore, if n is even,

n/2
X (−1)k
Tn (f, π2 )(x) = Tn+1 (f, π2 ) = (x − π2 )2k .
(2k)!
k=0

π
The Taylor expansion of f (x) = sin x about c = 2 is given by

X (−1)k
(x − π2 )2k .
(2k)!
k=0

2
(b) For the function f (x) = e−1/x , by Example 8.7.12

Tn (f, 0)(x) = 0 for all n ∈ N.

Thus the Taylor series of f at c = 0 converges for all x ∈ R; namely to the


zero function. It however does not converge to f .
(c) In many instances, the Taylor expansion of a given function can be
computed from a know series. As an example, we find the Taylor series ex-
pansion of f (x) = 1/x about c = 2. This could be done by computing the
derivatives of f and evaluating them at c = 2. However, it would still remain
to be shown that the given series converges to f (x). An easier method is as
follows: We first write
1 1 1 1
= = .
x 2 − (2 − x) 2 1 − ( 2−x
2 )
388 Introduction to Real Analysis

For |w| < 1,



1 X
= wk .
1−w
k=0

Setting w = (2 − x)/2, we have


∞ ∞
1 1 X (2 − x)k X (−1)k
= k
= (x − 2)k ,
x 2 2 2k+1
k=0 k=0

for all x, |x − 2| < 2. By uniqueness, the given series must be the Taylor series
of f (x) = 1/x. In this instance, the power series also converges to the function
f (x) for all x satisfying |x − 2| < 2. 

Remainder Estimates
To investigate when the Taylor series of a function f converges to f (x), we
consider
Rn (x) = Rn (f, c)(x) = f (x) − Tn (f, c)(x). (18)
The function Rn is called the remainder or error function between f and
Tn (f, c). Clearly,

f (x) = lim Tn (f, c)(x) if and only if lim Rn (f, c)(x) = 0.


n→∞ n→∞

Since the Taylor polynomial Tn is the nth partial sum of the Taylor series of f ,
the Taylor series converges to f at a point x if and only if lim Rn (f, c)(x) = 0.
n→∞
To emphasize this fact, we state it as a theorem.

THEOREM 8.7.15 Suppose f is an infinitely differentiable real-valued


function on the open interval I and c ∈ I. Then for x ∈ I,

X f (k) (c)
f (x) = (x − c)k ,
k!
k=0

if and only if lim Rn (f, c)(x) = 0.


n→∞

The formula
f ′′ (c)
f (x) = f (c) + f ′ (c)(x − c) + (x − c)2 + · · ·
2!
f (n) (c)
+ (x − c)n + Rn (f, x)(x)
n!
is known as Taylor’s formula with remainder. We now proceed to derive several
formulas for the remainder term Rn . These can be used to show convergence
of Tn to f .
Sequences and Series of Functions 389

Lagrange Form of the Remainder


Our first result, due to Joseph Lagrange (1736–1813), is called the Lagrange
form of the remainder. This result, sometimes also referred to as Taylor’s
theorem, was previously proved for the special case n = 2 in Lemma 5.4.3.

THEOREM 8.7.16 Suppose f is a real-valued function on an open interval


I, c ∈ I and n ∈ N. If f (n+1) (t) exists for every t ∈ I, then for any x ∈ I,
there exists a ζ between x and c such that

f (n+1) (ζ)
Rn (x) = Rn (f, c)(x) = (x − c)n+1 . (19)
(n + 1)!

Remark. Continuity of f (n+1) is not required.


Proof. Fix x ∈ I, and let M be defined by

f (x) = Tn (f, c)(x) + M (x − c)n+1 .

To prove the result we need to show that (n + 1)! M = f (n+1) (ζ) for some
ζ between x and c. To accomplish this, set

g(t) = f (t) − Tn (f, c)(t) − M (t − c)n+1


= Rn (t) − M (t − c)n+1 .

First, since Tn is a polynomial of degree less than or equal to n,

g (n+1) (t) = f (n+1) (t) − (n + 1)! M.


(k)
Also, since Tn (f, c)(c) = f (k) (c), k = 0, 1, ..., n,

g(c) = g ′ (c) = · · · = g (n) (c) = 0.

For convenience, let’s assume x > c. By the choice of M , g(x) = 0. By


the mean value theorem applied to g on the interval [c, x], there exits x1 ,
c < x1 < x, such that

0 = g(x) − g(c) = g ′ (x1 )(x − c).

Thus g ′ (x1 ) = 0 for some x1 , c < x1 < x. Since g ′ (c) = 0, by the mean value
theorem applied to g ′ on the interval [c, x1 ], g ′′ (x2 ) = 0 for some x2 , c < x2 <
x1 . Continuing in this manner we obtain a point xn satisfying c < xn < x,
such that g (n) (xn ) = 0. Applying the mean value theorem one more time to
the function g (n) on the interval [c, xn ], we obtain the existence of a ζ ∈ (c, xn )
such that
0 = g (n) (xn ) − g (n) (c) = g (n+1) (ζ)(xn − c).
Thus g (n+1) (ζ) = 0; i.e., f (n+1) (ζ) − (n + 1)! M = 0, for some ζ between x
and c. 
390 Introduction to Real Analysis

In 8.7.20 we will give several examples to show how the remainder estimates
may be used to prove convergence of the Taylor series to its defining function.
In the following example we show how the previous theorem may be used to
derive simple estimates and inequalities.

EXAMPLES 8.7.17 (a) √ In this example, we use Theorem 8.7.16 with n = 2


to approximate f (x) = 1 + x, x > −1. With c = 0 we find that
1 1
f (0) = 1, f ′ (0) = , f ′′ (0) = − .
2 4
Therefore T2 (f, 0)(x) = 1 + 21 x − 81 x2 , and thus
√ 1 1
1 + x = 1 + x − x2 + R2 (f, x).
2 8
By formula (19),

f (3) (ζ) 3 1
R2 (f, 0)(x) = x = (1 + ζ)−5/2 x3
3! 16
for some ζ between 0 and x. If x > 0, then ζ > 0 and in this case (1 + ζ)−5/2 <
1. Therefore we have
√ 1 3
| 1 + x − T2 (f, 0)(x)| < x
16
for
√ any x > 0. If we let x = 0.4, then T2 (f, 0)(.4) = 1.18, and by the above
| 1.4 − 1.18| < 0.004, √
so that two decimal place accuracy is assured. In fact,
to five decimal places 1.4 = 1.18322.
(b) The error estimates can also be used to derive inequalities. As in the
previous example,

1 + x = 1 + 21 x − 18 x2 + R2 (f, x).
1 3
For x > 0 we have 0 < R2 (f, 0)(x) < 16 x . Thus

1 1 √ 1 1 1
1 + x − x2 < 1 + x < 1 + x − x2 + x3
2 8 2 8 16
for all x > 0. 

Integral Form of the Remainder


Another formula for Rn (f, c) is given by the following integral form of the
remainder. This however does require the additional hypothesis that the
(n + 1)st derivative is Riemann integrable.
Sequences and Series of Functions 391

THEOREM 8.7.18 Suppose f is a real-valued function on an open interval


I, c ∈ I and n ∈ N. If f (n+1) (t) exists for every t ∈ I and is Riemann
integrable on every closed and bounded subinterval of I, then
Z x
1
Rn (x) = Rn (f, c)(x) = f (n+1) (t)(x − t)n dt, x ∈ I. (20)
n! c
Proof. The result is proved by induction on n. Suppose n = 1. Then
R1 (x) = f (x) − f (c) − f ′ (c)(x − c),

which by the fundamental theorem of calculus


Z x Z x Z x
= f ′ (t) dt − f ′ (c) dt = [f ′ (t) − f ′ (c)] dt.
c c c
From the integration by parts formula (Theorem 6.3.7) with
u(t) = f ′ (t) − f ′ (c), v ′ (t) = 1,
u′ (t) = f ′′ (t), v(t) = (t − x),
we obtain
Z x x Z x
[f ′ (t) − f ′ (c)] dt = [f ′ (t) − f ′ (c)](t − x) − f ′′ (t)(t − x)dt
c c c
Z x
= (x − t)f ′′ (t) dt.
c
To complete the proof, we assume that the result holds for n = k, and
prove that this implies the result for n = k + 1. Thus assume Rk (x) is given
by (20). Then
Rk+1 (x) = f (x) − Tk+1 (f, c)(x)
f (k+1) (c)
= f (x) − Tk (f, c)(x) − (x − c)k+1
(k + 1)!
f (k+1) (c)
= Rk (x) − (x − c)k+1
(k + 1)!
Z x Z x
1 1
= (x − t)k f (k+1) (t) dt − f (k+1) (c) (x − t)k dt
k! c k! c
Z x
1 k (k+1) (k+1)
= (x − t) [f (t) − f (c)] dt.
k! c
As for the case n = 1, we again use integration by parts with
1
u(t) = f (k+1) (t) − f (k+1) (c) and v(t) = − (x − t)k+1 ,
k+1
which upon simplification gives,
Z x
1
Rk+1 (x) = (x − t)k+1 f (k+2) (t) dt. 
(k + 1)! c
392 Introduction to Real Analysis

Cauchy’s Form for the Remainder


Under the additional assumption of continuity of f (n+1) we obtain Cauchy’s
form for the remainder as follows:

COROLLARY 8.7.19 Let f be a real-valued function on an open interval


I, c ∈ I and n ∈ N. If f (n+1) is continuous on I, then for each x ∈ I, there
exists a ζ between c and x such that

f (n+1) (ζ)
Rn (x) = Rn (f, c)(x) = (x − c) (x − ζ)n . (21)
n!
Proof. Since f (n+1) (t)(x−t)n is continuous on the interval from c to x, by the
mean value theorem for integrals (Theorem 6.3.6), there exists a ζ between c
and x such that
Z x
f (n+1) (t)(x − t)n dt = (x − c)f (n+1) (ζ) (x − ζ)n .
c

The result now follows by (20). 


We now compute the Taylor series for several elementary functions and
use the previous formulas for the remainder to show that the series converges
to the function.

EXAMPLES 8.7.20 (a) As our first example we prove the binomial the-
orem (Theorem 3.2.5). For n ∈ N let f (x) = (1 + x)n , x ∈ R. Since f is a
polynomial of degree n, if k > n then f (k) (x) = 0 for all x ∈ R. Therefore by
Theorem 8.7.16,
n
X f (k) (0) k
f (x) = x .
k!
k=0
(k)
By computation, f (0) = n!/(n − k)! for k = 0, 1, ..., n. Therefore
n
X n!
(1 + x)n = xk .
k!(n − k)!
k=0

The series expansion of (1 + x)α for α ∈ R with α < 0 is given in Theorem


8.8.4, whereas the expansion for α > 0 is given in Exercise 7 of Section 8.8.
For rational numbers α, the expansion of (1 + x)α was known to Newton as
early as 1664.
(b) Let f (x) = sin x with c = 0. Then
(
(n) (−1)k cos x, n = 2k + 1,
f (x) = k
(−1) sin x, n = 2k.
Sequences and Series of Functions 393

Thus f (n) (0) = 0 for all even n ∈ N, and f (n) (0) = (−1)k , whenever n =
2k + 1, k = 0, 1, 2, .... Therefore the Taylor series of f at c = 0 is given by

X (−1)k 2k+1
x .
(2k + 1)!
k=0

To show convergence of the series to sin x we consider the remainder term


Rn (x). By Theorem 8.7.16, for each x ∈ R there exists a ζ such that

f (n+1) (ζ) n+1


Rn (x) = x .
(n + 1)!

Since |f (n+1) (x)| ≤ 1 for all x, we have

|x|n+1
|Rn (x)| ≤ .
(n + 1)!

By Theorem 3.2.6(f), lim |x|n+1 (n + 1)! = 0 for any x ∈ R. As a conse-



n→∞
quence, lim Rn (x) = 0 for all x ∈ R, and thus
n→∞


X (−1)k 2k+1
sin x = x , x ∈ R.
(2k + 1)!
k=0

The sine function, as well as the cosine function, can be defined strictly in
terms of power series. For further details, the reader is encouraged to look at
Miscellaneous Exercise 3.
(c) As our third example we derive the Taylor series for f (x) = ln(1 + x),
where as in Example 6.3.5
Z x
1
ln x = dt, x > 0,
1 t

denotes the natural logarithm function on (0, ∞). Then f (0) = ln(1) = 0, and
by the fundamental theorem of calculus f ′ (x) = 1/(1+x). Thus for n = 1, 2, ...,

(n − 1)!
f (n) (x) = (−1)n+1 .
(1 + x)n

In particular, f (n) (0) = (−1)n+1 (n − 1)!, and the Taylor series of f at 0


becomes

X (−1)n+1 n
x .
n=1
n

Although we have already proved that this series converges to ln(1 + x) for all
x, −1 < x ≤ 1 (Example 8.7.6), we will prove this again to illustrate the use
of the remainder formulas.
394 Introduction to Real Analysis

Suppose first that 0 < x ≤ 1. By Theorem 8.7.16,

(−1)n+2 xn+1
Rn (x) = Rn (f, 0)(x) =
(n + 1)(1 + ζ)n+1

for some ζ, 0 < ζ < x. In this case, (1 + ζ) > 1, and thus


1 1
|Rn (x)| ≤ xn+1 ≤ ,
n+1 n+1
for all x, 0 ≤ x ≤ 1. Therefore,

lim Rn (x) = 0, for all x ∈ [0, 1].


n→∞

We next consider the more difficult case −1 < x < 0. By the Cauchy form
of the remainder, if −1 < x < 0, there exists ζ, x ≤ ζ ≤ 0, such that

1 (n+1) (x − ζ)n x
Rn (x) = f (ζ)(x − ζ)n x = (−1)n+2 .
n! (1 + ζ)n+1

Therefore,  n
|x| (ζ − x)
|Rn (x)| ≤ .
(1 + ζ) 1 + ζ
Consider the function ϕ(t) defined on [x, 0] by
t−x
ϕ(t) = .
1+t
Then ϕ′ (t) = (1 + x)/(1 + t)2 which is positive on [x, 0] provided x > −1.
Therefore
ϕ(t) ≤ ϕ(0) = −x = |x|
for all t, x ≤ t ≤ 0. Thus

|x|
|Rn (x)| ≤ |x|n .
(1 + ζ)

Since |x| < 1, lim Rn (x) = 0. Therefore the Taylor series converges to
n→∞
ln(1 + x) for all x, −1 < x ≤ 1; i.e.,

X (−1)n+1 n
ln(1 + x) = x , −1 < x ≤ 1.
n=1
n

The Taylor expansion of ln(1 + x) was first obtained in 1668 by Nicolaus


Mercator. Newton shortly afterward also obtained the same expansion by
term by term integration of the series expansion of 1/(1 + x) (see Example
8.7.6).
Sequences and Series of Functions 395

(d) As our final example, we consider the natural exponential function


E(x) = exp x which is defined as the inverse function of the natural logarithm
function L(x) = ln x. The domain of L is (0, ∞) with range (−∞, ∞). Since
L′ (x) = 1/x is strictly positive on (0, ∞), L is a strictly increasing function
on (0, ∞). The inverse function E(x) is defined by

y = E(x) if and only if x = ln y.

By the inverse function theorem (Theorem 5.2.14), E is differentiable on R


with
1
E ′ (x) = E ′ (L(y)) = ′ = y = E(x).
L (y)
Thus E ′ (x) = E(x) for all x ∈ R. Since ln ab = ln a + ln b, it immediately
follows that
1
E(x + y) = E(x) E(y) and E(−x) = .
E(x)

Also, E(0) = 1, and by Example 6.3.5 E(1) = e, where e is Euler’s number


defined in Example 3.3.5.. Therefore E(n) = en for every integer n, and if r =
m/n, m, n ∈ Z with n 6= 0, then E(nr) = E(m) = em . But E(nr) = (E(r))n .
Therefore, if r ∈ Q,
E(r) = er .
For arbitrary x ∈ R we define ex by ex = E(x). This definition of ex is
consistent with the definition given in Miscellaneous Exercise 3 of Chapter 1
(See Exercise 17).
Since E (n) (x) = E(x) for all x ∈ R, E (n) (0) = E(0) = 1. Thus the Taylor
series expansion of E(x) about c = 0 is given by

X 1 k
x .
k!
k=0

It is left as an exercise (Exercise 11) to show that this series converges to ex


for all x ∈ R. 

There is a more subtle question involving power series representation of


functions which we have not touched upon. The question concerns the follow-
ing: How is the radius of convergence R of the Taylor series of f related to the
function f ? The full answer to this question requires a knowledge of complex
analysis and thus is beyond the scope of this text. However, we will illustrate
the question and provide a hint of the answer with the following examples. If
f (x) = 1/(1 + x), then the Taylor expansion of f about x = 0 is given by

X
(−1)k xk ,
k=0
396 Introduction to Real Analysis

which has radius of convergence R = 1. This is expected since the given


function f is not defined at x = −1, and thus the series could not have radius
of convergence R > 1. If it did, this would imply that f would then have a
finite limit at x = −1.
On the other hand, the function g(x) = 1/(1+x2 ) is infinitely differentiable
on all of R. However, the Taylor series expansion of g about c = 0 is given by

X
(−1)k x2k ,
k=0

which again has only radius of convergence R = 1. The reason for this is that
even though g is well behaved on R, if we extend g to the complex plane C
by
1
g(z) = ,
1 + z2
then g is not defined when z 2 = −1; i.e., z = ±i, where i is the complex
number that satisfies i2 = −1.

Exercises 8.7
1. Find the radius of convergence of each of the following power series:
∞ 3k ∞ 1
xk (x + 1)2k
P P
a. 3
*b. k
k=1 k k=0 4

 k ∞
 k 2
1 1
xk *d. xk
P P
c. 1− 1−
k=1 k k=1 k
1
∞  x k ∞  , when k is even,
k
ak xk where ak = 2 1
P P
e. k *f.
k=1 2 k=0  , when k is odd.
2k+2
2. For each of the following, determine all values of x for which the given
series converges:
k
∞ k ∞ 3 k xk ∞

P P P 1+x
a. k
, (x 6
= 0) *b. k k
, (x 6
= 1) c. , (x 6= 1)
k=0 x k=1 2 (1 − x) k=0 1−x

3. Using the power series expansion of 1 (1 − x) and its derivatives, find
∞ ∞ ∞ k
kxk , |x| < 1 b. k2 xk , |x| < 1. c.
P P P
*a. k
.
k=1 k=1 k=1 2

4. a. Use Theorem 8.7.16 to show that



 
1 1 5 3
3
1 + x − 1 + x − x2 < x
3 9 81
for all x > 0.

3

3
b. Use the above inequality to approximate 1.2 and 2, and provide
an estimate of the error.
5. Determine how large n must be chosen so that | sin x − Tn (sin, 0)(x)| <
.001 for all x, |x| ≤ 1.
6. Use the Taylor series and remainder estimate of Example 8.7.20(c) to
compute ln 1.2 accurate to four decimal places.
Sequences and Series of Functions 397

ak (x − c)k has radius of convergence R > 0. For
P
7. Suppose f (x) =
k=0
Rx
|x − c| < R, set F (x) = f (t)dt.
c

Prove that

X ak
F (x) = (x − c)k+1 , |x − c| < R.
k+1
k=0

d 1
8. *a. Use the previous exercise and the fact that arctan x = to
dx 1 + x2
obtain the Taylor series expansion of arctan x about c = 0.
b. Use part (a) to obtain a series expansion for π.
*c. How large must n be chosen so that the nth partial sum of the series
in (b) provides an approximation of π correct to four decimal places?
P∞ (−1)k π
9. Use Exercise 8 and Abel’s theorem to prove that = .
k=0 2k + 1 4
10. *Find constants a0 , a1 , a2 , a3 , and a4 such that
x4 + 3x2 − 2x + 5 = a4 (x − 1)4 + a3 (x − 1)3 + a2 (x − 1)2 + a1 (x − 1) + a0 .
11. Prove that the Taylor series of ex (with c = 0) converges to ex for all
x ∈ R.
12. Using any applicable method, find the Taylor series of each of the follow-
ing functions at the indicated point, and specify the interval on which
the series converges to the function
a. f (x) = cosx, c = 0. *b. f (x) = ln x, c = 1
1+x
c. f (x) = ln , c = 0 *d. f (x) = (1 − x)−1/2 , c = 0
1−x
x2
e. f (x) = , c=0 *f. f (x) = arcsin x, c = 0
√1 − x2
g. f (x) = x, c = 1 *h. f (x) = (1 + x)p , c = 0, p real

ak xk , |x| < R, where R > 0. Prove the following:
P
13. Suppose f (x) =
k=0

a. f (x) is even if and only if ak = 0 for all odd k.


b. f (x) is odd if and only if ak = 0 for all even k.

ak xk converges uniformly on [0, 1].
P P
14. If ak converges, prove that
k=0
∞ ∞
ak xk , |x| < R1 , and g(x) = bk xk , |x| < R2 . Prove
P P
15. Suppose f (x) =
k=0 k=0
that
∞ k
c k xk ,
P P
f (x)g(x) = |x| < min{R1 , R2 }, where ck = aj bk−j .
k=0 j=0
2
16. Let f : R → R be defined by f (x) = e−1/x for x 6= 0, and f (0) = 0.
Prove that for each n ∈ N,
2
(
(n) P (1/x)e−1/x , x 6= 0,
f (x) =
0, x = 0,
398 Introduction to Real Analysis

where P is a polynomial of degree 3n.


17. Suppose b > 1. For x ∈ R define b(x) = E(x ln b), where E is the natural
exponential function.
a. Prove that b(r) = br for all r ∈ Q.
b. For x ∈ R, prove that b(x) = sup{br : r ∈ Q, r < x}.

8.8 The Gamma Function


We close this chapter with a brief discussion of the Beta and Gamma functions,
both of which are due to Euler. The Gamma function is closely related to
factorials and arises in many areas of mathematics. The origin, history, and
the development of the Gamma function are described very nicely in the article
by Philip Davis listed in the supplemental reading. Our primary application
of the Gamma function will be in the Taylor expansion of (1 − x)−α , where
α > 0 is arbitrary.

DEFINITION 8.8.1 For 0 < x < ∞, the Gamma function Γ(x) is de-
fined by Z ∞
Γ(x) = tx−1 e−t dt. (22)
0

When 0 < x < 1, the integral in (22) is an improper integral not only at ∞,
but also at 0. The convergence of the improper integral defining Γ(x), x > 0,
was given as Exercise 9 in Section 6.4. The Graph of Γ(x) for 0 < x < 5 is
given in Figure 8.5. The following properties of the Gamma function show
that it is closely related to factorials.

THEOREM 8.8.2 (a) For each x, 0 < x < ∞, Γ(x + 1) = x Γ(x).


(b) For n ∈ N, Γ(n + 1) = n!.

Proof. Let 0 < c < R < ∞. We apply integration by parts to


Z R
tx e−t dt.
c

x ′ −t
With u = t and v = e ,
Z R R Z R
x −t x −t
t e dt = − t e +x tx−1 e−t dt
c c c
R
Rx
Z
=− + cx e−c + x tx−1 e−t dt.
eR c
Sequences and Series of Functions 399

FIGURE 8.5
Graph of Γ(x), 0 < x < 5

Since lim cx e−c = 0 and lim Rx e−R = 0, taking the appropriate limits in
c→0+ R→∞
the above yields
Z ∞ Z ∞
Γ(x + 1) = tx e−t dt = x tx−1 e−t dt = x Γ(x).
0 0

This proves (a). For the proof of (b) we first note that
Z ∞
Γ(1) = e−t dt = 1.
0

Thus by induction, Γ(n + 1) = n!. 

EXAMPLE√8.8.3 Since the value of Γ( 21 ) occurs frequently, we now show


that Γ( 12 ) = π. By definition,
Z ∞
1
Γ( 21 ) = t− 2 e−t dt.
0
2
With the substitution t = s ,
Z ∞ Z ∞
2
− 12 −t
t e dt = 2 e−s ds.
0 0

To complete the result, we need to evaluate the so-called probability integral


R∞ −s2
e ds. This can be accomplished by the following trick using the change of
0
variables theorem from multivariable calculus. Consider the double integral
Z ∞Z ∞
2 2
J= e−x −y dxdy.
0 0
400 Introduction to Real Analysis

By changing to polar coordinates

x = r cos θ, y = r sin θ,

with 0 < r < ∞, θ ∈ [0, π2 ],


Z ∞ Z π/2
2
J= e−r r drdθ
0 0

π π
Z
2
= e−r r dr = .
2 0 4
On the other hand,
Z ∞ Z ∞ Z ∞ 2
−x2 −y 2 −x2
J= e e dx dy = e dx .
0 0 0

Therefore, √

π
Z
−x2
e dx = ,
0 2
from which the result follows. 

The Binomial Series


As an application of the Gamma function we will derive the power series
expansion of f (x) = (1 − x)−α , where α > 0 is real. The coefficients of this
expansion are expressed very nicely in terms of the Gamma function. By
Example 8.7.10, for n ∈ N,

1 X
(1 − x)−n = (k + n − 1) · · · (k + 1) xk
(n − 1)!
k=0
X (k + n − 1)! ∞
1
= xk ,
(n − 1)! k!
k=0

which in terms of the Gamma function gives



1 1 X Γ(k + n) k
n
= x .
(1 − x) Γ(n) k!
k=0

We will now prove that this formula is still valid for all α ∈ R with α > 0.

THEOREM 8.8.4 (Binomial Series) For α > 0,



1 1 X Γ(n + α) n
= x , |x| < 1. (23)
(1 − x)α Γ(α) n=0 n!
Sequences and Series of Functions 401

Proof. We first show that the radius of convergence of the series (23) is R = 1.
Set an = Γ(n + α)/n!. Then
an+1 Γ(n + 1 + α) n!
= .
an (n + 1)! Γ(n + α)
But by Theorem 8.8.2, Γ(n + 1 + α) = (n + α)Γ(n + α). Therefore,
an+1 n+α
lim = lim = 1,
n→∞ an n→∞ n + 1

and as a consequence of Theorem 7.1.10 we have R = 1.


To show that the series actually converges to (1 − x)−α , we set

1 X Γ(n + α) n
fα (x) = x , |x| < 1.
Γ(α) n=0 n!

Since a power series can be differentiated term-by-term,



1 X n Γ(n + α) n−1
fα′ (x) = x .
Γ(α) n=1 n!

Multiplying by (1 − x) gives

1 X n Γ(n + α)
(1 − x) fα (x) = (1 − x) xn−1
Γ(α) n=1 n!
"∞ ∞
#
1 X Γ(n + α) X n Γ(n + α)
= xn−1 − xn
Γ(α) n=1 (n − 1)! n=1
n!
∞  
1 X Γ(n + 1 + α) n Γ(n + α)
= − xn .
Γ(α) n=0 n! n!

But Γ(n + 1 + α) − n Γ(n + α) = α Γ(n + α). Therefore,

(1 − x)fα′ (x) = αfα (x).

As a consequence,
d
[(1 − x)α fα (x)] = −α (1 − x)α−1 fα (x) + (1 − x)α fα′ (x)
dx
= −α (1 − x)α−1 fα (x) + α (1 − x)α−1 fα (x) = 0.

Therefore (1−x)α fα (x) is equal to a constant for all x, |x| < 1. But fα (0) = 1.
Thus (1 − x)α fα (x) = 1; that is

fα (x) = (1 − x)−α ,

which proves the result. 


402 Introduction to Real Analysis

The Beta Function


There are a number of important integrals that can be expressed in terms of
the Gamma function. Some of these, which can be obtained by a change of
variables, are given in the exercises. There is one integral however which is
very important and thus we state it as a theorem. Since the proof is nontrivial
and would take us too far astray, we state the result without proof. For a proof
of the theorem the reader is referred to Theorem 8.20 in the text by Rudin.

THEOREM 8.8.5 For x > 0, y > 0,


Z 1
Γ(x)Γ(y)
tx−1 (1 − t)y−1 dt = . (24)
0 Γ(x + y)

The function
Γ(x)Γ(y)
B(x, y) = , x, y > 0,
Γ(x + y)
is called the Beta function.

Exercises 8.8
1. *a. Compute Γ( 23 ), Γ( 72 ).

(2n)! π
b. Prove that for n ∈ N, Γ(n + 21 ) = .
4n n!
2. *By making a change of variable, prove that
Z 1 x−1
1
Γ(x) = ln dt, 0 < x < ∞.
0 t
3. Evaluate each of the following definite integrals:
Z ∞ Z 1 n
3 1
*a. e−t t 2 dt b. ln dt, n ∈ N.
0 0 t
4. By making the change of variable t = sin2 u in Theorem 8.8.5, prove that
Z π
2 1 Γ(n)Γ(m)
(sin u)2n−1 (cos u)2m−1 du = , n, m > 0.
0 2 Γ(n + m)
5. Evaluate each of the following integrals:
Z π Z π
2 2
*a. (sin x)2n dx, n ∈ N. b. (sin x)2n+1 dx, n ∈ N.
0 0
6. Use the binomial series and term-by-term integration to find the power
series expansion of
Z x
arcsin x = (1 − t2 )−1/2 dt.
0
 
α
7. Let α > 0. Set = 1 and for k ∈ N set
0
 
α α(α − 1)(α − 2) · · · (α − k + 1)
= .
k k!
Sequences and Series of Functions 403

Note, if m ∈ N,
   
m m! m
= , k ≤ m, and = 0 for k > m.
k k!(m − k)! k

 
α
xk converges uniformly and absolutely
P
a. Prove that the series
k=0 k
for x ∈ [−1, 1].

 
α
xk = (1 + x)α , x ∈ [−1, 1].
P
b. Prove that
k=0 k

Notes
Without question the most important concept of this chapter is that of uniform
convergence of a sequence or series of functions. It is the additional hypothesis
required in proving that the limit function of a sequence of continuous or integrable
functions is again continuous or integrable. As was shown by numerous examples,
pointwise convergence is not sufficient. For differentiation, uniform convergence of
{fn } is not sufficient; what is also required is uniform convergence of the sequence
of derivatives {fn′ }.
The example of Weierstrass (Example 8.5.3) is interesting for several reasons.
First, it provides an example of a continuous function which is nowhere differentiable
on R. Furthermore, it also provides an example of a sequence of infinitely differen-
tiable functions which converges uniformly on R, but for which the limit function is
nowhere differentiable. Exercise 7 of Section 8.5 provides another construction of a
continuous function f that is nowhere differentiable. Although this construction is
much easier, the partial sums of the series defining the function f are themselves not
differentiable everywhere. Thus it is not so surprising that f itself is not differentiable
anywhere on R.
The proof of the Weiserstrass approximation theorem presented in the text is
only one of the many poofs available. A constructive proof by S. N. Bernstein using
the so-called Bernstein polynomials can be found on p. 107 of the text by Natanson
listed in the Bilbiography. The proof in the text using approximate identities was
chosen because the technique involved is very important in analysis and will be en-
countered later in the text. In Theorem 9.4.5 we will prove a variation of the Weier-
strass approximation theorem. At that point we will show that every continuous
real-valued function on [−π, π] with f (−π) = f (π) can be uniformly approximated
to within a given ǫ by a finite sum of a trigonometric series.
404 Introduction to Real Analysis

Miscellaneous Exercises
1. Using Miscellaneous Exercise 1 of Chapter 6 and the Weierstrass approx-
imation theorem, prove the following: If f ∈ R[a, b] and ǫ > 0 is given,
then there exists a polynomial P such that
Z b
|f − P | < ǫ.
a
2. Define f on R by
 
−1

c exp , |x| < 1,
f (x) = 1 − x2 ,
0, |x| ≥ 1,

R∞
where exp(x) = ex , and c > 0 is chosen so that f (x) dx = 1. For
−∞
1 1
λ > 0, set fλ (x) = f ( x).
λ λ
a. Prove that fλ ∈ C ∞ (R) for all λ > 0.
R∞
b. Prove that fλ (x) = 0 for all x ∈ R, |x| ≥ λ, and that fλ (x)dx = 1.
−∞
R
c. Prove that for every δ > 0, lim fλ (t) dt = 0.
λ→0+ {δ≤|t|}

3. In this exercise, we show how the trigonometric functions may be defined


by means of power series. Define the functions S and C on R by
∞ ∞
X (−1)k 2k+1 X (−1)k 2k
S(x) = x , C(x) = x .
(2k + 1)! (2k)!
k=0 k=0

a. Show that the power series defining S and C converge for all x ∈ R.
b. Show that S ′ (x) = C(x) and C ′ (x) = −S(x), x ∈ R.
c. Show that S ′′ (x) = −S(x) and C ′′ (x) = −C(x).
d. Show that if f : R → R satisfies f ′′ (x) = −f (x) with f (0) = 0, f ′ (0) =
1, then f (x) = S(x) for all x ∈ R.
e. If f : R → R satisfies f ′′ (x) = −f (x), prove that there exist constants
c1 , c2 such that f (x) = c1 S(x) + c2 C(x).
f. Show that (S(x))2 + (C(x))2 = 1. (Hint: Consider the function f (x) =
(S(x))2 + (C(x))2 .)
g. Show that C(x+y) = C(x)C(y)−S(x)S(y) and S(x+y) = S(x)C(y)+
C(x)S(y) for all x, y ∈ R.
Sequences and Series of Functions 405

Supplemental Reading

Andrushkiw, J. W. , “A note on mul- Lewin, J. W., “Some applications of


tiple series of positive terms,” Amer. the bounded convergence theorem for an
Math. Monthly 68 (1961), 253–258. introductory course in analysis,” Amer.
Billingsly, P., “Van der Waerden’s Math. Monthly 94 (1987), 988–993.
continuous nowhere differentiable func- Mathé, P., “Approximation of Hölder
tion,” Amer. Math. Monthly 89 (1982), continuous functions by Bernstein poly-
691. nomials,” Amer. Math. Monthly 106
Blank, A. A., “A simple example of (1999), 568–725.
a Weierstrass function,” Amer. Math. McLoughlin. P. F., “A simple proof
Monthly 73 (1966), 515–519. of Taylor’s Theorem,” Amer. Math.
Boas, Jr., R. P., “Partial sums of in- Monthly 120 (2013), 767–768.
finite series and how they grow,” Amer. Miller, K. S., “Derivatives of non-
Math. Monthly 84 (1977), 237–258. integer order,” Math. Mag. 68 (1995),
Boas, Jr., R. P. and Pollard, H., 183–192.
“Continuous analogues of series,” Amer. Minassian, D. P. and Gaisser, J. W.,
Math. Monthly 80 (1973), 18–25. “A simple Weierstrass function,” Amer.
Cunningham, Jr., F., “Taking limits Math. Monthly 91 (1984), 254–256.
under the integral sign,” Math. Mag. 40 Neuschel, T., “A new proof of Stir-
(1967), 179–186. ling’s formula,” Amer. Math. Monthly
Davis, P. J., “Leonhard Euler’s inte- 121 (2014), 350–352.
gral; a historical profile of the Gamma
Patin, J. M., “A very short proof
function,” Amer. Math. Monthly 66
of Stirling’s formula,” Amer. Math.
(1959), 849–869.
Monthly 96 (1989), 41–42.
de Silva, N., “A concise elementary
Roy, R., “The discovery of the series
proof of Arzelá’s bounded convergence
formula for π by Leibniz, Gregory and
theorem,” Amer. Math. Monthly 117
Nilakantha,” Math. Mag. 63 (1990), 291–
(2010), 918–920.
306.
Ferguson, Le Baron O., “What can
be approximated by polynomials with in- Sagan, H., “An elementary proof
teger coefficients,” Amer. Math. Monthly that Schoenberg’s space filling curve is
113 (2006), 403–414. nowhere differentiable,” Math. Mag. 65
French, A. P., “The integral defini- (1992), 125–128.
tion of the logarithm and the logarithmic Schenkman, E., “The Weierstrass
series,” Amer. Math. Monthly 85 (1978), approximation theorem,” Amer. Math.
580–582. Monthly 79 (1972), 65–66.
Garcia-Caballero, E. M. and Moreno, Weinstock, R., “Elementary evalua-
R ∞ −x2 R∞
S. G., “Yet another generalization of R ∞ of 2 0 e
tions dx, 0 cos x2 dx, and
a celebrated inequality of the Gamma 0
sin x dx,” Amer. Math. Monthly 97
function,” Amer. Math. Monthly 120 (1990), 39–42.
(2013), 821. Wen, L., “A nowhere differentiable
Kestleman, H., “Riemann integration continuous function constructed by in-
of limit functions,” Amer. Math. Monthly finite products,” Amer. Math. Monthly
77 (1970), 182–187. 109 (2002), 378–380.
9
Fourier Series

In this chapter, we consider the problem of expressing a real-valued periodic


function of period 2π in terms of a trigonometric series

X
1
2 a0 + (an cos nx + bn sin nx),
n=1

where the an and bn are real numbers. As we will see, such series afford
much greater generality in the type of functions that can be represented as
opposed to Taylor series. The study of trigonometric series has its origins in
the monumental work of Joseph Fourier (1768–1830) on heat conduction in
solids. His 1807 presentation to the French Academy introduced a whole new
subject area in mathematics while at the same time providing very useful
techniques for solving physical problems.
Fourier’s work is the source of all modern methods in mathematical physics
involving boundary value problems and has been a source of new ideas in
mathematical analysis for the past two centuries. To see how greatly mathe-
matics has been influenced by the studies of Fourier one only needs to look
at the two volume work Trigonometric Series by A. Zygmund (Cambridge
University Press, 1968). In addition to trigonometric series, Fourier’s origi-
nal method of separation of variables leads very naturally to the study of
orthogonal functions and the representation of functions in terms of a series
of orthogonal functions. All of these have many applications in mathematical
physics and engineering.
Fourier initially claimed and tried to show, with no success, that the
Fourier series expansion of a function actually represented the function. Al-
though his claim is false, in view of the eighteenth century concept of a func-
tion this was not an unrealistic expectation. Fourier’s claim had an immediate
impact on nineteenth century mathematics. It caused mathematicians to re-
consider the definition of “function.” The question of what type of function has
a Fourier series expansion also led Riemann to the development of the theory
of the integral and the notion of an integrable function. The first substantial
progress on the convergence of a Fourier series to its defining function is due
to Dirichlet in 1829. Instead of trying to prove like Fourier that the Fourier
series always converged to its defining function, Dirichlet considered the more
restrictive problem of trying to find sufficient conditions on the function f for
which the Fourier series converges pointwise to the function.

407
408 Introduction to Real Analysis

In the first section we provide a brief introduction to the theory of orthogo-


nal functions and to the concept of approximation in the mean. In Section 9.2
we also introduce the notion of a complete sequence of orthogonal functions
and show that this is equivalent to the convergence in the mean of the se-
quence of partial sums of the Fourier series to its defining function. The proof
of the completeness of the trigonometric system {1, sin nx, cos nx}∞ n=1 will be
presented in Section 9.4. In this section, we also prove Fejér’s theorem on the
uniform approximation of a continuous function by the nth partial sum of a
trigonometric series. In the final section, we present Dirichlet’s contributions
to the pointwise convergence problem.

9.1 Orthogonal Functions


In this section, we provide a brief introduction to the topic of orthogonal
functions and the question of representing a function by means of a series of
orthogonal functions. Although these topics have their origins in the study of
partial differential equations and boundary value problems1 , they are closely
related to concepts normally encountered in the study of vector spaces.
If X is a vector space over R, a function h , i : X × X → R is an inner
product on X if
(a) hx, xi ≥ 0 for all x ∈ X,
(b) hx, xi = 0 if and only if x = 0,
(c) hx, yi = hy, xi for all x, y ∈ X, and
(d) hax + by, zi = ahx, zi + bhy, zi for all x, y, z ∈ X and a, b ∈ R.
In Rn , the usual inner product is given by
n
X
ha, bi = a j bj
j=1

for a = (a1 , ..., an ) and b = (b1 , ..., bn ) in Rn . If h , i is an inner product on X,


then two non-zero vectors x, y ∈ X are orthogonal if hx, yi = 0. The term
“orthogonal” is synonymous with “perpendicular” and comes from geometric
considerations in Rn . Two non-zero vectors a and b in Rn are orthogonal if
and only if they are mutually perpendicular; that is, the angle θ between the
two vectors a and b is π/2 or 90◦ (see Exercise 9, Section 7.4).
In the study of analysis we typically encounter vector spaces whose ele-
ments are functions. For example, in previous sections we have shown that the
space ℓ2 of square summable sequences as well as the space C[a, b] of contin-
uous real-valued functions on [a, b] are vector spaces over R. With the usual
1 For a detailed treatment of this subject the reader is referred to the texts by Berg and

McGregor or by Weinberger listed in the Bibliography.


Fourier Series 409

rules of addition and scalar multiplication, R[a, b], the set of Riemann inte-
grable functions on [a, b], is also a vector space over R. If for f, g ∈ R[a, b] we
define Z b
hf, gi = f (x)g(x) dx,
a

then it is easily shown that h , i satisfies (a), (c), and (d) of the definition of
an inner product. It however does not satisfy (b). If a < b and c1 , ..., cn are a
finite number of points in [a, b], then the function
(
0, x 6= ci ,
f (x) =
1, x = ci ,

is in R[a, b] satisfying hf, f i = 0, but f is not the zero function. Thus tech-
nically h , i is not an inner product on R[a, b], a minor difficulty which can
easily be overcome by defining two Riemann integrable functions f and g to
be equal if f (x) = g(x) for all x ∈ [a, b] except on a set of measure zero. This
will be explored in greater detail in Chapter 10. Alternately, if we restrict
ourself to the subset C[a, b] of R[a, b], then hf, gi as defined above is an inner
product on C[a, b] (Exercise 11).

Orthogonal Functions
We now define orthogonality with respect to the above inner product on
R[a, b].

DEFINITION 9.1.1 A finite or countable collection of Riemann integrable


Rb
functions {φn } on [a, b] satisfying φ2n 6= 0 is orthogonal on [a, b] if
a

Z b
hφn , φm i = φn (x)φm (x) dx = 0, for all n 6= m.
a

EXAMPLES 9.1.2 (a) For our first example we consider the two functions
φ(x) = 1 and ψ(x) = x, x ∈ [−1, 1]. Since
Z 1 Z 1
φ(x)ψ(x)dx = x dx = 0,
−1 −1

the functions φ and ψ are orthogonal on the interval [−1, 1].


(b) In this example, we show that the sequence of functions {sin nx}∞
n=1
is orthogonal on [−π, π]. By the trigonometric identity

sin A sin B = 21 [cos(A − B) − cos(A + B)],


410 Introduction to Real Analysis

for n 6= m,
Z π
1 π
Z
sin nx sin mx dx = [cos(n − m)x − cos(n + m)x] dx
−π 2 −π
  π
1 sin(n − m)x sin(n + m)x
= − = 0.
2 (n − m) (n + m) −π

For future reference, when n = m,


Z π
1 π
Z
sin2 nx dx = (1 − cos 2nx) dx
−π 2 −π
  π
1 sin 2nx
= x− = π.
2 2n −π

(c) As our final example we consider the collection


nπx ∞
{1, sin nπx
L , cos L }n=1

on the interval [−L, L] where L > 0. As in (b), if n 6= m, then


Z L
sin nπx mπx
L sin L dx = 0.
−L
nπx
Thus the collection {sin L } is orthogonal on [−L, L]. Also, by the trigono-
metric identities
cos A cos B = 21 [cos(A − B) + cos(A + B)]
sin A cos B = 21 [sin(A − B) + sin(A + B)],
we have for n 6= m
Z L Z L
nπx mπx
cos L cos L dx = sin nπx mπx
L cos L dx = 0.
−L −L

Thus the functions in the collection {cos nπxL } are all orthogonal on [−L, L] as
are the function sin nπx
L and cos mπx
L for all n, m ∈ N with n 6= m. For m = n
Z L L
2 nπx
sin nπx
L cos nπx
L dx = L
2nπ sin L = 0.
−L −L
nπx
This last identity shows that the functions sin and cos nπx
L L are also or-
thogonal on [−L, L] for all n ∈ N. Finally, since
Z L Z L
nπx
sin L dx = cos nπx
L dx = 0
−L −L

for all n ∈ N, the constant function 1 is orthogonal to sin nπx nπx


L and cos L for
all n ∈ N. In this example, we also have
Z L Z L
sin2 nπx
L dx = cos2 nπx
L dx = L. 
−L −L
Fourier Series 411

If in Example 9.1.2(b) we define φn (x) = √1π sin nx, then the sequence
{φn (x)}∞
n=1 satisfies

Z π (
0, when n 6= m,
φn (x)φm (x) dx =
−π 1, when n = m.

Such a sequence of orthogonal functions is given a special name.

DEFINITION 9.1.3 A finite or countable collection of Riemann integrable


functions {φn } is orthonormal on [a, b] if
Z b (
0, when n 6= m,
φn (x) φm (x) dx =
a 1, when n = m.

Given a collection {φn } of orthogonal functions on [a, b], we can always


construct a family {ψn } of orthonormal functions on [a, b] by setting
1
ψn (x) = φn (x),
cn
where cn is defined by
Z b
c2n = φ2n (x) dx.
a

Approximation in the Mean


Let {φn } be a finite or countable family of orthogonal functions defined on
an interval [a, b]. For each N ∈ N and c1 , ..., cN ∈ R, consider the N th partial
sum
N
X
SN (x) = cn φn (x). (1)
n=1

A natural question is, given a real-valued function f on [a, b], how must the
coefficients cn be chosen so that SN gives the best approximation to f on [a, b]?
In the Weierstrass approximation theorem we have already encountered one
form of approximation; namely uniform approximation or approximation in
the uniform norm. However, for the study of orthogonal functions there is
another type of norm approximation that turns out to be more useful.
If X is a vector space over R with inner product h , i, then there is a natural
norm on X associated with this inner product. If for x ∈ X we define
p
kxk = hx, xi,

then k k is a norm on X as defined in Definition 7.4.8. The details that k k


is a norm is left to the exercises (Exercise 12). The crucial step in proving the
412 Introduction to Real Analysis

triangle inequality for k k is the following version of the Cauchy-Schwarz


inequality: For all x, y ∈ X,
|hx, yi| ≤ kxk kyk.
The proof of this inequality follows verbatim the proof of Theorem 7.4.3. For
Rb
the vector space R[a, b] with inner product hf, gi = a f (x)g(x) dx, the norm
of a function f , denoted kf k2 , is given by
"Z #1/2
b
kf k2 = (f (x))2 dx .
a

Thus for f ∈ R[a, b], the natural problem to consider is how must the constants
cn be chosen in order to minimize the quantity
Z b
2
kf − SN k2 = [f (x) − SN (x)]2 dx ?
a

This type of norm approximation is referred to as approximation in the


mean or least squares approximation. The following theorem specifies the
choice of {cn } so that SN provides the best approximation to f in the mean.

THEOREM 9.1.4 Let f ∈ R[a, b] and let {φn } be a finite or countable


collection of orthogonal functions on [a, b]. For N ∈ N, let SN be defined by
(1). Then the quantity
Z b
[f (x) − SN (x)]2 dx,
a
is minimal if and only if
Rb
a
f (x)φn (x)dx
cn = Rb , n = 1, 2, ...N. (2)
φ2 (x)dx
a n

Furthermore, for this choice of cn ,


Z b Z b N
X Z b
[f (x) − Sn (x)]2 dx = f 2 (x) dx − c2n φ2n (x)dx. (3)
a a n=1 a

Prior to proving the result, we give the following alternate statement of


the previous theorem.
N
P
COROLLARY 9.1.5 Let f ∈ R[a, b] and let SN (x) = cn φn (x) where
n=1
N
P
the cn are defined by (2). If TN (x) = an φn (x), an ∈ R, then
n=1
Z b Z b
2
[f (x) − SN (x)] dx ≤ [f (x) − TN (x)]2 dx,
a a

for any choice of an , n = 1, 2, ..., N .


Fourier Series 413

Proof of Theorem 9.1.4 For fixed N ∈ N,


Z b
0≤ [f (x) − SN (x)]2 dx
a
Z b Z b Z b
= f 2 (x) dx − 2 f (x)SN (x) dx + 2
SN (x) dx. (4)
a a a

By linearity of the integral,


Z b N
X Z b
f (x)SN (x) dx = cn f (x)φn (x) dx.
a n=1 a

Also,
N N
!
Z b Z b X X Z b
2
SN (x) dx = SN (x) cn φn (x) dx = cn SN (x)φn (x) dx.
a a n=1 n=1 a

But
Z b N
X Z b
SN (x)φn (x) = ck φk (x)φn (x) dx,
a k=1 a

which by orthogonality,
Z b
= cn φ2n (x) dx.
a

Therefore,
Z b N
X Z b
2
SN (x) dx = c2n φ2n (x) dx.
a n=1 a

Upon substituting into (4) we obtain


Z b
0≤ [f (x) − SN (x)]2 dx
a
Z b N
X Z b N
X Z b
= f 2 (x) dx − 2 cn f (x)φn (x) dx + c2n φ2n (x) dx,
a n=1 a n=1 a

which upon completing the square


hR i2
Rb #2 b
N Z N f φn
"
b b
f φn
Z X X a
= f 2 (x)dx + φ2n (x)dx cn − Ra b − Rb .
a n=1 a
a n
φ2 n=1 a
φ2n
414 Introduction to Real Analysis

FIGURE 9.1
Graphs of f and S2

The coefficients cn occur only in the middle term. Since this term is nonneg-
ative, the right side is a minimum if and only if
Rb
f φn
cn = Ra b , n = 1, 2, ..., N.
φ 2
a n

With this choice of cn , we also obtain formula (3) upon substitution. 

EXAMPLE 9.1.6 As was previously shown, the functions φ1 (x) = 1 and


φ2 (x) = x are orthogonal on [−1, 1]. Let f (x) = x3 + 1. Then
R1
f (x)φ1 (x)dx 1 1 3
Z
c1 = −1R1 2 = (x + 1) dx = 1,
φ1 (x)dx 2 −1
−1

and
R1
f (x)φ2 (x)dx 1
3 3
Z
−1
c2 = R1 = (x4 + x) dx = .
φ2 (x)dx 2 −1 5
−1 2

Therefore, S2 (x) = 1 + 35 x is the best approximation in the mean to f (x) =


1 + x3 on [−1, 1]. The graphs of f and S2 are given in Figure 9.1. 

DEFINITION 9.1.7 Let {φn }∞ n=1 be a sequence of orthogonal functions on


[a, b] and let f ∈ R[a, b]. For each n ∈ N, the number
Rb
f (x)φn (x) dx
cn = a R b (5)
φ 2 (x) dx
a n
Fourier Series 415

is called the Fourier coefficient of f with respect to {φn }. The series



P
cn φn (x) is called the Fourier series of f . This is denoted by
n=1


X
f (x) ∼ cn φn (x). (6)
n=1

Remark. The notation “∼” in (6) only means that the coefficients {cn } in
the series are given by formula (5). Nothing is implied about convergence of
the series!

EXAMPLE 9.1.8 In Example 9.1.2(b) it was shown that the sequence of


functions {sin nx}∞
n=1 is orthogonal on [−π, π]. Since
Z π
sin2 nx dx = π, n = 1, 2, ....,
−π

if f ∈ R[−π, π], the Fourier coefficients cn , n = 1, 2, .., of f with respect to


the orthogonal system {sin nx} are given by
Z π
1
cn = f (x) sin nx dx,
π −π

and the Fourier series of f becomes



X
f (x) ∼ cn sin nx.
n=1

As indicated above, nothing is implied about convergence. Even if the series


should converge, it need not converge to the function f . Since the terms of
the series are odd functions of x, the series, if it converges, defines an odd
function on [−π, π]. Thus unless f itself is odd, the series could not converge
to f . For example, if f (x) = 1, then
Z π π
1 −1
cn = sin nx dx = cos nx = 0.
π −π nπ −π

In this case, the series converges for all x, but clearly not to f (x) = 1. 

Bessel’s Inequality
For each N ∈ N, let SN (x) denote the N th partial sum of the Fourier series
of f ; i.e.,
N
X
SN (x) = cn φn (x),
n=1
416 Introduction to Real Analysis

where the cn are the Fourier coefficients of f with respect to the sequence
{φn } of orthogonal functions on [a, b]. Then by identity (3) of Theorem 9.1.4,
Z b N
X Z b
0≤ f 2 (x) dx − c2n φ2n (x) dx.
a n=1 a

Therefore
N
X Z b Z b
c2n φ2n (x) dx ≤ f 2 (x) dx.
n=1 a a

Since this holds for every N ∈ N, by letting N → ∞ we obtain the following


inequality.

THEOREM 9.1.9 (Bessel’s Inequality) If f ∈ R[a, b] and {cn }∞ n=1 are


the Fourier coefficients of f with respect to the sequence of orthogonal func-
tions {φn }∞
n=1 , then

X∞ Z b Z b
c2n φ2n (x) dx ≤ f 2 (x) dx.
n=1 a a


In Example 9.1.8 with f (x) = 1, f 2 (x)dx = 2π, and cn = 0 for all
−π
n = 1, 2, ..... Thus it is clear that equality need not hold in Bessel’s inequality.
However, there is one consequence of Theorem 9.1.9 which will prove useful
later.

COROLLARY 9.1.10 Suppose {φn }∞ n=1 is a sequence of orthogonal func-


tions on [a, b]. If f ∈ R[a, b], then
Rb
f (x)φn (x) dx
lim aqR = 0.
n→∞ b 2
a
φ n (x) dx

Rb
Proof. Since f ∈ R[a, b], f 2 (x) dx is finite. Thus by Bessel’s inequality, the
a
P 2 Rb 2
series cn a φn converges. As a consequence,
Z b
lim c2n φ2n (x) dx = 0.
n→∞ a

and thus
s Rb
b
f (x)φn (x) dx
Z
a
lim cn φ2n (x) dx = lim qR = 0. 
n→∞ a n→∞ b 2
φ
a n
(x) dx
Fourier Series 417

Exercises 9.1
1. *Let f (x) = sin πx, φ1 (x) = 1, and φ2 (x) = x. Find c1 and c2 so that
S2 (x) = c1 φ1 (x) + c2 φ2 (x) gives the best approximation in the mean to
f on [−1, 1].
2. a. Show that the polynomials P0 (x) = 1, P1 (x) = x, and P2 (x) = 32 x2 − 12
are orthogonal on [−1, 1].
b. Let
(
0, −1 ≤ x < 0,
f (x) =
1, 0≤x≤1
Find the constants c0 , c1 , and c2 , such that S2 (x) = c0 P0 (x) + c1 P1 (x) +
c2 P2 (x) gives the best approximation in the mean to f on [−1, 1].
3. *a. Let φ0 (x) = 1, φ1 (x) = x − a1 , φ2 (x) = x2 − a2 x − a3 . Determine the
constants a1 , a2 , and a3 , so that {φ0 , φ1 , φ2 } are orthogonal on [0, 1].
*b. Find the polynomial of degree less than or equal to 2 that best
approximates f (x) = sin πx in the mean on [0, 1].
4. Let {φn }∞
n=1 be aPsequence of orthogonal
P functions on [a, b]. For f, g ∈
R[a, b] with f ∼ an φn and g ∼ bn φn , show that for α, β ∈ R,

P
(αf + βg) ∼ (αan + βbn )φn .
n=1

5. a. Show that the sequence {sin nx}∞


n=1 is orthogonal on [0, π].

b. For f ∈ R[0, π], show that the Fourier series of f with respect to the
sequence {sin nx} is given by

P 2 Rπ
bn sin nx where bn = f (x) sin nx dx.
n=1 π 0

*c. Find the Fourier series of f (x) = x on [0, π] in terms of the orthogonal
sequence {sin nx}.
6. a. Show that the sequence {1, cos nx}∞
n=1 is orthogonal on [0, π].

b. For f ∈ R[0, π], show that the Fourier series of f with respect to the
above orthogonal sequence is given by

1
P
a +
2 0
an cos nx
n=1

2 Rπ 2 Rπ
where a0 = f (x) dx and an = f (x) cos nx dx, n = 1, 2, ....
π0 π0
*c. Find the Fourier series of f (x) = x on [0, π] in terms of the orthogonal
sequence {1, cos nx}.
7. If f ∈ R[0, π], prove that
Rπ Rπ
lim f (x) sin nx dx = lim f (x) cos nx dx = 0.
n→∞ 0 n→∞ 0

8. Let {φn } be a sequence of orthogonal functions on [a, b]. If the series


P∞
an φn (x) converges uniformly to a function f (x) on [a, b], prove that
n=1
for each n ∈ N, an is the Fourier coefficient of f .
418 Introduction to Real Analysis

9. Let {an } be a sequence in (0, 1) satisfying 0 < an+1 < an < 1 for all
n ∈ N. Define φn on [0, 1] by


 0, 0 ≤ x < an+1 ,
2(x − a )

n+1

, an+1 ≤ x < 12 (an+1 + an )


an − an+1

φn (x) =
−2(x − an ) 1

 , 2
(an+1 + an ) ≤ x ≤ an
an − an+1




0, an < x ≤ 1.

Show that {φn } is orthogonal in R[0, 1] and compute kφn k2 for each
n ∈ N.
10. Let P0 (x) = 1 and for n ∈ N let
1 dn
Pn (x) = n (1 − x2 )n , x ∈ [−1, 1].
2 n! dxn
The polynomials Pn are called the Legendre polynomials on [−1, 1].
a. Find P1 , P2 , and P3 .
b. Show that the sequence {Pn }∞ n=0 is orthogonal on [−1, 1]. (Hint: Use
repeated integration by parts.)
Rb
11. For f, g ∈ C[a, b] define hf, gi = a f (x)g(x)dx. Prove that h , i is an
inner product on C[a, b].
12. Let X be a vector space over R with inner product h , i. Prove each of
the following:
*a. |hx, yi| ≤ kxk kyk for all x, y ∈ X.
b. k k is a norm on X.
13. (Cauchy-Schwarz Inequality) Use the previous exercise to prove that
if f, g ∈ R[a, b], then
2
Rb
b  b 
R 2 R 2
f (x)g(x)dx ≤ f (x) dx g (x) dx .
a a a

14. Let X be a vector space over R with inner product h , i. If {y1 , ..., yn }
are non-zero orthogonal vectors in X and x ∈ X, prove that the quantity
kx − (c1 y1 + · · · cn yn )k is a minimum if and only if
hx, yi i
ci =
kyi k2
for all i = 1, ..., n. (Imitate the proof of Theorem 9.1.4)

9.2 Completeness and Parseval’s Equality


In this section, we look for necessary and sufficient conditions on the sequence
{φn } of orthogonal functions on [a, b] for which equality holds in Bessel’s
inequality. To accomplish this it will be useful to introduce the notion of
convergence in the mean.
Fourier Series 419

DEFINITION 9.2.1 A sequence {fn } of Riemann integrable function on


[a, b] converges in the mean to f ∈ R[a, b] if
Z b
lim [f (x) − fn (x)]2 dx = 0.
n→∞ a

If we consider R[a, b] as a normed linear space with norm


"Z #1/2
b
2
kf k2 = (f (x)) dx ,
a

then convergence in the mean is nothing else but convergence in norm


as defined in Definition 8.3.9. Thus a sequence {fn } in R[a, b] converges to
f ∈ R[a, b] in the mean if and only if lim kf − fn k2 = 0. Convergence in the
n→∞
mean is sometimes also referred to as mean-square convergence.
It is natural to ask how convergence is the mean is related to pointwise or
uniform convergence. Our first theorem proves that uniform convergence im-
plies convergence in the mean. As should be expected, pointwise convergence
is not sufficient (Exercise 2). In the other direction, we will show in Example
9.2.3 that convergence in the mean does not imply pointwise convergence, and
thus certainly not uniform convergence. There we construct a sequence {fn }
of Riemann integrable functions on [0, 1] such that kfn k2 → 0, but for which
{fn (x)} fails to converge for any x ∈ [0, 1].

THEOREM 9.2.2 If f, fn , n = 1, 2, ...., are Riemann integrable on [a, b],


and {fn } converges uniformly to f on [a, b], then {fn } converges in the mean
to f on [a, b].

Proof. Since the proof of this result is similar to the proof of Theorem 8.4.1,
we leave it as an exercise (Exercise 1). 

EXAMPLE 9.2.3 In this example, we construct a sequence {fn } on [0, 1]


that converges to zero in the mean, but for which {fn (x)} does not converge
for any x ∈ [0, 1]. This sequence is constructed as follows: For each n ∈ N,
write n = 2k + j where k = 0, 1, 2.. and 0 ≤ j < 2k . For example, 1 =
20 + 0, 2 = 21 + 0, 3 = 21 + 1, etc. Define fn on [0, 1] by

j j+1
1, ≤x≤ k ,

fn (x) = 2k 2
0, otherwise.
420 Introduction to Real Analysis

The first four functions f1 , f2 , f3 , and f4 are given as follows: f1 (x) = 1, and
(
1, 0 ≤ x ≤ 21 ,
f2 (x) = 1
0, 2 < x ≤ 1,
(
0, 0 ≤ x < 12
f3 (x) = 1
1 2 ≤ x ≤ 1,
(
1, 0 ≤ x ≤ 14 ,
f4 (x) = 1
0, 4 < x ≤ 1.

For each n ∈ N, fn ∈ R[0, 1] with


Z 1 (j+1)/2k
1
Z
2
fn (x) dx = 1 dx = .
0 j/2k 2k
R1
Thus lim fn2 (x) dx = 0. On the other hand, if x ∈ [0, 1], then the sequence
n→∞ 0
{fn (x)} contains an infinite number of 0’s and 1’s, and thus does not converge.


In the following theorem we prove that convergence in the mean of the


partial sums of the Fourier series is equivalent to equality in Bessel’s inequality.
THEOREM 9.2.4 Let {φn }∞ n=1 be a sequence of orthogonal functions on
[a, b]. Then the following are equivalent:
(a) For every f ∈ R[a, b],
Z b
lim [f (x) − SN (x)]2 dx = 0,
N →∞ a

where SN is the N th partial sum of the Fourier series of f .


(b) For every f ∈ R[a, b],
X∞ Z b Z b
2 2
cn φn (x) dx = f 2 (x) dx, (Parseval’s equality)
n=1 a a

where the cn are the Fourier coefficients of f .


N
P
Proof. Suppose SN (x) = cn φn (x) is the N th partial sum of the Fourier
n=1
series of f . Then by Theorem 9.1.4
Z b Z b N
X Z b
[f (x) − SN (x)]2 dx = f 2 (x) dx − c2n φ2n (x) dx.
a a n=1 a

From this it follows immediately that {SN } converges in the mean to f if and
only if Parseval’s equality holds. 
Fourier Series 421

DEFINITION 9.2.5 A sequence {φn }∞ n=1 of orthogonal functions on [a, b]


is said to be complete if for every f ∈ R[a, b],

X Z b Z b
c2n φ2n (x) dx = f 2 (x) dx.
n=1 a a

As a consequence of the previous theorem, the orthogonal sequence {φn }


is complete on [a, b] if and only if for every f ∈ R[a, b], the sequence {SN }
of the partial sums of the Fourier series of f converges in the mean to f .
We now prove some additional consequences of completeness of an orthogonal
sequence.

THEOREM 9.2.6 If the sequence {φn }∞ n=1 of orthogonal functions on [a, b]


is complete, and if f is a continuous real-valued function on [a, b] satisfying
Z b
f (x)φn (x) dx = 0, for all n = 1, 2, ...,
a

then f (x) = 0 for all x ∈ [a, b].

Proof. The hypothesis implies that the Fourier coefficients cn of f are zero
for all n ∈ N. Thus by Parseval’s equality,
Z b
f 2 (x) dx = 0.
a

Since f 2 is continuous and nonnegative, by Exercise 7, Section 6.1, this holds


if and only if f 2 (x) = 0 for all x ∈ [a, b]. Thus f (x) = 0 for all x ∈ [a, b]. 
There is a converse to Theorem 9.2.6. Since the proof of the converse
requires a knowledge of the Lebesgue integral, we only state the result. A
sketch of the proof is provided in the miscellaneous exercises (Exercise 3) of
Chapter 10.

THEOREM 9.2.7 If {φn }∞ n=1 is a sequence of orthogonal functions on [a, b]


having the property that the only real-valued continuous function f on [a, b]
satisfying
Z b
f (x)φn (x) dx = 0 for all n = 1, 2, ...
a

is the zero function, then the system {φn }∞


n=1 is complete.

For the orthogonal system {sin nx}∞ n=1 on [−π, π] and f (x) = 1, we have
cn = 0 for all n = 1, 2, .... Thus as a consequence of Theorem 9.2.6 the
orthogonal system {sin nx} is not complete on [−π, π]. However, as we will
see in Exercise 3 of Section 9.4, this system will be complete on [0, π].
422 Introduction to Real Analysis

Another consequence of completeness is the following: Suppose {φn } is


complete on [a, b] and f, g are continuous real-valued functions on [a, b] satis-
fying
Z b Z b
f (x)φn (x) dx = g(x)φn (x) dx
a a

for all n = 1, 2, ..., then f (x) = g(x) for all x ∈ [a, b]. The above assumption
simply means that f and g have the same Fourier coefficients. To prove the
result, apply Theorem 9.2.6 to h(x) = f (x) − g(x).

THEOREM 9.2.8 Suppose {φn }∞


n=1 is complete on [a, b], f, g ∈ R[a, b] with


X ∞
X
f (x) ∼ cn φn (x) and g(x) ∼ bn φn (x).
n=1 n=1

Then,
Z b ∞
X Z b
f (x)g(x) dx = c n bn φ2n (x) dx.
a n=1 a

Proof. Exercise 5. 

Exercises 9.2
1. Prove Theorem 9.2.2.
2. For n ∈ N, define the function fn on [0, 1] by
√n,

1
0<x< ,
fn (x) = n
0, elsewhere.
Show that {fn } converges to 0 pointwise, but not in the mean.
3. Consider the orthogonal system {1, cos nπx
L
, sin nπx
L
}∞
n=1 on [−L, L].
a. Show that if f ∈ R[−L, L], then the Fourier series of f with respect
to the above orthogonal system is given by

f (x) ∼ 21 ao + (an cos nπx + bn sin nπx
P
L L
),
n=1

where
1 RL nπx
an = f (x) cos dx, n = 0, 1, 2, ..., and
L −L L
1 RL nπx
bn = f (x) sin dx, n = 1, 2, ...
L −L L
b. Show that Bessel’s inequality becomes
1 2
∞ 1 RL 2
(a2n + b2n ) ≤
P
a
2 o
+ f (x) dx.
n=1 L −L
Fourier Series 423

4. *a. Assuming that the orthogonal system {sin nx} is complete on [0, π],
show that Parseval’s equality becomes
∞ 2 Rπ 2 Rπ
b2n = [f (x)]2 dx where bn =
P
f (x) sin nx dx.
n=1 π0 π0
*b. Use Parseval’s equality and the indicated function to find the sum of
the given series.
P∞ 1 P∞ 1
(i) 2
, f (x) = 1. (ii) 2
, f (x) = x.
k=1 (2k − 1) k=1 k

5. *Prove Theorem 9.2.8.


6. *Show by example that continuity of f is required in Theorem 9.2.6.

9.3 Trigonometric and Fourier Series


In Section 9.1, we introduced Fourier series with respect to any system
{φn }∞
n=1 of orthogonal functions on [a, b]. In this section, we will emphasize
the trigonometric system
nπx ∞
1, cos nπx

L , sin L n=1
,

which by Example 9.1.2(c) is orthogonal on [−L, L]. For convenience we will


take L = π.
Any series of the form

X
1
2 A0 + (An cos nx + Bn sin nx),
n=1

where the An and Bn are real numbers, is called a trigonometric series.


For example, the series
∞ ∞
X sin nx X cos nx
and
n=2
ln n n=1
n

are both examples of trigonometric series. Since the coefficients


 ∞  ∞
1 1
and
ln n n=2 n n=1

are nonnegative and decrease to zero, by Theorem 7.2.6 the first series con-
verges for all x ∈ R, whereas the second converges for all x ∈ R, except
x = 2pπ, p ∈ Z.
424 Introduction to Real Analysis

Fourier Series
For the orthogonal system {1, cos nx, sin nx}∞
n=1 on [−π, π] we have
Z π Z π Z π
sin2 nx dx = cos2 nx dx = π, and 12 dx = 2π,
−π −π −π

Thus by Definition 9.1.7, the Fourier coefficients of a function f ∈ R[−π, π]


with respect to the orthogonal system are defined as follows.

DEFINITION 9.3.1 Let f ∈ R[−π, π]. The Fourier coefficients of f with


respect to the orthogonal system {1, cos nx, sin nx} are defined by
Z π
1
a0 = f (x) dx,
π −π
Z π
1
an = f (x) cos nx dx, n = 1, 2, ...
π −π
Z π
1
bn = f (x) sin nx dx, n = 1, 2, ...
π −π
Also, the Fourier series of f is given by

1 X
f (x) ∼ a0 + (an cos nx + bn sin nx).
2 n=1

Remark. For the constant function φ0 = 1, since −π (φ0 )2 = 2π, the term a0
1

according to Definition 9.1.7 should be defined as 2π −π
f (x)dx. However, for
π
notational convenience it is easier to define a0 as π1 −π f (x)dx and to include
R

the constant 21 in the definition of the Fourier series.

For the orthogonal system {1, cos nx, sin nx}, by Exercise 3(b) of the pre-
vious section, Bessel’s inequality of Theorem 9.1.9 becomes

1 π
X Z
1 2 2 2
2 a0 + (an + bn ) ≤ [f (x)]2 dx. (Bessel’s Inequality)
n=1
π −π

Thus for f ∈ R[a, b] the sequences {an } and {bn } of Fourier coefficients of
f are square summable sequences. In Theorem 10.8.7 we will prove that if
{an } and {bn } are square summable sequences, then there exists a Lebesgue
integrable function f such that {an } and {bn } are the Fourier coefficients of
f.
Remark on Notation. In subsequent sections we will primarily be interested
in real-valued functions defined on all of R that are periodic of period 2π. As a
consequence, in the examples and exercises, rather than defining our functions
on [−π, π], we only define the function f on [−π, π) with the convention that
f (π) = f (−π). This then allows us to extend f to all of R as a 2π periodic
function according to the following definition.
Fourier Series 425

DEFINITION 9.3.2 For a real-valued function f defined on [−π, π), the


periodic extension (of period 2π) of f to R is obtained by defining f (x) =
f (x − 2kπ), where k ∈ Z is such that x − 2kπ ∈ [−π, π).

(
0, −π ≤ x < 0,
EXAMPLES 9.3.3 (a) Let f (x) = . Then
1, 0≤x<π
π π
1 1
Z Z
a0 = f (x) dx = 1 dx = 1,
π −π π 0

and for n=1,2,...,


Z π π
1 1
an = cos nx dx = sin nx = 0,
π 0 nπ 0
Z π π
1 −1 1 1
bn = sin nx dx = cos nx = [1 − cos nπ] = [1 − (−1)n ] .
π 0 nπ 0 nπ nπ
In the above we have used the fact that cos nπ = (−1)n . Thus the Fourier
series of f is given by
∞ ∞
1 X 1 1 2X 1
f (x) ∼ + [1 − (−1)n ] sin nx = + sin(2k + 1)x.
2 n=1 nπ 2 π 2k + 1
k=0

If SN (x) denotes the N th partial sum of the Fourier series, then S1 and S3
are given by
 
1 2 1 2 1
S1 (x) = + sin x and S3 (x) = + sin x + sin 3x .
2 π 2 π 3
The graph of f, S1 , S3 , S5 , and S15 are given in Figure 9.2.
(b) Let f (x) = x. Before we compute the Fourier coefficients, we will make
several observations which simplify this task. Recall that a function g(x) is
even on [−a, a] if g(−x) = g(x) for all x, and g(x) is odd if g(−x) = −g(x)
for all x. By Exercise 4 of Section 6.2, if g(x) is even on [−a, a], then
Z a Z a
g(x) dx = 2 g(x) dx,
−a 0

whereas if g(x) is odd, Z a


g(x) dx = 0.
−a
The functions sin nx are all odd, whereas cos nx are even for all n. Therefore
since f (x) = x is odd, x cos nx is odd and x sin nx is even. Thus, an = 0 for
all n = 0, 1, 2, ..., and
Z π
2
bn = x sin nx dx,
π 0
426 Introduction to Real Analysis

FIGURE 9.2
Graphs of f, S1 , S3 , S5 , and S15

which by an integration by parts


 π π 
2 −x 1 2 2
Z
= cos nx + cos nx dx = − cos nπ = (−1)n+1 .
π n 0 n 0 n n

Therefore

X (−1)n+1
x∼2 sin nx. 
n=1
n

Riemann-Lebesgue Lemma
There is one additional result from the general theory that will be needed later.
For the orthogonal system {1, cos nx, sin nx}, Corollary 9.1.10 is as follows:

THEOREM 9.3.4 (Riemann-Lebesgue Lemma) If f ∈ R[−π, π], then


Z π Z π
lim f (x) cos nx dx = lim f (x) sin nx dx = 0.
n→∞ −π n→∞ −π

The following example shows that integrability of the function f is re-


quired.

EXAMPLE 9.3.5 Let f be defined on [−π, π) as follows:



0, −π ≤ x ≤ 0,
f (x) = 1
 , 0 < x < π.
x
Fourier Series 427

Then π π nπ
sin nx sin x
Z Z Z
f (x) sin nx dx = dx = dx.
−π 0 x 0 x
Hence,
π nπ ∞
sin x sin x π
Z Z Z
lim f (x) sin nx dx = lim dx = dx = .2 
n→∞ −π n→∞ 0 x 0 x 2

Is a Trigonometric Series a Fourier Series?


Since every Fourier series is a trigonometric series, an obvious question to
ask is whether every trigonometric series is a Fourier series? More specifically,
given a trigonometric series

X
1
2 A0 + (An cos nx + Bn sin nx),
n=1

with {An } and {Bn } converging to zero, does there exist a function f on
[−π, π] such that the coefficients An and Bn are given by Definition 9.3.1?
As we will see, the answer is no! First however, in the positive direction, we
prove the following.

THEOREM 9.3.6 If the trigonometric series


1 X
A0 + (An cos nx + Bn sin nx)
2
converges uniformly on [−π, π], then it is the Fourier series of a continuous
real-valued function on [−π, π].

Proof. For n ∈ N, let


n
1 X
Sn (x) = A0 + (Ak cos kx + Bk sin kx).
2
k=1

Since the series converges uniformly on [−π, π], and Sn is continuous for each
n,
f (x) = lim Sn (x)
n→∞

2 The value of π/2 for the improper integral is most easily obtained by contour integration

and the theory of residues of complex analysis. A real variables approach that computes
the value of this integral is given in the article by K. S. Williams listed in the supplemental
readings.
428 Introduction to Real Analysis

is a continuous function on [−π, π]. For m ∈ N, consider


Z π Z π  
f (x) cos mx dx = lim Sn (x) cos mx dx
−π −π n→∞
Z π
= lim Sn (x) cos mx dx.
n→∞ −π

Since for each m, the sequence {Sn (x) cos mx} converges uniformly to
f (x) cos mx on [−π, π], the above interchange of limits and integration is valid
by Theorem 8.4.1. If n > m, then
Z π Z π n
X Z π
Sn (x) cos mx = 21 A0 cos mx dx + Ak cos kx cos mx dx
−π −π k=1 −π
n
X Z π
+ Bk sin kx cos mx dx.
k=1 −π

Thus by orthogonality,
Z π
Sn (x) cos mx dx = Am π.
−π

Letting n → ∞ gives
π
1
Z
Am = f (x) cos mx dx.
π −π

The analogous formula also holds for Bm , and thus the given series is the
Fourier series of f . 
P P
Remarks. (a) If |Ak | and |Bk | both converge, then by the Weierstrass
M-test the series

1 X
A0 + (Ak cos kx + Bk sin kx)
2
k=1

converges uniformly on R, and thus is the Fourier


P seriesPof a continuous func-
tion on [−π, π]. Convergence of the series |Ak | and |Bk | is however not
necessary for uniform convergence of the trigonometric series. For example,
the trigonometric series

X sin nx
n=2
n ln n
1 ∞
P
converges uniformly on R (Exercise 12), yet = ∞.
n ln n n=2
(b) In 1903 Lebesgue proved the following stronger version of Theorem

9.3.6: If f (x) = 21 A0 +
P
(Ak cos kx + Bk sin kx) for all x ∈ (−π, π), and if f
k=1
Fourier Series 429

is continuous (in fact, measurable) then Ak and Bk are the Fourier coefficients
of the function f .3 (See also Miscellaneous Exercise 1.)
We now turn to the negative results. Consider the series

X sin nx
, (7)
n=2
ln n

which by Theorem 7.2.6 converges for all x ∈ R. However, there does not exist
a Riemann integrable function f on [−π, π] such that

1 1 π
Z
= bn = f (x) sin nx dx.
ln n π −π

If such a function f exists, then by Bessel’s inequality we would have


∞ ∞ π
1 1
X X Z
b2n = 2
≤ f 2 (x) dx.
n=2 n=2
(ln n) π −π

But, since f is Riemann integrable on [−π, π], so is f 2 , and thus the integral
is finite. On the other hand however,

X 1
= ∞,
n=2
(ln n)2

which gives a contradiction.


The above argument only shows that the series (7) is not obtained by
means of Definition 9.3.1 from a Riemann integrable function. There remains
however the question whether this is still the case if we extend our definition
to allow the class of Lebesgue integrable functions to be introduced in Chapter
10? As we will see in Section 10.8, the answer to this is still no!

Fourier Sine and Cosine Series


We close this section with a brief discussion of Fourier sine and cosine series.
As we have seen in Exercise 5 of Section 9.1, the sequence {sin nx}∞ n=1 is
orthogonal on [0, π]. Also, by Exercise 6 of Section 9.1 the same is true of the
sequence {1, cos nx}∞ n=1 . For f ∈ R[0, π], the Fourier series with respect to
each of these two orthogonal systems are called the Fourier sine and cosine
series of f respectively. Since
Z π Z π
sin2 nx dx = cos2 nx dx = 12 π,
0 0

the formulas of Definition 9.1.7 give the following.

3 Sur les series trigonometric, Annales Scientifiques de l’École Normale Supérieure, (3)

20 (1903), 453–485.
430 Introduction to Real Analysis

DEFINITION 9.3.7 For f ∈ R[0, π], the Fourier sine series of f is given
by
X∞
f (x) ∼ bn sin nx,
n=1
where
2 π
Z
bn = f (x) sin nx dx, n = 1, 2, ...
π 0
are the Fourier sine coefficients of f . Similarly, the Fourier cosine series
of f ∈ R[0, π] is given by

X
f (x) ∼ 21 a0 + an cos nx,
n=1

where
π π
2 2
Z Z
a0 = f (x) dx and an = f (x) cos nx dx, n = 1, 2, ...
π 0 π 0

are the Fourier cosine coefficients of f .

There is a simple connection between Fourier series and the Fourier sine
and cosine series. As in Example 9.3.3(b), we first note that if f is an even
function on [−π, π], then by Exercise 1,

2 π
X Z
1
f (x) ∼ 2 a0 + an cos nx, where an = f (x) cos nx dx
n=1
π 0

Similarly, if f is an odd function on [−π, π], then



2 π
X Z
f (x) ∼ bn sin nx, where bn = f (x) sin nx dx.
n=1
π 0

Thus the coefficients {an } and {bn } depend only on the values of the function
f on [0, π]. Conversely, given a function f on [0, π), the following definition
extends f both as an even and an odd function to the interval (−π, π):

DEFINITION 9.3.8 Let f be a real-valued function defined on [0, π). The


even extension of f to (−π, π), denoted fe , is the function defined by
(
f (−x), −π < x < 0,
fe (x) =
f (x), 0 ≤ x < π.

Similarly, the odd extension of f to (−π, π), denoted fo , is the function


defined by 
−f (−x),
 −π < x < 0,
fo (x) = 0, x = 0,

f (x), 0 < x < π.

Fourier Series 431

It is easily seen that the functions fe (x) and fo (x) are even and odd re-
spectively on (−π, π), and agree with the given function f on (0, π). From
the above discussion it is easily seen that if f ∈ R[0, π], then the Fourier sine
series of f is equal to the Fourier series of fo , and the Fourier cosine series of
f is equal to the Fourier series of fe (Exercise 2).
Remark. In the definition of the even and odd extension we only assumed
that f was defined on [0, π), and then defined fe and fo on (−π, π). If

f (π−) = lim− f (x)


x→π

exists, then for the even extension we define fe (−π) = f (π−). Thus fe is
now defined on [−π, π) and hence can be extended to all of R as a 2π-periodic
function. For the odd extension fo , we set fo (−π) = −f (π−), thereby defining
fo on [−π, π).

Exercises 9.3
1. Let f ∈ R[−π, π]. Prove the following:
*a. If f is even on [−π, π], then the Fourier series of f is given by
∞ 2 Rπ
f (x) ∼ 21 a0 +
P
an cos nx, where an = f (x) cos nx dx,
n=1 π 0
n = 0, 1, 2, ...
b. If f is odd on [−π, π], then the Fourier series of f is given by
P∞ 2 Rπ
f (x) ∼ bn sin nx where bn = f (x) sin nx dx, n = 1, 2, ...
n=1 π0
2. Let f ∈ R[0, π]. Prove the following:
a. The Fourier sine series of f is equal to the Fourier series of the odd
extension fo of f .
b. The Fourier cosine series of f is equal to the Fourier series of the even
extension fe of f .
3. Find the Fourier
( series of each of the following functions
( f on [−π, π).
−1, −π ≤ x < 0, 0, −π ≤ x ≤ 0,
*a. f (x) = . b. f (x) =
1, 0 ≤ x < π. x, 0 < x < π.
*c. f (x) = |x|. d. f (x) = x2 .
*e. f (x) = 1 + x .
−π ≤ x < − π2 ,


 0,
− π2 ≤ x < 0,

−1,
f. f (x) =


 1, 0 ≤ x < π2 ,
π
0, ≤ x < π.

2

4. On the interval [−2π, 2π] sketch the graph of the 2π-periodic extension
of each of the functions in Exercise 3.
5. Find the Fourier sine and cosine series on [0, π) of each of the following
functions.
432 Introduction to Real Analysis

*a. f (x) = 1 b. f (x) = x


2
c. f (x) = x
( d. f (x) =(e−x
1, 0 ≤ x < π2 , 0, 0 ≤ x < π2 ,
e. f (x) = π
*f. f (x) =
−1, 2
≤x<π 1, π2 ≤ x < π
6. Let h be defined on [0, π) as follows:
(
cx, 0 ≤ x ≤ π2 ,
h(x) = π
c(π − x), 2
< x < π,
where c > 0 is a constant.
a. Sketch the graph of h on [0, π).
b. Sketch the graph of the even extension he on (−π, π).
*c. Find the Fourier series of he .
d. Sketch the graph of the odd extension ho on (−π, π).
e. Find the Fourier series of ho .
7. a. Find the Fourier series of f (x) = sin x on [−π, π].
*b. Find the Fourier cosine series of f (x) = sin x on [0, π].
8. *Suppose f, f ′ are continuous on [−π, π], and f ′′ ∈ R[−π, π]. Also, sup-
pose that
f (−π) = f (π) and f ′ (−π) = f ′ (π).
Prove that the Fourier series of f converges uniformly to f on [−π, π].
(Hint: Apply integration by parts to show that |ak | and |bk | are less than
C k−2 for some positive constant C.)
9. Show that the trigonometric series
P∞ 1
√ sin nx
n=1 n
converges for all x ∈ R but is not the Fourier series of a Riemann inte-
grable function on [−π, π].
10. If f is absolutely integrable on [−π, π] (see Section 6.4), prove that
Rπ Rπ
lim f (x) cos nx dx = lim f (x) sin nx dx = 0.
n→∞ −π n→∞ −π

11. Using the previous exercise, show that each of the following hold.
Rπ Rπ
a. lim ln x sin nx dx = lim ln x cos nx dx = 0.
n→∞ 0 n→∞ 0

Rπ Rπ
b. lim xα sin nx dx = lim xα cos nx dx = 0 for all α > −1.
n→∞ 0 n→∞ 0

12. Suppose an ≥ 0 for all n ∈ N, and {nan } is monotone decreasing with



P
lim nan = 0. Prove that an sin nx converges uniformly. (Hint: Write
n→∞ n=1
(nan )( sinnnx ) and use the fact that the partial sums of
P P
P asin
n sin nx as
nx
n
are uniformly bounded. See also Exercise 15 of Section 8.2. )
Fourier Series 433

9.4 Convergence in the Mean of Fourier Series


Let f be a real-valued function defined on [−π, π), and extend f to all of
R to be periodic of period 2π. Throughout this section we will assume that
f ∈ R[−π, π]. Let

1 X
f (x) ∼ a0 + (an cos nx + bn sin nx)
2 n=1

be the Fourier Series of f , and for each n ∈ N, let


n
1 X
Sn (x) = a0 + (ak cos kx + bk sin kx)
2
k=1

be the nth partial sum of the Fourier series.


Our goal in this section will be to prove that if f ∈ R[−π, π], then the
sequence {Sn } converges in the mean to the function f on [−π, π]. By The-
orem 9.2.4 this is equivalent to completeness of the trigonometric system
{1, cos nx, sin nx}. To investigate mean-square convergence, and also point-
wise convergence in the next section, it will be useful to obtain an integral
expression for Sn .

The Dirichlet Kernel

THEOREM 9.4.1 Let f ∈ R[−π, π]. Then for each n ∈ N and x ∈ R,

1 π
Z
Sn (x) = f (t)Dn (x − t) dt,
π −π

where Dn is the Dirichlet kernel, given by



1
1 Xn  sin(n + 2 )t ,

t 6= 2pπ, p ∈ Z
Dn (t) = + cos kt = 2 sin 2t
2
n + 21 ,

k=1  t = 2pπ.
434 Introduction to Real Analysis

Proof. By the definition of the Fourier coefficients ak and bk (Definition 9.3.1),


n
1 X
Sn (x) = a0 + (ak cos kx + bk sin kx)
2
k=1
n  Z π
1 π 1

1
Z X
= f (t) dt + f (t) cos kt dt cos kx
π −π 2 π −π
k=1
n  Z π 
X 1
+ f (t) sin kt dt sin kx
π −π
k=1
n
" #
1 π 1 X
Z
= f (t) + (cos kt cos kx + sin kt sin kx) dt
π −π 2
k=1
n
" #
1 π 1 X
Z
= f (t) + cos k(x − t) dt.
π −π 2
k=1

In the last step we have used the trigonometric identity

cos k(x − t) = cos kx cos kt + sin kx sin kt.


n
1
P
Set Dn (s) = 2 + cos ks. Then by the above,
k=1

π
1
Z
Sn (x) = f (t)Dn (x − t) dt.
π −π

To conclude the proof, it remains to derive the formula for Dn (s). If s = 2pπ,
p ∈ Z, then
n
1 X 1
Dn (2pπ) = + 1=n+ .
2 2
k=1

By identity (2) of Theorem 7.2.6, for s 6= 2pπ,

1 sin(n + 12 )s − sin 2s sin(n + 12 )s


Dn (s) = + = ,
2 2 sin 2s 2 sin 2s

which establishes the result. 


If the sequence {Dn }∞
n=1 were an approximate identity, convergence results
would follow easily from Theorem 8.6.5. Unfortunately however, the functions
Dn are neither nonnegative nor do they satisfy (b) of Definition 8.6.4. They
however still satisfy
1 π
Z
Dn (t) dt = 1, (8)
π −π
a fact which will prove useful later. To see that (8) holds, it suffices to integrate
Fourier Series 435

FIGURE 9.3
Graphs of D1 , D3 , and D5

the sum defining Dn (s) term by term. The only nonzero term will be the
integral involving 1/2, and thus
1 π 1 π 1
Z Z
Dn (t) dt = dt = 1.
π −π π −π 2
The graphs of D1 , D3 , and D5 are illustrated in Figure 9.3.

The Fejér Kernel


To prove mean-square convergence, it turns out to be more useful to first
consider the arithmetic means of the partial sums Sn . For each n ∈ N, set
S0 (x) + · · · + Sn (x)
σn (x) = .
n+1
By Exercise 14 of Section 3.2, if lim Sn (x) = S(x), then we also have
n→∞

lim σn (x) = S(x).


n→∞

However, it is possible that the sequence {Sn (x)} diverges for a particular x,
whereas the sequence {σn (x)} may converge.
n
LEMMA 9.4.2 For n ∈ N, let Sn (x) = 12 a0 +
P
ak cos ks + bk sin kx. Then
k=1
n  
1 X j
(a) σn (x) = a0 + 1− (aj cos jx + bj sin jx), and
2 j=1
n+1
Z π Z π
(b) [f (x) − Sn (x)]2 dx ≤ [f (x) − σn (x)]2 dx.
−π −π
436 Introduction to Real Analysis

FIGURE 9.4
Graphs of F1 , F3 , and F5

Proof. The proof of (a) is left as an exercise (Exercise 1). Since σn is itself
the partial sum of a trigonometric series, the result (b) follows by Corollary
9.1.5. 
Our next step is to express σn (x) as an integral analogous to that of Sn (x)
in the previous theorem.

THEOREM 9.4.3 Let f ∈ R[−π, π]. Then for each n ∈ N and x ∈ R,


1 π
Z
σn (x) = f (t)Fn (x − t) dt,
π −π

where Fn is the Fejér kernel, given by


 2
sin(n + 1) 2t

n
 1
, t 6= 2pπ,

1 X 
sin 2t
Fn (t) = Dk (t) = 2(n + 1)
n+1
 n + 1,

k=0 
t = 2pπ.
2
The graphs of F1 , F3 , and F5 are illustrated in Figure 9.4.
Proof. By Theorem 9.4.1,
π
1
Z
Sn (x) = f (s)Dn (x − s) ds,
π −π

where Dn is the Dirichlet kernel. Therefore,


1 π
Z
σn (x) = f (s)Fn (x − s) ds,
π −π
Fourier Series 437

where
n
1 X
Fn (t) = Dk (t).
n+1
k=0
1
If t = 2pπ, p ∈ Z, then Dk (2pπ) = k + 2, and thus
n
1 X (n + 1)
Fn (2pπ) = (k + 12 ) = .
n+1 2
k=0

If t 6= 2pπ, p ∈ Z, then
n
1 X
Fn (s) = sin(k + 12 )t
2(n + 1) sin 2t k=0
n
1 X
= sin 2t sin(k + 21 )t.
2(n + 1) sin2 t
2 k=0

By the identity sin A sin B = 21 [cos(B − A) − cos(B + A)],


n n
X 1X
sin 2t sin(k + 21 )t = (cos kt − cos(k + 1)t)
2
k=0 k=0
1
= (1 − cos(n + 1)t) = sin2 (n + 1) 2t .
2
Therefore, for t 6= 2pπ,
2
sin(n + 1) 2t

1
Fn (t) = . 
2(n + 1) sin 2t

We now prove the following properties of the Fejér kernel.

THEOREM 9.4.4
(a) Fn is periodic of period 2π with Fn (−t) = Fn (t).
(b) Fn (t) ≥ 0 for all t.
1 Rπ
(c) Fn (t)dt = 1.
π −π
(d) For 0 < δ < π, lim Fn (t) = 0 uniformly for all t, δ ≤ |t| ≤ π.
n→∞

Proof. (a) Clearly Fn (−t) = Fn (t). Also, since


   
(n + 1) t
sin (t + 2π) = sin (n + 1) cos(n + 1)π
2 2
 
t
= (−1)n+1 sin (n + 1) ,
2
438 Introduction to Real Analysis

and sin 21 (t + 2π) = − sin 2t , substituting into the formula for Fn gives
Fn (t + 2π) = Fn (t). Therefore Fn is periodic of period 2π. The proof of (b) is
obvious.

(c) Since π1 Dk (t)dt = 1, we have
−π

π n π
1 1 X1
Z Z
Fn (t)dt = Dk (t) dt = 1.
π −π n+1 π −π
k=0

To prove (d), we first note that sin2 t


2 ≥ sin2 δ
2 for all t, 0 < δ ≤ |t| ≤ π.
Also, since | sin(n + 1) 2t | ≤ 1 for all t,

1
Fn (t) ≤ for all t, δ ≤ |t| ≤ π.
2(n + 1) sin2 δ
2

Therefore, for any δ, 0 < δ < π, lim Fn (t) = 0 uniformly on δ ≤ |t| ≤ π. 


n→∞
As a consequence of (b), (c), and (d) above, the sequence { π1 Fn (t)}∞
n=1 is
an approximate identity on [−π, π]. If f is a real-valued function on R that is
periodic of period 2π, then

1 π
Z
σn (x) = f (t)Fn (x − t)dt,
π −π

which by the change of variable s = t − x,

π−x
1
Z
= f (s + x)Fn (s) ds.
π −π−x

Since the function s → f (s + x)Fn (s) is periodic of period 2π, by Theorem


8.6.3
1 π
Z
σn (x) = f (s + x)Fn (s) ds.
π −π
Thus if f is continuous on [−π, π] with f (−π) = f (π), by Theorem 8.6.5,

lim σn (x) = f (x)


n→∞

uniformly on [−π, π]. We summarize this in the following theorem of L. Fejér


(1880–1959).

THEOREM 9.4.5 (Fejér) If f is a continuous real-valued function on


[−π, π] with f (−π) = f (π), then

lim σn (x) = f (x)


n→∞

uniformly on [−π, π].


Fourier Series 439

COROLLARY 9.4.6 If f is a continuous real-valued function on [−π, π]


with f (−π) = f (π), then
Z π
lim [f (x) − Sn (x)]2 dx = 0.
n→∞ −π

Proof. The proof of the Corollary is an immediate consequence of Lemma


9.4.2(b). 
Remark. There is a similarity between Theorem 9.4.5 and the Weierstrass
approximation theorem. An alternate way of expressing Theorem 9.4.5 is as
follows: Let f be continuous on [−π, π] with f (−π) = f (π) . Given ǫ > 0,
there exists a trigonometric polynomial
n
1 X
Tn (x) = A0 + (Ak cos kx + Bk sin kx)
2
k=1

such that
|f (x) − Tn (x)| < ǫ
for all x ∈ [−π, π]. The function Tn (x) is called a trigonometric polynomial
since in complex form it can be rewritten as
n
X
Tn (x) = ck eikx ,
k=−n

where by De Moivre’s formula eikx = cos kx + i sin kx.

Convergence in the Mean


Corollary 9.4.6 only proves convergence in the mean for the case where f is
a continuous real-valued function on [−π, π]. We are now ready to prove the
main result of this section.

THEOREM 9.4.7 If f ∈ R[−π, π], then


Z π
lim [f (x) − Sn (x)]2 dx = 0,
n→∞ −π

where Sn is the nth partial sum of the Fourier series of f .

For the proof of the theorem we require the following lemma.

LEMMA 9.4.8 Let f ∈ R[a, b] with |f (x)| ≤ M for all x ∈ [a, b]. Then
given ǫ > 0, there exists a continuous function g on [a, b] with g(a) = g(b) and
|g(x)| ≤ M for all x ∈ [a, b], such that
Z b
|f (x) − g(x)| dx < ǫ.
a
440 Introduction to Real Analysis

Before proving the lemma, we will first use the result to prove Theorem
9.4.7, then consider some consequences of this theorem.
Proof of Theorem 9.4.7 Suppose |f (x)| ≤ M with M > 0, and let ǫ > 0
be given. By the lemma, there exists a continuous function g on [−π, π] with
g(−π) = g(π) and |g(x)| ≤ M for all x ∈ [−π, π] such that
Z π
ǫ
|f (x) − g(x)| dx < . (9)
−π 8M
Let Sn (g)(x) be the function defined by
1 π
Z
Sn (g)(x) = g(t)Dn (x − t) dt,
π −π
where Dn is the Dirichlet kernel. Then since g is continuous, by Corollary
9.4.6 the sequence {Sn (g)} converges in the mean to g on [−π, π]. Thus there
exists no ∈ N such that
Z π
ǫ
[g(x) − Sn (g)(x)]2 dx <
−π 4

for all n ≥ no . Consider −π [f (x) − Sn (g)(x)]2 dx. Since
|f (x) − Sn (g)(x)| ≤ |f (x) − g(x)| + |g(x) − Sn (g)(x)|,
we have
|f (x) − Sn (g)(x)|2 ≤ 2 |f (x) − g(x)|2 + |g(x) − Sn (g)(x)|2 .
 

But by inequality (9),


Z π π
ǫ
Z
|f (x) − g(x)|2 dx ≤ 2M |f (x) − g(x)| dx <
−π −π 4
Therefore, for all n ≥ no ,
Z π Z π
ǫ
[f (x) − Sn (g)(x)]2 dx < + 2 [g(x) − Sn (g)(x)]2 dx < ǫ.
−π 2 −π

However, for each n ∈ N, Sn (g) is the nth partial sum of a trigonometric series.
Thus if Sn is the nth partial sum of the Fourier series of f , by Corollary 9.1.5
Z π Z π
[f (x) − Sn (x)]2 dx ≤ [f (x) − Sn (g)(x)]2 dx.
−π −π

Thus, given ǫ > 0, there exists an integer no , such that


Z π
[f (x) − Sn (x)]2 dx < ǫ
−π

for all n ≥ no ; i.e., {Sn } converges in the mean to f on [−π, π]. 


As a consequence of Theorems 9.2.4 and 9.4.7, we have the following corol-
lary:
Fourier Series 441

COROLLARY 9.4.9 (Parseval’s equality) If f ∈ R[−π, π], then



1 2 X 2 1 π 2
Z
2
a + (a + bn ) = f (x) dx.
2 0 n=1 n π −π

EXAMPLE 9.4.10 By Example 9.3.3(b), the Fourier series of f (x) = x is


given by

X 2(−1)n+1
x∼ sin nx.
n=1
n
Here an = 0 for all n, and bn = 2(−1)n+1 /n. Thus by Parseval’s equality,

4 1 π 2 2
X Z
2
= x dx = π 2 ,
n=1
n π −π 3

which gives

X 1 π2
= . 
n=1
n2 6

Proof of Lemma 9.4.8. Let f ∈ R[a, b] with |f (x)| ≤ M for all x ∈ [a, b].
Since f1 (x) = f (x) + M is nonnegative, if we can prove the result for the
function f1 , the result also follows for the function f . Thus we can assume
that f satisfies 0 ≤ f (x) ≤ M for all x ∈ [a, b].
Let ǫ > 0 be given. Since f is Riemann integrable on [a, b], there exists a
partition P = {x0 , x1 , ..., xn } of [a, b] such that
Z b
ǫ
0≤ f (x) dx − L(P, f ) < ,
a 2
where
n
X
L(P, f ) = mi ∆xi ,
i=1

and mi = inf{f (t) : t ∈ [xi−1 , xi ]}, i = 1, 2, ..., n. For each i = 1, 2, ..., n, define
the functions hi on [a, b] by
(
mi , xi−1 ≤ x < xi ,
hi (x) =
0, elsewhere,
n
P
and let h(x) = hi (x) (see Figure 9.5). The function h is called a step
i=1
function on [a, b]. Since h is continuous except at a finite number of points,
h is Riemann integrable and
Z b Xn
h(x) dx = mi ∆xi = L(P, f ).
a i=1
442 Introduction to Real Analysis

FIGURE 9.5
Graph of the Step Function h

Therefore,
b
ǫ
Z
0≤ [f (x) − h(x)] dx < .
a 2
Also, since mi = inf{f (t) : t ∈ [xi−1 , xi ]}, 0 ≤ h(x) ≤ f (x) for all x ∈ [a, b]. By
taking slightly shorter intervals, and connecting the endpoints with straight
line segments, we leave it as an exercise (Exercise 9) to show that there exists
a continuous function g on [a, b] with g(a) = g(b) = 0, 0 ≤ g(x) ≤ h(x), such
that Z b
ǫ
[h(x) − g(x)] dx < .
a 2
Then 0 ≤ g(x) ≤ f (x), and
Z b Z b Z b
0≤ [f (x) − g(x)] dx = [f (x) − h(x)] dx + [h(x) − g(x)] dx
a a a
ǫ ǫ
< + = ǫ. 
2 2

Exercises 9.4
1. Prove Lemma 9.4.2(a).
2. *Using the Fourier series of f (x) = x2 and Parseval’s equality, find
P∞ 1

4
.
n=1 n

3. *Prove that the orthogonal systems {sin nx}∞ ∞


n=1 and {1, cos nx}n=1 are
both complete on [0, π].
Fourier Series 443

4. Show that the set {sin 21 x, sin 23 x, sin 52 x, . . . } is complete on [0, π]


5. Let f ∈ R[−π, π], and let Sn denote the nth partial sum of the Fourier
series of f .
a. Using the Cauchy-Schwarz inequality for integrals (Exercise 13, Section
9.1), prove that
1/2
Z π √
Z π
|Sn (x) − f (x)| dx ≤ 2π |Sn (x) − f (x)|2 dx .
−π −π
Rπ Rπ
b. Prove that lim Sn (x) dx = f (x)dx .
n→∞ −π −π

6. * Use the previous exercise to prove the following: Suppose f ∈ R[−π, π]


with

f (x) ∼ 21 ao +
P
(an cos nx + bn sin nx).
n=1
Rx
For x ∈ [−π, π], set F (x) = f (t)dt − 21 ao x.
−π

Then F is continuous on [−π, π] with F (−π) = F (π), and for all x ∈


[−π, π],
∞  
1 X an bn
F (x) = ao π + sin nx − (cos nx − cos nπ) ,
2 n=1
n n
where the convergence is uniform on [−π, π].
7. By integrating the Fourier series of f (x) = x (Example 9.3.3(b)), find the
Fourier series of g(x) = x2 .
8. Suppose f is a real-valued periodic function of period 2π with f ∈
R[−π, π], and xo ∈ R is such that f (xo −) and f (xo +) both exist. Prove
that
lim σn (xo ) = 12 [f (xo −) + f (xo +)].
n→∞

9. Complete the proof of Lemma 9.4.8; namely, given the step function h
on [a, b] and ǫ > 0, there exists a continuous function g on [a, b] with
Rb
0 ≤ g(x) ≤ h(x), g(a) = g(b) = 0, and [h(x) − g(x)] dx < ǫ.
a

10. * Let f ∈ R[a, b]. Given ǫ > 0, prove that there exists a polynomial p(x)
Rb
such that |f (x) − p(x)|dx < ǫ.
a

9.5 Pointwise Convergence of Fourier Series


We now turn our attention to the question of pointwise convergence of a
Fourier series to its defining function. It is known that even if the function is
continuous, this is not always possible; additional hypothesis on the function
444 Introduction to Real Analysis

f will be required. As was indicated in the introduction, Dirichlet was the


first to find sufficient conditions on the function f so that the Fourier series
of f converges to f . For the proof of his result we need several preliminary
lemmas.

LEMMA 9.5.1 Let f be a periodic real-valued function on R of period 2π


with f ∈ R[−π, π]. Fix x ∈ R. Then for any real number A,
2 π f (x + t) + f (x − t)
Z  
Sn (x) − A = − A Dn (t) dt,
π 0 2
where Dn is the Dirichlet kernel and Sn is the nth partial sum of the Fourier
series of f .

Proof. By Theorem 9.4.1,


1 π
Z
Sn (x) = f (t)Dn (x − t) dt,
π −π

which by the change of variable s = x − t,

1 x−π 1 x+π
Z Z
=− f (x − s)Dn (s) ds = f (x − s)Dn (s) ds.
π x+π π x−π
Since both f and Dn are periodic of period 2π, by Theorem 8.6.3
1 x+π 1 π
Z Z
f (x − s)Dn (s) ds = f (x − s)Dn (s) ds.
π x−π π −π
Therefore,
0 π
1 1
Z Z
Sn (x) = f (x − s)Dn (s) ds + f (x − s)Dn (s) ds.
π −π π 0

In the first integral set u = −s. Then since Dn (−u) = Dn (u),


1 0 1 0
Z Z
f (x − s)Dn (s) ds = − f (x + u)Dn (u) du
π −π π π
Z π
1
= f (x + s)Dn (s) ds.
π 0
Therefore,
1 π
Z
Sn (x) = [f (x + s) + f (x − s)]Dn (s) ds.
π 0
2

Finally, since π Dn (s) ds = 1,
0
π  
2 f (x + s) − f (x − s)
Z
Sn (x) − A = − A Dn (s) ds,
π 0 2
which is the desired identity. 
Fourier Series 445

LEMMA 9.5.2 If g ∈ R[0, π], then


Z π  
1
lim g(t) sin n + t dt = 0.
n→∞ 0 2

Proof. Extend g to [−π, π] by defining g(x) = 0 for all x, −π ≤ x < 0. Since


sin(n + 12 )t = cos 2t sin nt + sin 2t cos nt, we have
Z π Z π
g(t) sin n + 21 t dt = g(t) sin n + 12 t dt
 
0 −π
Z π Z π
= g1 (t) sin nt dt + g2 (t) cos nt dt,
−π −π

where g1 (t) = g(t) cos 2t and g2 (t) = g(t) sin 2t . Since g is Riemann integrable
on [−π, π], so are both g1 and g2 . Thus by Theorem 9.3.4,
Z π Z π
lim g1 (t) sin nt dt = lim g2 (t) cos nt dt = 0,
n→∞ −π n→∞ −π

from which the result follows. 

Dirichlet’s Theorem
Before we state and prove Dirichlet’s theorem, we briefly review some notation
introduced in Chapter 4. For a real-valued function f defined in a neighbor-
hood of a given point p, the right and left limits of f at p, denoted f (p+)
and f (p−) respectively, are defined by

f (p+) = lim+ f (x) and f (p−) = lim− f (x),


x→p x→p

provided of course that the limits exist.

THEOREM 9.5.3 (Dirichlet) Let f be a real-valued periodic function on


R of period 2π with f ∈ R[−π, π]. Suppose xo ∈ R is such that
(a) f (xo +) and f (xo −) both exist, and
(b) there exists a constant M and a δ > 0 such that

|f (xo + t) − f (xo +)| ≤ M t and |f (xo − t) − f (xo −)| ≤ M t (10)

for all t, 0 < t ≤ δ, then


1
lim Sn (xo ) = [f (xo +) + f (xo −)].
n→∞ 2
Remark. If f is continuous at xo , then 12 [f (xo +) + f (xo −)] = f (xo ), and
thus
lim Sn (xo ) = f (xo ),
n→∞
446 Introduction to Real Analysis

provided of course that the inequalities (10) hold.


Proof. Set A = 21 [f (xo +) + f (xo −)]. By Lemma 9.5.1
2 π f (xo + t) + f (xo − t)
Z  
Sn (xo ) − A = − A Dn (t) dt
π 0 2
sin(n + 21 )t
Z π
1
= [(f (xo + t) − f (xo +)) + (f (xo − t) − f (xo −))] dt
π 0 2 sin 2t
Z π Z π
1 1
= g1 (t) sin(n + 12 )t dt + g2 (t) sin(n + 12 )t dt,
π 0 π 0
where
f (xo + t) − f (xo +) f (xo − t) − f (xo −)
g1 (t) = and g2 (t) = .
2 sin 2t 2 sin 2t
To prove the result, it suffices to show that the functions g1 and g2 are Riemann
integrable on [0, π]. If this is the case, then by Lemma 9.5.2,
Z π
lim gi (t) sin(n + 21 )t dt = 0, i = 1, 2.
n→∞ 0
Therefore,
1
[f (xo +) + f (xo −)].
lim Sn (xo ) = A =
n→∞ 2
To finish the proof we still have to show that gi ∈ R[0, π], i = 1, 2. We
will prove the result for the function g1 , the proof for g2 being similar. Set
h(t) = f (xo + t) − f (xo +). Since f ∈ R[−π, π] and is periodic of period 2π,
h is Riemann integrable on [0, π]. Also, (sin 2t )−1 is continuous on (0, π] and
thus Riemann integrable on [c, π] for any c, 0 < c < π. Therefore,
h(t)
g1 (t) =
2 sin 2t
is Riemann integrable on [c, π] for any c, 0 < c < π. Let δ > 0 be as in
hypothesis (b). Then for 0 < t < δ, using inequality (10),
|f (xo + t) − f (xo +)|
|g1 (t)| =
|2 sin 2t |
1 1
|f (xo + t) − f (xo +)| 2t 2t
= 1 ≤ M .
|t| sin 2 t sin 12 t
x
Since lim = 1, there exists a constant C such that
x→0+ sin x
1
2t
≤C
sin 12 t
for all t, 0 < t ≤ π. Therefore, g1 is bounded on (0, δ), and hence also on
(0, π]. Thus since g1 is Riemann integrable on [c, π] for any c, 0 < c < π, by
Exercise 5 of Section 6.2, g1 is Riemann integrable on (0, π]. 
Fourier Series 447

Piecewise Continuous Functions


Dirichlet’s theorem required that f (xo −) and f (xo +) both exist at xo , and
that f satisfy the inequalities (10) at the point xo . The existence of both
f (xo −) and f (xo +) means that either f is continuous at xo , or in the termi-
nology of Section 4.4, has a removable or jump discontinuity at xo . Functions
that have only simple discontinuities, and at most a finite number on an in-
terval [a, b], are said to be piecewise continuous on [a, b]. We make this precise
with the following definition.

DEFINITION 9.5.4 A real-valued function f is piecewise continuous on


[a, b] if there exist finitely many points a = x0 < x1 < · · · < xn = b such that
(a) f is continuous on (xi−1 , xi ) for all i = 1, 2, ..., n, and
(b) for each i = 0, 1, 2, ..., n, f (xi +) and f (xi −) exist as finite limits.

For i = 0 and n, we of course only require the existence of the right and left
limit respectively. Also, we do not require that f be defined at x0 , x1 , ..., xn .
In addition to piecewise continuous functions, it will also be useful to
consider functions for which the derivative is piecewise continuous. If f is
piecewise continuous on [a, b], then the derivative f ′ is piecewise continuous
on [a, b] if there exist a finite number of points a = x0 < x1 < · · · < xn = b
such that
(a) f ′ (x) exists and is continuous on each interval (xi−1 , xi ), i = 1, 2, ..., n,
and
(b) for each i = 0, 1, 2, ..., n, the quantities f ′ (xi +) and f ′ (xi −) both exist as
finite limits. Again, for the endpoints a and b we obviously only require the
existence of f ′ (a+) and f ′ (b−).
(
x, 0 < x < 1,
EXAMPLES 9.5.5 (a) Consider the function f (x) =
x2 + 2, 1 < x < 2.
Since f is continuous on (0, 1) and (1, 2) and f (0+), f (1−), f (1+), and f (2−)
all exist, f is piecewise continuous on [0, 2]. Also, since
(
1, 0 < x < 1,
f ′ (x) =
2x, 1 < x < 2,

and f ′ (0+), f ′ (1−), f ′ (1+), and f ′ (2−) all exist, f ′ is also piecewise contin-
uous on [0, 2].
(b) As our second example, consider the function
(
0, x = 0,
f (x) = 2 1
x sin x , 0 < x ≤ 1.
448 Introduction to Real Analysis

The function f is both continuous and differentiable on [0, 1] with


(
′ 0, x = 0,
f (x) = 1 1
2x sin x − cos x , 0 < x < 1.

However, since f ′ (0+) does not exist, f ′ is not piecewise continuous on [0, 1].


Returning to Dirichlet’s theorem, suppose f is such that both f ′ (xo −) and


f (xo +) exist at the point xo . If f ′ is piecewise continuous on [−π, π], then

there exists x1 > xo such that f and f ′ are continuous on (xo , x1 ). If f is not
continuous at xo , redefine f at xo as f (xo +). Then f (redefined if necessary)
is continuous on [xo , x1 ), and thus by Theorem 5.2.11
f (xo + t) − f (xo +)
lim = f ′ (xo +).
t→0+ t
As a consequence, there exists a constant M and a δ > 0 such that

|f (xo + t) − f (xo +)| ≤ M t

for all t, 0 < t < δ. Similarly, the existence of f ′ (xo −) implies the existence
of a constant M and a δ > 0 such that

|f (xo − t) − f (xo −)| ≤ M t

for all t, 0 < t < δ. Thus f satisfies the hypothesis of Dirichlet’s theorem. On
combining the above discussion with Theorem 9.5.3, we obtain the following
corollary.

COROLLARY 9.5.6 Suppose f is a real-valued periodic function on R of


period 2π. If f and f ′ are piecewise continuous on [−π, π), then

lim Sn (x) = 21 [f (x+) + f (x−)]


n→∞

for all x ∈ R.

EXAMPLES 9.5.7 (a) Consider the function f defined on [−π, π) by



0,
 −π ≤ x < − π2 ,
f (x) = 3, − π2 ≤ x ≤ π2 ,
π

0, 2 < x < π.

Extend f to all of R as a periodic function of period 2π. The function f is


then continuous except at x = 21 π + kπ, k ∈ Z. At each discontinuity xo ,
1
2 [f (xo +) + f (xo −)] = 32 .
Fourier Series 449

FIGURE 9.6
Graph of 12 [f (x+) + f (x−)]

The graph of 21 [f (x+) + f (x−)] on the interval [−2π, 2π] is given in Figure
9.6.
To discuss convergence of the Fourier series of f , we first note that f is
piecewise continuous on [−π, π] and differentiable on (−π, − π2 ), (− π2 , π2 ), and
( π2 , π) with f ′ (x) = 0 in the respective intervals. Thus f ′ is also piecewise
continuous on [−π, π]. As a consequence of Corollary 9.5.6 the Fourier series
of f converges to 21 [f (x+) + f (x−)] for all x ∈ R.
The Fourier series of f is obtained as follows:
Z π2
1
a0 = 3 dx = 3,
π − π2
Z π2
1 6 nπ
an = 3 cos nx dx = sin , n = 1, 2, ...
π − π2 nπ 2
and since f is an even function, bn = 0 for all n ∈ N. Therefore,

3 X 6 nπ
f (x) ∼ + sin cos nx.
2 n=1 nπ 2

If n is even, sin nπ
2 = 0. If n is odd, i.e., n = 2k + 1,
 π π
sin(2k + 1) π2 = sin kπ + = cos kπ sin = (−1)k .
2 2
Thus

3 6 X (−1)k
f (x) ∼ + cos(2k + 1)x.
2 π 2k + 1
k=0
When x = 0, the series converges to f (0) = 3. As a consequence

3 6 X (−1)k
3= + ,
2 π 2k + 1
k=0
450 Introduction to Real Analysis

which upon simplification gives



X (−1)k π
= .
2k + 1 4
k=0

(b) Dirichlet’s theorem can also be applied to the Fourier sine and cosine
series of a real-valued function f defined on [0, π). As an example, consider
the cosine series of f (x) = x on (0, π). By Exercise 5(b) of Section 9.3, the
cosine series of f (x) = x is given by

1 4X 1
x∼ π− cos(2k − 1)x.
2 π (2k − 1)2
k=1

Since the even extension fe of f to [−π, π] is given by fe (x) = |x|, the cosine
series of x is the Fourier series of |x| on [−π, π]. Since both fe and fe′ are
piecewise continuous on [−π, π], the Fourier series converges to the 2π-periodic
extension of |x| for all x ∈ R.
By Dirichlet’s theorem, the Fourier series converges to |x| for all x ∈
[−π, π]. Taking x = 0 gives

π 4X 1
0= − ,
2 π (2k − 1)2
k=1


P 1 π2
or 2
= . 
k=1 (2k − 1) 8

Remarks. Although the inequalities (10) of Theorem 9.5.3 are sufficient, they
are by no means necessary. There are variations of Dirichlet’s theorem which
provide sufficient conditions on f to guarantee convergence of the series to
1
2 [f (x0 +) + f (x0 −)]. For example, the inequalities of Theorem 9.5.3 can be
replaced by the following:

|f (x0 + t) − f (x0 +)| ≤ M tα and |f (x0 + t) − f (x0 −)| ≤ M tα (11)

for all t, 0 < t < δ, and some α, 0 < α ≤ 1. If f satisfies the above at xo ,
then the conclusion of Dirichlet’s theorem is still valid. An even more general
condition is due to Ulisse Dini (1845–1918) who proved that if f ∈ R[−π, π]
satisfies
Z δ Z δ
|f (xo + t) − f (xo +)| |f (xo − t) − f (xo −)|
dt < ∞ and dt < ∞
0 t 0 t

for some δ > 0, then the Fourier series of f converges to 21 [f (xo +) + f (xo −)]
at xo . Both of the above hold if f satisfies (11) at xo .
Fourier Series 451

Differentiation of Fourier Series


Our final result of this section involves the derivative of a Fourier series. Con-
sider the function f (x) = x. Since both f and f ′ are continuous on [−π, π], the
Fourier series of f converges to f for all x ∈ (−π, π). Therefore, by Example
9.3.3(b),

X 2(−1)n+1
x= sin nx, x ∈ (−π, π).
n=1
n
The differentiated series

X
2(−1)n+1 cos nx
n=1

does not converge since its nth term fails to approach zero as n → ∞. The 2π-
periodic extension of f has discontinuities at x = ±(2k − 1)π, k ∈ N. It turns
out that continuity of the periodic function is important for differentiability
of the Fourier series. Sufficient conditions are given by the following theorem.
THEOREM 9.5.8 Let f be a continuous function on [−π, π] with f (−π) =
f (π), and let f ′ be piecewise continuous on [−π, π]. If

1 X
f (x) = a0 + (an cos nx + bn sin nx), x ∈ [−π, π],
2 n=1

is the Fourier series of f , then at each x ∈ (−π, π) where f ′′ (x) exists,



X
f ′ (x) = (−nan sin nx + nbn cos nx). (12)
n=1

Remark. At a point x where f ′′ (x) does not exist but f ′′ (x−) and f ′′ (x+)
both exist, the series (12) converges to 21 [f ′ (x−) + f ′ (x+)].
Proof. Suppose x ∈ (−π, π) is such that f ′′ (x) exists. Since f ′ is continuous
at x, by Dirichlet’s theorem

1 X
f ′ (x) = α0 + (αn cos nx + βn sin nx).
2 n=1

Since f is continuous with f (−π) = f (π), and f ′ ∈ R[−π, π],


Z π
1 1
α0 = f ′ (t) dt = [f (π) − f (−π)] = 0.
2π −π 2π
Also, by integration by parts,
Z π
1
αn = f ′ (t) cos nt
2π −π
Z π
1 n
= [f (π) cos nπ − f (−π) cos(−nπ)] + f (t) sin nt dt
2π 2π −π
= nbn .
452 Introduction to Real Analysis

Similarly, βn = −nan , which proves the result. 

EXAMPLE 9.5.9 Consider the function f (x) = x2 . Since f is even on


(−π, π),

X
f (x) = 21 a0 + an cos nx.
n=1

Also, since f satisfies the hypothesis of Theorem 9.5.8,



X
2x = (−nan sin nx).
n=1

On the other hand, by Example 9.3.3(b),



X 4(−1)n+1
2x = sin nx.
n=1
n

Therefore an = 4(−1)n n2 for n = 1, 2, 3, .... To find a0 we use the definition.




This gives a0 = 2π 2 /3, and thus



2 π2 X (−1)n
x = +4 cos nx, x ∈ [−π, π]. 
3 n=1
n2

Remarks. In closing this section, it should be mentioned that there exist


continuous functions f for which the Fourier series of f fails to converge at
a given point. This was first shown by P. du Bois Reymond. Other examples
were subsequently constructed by L. Fejér and Lebesgue. The example of Fejér
can be found on p. 416 of the text by E.C. Titchmarsh. Given any countable
set E in (−π, π), it is possible to construct a continuous function f on [−π, π]
such that the Fourier series of f diverges on E and converges on (−π, π) \ E.4
In fact, it is possible to construct a continuous function on [−π, π] whose
Fourier series diverges on an uncountable subset of (−π, π).5 The existence
of functions having such pathological behavior is due to the fact that for the
Dirichlet kernel Dn , Z π
lim |Dn (t)| dt = ∞.
n→∞ 0
The details are left to the miscellaneous exercises.
For all of the above examples it is still the case that the Fourier series of f
converges to f except on a set of measure zero. This in fact is the case for the
Fourier series of every Riemann integrable function f on [−π, π]. In 1926, Kol-
mogorov showed that there exist Lebesgue integrable functions whose Fourier
4 See Chapter VIII of the text by A. Zygmund.
5 Ibid.
Fourier Series 453

series diverge everywhere.6 The biggest question about the convergence of


Fourier series was asked by Lusin: If a function f satisfies the hypothesis that
f 2 is Lebesgue integrable on [−π, π], does the Fourier series of f converge to f ,
except perhaps on a set of measure zero? This problem remained unanswered
for 50 years. The first proof that this was indeed the case was provided by L.
Carleson in 1966.7

Exercises 9.5
1. For each of the functions of Exercise 3, Section 9.3, sketch the graph of the
function on the interval [−2π, 2π] to which the Fourier series converges.
2. *Use the Fourier series of f (x) = x2 to find each of the following sums:
P∞ (−1)n+1 P∞ 1
a. , b. .
n=1 n2 n=1 n
2

3. Using the Fourier cosine series of f (x) = sin x on [0, π) (Exercise 7,


Section 9.3), find each of the following sums:

P 1 P∞ (−1)n
*a. 2
, *b. 2
.
n=1 4n − 1 n=1 4n − 1

4. a. Find the Fourier series of f (x) = (π − |x|)2 on [−π, π].


b. On the interval [−2π, 2π] sketch the graph of the function to which
the series in (a) converges.
c. Use the results of (a) to find the sums of the following series:
P∞ 1 P∞ 1
i. 2
, ii. 4
.
n=1 n n=1 n

5. a. Show that for −π < x < π,


" ∞
#
x eπ − e−π X (−1)n+1
e = 1−2 (cos nx − n sin nx) .
2π n=1
n2 + 1
b. Use the results of (a) to find the sums of the following series:
P∞ (−1)n+1 ∞
P 1
i. 2
, ii. .
n=1 n + 1 n=1 n2 + 1
6. *a. Find the Fourier cosine series of eax on [0, π).
*b. On the interval [−2π, 2π], sketch the graph of the function to which
the series in (a) converges.
7. Let f be a continuous function on [−π, π] with f (−π) = f (π). If in
addition f ′ is piecewise continuous on [−π, π], prove that the Fourier
series of f converges uniformly to f on [−π, π].

6 “Une série de Fourier–Lebesgue divergente partout,” Compte Rendus, 183 (1926), 1327–

1328.
7 “On convergence and growth of partial sums of Fourier Series,” Acta Math., 116 (1966),

135–157.
454 Introduction to Real Analysis

8. Suppose f ∈ R[−π, π], and xo ∈ (−π, π) is such that f (xo −) and f (xo +)
exist, and that
f (xo + t) − f (xo +) f (xo − t) − f (xo −)
lim , lim
t→0+ t t→0+ t
both exist as finite limits. Prove that f satisfies the hypothesis of Dirich-
let’s theorem (9.5.3).
9. Suppose f ∈ R[−π, π] and xo ∈ (−π, π) is such that f (xo −) and f (xo +)
both exist. If f satisfies the inequalities (11) at xo for some α, 0 < α ≤ 1,
prove that the Fourier series of f at xo converges to 21 [f (xo +) + f (xo −)].

Notes
There is no doubt about the significance of Fourier’s contributions to the areas of
mathematical physics and applied mathematics; one only needs to consult a text on
partial differential equations. The methods which he developed in connection with
the theory of heat conduction are applicable to a large class of physical phenom-
ena, including problems in acoustics, elasticity, optics, and the theory of electrical
networks, among others. Fourier’s work however is even more significant in that it
inaugurated a new area of mathematics.
The study of Fourier series led to the development of the fundamental concepts
and methods of what is now called real analysis. The study of the concept of a
function by Dirichlet and others was directly linked to their interest in Fourier series.
The study of Fourier series by Riemann led to his development of the Riemann
integral. He was concerned with the question of finding sufficient conditions for the
existence of the integrals which gave the Fourier coefficients of a function f , that is
1 π 1 π
Z Z
an = f (x) cos nx dx and bn = f (x) sin nx dx.
π −π π −π

The quest for an understanding of what types of functions possessed Fourier series is
partly responsible for the development of Lebesgue’s theory of integration. The need
for a more extensive theory of integration was illustrated by Lebesgue in 1903 in the
paper “Sur les series trigonometric.”8 In this paper, he constructed an example of a
function that is not Riemann integrable but that is representable everywhere by its
Fourier series. Such a function is f (x) = − ln |2 sin(x/2)| whose Fourier series is

X cos kx
.
k
k=1

This series converges everywhere to the function f on [−π, π], but since f is un-
bounded, it is not Riemann integrable on [−π, π]. It is however integrable in the
sense of Lebesgue. The article by Alan Gluchoff provides an excellent exposition
on the influence of trigonometric series to the theories of integration of Cauchy,
Riemann, and Lebesgue.

8 Annales Scientifiques de l’École Normale Supérieure, (3) 20 (1903), 453–485.


Fourier Series 455

There are several important topics that we have not touched upon in the course
of this chapter. One of these is the Gibbs phenomenon, named after Josiah Gibbs
(1839–1903). To explain this phenomenon, we consider the Fourier series of f (x) =
0, x ∈ [−π, 0), and f (x) = 1, x ∈ [0, π) of Example 9.3.3(a). By Dirichlet’s theorem
we have 
∞ 0,
 −π < x < 0,
1 2 X 1
+ sin(2k − 1)x = 21 , x = 0,
2 π (2k − 1) 
k=1 
1, 0 < x < π.
A careful examination of the graphs (see Figure 9.2) of the partial sums S3 , S5 ,
and S15 , shows that near 0, each of the functions Si , i = 3, 5, 15, has an absolute
maximum at a point ti , where the ti get closer to 0, but Si (ti ) is bounded away from
1.
We now consider the behavior of the partial sums Sn more closely. For n =
(2k − 1), k ∈ N,
 
1 2 1 1
S2k−1 (x) = + sin x + sin 3x + · · · + sin(2k − 1)x ,
2 π 3 (2k − 1)
and
′ 2
S2k−1 (x) = [cos x + cos 3x + · · · + cos(2k − 1)x] .
π

If we multiply S2k−1 (x) by sin x and use the identity

sin x cos jx = 21 [sin(j + 1)x − sin(j − 1)x],

we obtain
′ 1
sin x S2k−1 (x) = sin kx.
π
From this it now follows that S2k−1 (x) has relative maxima and minima at the points

2kx = ±π, ±2π, ..., ±2(k − 1)π.

These points are equally spaced in (−π, π). Consider the points xk = π/(2k), at
which each S2k−1 has an (absolute) maximum with
 
1 2 π 1 3π 1 (2k − 1)π
S2k−1 (xk ) = + sin + sin + ··· + sin .
2 π 2k 3 2k 2k − 1 2k
To find lim S2k−1 (xk ), we write the above sum as a Riemann sum of the function
k→∞
g(x) = (sin x)/x. This function is Riemann integrable on [0, π]. With the partition
P of [0, π] given by yj = j πk , j = 0, 1, 2, ..., k, and tj = 12 (yj−1 + yj ),
  k
2 π 1 3π 1 (2k − 1)π 1X
sin + sin + ··· + sin = g(tj )∆yj .
π 2k 3 2k 2k − 1 2k π j=1

Therefore,
1 π sin x
Z
1
lim S2k−1 (xk ) =
+ dx.
k→∞ 2 π 0 x
To approximate the integral we use the Taylor series expansion of sin x. This gives

sin x x2 x4 x6 x8
=1− + − + − ··· , x ∈ R.
x 3! 5! 7! 9!
456 Introduction to Real Analysis

Therefore,
π
π3 π5 π7 π9
Z  
1 sin x 1
dx = π− + − + − ···
π 0 x π 3 · 3! 5 · 5! 7 · 7! 9 · 9!
π2 π4 π6 π8
=1− + − + − ···
3 · 3! 5 · 5! 7 · 7! 9 · 9!
≈ 1 − .54831 + .16235 − .02725 + .00291
≈ .59 (to two decimal places).

The notation “≈” denotes approximately equal to. Therefore lim S2k−1 (xk ) ≈ 1.09.
k→∞
Even though lim Sn (x) = 1 for all x ∈ (0, π), if we approach 0 along the points xk ,
n→∞
then S2k−1 (xk ) overshoots the value 1 by approximately .09, i.e.,

lim |S2k−1 (xk ) − f (xk )| ≈ 0.09.


k→∞

If the Fourier series converges uniformly, this cannot happen.


The above behavior, known as the Gibbs phenomenon, is due to the fact that f
has a jump discontinuity at x0 = 0, and is typical of the behavior of the Fourier series
of a piecewise continuous function at a jump discontinuity. Furthermore, if f and f ′
are piecewise continuous on [−π, π], then the amount of overshoot at a discontinuity
x0 , due to the Gibbs phenomenon, is approximately equal to 0.09[f (x0 +) − f (x0 −)].
The article by Shelupsky in the supplemental readings provides a very nice discussion
of this phenomenon.
Another important question involves the uniqueness of the representation of a
function by a trigonometric series. Specifically, suppose

X ∞
X
f (x) = (Ak cos kx + Bk sin kx) = (Ck cos kx + Dk sin kx)
k=0 k=0

for all x ∈ [−π, π], must Ak = Ck and Bk = Dk for all k = 0, 1, 2, ...? Alternately, if

X
(Ak cos kx + Bk sin kx) = 0 (13)
k=0

for all x ∈ [−π, π], must Ak = Bk = 0 for all k = 0, 1, 2, ...? By (13) we mean that
lim Sn (x) = 0 for all x ∈ [−π, π], where Sn is the nth partial sum of the series.
n→∞
In 1870 Eduard Heine proved that if the sequence {Sn } converges uniformly
to 0 on [−π, π] then Ak = Bk = 0 for all k. This is Theorem 9.3.6. The general
uniqueness problem was solved by Cantor in the early 1870’s. He was also able to
prove uniqueness if (13) holds for all but a finite number of x in [−π, π]. This then led
Cantor to consider the uniqueness problem for infinite subsets of [−π, π]; specifically,
if E ⊂ [−π, π] is infinite and (13) holds for all x ∈ [−π, π] \ E, must Ak = Bk = 0
for all k? Since point set theory was undeveloped at this time, this question also led
Cantor to devote much of his time and effort to studying point subsets of R. For a
thorough discussion of the uniqueness problem the reader is referred to the article
by Marshall Ash listed in the supplemental readings or to the text by A. Zygmund
listed in the Bibliography.
Fourier Series 457

Micellaneous Exercises

1
P
1. Suppose f is continuous on [−π, π) and f (x) = A
2 0
+ (Ak cos kx +
k=1
Bk sin kx). Let Sn be the nth partial sum of the series. If there exists a
positive constant M such that |Sn (x)| ≤ M for all x ∈ [−π, π] and n ∈ N,
prove that Ak and Bk are the Fourier coefficients of f .

2. Prove that lim |Dn (t)| dt = ∞, where Dn is the Dirichlet kernel. (See
n→∞ 0
Example 6.4.4(b).)
3. Let f (x) = − ln |2 sin(x/2)|. Show that the Fourier series of f is given

P
by (cos kx)/k. (Note: Since f is unbounded at x = 0, the integrals
k=1
defining ak and bk have to be interpreted as improper integrals.)

Supplemental Reading

Ash, M. J., “Uniqueness of represen- Lanczos, C., Discourse on Fourier


tation by trigonometric series,” Amer. Series, Hafner Publ. Co., New York, NY,
Math. Monthly 96 (1989), 873–885. 1966.
Askey, R. and Haimo, D. T., “Sim- Shelupsky, D., “Derivation of the
ilarities between Fourier series and Gibbs phenomenen,” Amer. Math.
power series,” Amer. Math. Manthly 103 Monthly 87 (1980), 210–212.
(1996), 297–304. Shepp, L. A. and Kruskal, J. B.,
Gluchoff, A. D., “Trigonometric se- “Computerized tomography: The new
ries and theories of integration,” Math. medical X-ray technology,” Amer. Math.
Mag. 67 (1994), 3–20. Monthly 85 (1978), 420–439.
Gonzalez-Velasco, E. A., “Connec- Simon, B., “Uniform convergence of
tions in mathematical analysis: The case Fourier series,” Amer. Math. Monthly 67
of Fourier series,” Amer. Math. Monthly (1969), 55–56.
99 (1992), 427–441. R ∞ Williams,
 K. S., “Note on
Halmos, P., “Fourier series,” Amer. 0
(sin x x) dx,” Math. Mag. 44 (1971),
Math. Monthly 85 (1978), 33–34. 9–11.
10
Lebesgue Measure and Integration

The concept of measure plays a very important role in the theory of real
analysis. On the real line the idea of measure generalizes the length of an
interval, in the plane, the area of a rectangle, and so forth. It allows us to talk
about the measure of a set in the same way that we talk about the length of
an interval. The development of the Riemann integral of a bounded function
on a closed and bounded interval [a, b] depended very much on the fact that
we partitioned [a, b] into intervals. The notion of measure and measurable set
will play a prominent role in the development of the Lebesgue integral in that
we will partition [a, b] not into intervals, but instead into pairwise disjoint
measurable sets.
The theory of measure is due to Henri Lebesgue (1875–1941) who in his
famous 1902 thesis defined measure of subsets of the line and the plane, and
also the Lebesgue integral of a nonnegative function. Like Riemann, Lebesgue
was also led to the development of his theory of integration by the problem
of finding sufficient conditions on a function f for which the integrals defining
the Fourier coefficients of f exist. In this chapter, we will develop the theory of
Lebesgue measure of subsets of R following the original approach of Lebesgue
using inner and outer measure. Although this approach is somewhat more
tedious than the modern approach due to Constantin Carathéodory (1873–
1950), it has the advantage of being more intuitive and conceptually easier to
visualize.
In the first section we will illustrate the need for the concept of measure
of a set and measurable function by considering an alternate approach to in-
tegration developed by Lebesgue in 1928. Although this ultimately will not
be how we define the Lebesgue integral, the approach is instructive in empha-
sizing the concepts required for the development of the Lebesgue theory of
integration. In Section 10.2, we use the fact that every open subset of R can
be expressed as a finite or countable union of disjoint open intervals to define
the measure of open sets, and then of compact sets. These are then used to
define the inner and outer measure of subsets of R. A bounded subset of R is
then said to be measurable if these two quantities are the same.
In Section 10.6, we develop the theory of the Lebesgue integral of a
bounded real-valued function using upper and lower sums as in the devel-
opment of the Riemann integral. However, rather than using point partitions
of the interval, we will use measurable partitions. They key result of the sec-
tion is that a bounded real-valued function on [a, b] is Lebesgue integrable if

459
460 Introduction to Real Analysis

and only if it is measurable. As we will see, the class of Lebesgue integrable


functions contains the class of Riemann integrable functions, and for Riemann
integrable functions, the two integrals coincide. One of the advantages of the
Lebesgue theory of integration involves the interchange of limits of sequences
of functions and integration. We will prove several very important and useful
convergence theorems, including the well known bounded convergence theo-
rem and Lebesgue’s dominated convergence theorem.

10.1 Introduction to Measure


In Definition 6.1.11, we defined what it means for a subset of R to have
measure zero. In this chapter, we will consider the concept of measure of a
set in greater detail. When introducing a new concept it is of course very
natural to ask both “why,” and “does it lead to something useful?” Both
of these questions were answered by Lebesgue in 1903 when he exhibited
a trigonometric series that converged everywhere to a nonnegative function
f that was not Riemann integrable.1 The function f however is integrable
according to Lebesgue’s definition and the trigonometric series is the Fourier
series of f .
In this section, we will illustrate why it is necessary to consider the concept
of measure of a set by considering an alternate approach to integration. As
we will see later in the chapter, this approach leads to greater generality in
the types of functions that can be integrated. In addition, Lebesgue’s theory
of integration also allows us to prove interchange of limit and integration
theorems without requiring uniform convergence of the sequence of functions.
Let f be a bounded function on [a, b]. In developing the theory of the
Riemann integral we partitioned the interval [a, b] and defined the upper and
lower sums of f corresponding to the partition. An alternate approach to
integration, due to Lebesgue, is to partition the range of the function, rather
than the domain. For purposes of illustration suppose f is nonnegative with
Range f ⊂ [0, β). Let n ∈ N, and partition [0, β) into n disjoint subintervals
 
β β
(j − 1) , j , j = 1, 2, ..., n.
n n

See Figure 10.1 with n = 8.


For each j = 1, 2, ..., n, we let
 
β β
Ej = x ∈ [a, b] : (j − 1) ≤ f (x) < j .
n n
1 “Sur les series trigonometric,” Annales Scientifiques de l’École Normale Supérieure,

(3) 20 (1903), 453–485.


Lebesgue Measure and Integration 461

FIGURE 10.1
Partition of the range of f

In Figure 10.1,
 
β β
E4 = x ∈ [a, b] : 3 ≤ f (x) < 4 = [a, x1 ) ∪ [x2 , x3 ) ∪ [x4 , x5 ).
8 8

For a set such as E4 , which is a finite union of disjoint intervals, it is reasonable


to define the measure of E4 , denoted m(E4 ), as the sum of the length of the
intervals, i.e.,

m(E4 ) = (x1 − a) + (x3 − x2 ) + (x5 − x4 ).

Assuming that we can do this for each of the sets Ej , a lower approximation
to the area under the graph of f is given by
n
X β
(j − 1) m(Ej ),
j=1
n

and an upper approximation would be given by


n
X β
j m(Ej ).
j=1
n

Taking limits as n → ∞, assuming that they exist and are equal, would in
fact provide another approach to integration. That this indeed is the case for
a large class of functions, including the Riemann integrable functions, will be
proved in Section 10.6.
462 Introduction to Real Analysis

For nice functions, namely those for which the sets Ej are finite unions of
intervals, the above is perfectly reasonable. However, suppose our function f
is defined on [0, 1] by
(
0, x ∈ Q ∩ [0, 1],
f (x) =
x, elsewhere.

As above, for each j = 1, 2, ..., n, let


 
(j − 1) j
Ej = x ∈ [0, 1] : ≤ f (x) < .
n n

Thus E1 = Q ∩ [0, 1] ∪ {x irrational : 0 < x < n1 }, and for j ≥ 2,


 
(j − 1) j
Ej = x irrational : <x< .
n n
Here the sets Ej are no longer unions of intervals, and so what is meant by
the measure of the set is by no means obvious.
Our goal in the next two sections is to define a function λ defined on a
family M of subsets of R, called the measurable sets, which contains all the
intervals, and has the property that
(a) for an interval J, λ(J) = length of J,
(b) λ(E + x) = λ(E), for all E ∈ M and x ∈ R, where

E + x = {y + x : y ∈ E}, and
S P
(c) λ( Ek ) = λ(Ek ) for any finite or countable family {Ek } of pairwise
disjoint sets in M.

10.2 Measure of Open Sets: Compact Sets


We begin our discussion of measure theory by first defining the measure of
open and compact subsets of R.

DEFINITION 10.2.1 If J is an interval, we define the measure of J,


denoted m(J), to be the length of J.

Thus if J is (a, b), (a, b], [a, b), or [a, b], a, b ∈ R, then

m(J) = b − a.

If J is R, (a, ∞), [a, ∞), (−∞, b) or (−∞, b], we set m(J) = ∞. In dealing with
the symbols ∞ and −∞, it is customary to adopt the following conventions:
Lebesgue Measure and Integration 463

(a) If x is real, then x + ∞ = ∞, x − ∞ = −∞.


(b) If x > 0 then x · ∞ = ∞, x · (−∞) = −∞.
(c) If x < 0 then x · ∞ = −∞, x · (−∞) = ∞.
(d) Also, ∞ + ∞ = ∞, −∞ − ∞ = −∞, ∞ · (±∞) = ±∞, −∞ · (±∞) =
∓∞.
The symbols ∞ − ∞ and −∞ + ∞ are undefined, but we shall adopt the
arbitrary convention that 0 · ∞ = 0.

Measure of Open Sets


Our first step will be to extend the set function m to the open subsets of R.
Since this extension relies on Theorem 2.2.20, we restate that result at this
point.
THEOREM 2.2.20 If U is an open subset of R, then there exists a finite or
countable collection {In } of pairwise disjoint open intervals such that
[
U= In .
n

Recall, the family {In } is pairwise disjoint if and only if In ∩ Im = ∅


whenever n 6= m.
S
DEFINITION 10.2.2 If U is an open subset of R with U = n In , where
{In } is a finite or countable collection of pairwise disjoint open intervals, we
define the measure of U , denoted m(U ), by
X
m(U ) = m(In ).
n

Remarks. (a) For the empty set ∅, we set m(∅) = 0.


(b) The sum defining m(U ) may be either finite or infinite. If any of the
intervals are of infinite length, then m(U ) = ∞. On the other hand, if

[
U= In ,
n=1

where the In are pairwise disjoint bounded open intervals, we may still have

X
m(U ) = m(In ) = ∞,
n=1

due to the divergence of the series to ∞. Since m(In ) ≥ 0 for all n, the
sequence of partial sums is monotone increasing and thus will either converge
to a real number or diverge to ∞.
464 Introduction to Real Analysis

EXAMPLES 10.2.3 (a) For each n = 1, 2, ..., set


 
1 1
In = n − n , n + n .
2 2

Then I1 = (1− 21 , 1+ 12 ), I2 = (2− 41 , 2+ 14 ), etc. Since n+2−n < (n+1)−2


S∞
−(n+1)

for all n ∈ N, the collection {In }n=1 is pairwise disjoint. Let U = n=1 In .
Then
∞ ∞ ∞
X X 1 X 1 1
m(U ) = m(In ) = 2 n = n
= 1 = 2.
n=1 n=1
2 n=0
2 1 − 2

The set U is an example of an unbounded set with finite measure.


S∞
(b) Let Jn = (n, n + 1/n), n = 1, 2, ..., and let V = n=1 Jn . Then
∞ ∞
X X 1
m(V ) = m(Jn ) = = ∞.
n=1 n=1
n

(c) As in Section 2.5, let U denote the complement of the Cantor set in
[0, 1]. Since U is the union of the open intervals that have been removed, by
Property 4 of the Cantor set,

m(U ) = 1. 

We now state and prove several results concerning the measure of open
sets.

THEOREM 10.2.4 If U and V are open subsets of R with U ⊂ V , then

m(U ) ≤ m(V ).

Proof. The statement of the theorem appears to be so obvious that no proof


seems to be required. However, it is important to keep in mind how the mea-
sure of an open set is defined. Suppose
[ [
U= In and V = Jm ,
n m

where {In }n and {Jm }m are finite or countable collections of pairwise disjoint
open intervals. Since U ⊂ V , each interval In ⊂ Jm for some m. For each m,
let
Nm = {n : In ⊂ Jm }.
Since the collection {Jm }m is pairwise disjoint, so is the collection {Nm }m ,
and [ [ [
U= In = In .
n m n∈Nm
Lebesgue Measure and Integration 465

Therefore X X
m(U ) = m(In ).
m n∈Nm

But by Exercise 1, X
m(In ) ≤ m(Jm ),
n∈Nm

from which the result follows. 


Remark. As a consequence of the previous theorem, if U is an open subset
of (a, b), a, b ∈ R, then
m(U ) ≤ b − a.
Thus every bounded open set has finite measure.

THEOREM 10.2.5 If U is an open subset of R, then

m(U ) = lim m(Uk ),


k→∞

where for each k ∈ N, Uk = U ∩ (−k, k).

Proof. For each k, Uk is open, with

Uk ⊂ Uk+1 ⊂ U

for all k ∈ N. By Theorem 10.2.4, the sequence {m(Uk )} is monotone increas-


ing with m(Uk ) ≤ m(U ) for all k. Therefore

lim m(Uk ) ≤ m(U ). (1)


k→∞

If U is bounded, then there exists ko ∈ N such that

U ∩ (−k, k) = U

for all k ≥ ko . Hence m(Uk ) = m(U ) for all k ≥ ko , and thus equality holds
in (1).
Suppose that U is an unbounded open subset of R with
[
U= In ,
n

where {In } is a finite or countable collection of pairwise disjoint open intervals.


If m(In ) = ∞ for some n, then m(U ) = ∞, and for that n, either In = R or
In is an interval of the form

(−∞, an ) or (an , ∞)

for some an ∈ R. Suppose In = (an , ∞). Choose ko ∈ N such that ko ≥ |an |.


Then for all k ≥ ko ,
In ∩ (−k, k) = (an , k),
466 Introduction to Real Analysis

and thus
∞ = lim m(In ∩ (−k, k)) ≤ lim m(Uk ) ≤ m(U ).
k→∞ k→∞

Therefore equality holds in (1). The other two cases follow similarly.
Suppose m(In ) < ∞ for all n. Since U is unbounded, the collection {In }
must be infinite. If the collection were finite, then since each interval has
finite length, each interval is bounded, and as a consequence U must also be
bounded. Let α ∈ R with α < m(U ). Since

X
m(In ) = m(U ) > α,
n=1

there exists a positive integer N such that


N
X
m(In ) > α.
n=1
SN
Let V = n=1 In . Then V is a bounded open set, and thus by the above,

m(V ) = lim m(V ∩ (−k, k)).


k→∞

Since m(V ) > α, there exists ko ∈ N such that

m(V ∩ (−k, k)) > α for all k ≥ ko .

But V ∩(−k, k) ⊂ Uk for all k ∈ N. Hence by Theorem 10.2.4, m(V ∩(−k, k)) ≤
m(Uk ) and as a consequence

m(Uk ) > α for all k ≥ ko .

If m(U ) = ∞, then since α < m(U ) was arbitrary, we have m(Uk ) → ∞


as k → ∞. If m(U ) < ∞, then given ǫ > 0, take α = m(U ) − ǫ. By the above,
there exists ko ∈ N such that

m(U ) − ǫ < m(Uk ) ≤ m(U ) for all k ≥ ko .

Therefore, lim m(Uk ) = m(U ). 


k→∞
Remark. In proving results about open sets, the previous theorem allows us
to first prove the result for the case where U is a bounded open set, and then
to use the limit process to extend the result to the unbounded case.
Our next goal is to prove the following:

THEOREM 10.2.6 If {Un }n is a finite or countable collection of open sub-


sets of R, then !
[ X
m Un ≤ m(Un ).
n n
Lebesgue Measure and Integration 467

For the proof of the theorem we require the following lemma.

LEMMA 10.2.7 If {In }N


n=1 is a finite collection of bounded open intervals,
then
N N
!
[ X
m In ≤ m(In ).
n=1 n=1

It should be noted that the collection {In } is not assumed to be pairwise


disjoint. The lemma is most easily proved by resorting to the theory of the
Riemann integral.

DEFINITION 10.2.8 If E is a subset of R, the characteristic function


of E, denoted χE , is the function defined by
(
1, x ∈ E,
χE (x) =
0, x 6∈ E.

Suppose I is a bounded open interval. Choose a, b ∈ R such that I ⊂ [a, b].


Since χI is continuous on [a, b] except at the two endpoints of I, χI ∈ R[a, b]
with Z b
χI (x) dx = m(I).
a

It should be clear that this is independent of the interval [a, b] containing I.


If U is an open subset of [a, b] with
m
[
U= Jk , m ≤ N,
k=1

where the {Jk } are pairwise disjoint open intervals, then


m
X
χU (x) = χJk (x),
k=1

and thus
m
X m Z
X b Z b
m(U ) = m(Jk ) = χJk (x) dx = χU (x) dx. (2)
k=1 k=1 a a

Proof of Lemma 10.2.7 Let {In }N n=1 be a finite collection of bounded open
SN
intervals and let U = n=1 In . Choose a, b ∈ R such that U ⊂ [a, b]. Then χU
is continuous except at a finite number of points, and
N
X
χU (x) ≤ χIn (x).
n=1
468 Introduction to Real Analysis

Therefore by (2) and the above,


Z b Z N
bX N Z
X b N
X
m(U ) = χU (x) dx ≤ χIn (x) dx = χIn (x) dx = m(In ). 
a a n=1 n=1 a n=1

Proof of Theorem 10.2.6 Since the result for a finite collection follows ob-
viously from that of a countable collection, we suppose {Un }∞
n=1 is a countable
collection of open sets and
[∞
U= Un .
n=1
S
Since U is open, U = m Jm where {Jm }m is a finite or countable collection
of pairwise disjoint open intervals. Also, for each n,
[
Un = In,k ,
k

where for each n, {In,k }k is a finite or countable collection of pairwise disjoint


open intervals.
We assume first that U is bounded with m(U P ) < ∞. Let ǫ > 0 be given.
If the collection {Jm }m is infinite, then since m m(Jm ) < ∞, there exists a
positive integer N such that

X
m(Jm ) < ǫ.
m=N +1

Thus
N
X
m(U ) < m(Jm ) + ǫ.
m=1

If the collection {Jm }m is finite, the previous step is not necessary.


Consider the collection {Jm }N m=1 . For each m = 1, 2, ..., N , let Km be an
open interval such that
ǫ
K m ⊂ Jm and m(Jm ) < m(Km ) + .
N
Then
N
X N
X
m(U ) < m(Jm ) + ǫ < m(Km ) + 2ǫ. (3)
m=1 m=1
SN
Let A = m=1 K m . Since each K m is closed and bounded, so is the set A.
Thus A is compact, and since

!
[ [
A⊂U = In,k ,
n=1 k
Lebesgue Measure and Integration 469

the collection {In,k } is an open cover of A. Hence by compactness there exists


a finite number Ini ,kij , i = 1, ..., J, j = 1, ..., mi , such that
[
A⊂ Ini ,kij .
i,j

Since the intervals {Km } are pairwise disjoint


 
N N
!
X [ [
m(Km ) = m Km ≤ m  Ini ,kij  ,
m=1 m=1 i,j

which by Lemma 10.4

X X mi
J X
≤ m(Ini ,kij ) = m(Ini ,kij )
i,j i=1 j=1
J
X ∞
X
≤ m(Uni ) ≤ m(Un ).
i=1 n=1

Combining this with inequality (3) gives



X
m(U ) < m(Un ) + 2ǫ.
n=1

Since ǫ > 0 was arbitrary, the result follows for the case where U is bounded.
If U is unbounded, then for each k ∈ N,

X ∞
X
m(U ∩ (−k, k)) ≤ m(Un ∩ (−k, k)) ≤ m(Un ).
n=1 n=1

The result now follows by Theorem 10.2.5. 

THEOREM 10.2.9 If U and V are open subsets of R, then

m(U ) + m(V ) = m(U ∪ V ) + m(U ∩ V ).

Proof. (a) If both U and V are unions of a finite number of bounded open
intervals, then so are U ∩ V and U ∪ V . Thus the functions

χU , χV , χU ∪V , χU ∩V

are all Riemann integrable on some interval [a, b] with

χU (x) + χV (x) = χU ∪V (x) + χU ∩V (x)


470 Introduction to Real Analysis

for all x ∈ [a, b]. Therefore by identity (2),

m(U ) + m(V ) = m(U ∪ V ) + m(U ∩ V ).

(b) For the general case, suppose



[ ∞
[
U= In and V = Jn ,
n=1 n=1

where the collections {In } and {Jn } consist of pairwise disjoint open intervals
respectively. If one of m(U ) or m(V ) is ∞, then by Theorem 10.2.4, m(U ∪V ) =
∞ and the conclusion holds. Thus we can assume that both m(U ) and m(V )
are finite.
Let ǫ > 0 be given. Choose N ∈ N such that

X ∞
X
m(In ) < ǫ and m(Jn ) < ǫ.
n=N +1 n=N +1

Let U ∗ and U ∗∗ be defined by


N
[ ∞
[
U∗ = In , U ∗∗ = In .
n=1 n=N +1

Also let V ∗ and V ∗∗ be defined analogously. Then

m(U ) = m(U ∗ ) + m(U ∗∗ ) and m(V ) = m(V ∗ ) + m(V ∗∗ ).

Since m(U ∗∗ ) < ǫ and m(V ∗∗ ) < ǫ,

m(U ) + m(V ) < m(U ∗ ) + m(V ∗ ) + 2ǫ.

Since the sets U ∗ and V ∗ are finite unions of open intervals, by part (a)

m(U ∗ ) + m(V ∗ ) = m(U ∗ ∪ V ∗ ) + m(U ∗ ∩ V ∗ ),

which by Theorem 10.2.4

≤ m(U ∪ V ) + m(U ∩ V ).

The last inequality follows since both U ∗ ∪ V ∗ and U ∗ ∩ V ∗ are subsets of


U ∪ V and U ∩ V respectively. Hence

m(U ) + m(V ) < m(U ∪ V ) + m(U ∩ V ) + 2ǫ.

Since ǫ > 0 was arbitrary, we have

m(U ) + m(V ) ≤ m(U ∪ V ) + m(U ∩ V ).


Lebesgue Measure and Integration 471

We now proceed to prove the reverse inequality. We first note that

U ∪ V = (U ∗ ∪ V ∗ ) ∪ (U ∗∗ ∪ V ∗∗ ),

and as a consequence

m(U ∪ V ) ≤ m(U ∗ ∪ V ∗ ) + 2ǫ.

Also by the distributive law,

U ∩ V = (U ∗ ∪ U ∗∗ ) ∩ (V ∗ ∪ V ∗∗ )
= (U ∗ ∩ V ∗ ) ∪ (U ∗∗ ∩ V ∗ ) ∪ (U ∗ ∩ V ∗∗ ) ∪ (U ∗∗ ∩ V ∗∗ )
⊂ (U ∗ ∩ V ∗ ) ∪ U ∗∗ ∪ V ∗∗ .

Therefore,
m(U ∩ V ) < m(U ∗ ∩ V ∗ ) + 2ǫ.
Combining the above gives

m(U ∪ V ) + m(U ∩ V ) < m(U ∗ ∪ V ∗ ) + m(U ∗ ∩ V ∗ ) + 4ǫ,

which since U ∗ and V ∗ are finite unions of intervals

= m(U ∗ ) + m(V ∗ ) + 4ǫ
≤ m(U ) + m(V ) + 4ǫ.

Again since ǫ > 0 was arbitrary, this proves the reverse inequality. 

Measure of Compact Sets


We now define the measure of a compact subset of R. If K is a compact subset
of R and U is any bounded open set containing K, then

U = K ∪ (U \ K).

Using the fact that U \ K is also open and bounded, and thus has finite
measure, we define the measure of K as follows:

DEFINITION 10.2.10 Let K be a compact subset of R. The measure of


K, denoted m(K), is defined to be

m(K) = m(U ) − m(U \ K),

where U is any bounded open subset of R containing K.

We first show that the definition of m(K) is independent of the choice of


U.
472 Introduction to Real Analysis

THEOREM 10.2.11 If K is compact, then m(K) is well defined.


Proof. Suppose U and V are any two bounded open sets containing K. Then
by Theorem 10.2.9
m(U ) + m(V \ K) = m(U ∪ (V \ K)) + m(U ∩ (V \ K))
= m(U ∪ V ) + m((U ∩ V ) \ K).
In the above we have used the fact that
U ∪ (V \ K) = U ∪ V and U ∩ (V \ K) = (U ∩ V ) \ K.
Similarly
m(U \ K) + m(V ) = m(U ∪ V ) + m((U ∩ V ) \ K).
Therefore
m(U ) + m(V \ K) = m(V ) + m(U \ K).
Since all the terms are finite,
m(U ) − m(U \ K) = m(V ) − m(V \ K).
Thus the definition of m(K) is independent of the choice of U ; i.e., m(K) is
well defined. 

EXAMPLES 10.2.12 (a) In our first example we show that for a closed
and bounded interval [a, b], a, b ∈ R, Definition 10.2.10 is consistent with
Definition 10.2.1; namely
m([a, b]) = b − a.
Let U = (a − ǫ, b + ǫ), ǫ > 0. Then
U \ [a, b] = (a − ǫ, a) ∪ (b, b + ǫ).
Therefore,
m([a, b]) = m(U ) − m(U \ [a, b])
= (b − a) + 2ǫ − 2ǫ = b − a.

(b) If K = {x1 , ..., xn } with xi ∈ R, then m(K) = 0. Choose δ > 0 such


that the intervals
Ij = (xj − δ, xj + δ), j = 1, 2, ..., n,
Sn
are pairwise disjoint, and let U = j=1 Ij . Then
n
[
U \K = [(xj − δ, xj ) ∪ (xj , xj + δ)] .
j=1

Thus m(U ) = m(U \ K); i.e., m(K) = 0. 


Lebesgue Measure and Integration 473

THEOREM 10.2.13
(a) If K is compact and U is open with K ⊂ U , then m(K) ≤ m(U ).
(b) If K1 and K2 are compact with K1 ⊂ K2 , then m(K1 ) ≤ m(K2 ).

Proof. The proof of (a) is an immediate consequence of the definition. For


the proof of (b), if U is a bounded open set containing K2 , then since U \K2 ⊂
U \ K1 ,

m(K1 ) = m(U ) − m(U \ K1 ) ≤ m(U ) − m(U \ K2 ) = m(K2 ) 

If I is an open interval and a, b ∈ R, then I ∩ [a, b] is an interval and thus


m(I ∩ [a, b]) is defined with

m(I ∩ [a, b]) = m(I ∩ (a, b)).

We extend this to open subsets of R as follows:


S
DEFINITION 10.2.14 If U is an open subset of R with U = n In where
{In }n is a finite or countable collection of pairwise disjoint open intervals,
and a, b ∈ R, we define
X
m(U ∩ [a, b]) = m(In ∩ [a, b]).
n

Since m(In ∩ [a, b]) = m(In ∩ (a, b)) for all n, we have

m(U ∩ [a, b]) = m(U ∩ (a, b)).

Remark. What the above definition really defines is the measure of relatively
open subsets of [a, b] (see Definition 2.2.21). By Theorem 2.2.23, a subset G
of [a, b] is open in [a, b] if and only if there exists an open subset U of R such
that G = U ∩ [a, b]. Since the set U may not be unique, we leave it as an
exercise (Exercise 4) to show that if U, V are open subsets of R with

U ∩ [a, b] = V ∩ [a, b],

then m(U ∩ [a, b]) = m(V ∩ [a, b]).

THEOREM 10.2.15 If U is an open subset of R and a, b ∈ R, then

m(U ∩ [a, b]) + m(U c ∩ [a, b]) = b − a.

Recall from Chapter 1 that U c = R \ U = {x ∈ R : x 6∈ U }. If U is open,


then U c is closed and thus U c ∩ [a, b] is a compact subset of [a, b].
Proof. Suppose V ⊃ [a, b] is open. Let K = U c ∩ [a, b]. Then since V ⊃ K,

m(K) = m(V ) − m(V \ K).


474 Introduction to Real Analysis

But
V \ K = V ∩ (U c ∩ [a, b])c = (V ∩ U ) ∪ (V ∩ [a, b]c ) ⊃ V ∩ U.
Therefore, m(V ∩ U ) ≤ m(V \ K). Since U ∩ [a, b] ⊂ U ∩ V ,
m(U ∩ [a, b]) + m(K) ≤ m(U ∩ V ) + m(V ) − m(V \ K) ≤ m(V ).
Given ǫ > 0, take V = (a − ǫ, b + ǫ). Then
m(U ∩ [a, b]) + m(U c ∩ [a, b]) ≤ b − a + 2ǫ.
Since ǫ > 0 is arbitrary, this proves
m(U ∩ [a, b]) + m(U c ∩ [a, b]) ≤ b − a.
To prove the reverse inequality, let Iǫ = [a+ǫ, b−ǫ], where 0 < ǫ < 21 (b−a).
Then
m(U ∩ [a, b]) + m(U c ∩ [a, b]) ≥ m(U ∩ (a, b)) + m(U c ∩ Iǫ ).
Since (a, b) is an open set containing U c ∩ Iǫ ,
m(U c ∩ Iǫ ) = b − a − m((a, b) \ (U c ∩ Iǫ )).
But
m((a, b) \ (U c ∩ Iǫ )) = m(((a, b) ∩ U ) ∪ ((a, b) ∩ Iǫc ))
= m(((a, b) ∩ U ) ∪ (a, a − ǫ) + (b − ǫ, b)),

which by Theorem 10.2.6

≤ m(U ∩ (a, b)) + 2ǫ.


Therefore
m(U ∩ [a, b]) + m(U c ∩ [a, b]) ≥ b − a − 2ǫ.
Since ǫ > 0 was arbitrary, the reverse inequality follows. 

Exercises 10.2
1. If {In }n is a finite or countable collection of disjoint open intervals with
S P
n In ⊂ (a, b), prove that m(In ) ≤ m((a, b)).
n
2. *If U 6= ∅ is an open subset of R, prove that m(U ) > 0.
3. *Let P denote the Cantor set in [0, 1]. Prove that m(P ) = 0.
4. Suppose U, V are open subsets of R, a, b ∈ R with U ∩ [a, b] = V ∩ [a, b].
Prove that m(U ∩ [a, b]) = m(V ∩ [a, b]).
5. If A, B are subsets of R, prove that
χA∩B (x) = χA (x)χB (x),
χA∪B (x) = χA (x) + χB ( x) − χA∩B (x),
χAc (x) = 1 − χA (x).
6. *If K1 and K2 are disjoint compact subsets of R, prove that
m(K1 ∪ K2 ) = m(K1 ) + m(K2 ).
Lebesgue Measure and Integration 475

10.3 Inner and Outer Measure: Measurable Sets


Our goal in this section is to define a function λ on a large family M of subsets
of R, called the measurable sets, which agrees with the function m on the open
and compact subsets of R. We begin with the definition of inner and outer
measure of a set.

DEFINITION 10.3.1 Let E be a subset of R. The outer measure of E,


denoted λ∗ (E), is defined by

λ∗ (E) = inf{m(U ) : U is open with E ⊂ U }.

The inner measure of E, denoted λ∗ (E), is defined by

λ∗ (E) = sup{m(K) : K is compact with K ⊂ E}.

THEOREM 10.3.2
(a) For any subset E of R,

0 ≤ λ∗ (E) ≤ λ∗ (E).

(b) If E1 and E2 are subsets of R with E1 ⊂ E2 , then

λ∗ (E1 ) ≤ λ∗ (E2 ) and λ∗ (E1 ) ≤ λ∗ (E2 ).

Proof. (a) If K is compact and U is open with K ⊂ E ⊂ U , then

0 ≤ m(K) ≤ m(U ).

If we fix K, then m(K) ≤ m(U ) for all open sets U containing E. Taking the
infimum over all such U gives

0 ≤ m(K) ≤ λ∗ (E).

Taking the supremum over all compact subsets K of E proves (a). The proof
of (b) is similar and is left as an exercise (Exercise 7). 

EXAMPLES 10.3.3 (a) If E is any countable subset of R, then

λ∗ (E) = λ∗ (E) = 0.

Suppose E = {xn }∞
n=1 . Let ǫ > 0 be arbitrary. For each n, let
 ǫ ǫ 
I n = x n − n , xn + n ,
2 2
476 Introduction to Real Analysis
S∞
and set U = n=1 In . Then U is open with E ⊂ U . By Theorem 10.2.6,
∞ ∞
X X ǫ
m(U ) ≤ m(In ) = n−1
= 2 ǫ.
n=1 n=1
2

Therefore, λ∗ (E) < 2ǫ. Since ǫ > 0 was arbitrary, λ∗ (E) = 0. As a conse-
quence, we also have λ∗ (E) = 0.
(b) If I is any bounded interval, then

λ∗ (I) = λ∗ (I) = m(I).

Suppose I = (a, b) with a, b ∈ R. Since I itself is open,

λ∗ (I) ≤ m(I) = b − a.

On the other hand, if 0 < ǫ < (b − a), then [a + ǫ/2, b − ǫ/2] is a compact
subset of I, and as a consequence
h ǫ ǫ i
b − a − ǫ = m a + ,b − ≤ λ∗ (I).
2 2
Therefore,
b − a − ǫ ≤ λ∗ (I) ≤ λ∗ (I) ≤ b − a.
Since ǫ > 0 was arbitrary, equality holds. A similar argument proves that if I
is any closed and bounded interval, then

λ∗ (I) = λ∗ (I) = m(I).

As a consequence of Theorem 10.3.2(b), the result holds for any bounded


interval I.
(c) For any open set U ,

λ∗ (U ) = λ∗ (U ) = m(U ).

By definition, λ∗ (U ) = m(U ). But m(U ) = N m(InS


P
), where {In } is a pair-
wise disjoint collection of open intervals with U = n In . Suppose α ∈ R
satisfies α < λ∗ (U ). Since m(U ) > α, there exists a finite number of intervals
N
P
I1 , . . . , IN such that m(Ij ) > α. For each j, choose a closed and bounded
j=1
N
P SN
interval Jj ⊂ Ij such that m(Jj ) > α. Let K = j=1 Jj . Then K is a com-
j=1
pact subset of U and thus λ∗ (U ) < m(K) Finally, since the intervals {Jj }N
j=1
are pairwise disjoint, by Exercise 6 of Section 10.2,
N
X
m(K) = m(Jj ) > α.
j=1
Lebesgue Measure and Integration 477

Therefore λ∗ (U ) > α. If λ∗ (U ) = ∞, then by the above λ∗ (U ) > α for


every α ∈ R; that is, λ∗ (U ) = ∞. On the other hand, if λ∗ (U ) is finite, take
α = λ∗ (U ) − ǫ, where ǫ > 0 is arbitrary. But then λ∗ (U ) ≥ λ∗ (U ) > λ∗ (U ) − ǫ
for every ǫ > 0. From this it now follows that λ∗ (U ) = λ∗ (U ) = m(U ). 

Measurable Sets
In both of the previous examples, the inner and outer measure of the sets
are equal. As we shall see, all subsets of R build out of open sets or closed
sets by countable unions, intersections, and complementation will have this
property, and this includes most sets encountered in practice. In fact, the
explicit construction of a set whose inner and outer measure are different
requires use of an axiom from set theory, the Axiom of Choice, which we have
not discussed. The construction of such a set is outlined in the miscellaneous
exercises.

DEFINITION 10.3.4
(a) A bounded subset E of R is said to be Lebesgue measurable or
measurable if
λ∗ (E) = λ∗ (E).
If this is the case, then the measure of E, denoted λ(E), is defined to be

λ(E) = λ∗ (E) = λ∗ (E).

(b) An unbounded set E is measurable if E ∩ [a, b] is measurable for


every closed and bounded interval [a, b]. If this is the case, we define

λ(E) = lim λ(E ∩ [−k, k]).


k→∞

Remarks. (a) As a consequence of Example 10.3.3(c) every open set U is


measurable with
λ(U ) = m(U )
(b) If E is unbounded and E∩I is measurable for every closed and bounded
interval I, then by Theorem 10.3.2 the sequence {λ(E ∩ [−k, k])}∞ k=1 is non-
decreasing, and as a consequence

λ(E) = lim λ(E ∩ [−k, k])


k→∞

exists.
(c) There is no discrepancy between the two parts of the definition. We
will shortly prove in Theorem 10.4.1 that if E is a bounded measurable set,
then E ∩ I is measurable for every interval I. Conversely, if E is a bounded
set for which
λ∗ (E ∩ [a, b]) = λ∗ (E ∩ [a, b])
478 Introduction to Real Analysis

for all a, b ∈ R, then by choosing a and b sufficiently large such that E ⊂ [a, b],
we have λ∗ (E) = λ∗ (E). The two separate definitions are required due to the
existence of unbounded nonmeasurable sets E for which

λ∗ (E) = λ∗ (E) = ∞.

An example of such a set will be given in Exercise 5 of Section 10.4.

THEOREM 10.3.5 Every set E of outer measure zero is measurable with


λ(E) = 0.

Proof. Suppose E ⊂ R with λ∗ (E) = 0. Then for any closed and bounded
interval I,
λ∗ (E ∩ I) ≤ λ∗ (E) = 0
Thus λ∗ (E ∩I) = λ∗ (E ∩I) = 0, and hence E ∩I is measurable for every closed
and bounded interval I. Since λ(E ∩ [−k, k]) = 0 for every k ∈ N, λ(E) = 0.

As a consequence of the previous theorem and Example 10.3.3(a), every
countable set E is measurable with λ(E) = 0. In particular, Q is measurable
with λ(Q) = 0. Another consequence of Theorem 10.3.5 is that every subset
of a set of measure zero is measurable.

THEOREM 10.3.6 Every interval I is measurable with λ(I) = m(I).

Proof. By Example 10.3.3(b), if I is a bounded interval, λ∗ (I) = λ∗ (I) =


m(I). Thus I is measurable with λ(I) = m(I), On the other hand, if I is
unbounded, then I ∩ [a, b] is a bounded interval for every a, b ∈ R, and thus
measurable. In this case,

λ(I) = lim λ(I ∩ [−k, k]) = lim m(I ∩ [−k, k]) = ∞. 


k→∞ k→∞

THEOREM 10.3.7 For any a, b ∈ R and E ⊂ R,

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) = b − a.

Proof. Let U be any open subset of R with E ∩ [a, b] ⊂ U . Then U c ∩ [a, b] is


compact with U c ∩ [a, b] ⊂ E c ∩ [a, b]. Therefore,

m(U ) + λ∗ (E c ∩ [a, b]) ≥ m(U ∩ [a, b]) + m(U c ∩ [a, b]) = b − a.

The last equality follows by Theorem 10.2.15. Taking the infimum over all
open sets U containing E ∩ [a, b] gives

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) ≥ b − a.


Lebesgue Measure and Integration 479

To prove the reverse inequality, let K be a compact subset of E c ∩ [a, b].


Then K c is open with K c ∩ [a, b] ⊃ E ∩ [a, b]. Therefore

λ∗ (E ∩ [a, b]) + m(K ∩ [a, b]) ≤ m(K c ∩ [a, b]) + m(K ∩ [a, b]) = b − a.

The last equality again follows by Theorem 10.2.15 with U = K c . Thus taking
the supremum over all compact subsets K of E ∩ [a, b] gives

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) ≤ b − a,

which combined with the above, proves the result. 

LEMMA 10.3.8 For any subset E of R,

λ∗ (E) = lim λ∗ (E ∩ [−k, k]).


k→∞

Proof. Since the proof is similar to that of Theorem 10.2.5, we leave it as an


exercise (Exercise 8). 

THEOREM 10.3.9 Suppose E1 , E2 are subsets of R. Then


(a) λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ) ≤ λ∗ (E1 ) + λ∗ (E2 ), and
(b) λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ) ≥ λ∗ (E1 ) + λ∗ (E2 ).

Proof. (a) If λ∗ (Ei ) = ∞ for some i, i = 1, 2, then inequality (a) certainly


holds. Thus suppose λ∗ (Ei ) < ∞ for i = 1, 2. Let ǫ > 0 be given. By the
definition of outer measure, for each i, we can choose an open set Ui containing
Ei such that
ǫ
m(Ui ) < λ∗ (Ei ) + .
2
Therefore

ǫ + λ∗ (E1 ) + λ∗ (E2 ) > m(U1 ) + m(U2 ),

which by Theorem 10.2.9

= m(U1 ∪ U2 ) + m(U1 ∩ U2 )
≥ λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ).

The last inequality follows from the definition of outer measure. Since ǫ > 0
was arbitrary, inequality (a) follows.
(b) Let a, b ∈ R be arbitrary. By (a) applied to [a, b] ∩ Eic , we have

λ∗ ([a, b] ∩ E1c ) + λ∗ ([a, b] ∩ E2c )


≥ λ∗ ([a, b] ∩ (E1c ∪ E2c )) + λ∗ ([a, b] ∩ (E1c ∩ E2c ))
= λ∗ ([a, b] ∩ (E1 ∩ E2 )c ) + λ∗ ([a, b] ∩ (E1 ∪ E2 )c ).
480 Introduction to Real Analysis

But by Theorem 10.3.7, for any E ⊂ R,

λ∗ ([a, b] ∩ E c ) = (b − a) − λ∗ (E ∩ [a, b]).

Therefore,

λ∗ (E1 ∩ [a, b]) + λ∗ (E2 ∩ [a, b])


≤ λ∗ ([a, b] ∩ (E1 ∩ E2 )) + λ∗ ([a, b] ∩ (E1 ∪ E2 )).

For each k ∈ N, set Ik = [−k, k]. By the above,

λ∗ (E1 ∩ Ik ) + λ∗ (E2 ∩ Ik ) ≤ λ∗ ((E1 ∩ E2 ) ∩ Ik ) + λ∗ ((E1 ∪ E2 ) ∩ Ik )


≤ λ∗ (E1 ∩ E2 ) + λ∗ (E1 ∪ E2 ).

The result now follows by Lemma 10.3.8. 

Exercises 10.3
1. a. If E ⊂ R, a, b ∈ R, prove that λ∗ (E ∩ (a, b)) = λ∗ (E ∩ [a, b]), and
λ∗ (E ∩ (a, b)) = λ∗ (E ∩ [a, b]).
*b. If E ⊂ R, prove that λ∗ (E + x) = λ∗ (E) and λ∗ (E + x) = λ∗ (E) for
every x ∈ R, where E + x = {a + x : a ∈ E}.
2. Prove that every subset of a set of measure zero is measurable.
3. *Let E1 ⊂ R with λ∗ (E1 ) = 0. If E2 is a measurable subset of R, prove
that E1 ∩ E2 and E1 ∪ E2 are measurable.
4. Let E = [0, 1] \ Q. Prove that E is measurable and λ(E) = 1.
5. Let P denote the Cantor set in [0, 1].
a. Prove that λ∗ (P c ∩ [0, 1]) = 1.
b. Prove that λ∗ (P ) = 0.
6. *If E ⊂ R, prove that there exists a sequence
T {Un } of open sets with
E ⊂ Un for all n ∈ N such that λ∗ (E) = λ∗ ( n Un ).
7. Prove Theorem 10.3.2(b).
8. *Prove Lemma 10.3.8.
9. a. Prove that every compact set K is measurable with λ(K) = m(K).
b. Prove that every closed set is measurable.

10.4 Properties of Measurable Sets


In this section, we will study some of the basic properties of measurable sets.
Our first result proves that the union and intersection of two measurable sets
are again measurable.
Lebesgue Measure and Integration 481

THEOREM 10.4.1 If E1 and E2 are measurable subsets of R, then

E1 ∩ E 2 and E1 ∪ E2

are measurable with

λ(E1 ) + λ(E2 ) = λ(E1 ∪ E2 ) + λ(E1 ∩ E2 ).

Proof. (a) We consider first the case where both E1 and E2 are bounded
measurable sets, in which case, E1 ∩ E2 and E1 ∪ E2 are also bounded. Since

λ(Ei ) = λ∗ (Ei ) = λ∗ (Ei ), i = 1, 2,

by Theorem 10.3.9,

λ(E1 ) + λ(E2 ) ≤ λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 )


≤ λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ) ≤ λ(E1 ) + λ(E2 ).

Therefore,

λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ) = λ∗ (E1 ∪ E2 ) + λ∗ (E1 ∩ E2 ),

and as a consequence,

λ∗ (E1 ∪ E2 ) − λ∗ (E1 ∪ E2 ) = λ∗ (E1 ∩ E2 ) − λ∗ (E1 ∩ E2 ).

But for any bounded set E, λ∗ (E) − λ∗ (E) ≥ 0. Thus equality can hold in the
above if and only if both sides are zero; namely,

λ∗ (E1 ∪ E2 ) = λ∗ (E1 ∪ E2 ) and λ∗ (E1 ∩ E2 ) = λ∗ (E1 ∩ E2 ).

Therefore E1 ∩ E2 and E1 ∪ E2 are measurable, with

λ(E1 ) + λ(E2 ) = λ(E1 ∪ E2 ) + λ(E1 ∩ E2 ).

(b) Suppose one or both of the measurable sets E1 and E2 are unbounded.
Let I = [a, b] with a, b ∈ R. If both E1 and E2 are unbounded, then E1 ∩ I and
E2 ∩ I are measurable by definition. If one of the two sets, say E1 , is bounded,
then by part (a) E1 ∩ I is measurable. Thus in both cases, E1 ∩ I and E2 ∩ I
are bounded measurable sets. But then

(E1 ∩ E2 ) ∩ I and (E1 ∪ E2 ) ∩ I

are measurable for every closed and bounded interval I with

λ(E1 ∩ I) + λ(E2 ∩ I) = λ((E1 ∩ E2 ) ∩ I) + λ((E1 ∪ E2 ) ∩ I). (4)

Since E1 ∪ E2 is unbounded, E1 ∪ E2 is measurable by definition. Also, if


482 Introduction to Real Analysis

E1 ∩ E2 is unbounded, then it is measurable by definition. On the other hand,


if E1 ∩ E2 is bounded, choose I such that E1 ∩ E2 ⊂ I. In this case,

(E1 ∩ E2 ) ∩ I = E1 ∩ E2 .

Therefore, E1 ∩ E2 is measurable. Finally, to prove that

λ(E1 ) + λ(E2 ) = λ(E1 ∪ E2 ) + λ(E1 ∩ E2 ),

take Ik = [−k, k] in (4) and let k → ∞. 


Remark. As a consequence of part (a) of the previous theorem, if E is a
bounded measurable subset of R, then E ∩ I is measurable for every bounded
interval I. Also, if E1 , ..., En are measurable sets, then by induction
n
\ n
[
Ek and Ek
k=1 k=1

are measurable.

THEOREM 10.4.2 A subset E of R is measurable if and only if

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) ≤ b − a

for every a, b ∈ R.

Proof. Let E ⊂ R. Interchanging the roles of E and E c in Theorem 10.3.7


gives
λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) = b − a
for any a, b ∈ R. If E is measurable, then E ∩ [a, b] is measurable for every
a, b ∈ R, and thus
λ∗ (E ∩ [a, b]) = λ∗ (E ∩ [a, b]).
Thus if E is measurable, λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) = b − a.
Conversely, suppose E satisfies λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) ≤ b − a for
every a, b ∈ R. Since we always have

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) = b − a,

we obtain
λ∗ (E ∩ [a, b]) − λ∗ (E ∩ [a, b]) ≤ 0.
Since λ∗ (E ∩ [a, b]) ≤ λ∗ (E ∩ [a, b]), the above can hold if and only if

λ∗ (E ∩ [a, b]) = λ∗ (E ∩ [a, b])

Thus E ∩ [a, b] is measurable for every a, b ∈ R. Hence E is measurable. 


Lebesgue Measure and Integration 483

COROLLARY 10.4.3 A set E is measurable if and only if E c is measur-


able.

Proof. This is an immediate consequence of the fact that E satisfies Theorem


10.4.2 if and only if E c satisfies Theorem 10.4.2. 
Our next goal is to show that the union and intersection of a countable
collection of measurable sets is again measurable. For the proof of this result
we require to following theorem.

THEOREM 10.4.4
(a) If {En }∞
n=1 is a sequence of subsets of R, then

∞ ∞
!
[ X

λ En ≤ λ∗ (En ).
n=1 n=1

(b) If {En }∞
n=1 is a sequence of pairwise disjoint subsets of R, then

∞ ∞
!
[ X
λ∗ En ≥ λ∗ (En ).
n=1 n=1


λ∗ (En ) = ∞, then the conclusion in (a) is certainly true.
P
Proof. (a) If
n=1

λ∗ (En ) < ∞. Let ǫ > 0 be given. For each n ∈ N,
P
Thus we assume that
n=1
there exists an open set Un with En ⊂ Un such that
ǫ
m(Un ) < λ∗ (En ) + .
2n
S∞ S∞
Let U = n=1 Un . Then U is an open subset of R with E = n=1 En ⊂ U .
Thus

!
[

λ (E) ≤ m Un ,
n=1

which by Theorem 10.2.6


∞ ∞ 
X X ǫ 
≤ m(Un ) < λ∗ (En ) +
n=1 n=1
2n
X∞
= λ∗ (En ) + ǫ.
n=1

Since this holds for all ǫ > 0, the result follows.


484 Introduction to Real Analysis

(b) Suppose the sets En , n = 1, 2, ... are pairwise disjoint. Since E1 ∩ E2 =


∅, by Theorem 10.3.9(b)

λ∗ (E1 ) + λ∗ (E2 ) ≤ λ∗ (E1 ∪ E2 ).

By induction,
n n ∞
! !
X [ [
λ∗ (Ek ) ≤ λ∗ Ek ≤ λ∗ Ek .
k=1 k=1 k=1

Since the above holds for all n ∈ N, letting n → ∞ gives the desired result.


THEOREM
S∞ 10.4.5 Let {En }∞
n=1 be a sequence of measurable sets. Then
T ∞
n=1 En and n=1 En are measurable with
∞ ∞
!
[ X
(a) λ En ≤ λ(En ).
n=1 n=1
(b) If in addition !
the sets En , n = 1, 2, ..., are pairwise disjoint, then
[∞ X∞
λ En = λ(En ).
n=1 n=1
S∞
Proof. Let E = n=1 En . Without loss of generality we can assume that E
(and hence all the sets En ) is bounded. If not, we consider

[
E ∩ [a, b] = (En ∩ [a, b])
n=1

for a, b ∈ R.
Set A1 = E1 , and for each n ∈ N, n ≥ 2, set
n−1
! n−1
!c
[ \ [
An = En \ Ek = E n Ek .
k=1 k=1

Since finite unions, intersections, and complements of measurable sets are


again measurable, An is measurable for each n ∈ N. Furthermore, the sets
An , n ∈ N, are pairwise disjoint with

[
An = E.
n=1

Thus by Theorem 10.4.4(a) and (b),



X ∞
X

λ(An ) ≤ λ∗ (E) ≤ λ (E) ≤ λ(An ).
n=1 n=1
Lebesgue Measure and Integration 485

Therefore λ∗ (E) = λ∗ (E), and thus since E is bounded, E is measurable.


Furthermore, by Theorem 10.4.4

X
λ(E) ≤ λ(En ),
n=1

with equality if the sets En , n = 1, 2..., are pairwise disjoint.


S∞By Corollary 10.4.3, Enc is measurable for each n. Thus by the above,
c
n=1 En is measurable. Since

∞ ∞
!c
\ [
c
En = En ,
n=1 n=1

the intersection is also measurable. 

THEOREM 10.4.6
(a) If {En }∞
n=1 is a sequence of measurable sets with E1 ⊂ E2 ⊂ · · · , then


!
[
λ En = lim λ(En ).
n→∞
n=1

(b) If {En }∞
n=1 is a sequence of measurable sets with E1 ⊃ E2 ⊃ · · · and
λ(E1 ) < ∞, then

!
\
λ En = lim λ(En ).
n→∞
n=1
S∞
Proof. (a) Let E = n=1 En . By the previous theorem, E is measurable. If
λ(Ek ) = ∞ for some k, then λ(En ) = ∞ for all n ≥ k and λ(E) = ∞, thus
proving the result. Hence we assume that λ(En ) < ∞ for all n.
Set Eo = ∅, and for n ∈ N, let An = En \ En−1 . Then each An is measur-
able, the collection {An } is pairwise disjoint, and

[
An = E.
n=1

Thus by Theorem 10.4.5(b),



X N
X
λ(E) = λ(An ) = lim λ(En \ En−1 ).
N →∞
n=1 n=1

But En = (En \ En−1 ) ∪ En−1 . Since the sets En \ En−1 and En−1 are disjoint,

λ(En ) = λ(En \ En−1 ) + λ(En−1 ).


486 Introduction to Real Analysis

Therefore
N
X N
X
λ(En \ En−1 ) = [λ(En ) − λ(En−1 )] = λ(EN ) − λ(Eo ) = λ(EN ),
n=1 n=1

from which the result now follows.


T∞
(b) Let E = n=1 En . Again by the previous theorem E, is measurable.
Since
[ [∞
E1 = E (En \ En+1 ),
n=1

which is a union of pairwise disjoint measurable sets, by Theorem 10.4.5(b)



X
λ(E1 ) = λ(E) + λ(En \ En+1 )
n=1
N
X
= λ(E) + lim [λ(En ) − λ(En+1 )]
N →∞
n=1
= λ(E) + λ(E1 ) − lim λ(EN +1 ).
N →∞

Since λ(E1 ) < ∞, λ(E) = lim λ(EN +1 ). 


N →∞

The Sigma-Algebra of Measurable Sets


Let M denote the collection of measurable subsets of R. By Theorem 10.3.6,
every interval I is in M with λ(I) = m(I). Since every open set U can be
expressed as a finite or countable union of pairwise disjoint open intervals, by
Theorem 10.4.5 every open set U is measurable with

λ(U ) = m(U ).

Since the complement of every measurable set is measurable, M also contains


all the closed subsets of R. In particular, every compact set K is measurable
with λ(K) = m(K), where m(K) is as defined in Definition 10.2.10. These
by no means exhaust the measurable sets. Any set obtained from a countable
union or intersection of open sets or closed sets, or of sets obtained in this
manner, is again measurable.
The collection M is very large. To illustrate just how large we use the fact
that the Cantor set P has measure zero. Thus any subset of the Cantor set has
outer measure zero, and as a consequence of Theorem 10.3.5 is measurable.
By Property 6 of the Cantor set (Section 2.5), P has the same cardinality
(Definition 1.7.1) as the set of all sequences of 0′ s and 1′ s. By Miscellaneous
Exercise 5 of Chapter 1, the set of all sequences of 0’s and 1’s is equivalent
to [0, 1], and this set has the same cardinality as all of R. Thus the set of
all subsets of P is equivalent (not equal) to the set of all subsets of R. As
Lebesgue Measure and Integration 487

a consequence, in the terminology of equivalence of sets, M has the same


cardinality as the set of all subsets of R. However, nonmeasurable subsets of
R do exist. The construction of such a set will be outlined in the miscellaneous
exercises.
We conclude this section by summarizing some of the properties of M.

THEOREM 10.4.7
(a) If E ∈ M, then E c ∈ M.
(b) ∅, R ∈ M.
(c) If En ∈ M, n = 1, 2, ..., then

[ ∞
\
En ∈ M and En ∈ M.
n=1 n=1

(d) Every interval I ∈ M with λ(I) = m(I).


(e) If E ∈ M, then E + x ∈ M for all x ∈ R with

λ(E + x) = λ(E).

Proof. The result (a) is Corollary 10.4.3, whereas (b) follows from Theorem
10.3.6 and Corollary 10.4.3. The statement (c) is Theorem 10.4.5, whereas (d)
is Theorem 10.3.6. The proof of (e) follows from Exercise 1(b) of the previous
section. 
Any collection A of subsets of a set X satisfying (a), (b), and (c) of the
previous theorem is called a sigma-algebra (σ-algebra) of sets. The σ denotes
that the collection A is closed under countable unions.
Remark. An alternate approach to the theory of measure is due to Constantin
Carathéodory (1873–1950) in which a subset E of R is said to be measurable
if
λ∗ (E ∩ T ) + λ∗ (E c ∩ T ) = λ∗ (T ) (5)
for every subset T of R. Since T = (E ∩ T ) ∪ (E c ∩ T ), by Theorem 10.3.9 one
always has λ∗ (T ) ≤ λ∗ (E ∩ T ) + λ∗ (E c ∩ T ). Thus E satisfies (5) if and only
if
λ∗ (E ∩ T ) + λ∗ (E c ∩ T ) ≤ λ∗ (T ).
The advantage to this approach is that it does not require the concept of inner
measure, and it includes both unbounded and bounded sets simultaneously.
If a subset E of R satisfies (5), taking T = [a, b], a, b ∈ R gives

λ∗ (E ∩ [a, b]) + λ∗ (E c ∩ [a, b]) = λ∗ ([a, b]) = b − a.

Thus E satisfies Theorem 10.4.2 and hence is measurable. In Exercise 6, the


reader will be asked to prove that if E is measurable as defined in the text,
then E satisfies (5) for every subset T of R.
488 Introduction to Real Analysis

Exercises 10.4
1. Find a sequence {En }∞
n=1 of measurable sets with E1 ⊃ E2 ⊃ · · · such
that
λ ∞
T 
n=1 En 6= lim λ(En ).
n→∞

2. *If E is a measurable subset of R, prove that given ǫ > 0, there exists


an open set U ⊃ E and a closed set F ⊂ E such that λ(U \ E) < ǫ and
λ(E \ F ) < ǫ.
3. Let E be a bounded subset of R. Prove that E is measurable if and only
if given ǫ > 0 there exists an open set U and a compact set K with
K ⊂ E ⊂ U such that λ(U \ K) < ǫ.
4. *If E1 , E2 are measurable subsets of [0, 1], and if λ(E1 ) = 1, prove that
λ(E1 ∩ E2 ) = λ(E2 ).
5. Suppose E1 is a nonmeasurable subset of [0, 1]. Set E = E1 ∪ (1, ∞).
Prove that E is nonmeasurable but that
λ∗ (E) = λ∗ (E) = ∞.
6. *(Carathéodory) Prove that a subset E of R is measurable if and only
if λ∗ (E ∩ T ) + λ∗ (E c ∩ T ) = λ∗ (T ) for every subset T of R.

10.5 Measurable Functions


In our discussion of Lebesgue’s approach to integration, we defined the sets
 
β β
Ej = x ∈ [a, b] : (j − 1) ≤ f (x) < j ,
n n

where f : [a, b] → [0, β) is a bounded real-valued function. As we saw in


Section 10.1, in order to define the integral of f by partitioning the range, it
is necessary that the sets Ej , j = 1, ..., n, be measurable. Thus we make the
following definition.

DEFINITION 10.5.1 Let f be a real-valued function defined on [a, b]. The


function f is said to be measurable if for every s ∈ R, the set

{x ∈ [a, b] : f (x) > s}

is measurable. More generally, if E is a measurable subset of R, a function


f : E → R is measurable if

{x ∈ E : f (x) > s}

is measurable for every s ∈ R.


Lebesgue Measure and Integration 489

Since f −1 ((s, ∞)) = {x : f (x) > s}, f is measurable if and only if


−1
f ((s, ∞)) is a measurable set for every s ∈ R. We illustrate the idea of
a measurable function with the following examples.

EXAMPLES 10.5.2 (a) Let A be a measurable subset of R and let χA


denote the characteristic function of A. Then

R,
 s ≤ 0,
{x : χA (x) > s} = A, 0 < s ≤ 1,

∅, s > 1.

Since each of the sets ∅, A, and R are measurable, χA is a measurable function


on R.
(b) Let f : [0, 1] → R be defined by
(
0, x ∈ Q ∩ [0, 1],
f (x) =
x, x ∈ [0, 1] \ Q.

Then 
[0, 1],
 if s < 0,
{x ∈ [0, 1] : f (x) > s} = Qc ∩ (s, 1), if 0 ≤ s < 1,

∅, if s ≥ 1.

Again, since each of the sets is a measurable subset of R, f is measurable. 

Properties of Measurable Functions


We now consider some properties of measurable functions. Our first result
provides several equivalent conditions for measurability.

THEOREM 10.5.3 Let f be a real-valued function defined on a measurable


set E. Then f is measurable if and only if any of the following hold:
(a) {x : f (x) > s} is measurable for every s ∈ R.
(b) {x : f (x) ≥ s} is measurable for every s ∈ R.
(c) {x : f (x) < s} is measurable for every s ∈ R.
(d) {x : f (x) ≤ s} is measurable for every s ∈ R.

Proof. The set of (d) is the complement of the set in (a). Thus by Corollary
10.4.3, one is measurable if and only if the other is. Similarly for the sets of
(b) and (c). Thus it suffices to prove that (a) is equivalent to (b).
Suppose (a) holds. For each n ∈ N, let
 
1
En = x : f (x) > s −
n
490 Introduction to Real Analysis

By (a), En is measurable for all n ∈ N. But



\
{x : f (x) ≥ s} = En ,
n=1

which is measurable by Theorem 10.4.5. Conversely, since


∞  
[ 1
{x : f (x) > s} = x : f (x) ≥ s + ,
n=1
n

if (b) holds, then by Theorem 10.4.5, (a) also holds. 

THEOREM 10.5.4 Suppose f, g are measurable real-valued functions de-


fined on a measurable set E. Then
(a) f + c and cf are measurable for every c ∈ R.
(b) f + g is measurable.
(c) f g is measurable.
(d) 1/g is measurable provided g(x) 6= 0 for all x ∈ E.
Proof. The proof of (a) is straightforward and is omitted. The proof of (a) also
follows from (b) and (c) upon showing that constant functions are measurable.
(b) Let s ∈ R. Then f (x) + g(x) > s if and only if f (x) > s − g(x). If
x ∈ E is such that f (x) > s − g(x), then there exists r ∈ Q such that
f (x) > r > s − g(x).
Let {rn }∞
n=1 be an enumeration of Q. Then
∞ 
[ \ 
{x : f (x) + g(x) > s} = {x : f (x) > rn } {x : rn > s − g(x)} .
n=1

Since f and g are measurable functions,


{x : f (x) > rn } and {x : rn > s − g(x)}
are measurable sets for every n ∈ N. Thus their intersection and the resulting
union is also measurable. Therefore f + g is measurable.
(c) To prove (c) we first show that f 2 is measurable. If s < 0, then
{x ∈ E : f 2 (x) > s} = E,
which is measurable. Assume s ≥ 0. Then
√ [ √
{x : f 2 (x) > s} = {x : f (x) > s} {x : f (x) < − s}.
But each of these two sets are measurable. Thus their union is measurable.
Since
1
(f + g)2 − (f − g)2 ,

fg =
4
the function f g is measurable.
(d) The proof of (d) is left as an exercise (Exercise 5). 
Lebesgue Measure and Integration 491

THEOREM 10.5.5 Every continuous real-valued function on [a, b] is mea-


surable.

Proof. Exercise 7. 

A Property Holding Almost Everywhere


A very important concept in the study of measure theory involves the idea of
a property being true for all x except for a set of measure zero. This idea was
previously encountered in the statement of Lebesgue’s theorem in Chapter
6; namely, a bounded real-valued function f on [a, b] is Riemann integrable
if and only if {x : f is not continuous at x} has measure zero. An equivalent
formulation is that f is continuous except on a set of measure zero. In this
section, we will encounter several other properties that are assumed to hold
except on sets of measure zero.

DEFINITION 10.5.6 A property P is said to hold almost everywhere


(abbreviated a.e.) if the set of points where P does not hold has measure zero,
i.e.,
λ({x : P does not hold }) = 0.

Remark. The assertion that a set is of measure zero includes the assertion
that it is measurable. This however is not necessary. If instead we only require
that λ∗ ({x : P does not hold }) = 0, then by Theorem 10.3.5 the set
{x : P does not hold } is in fact measurable.
We will illustrate the concept of a property holding almost everywhere by
means of the following examples.

EXAMPLES 10.5.7 (a) Suppose f and g are real-valued functions defined


on [a, b]. The functions f and g are said to be equal almost everywhere,
denoted f = g a.e., if
{x ∈ [a, b] : f (x) 6= g(x)}
has measure zero. For example, if g(x) = 1 for all x ∈ [0, 1] and
(
1, x ∈ [0, 1] \ Q,
f (x) =
0, x ∈ [0, 1] ∩ Q.

Then {x ∈ [0, 1] : f (x) 6= g(x)} = [0, 1] ∩ Q which has measure zero. Therefore
f = g a.e.
(b) In Theorem 10.5.4 we proved that if g is a real-valued measurable
function on [a, b] with g(x) 6= 0 for all x ∈ [a, b], then 1/g is also measurable
on [a, b]. Suppose we replace the hypothesis g(x) 6= 0 for all x ∈ [a, b] with
g 6= 0 a.e.; that is, the set

E = {x ∈ [a, b] : g(x) = 0}
492 Introduction to Real Analysis

has measure zero. If we now define f by


(
g(x), x ∈ [a, b] \ E,
f (x) =
1, x ∈ E,

then f (x) 6= 0 for all x ∈ [a, b] and f (x) = g(x) except for x ∈ E, which has
measure zero. Thus f = g a.e. on [a, b]. As a consequence of our next theorem,
the function f will also be measurable on [a, b].
(c) A real-valued function f on [a, b] is continuous almost everywhere
if
{x ∈ [a, b] : f is not continuous at x}
has measure zero. As in Example 6.1.14, consider the function f on [0, 1]
defined by

1,
 x = 0,
f (x) = 0, if x is irrational
1
if x = m

n, n in lowest terms , x 6= 0.

As was shown in Example 4.2.2(g), the function f is continuous at every


irrational number in [0, 1], and discontinuous at every rational number in
[0, 1]. Therefore,

λ({x ∈ [0, 1] : f is not continuous at x}) = λ(Q ∩ [0, 1]) = 0.

Thus f is continuous a.e. on [0, 1].


(d) Let f and fn , n = 1, 2, ... be a sequence of real-valued functions defined
on [a, b]. The sequence {fn } is said to converge almost everywhere to f ,
denoted fn → f a.e., if

{x ∈ [a, b] : {fn (x)} does not converge to f (x)}

has measure zero. To illustrate this, consider the sequence {fn } defined in
Example 8.1.2(c) as follows: Let {xk } be an enumeration of Q ∩ [0, 1]. For
each n ∈ N, define fn on [0, 1] by
(
0, x = xk , 1 ≤ k ≤ n,
fn (x) =
1, otherwise.

Then (
0, if x is rational,
lim fn (x) = f (x) =
n→∞ 1, if x is irrational.
Thus {x ∈ [0, 1] : {fn (x)} does not converge to 1} = Q ∩ [0, 1], which has
measure zero. Hence fn → 1 a.e. on [0, 1]. 

One of the key results needed in the sequel is the following:


Lebesgue Measure and Integration 493

THEOREM 10.5.8 Suppose f and g are real-valued functions defined on a


measurable set A. If f is measurable and g = f a.e., then g is measurable on
A.

Proof. Let E = {x ∈ A : g(x) 6= f (x)}. Then λ(E) = 0. Let B = A \ E. Since


E is measurable, so is B. Also on B, g(x) = f (x). Fix s ∈ R. Then
[
{x ∈ A : g(x) > s} = {x ∈ B : g(x) > s} {x ∈ E : g(x) > s}
[
= {x ∈ B : f (x) > s} {x ∈ E : g(x) > s}.

Since E1 = {x ∈ E : g(x) > s} is a subset of E and λ(E) = 0, the set E1 is


measurable. Also, since f is measurable, {x ∈ B : f (x) > s} is measurable.
Therefore {x : g(x) > s} is measurable, and thus g is measurable. 

THEOREM 10.5.9 Let {fn }∞ n=1 be a sequence of real-valued measurable


functions defined on a measurable set A such that {fn (x)}∞
n=1 is bounded for
every x ∈ A. Let

ϕ(x) = sup{fn (x) : n ∈ N} and ψ(x) = inf{fn (x) : n ∈ N}.

Then ϕ and ψ are measurable on A.

Proof. The result follows by Theorem 10.4.5, and the fact that for every
s ∈ R,

[
{x : ϕ(x) > s} = {x : fn (x) > s} and
n=1
[∞
{x : ψ(x) < s} = {x : fn (x) < s}. 
n=1

COROLLARY 10.5.10 Let {fn }∞ n=1 be a sequence of real-valued measur-


able functions defined on a measurable set A, and let f be a real-valued func-
tion on A. If fn → f a.e. on A, then f is measurable on A.

Proof. Let E = {x : {fn (x)} does not converge to f (x)}. By hypothesis


λ(E) = 0. Set (
fn (x), x ∈ A \ E,
gn (x) =
0, x ∈ E.
Then gn = fn a.e. and thus is measurable. Also, lim gn (x) = g(x) exists for
n→∞
all x ∈ A. But

g(x) = lim gn (x) = lim gn (x) = inf sup{fk (x) : k ≥ n}.


n→∞ n→∞ n
494 Introduction to Real Analysis

By the previous theorem each of the functions

Fn (x) = sup{fk (x) : k ≥ n} and g(x) = inf{Fn (x) : n ∈ N}

are measurable on A. Finally, since f = g a.e., f itself is measurable. 


Suppose {fn } is a sequence of measurable functions on [a, b] such that
fn → f a.e. Then by definition there exists a subset E of [a, b] such that
λ([a, b] \ E) = 0, and lim fn (x) = f (x) for all x ∈ E. Exercise 15 provides a
n→∞
significant strengthening of this result. There you will be asked to prove that
given ǫ > 0, there exists a measurable set E ⊂ [a, b], such that λ([a, b]\E) < ǫ,
and {fn } converges uniformly to f on E. This result is known as Egorov’s
theorem.

Exercises 10.5
1. *Let f be defined on [0, 1] by


 0, x = 0,
1

f (x) = , 0 < x < 1,

 x
2, x = 1.

Prove directly that f is measurable.


2. Let f be a real-valued function on a measurable set A with finite range,
i.e., Range f = {α1 , ..., αn }, αj ∈ R. Prove that f is measurable if and
only if f −1 (αj ) is measurable for all j = 1, ..., n.
3. Let A be a measurable subset of R, and let f : A → R be measurable.
Prove that for each n ∈ N, the function fn defined by
(
f (x), if |f (x)| ≤ n,
fn (x) =
n, if |f (x)| > n,
is measurable on A.
4. If f is measurable on [a, b], prove that f + c and cf are measurable on
[a, b] for every c ∈ R.
5. *If g is measurable on [a, b] with g(x) 6= 0 for all x ∈ [a, b], prove that
1/g is measurable on [a, b].
6. Let f be a real-valued function on [a, b]. Prove that f is measurable if
and only if f −1 (U ) is a measurable subset of [a, b] for every open subset
U of R.
7. *Prove that every continuous real-valued function f on [a, b] is measur-
able.
8. Let A be a measurable subset of R. If g : A → R is measurable and
f : R → R is continuous, prove that f ◦ g is measurable.
9. If f is monotone on [a, b] and g : R → [a, b] is measurable, prove that
f ◦ g is measurable.
Lebesgue Measure and Integration 495

10. Let E be a measurable subset of R, and let f be a measurable function


on E. Define the functions f + and f − on E as follows:
f + (x) = max{f (x), 0}, f − (x) = max{−f (x), 0}.
*a. Prove directly that f + and f − are nonnegative measurable functions
on E with f (x) = f + (x) − f − (x).
b. Prove that |f (x)| = f + (x) + f − (x) and that |f | is measurable.
*c. If f is a real-valued function on [a, b] such that |f | is measurable on
[a, b], is f measurable on [a, b]?
d. If f (x) = 1
2
+ cos x, x ∈ [0, 2π], find f + (x) and f − (x).
11. Let A be a measurable subset of R and let {fn } be a sequence of mea-
surable functions on A. Let E = {x ∈ A : {fn (x)}∞
n=1 converges }. Prove
that E is measurable.
12. *Let f be a bounded measurable function on [0, 1] and let {fn } be defined
on [0, 1] by fn (x) = xn f (x). Prove that each fn is measurable and that
{fn } converges to 0 almost everywhere on [0, 1].
13. Let {xk } be an enumeration of the rational numbers in [0, 1]. For each
n ∈ N let
(
1 if x = xk , 1 ≤ k ≤ n,
fn (x) = .
0 otherwise.
Show that fn is measurable for each n ∈ N, and that {fn } converges to
0 almost everywhere on [0, 1].
14. *If f is differentiable on [a, b], prove that f ′ is measurable on [a, b].
15. *Egorov’s Theorem: Let {fn } be a sequence of measurable functions
on [a.b] such that fn → f a.e. on [a, b]. Given ǫ > 0, prove that there
exists a measurable subset E of [a, b] such that λ([a, b] \ E) < ǫ and {fn }
converges uniformly to f on E.
16. Construct a sequence {fn } of measurable functions on [0, 1] such that
{fn (x)} converges for each x ∈ [0, 1] but that {fn } does not converges
uniformly on any measurable set E ⊂ [0, 1] with λ([0, 1] \ E) = 0.
17. Let f be a measurable function on [a, b]. Prove that the function
λ({x ∈ [a, b] : f (x) > t}), t > 0,
is nonincreasing and right continuous.
18. If {fn } is a nondecreasing sequence of measurable functions on [a, b] and
f = lim fn , then for all t > 0,
n→∞

lim λ({x ∈ [a, b] : |fn (x)| > t}) = λ({x ∈ [a, b] : |f (x)| > t}).
n→∞

10.6 Lebesgue Integral of a Bounded Function


There are many different approaches to the development of the Lebesgue
integral. One is the method outlined in the first section of this chapter which
496 Introduction to Real Analysis

is pursued further in Exercise 1. The drawback to this approach is that it a


priori assumes that f is a measurable function. The approach that we will
follow is patterned on the Darboux approach to the Riemann integral.

DEFINITION 10.6.1 Let E be a measurable subset of R. A measurable


partition of E is a finite collection P = {E1 , ..., En } of pairwise disjoint
measurable subsets of E such that
n
[
Ek = E.
k=1

Suppose P = {xo , x1 , ..., xn } is a point partition of [a, b] as considered


in Chapter 6. Set E1 = [xo , x1 ], and for k = 2, ..., n, set Ek = (xk−1 , xk ].
Then the collection P = {E1 , ..., En } is a measurable partition of [a, b]. A
measurable partition of [a, b] however need not consist of intervals. For example
if E1 = [a, b] ∩ Q and E2 = [a, b] \ E1 , then {E1 , E2 } is a measurable partition
of [a, b].
As for the Riemann integral, if f is a bounded real-valued function on
[a, b], and P = {E1 , ..., En } is a measurable partition of [a, b], we define the
lower and upper Lebesgue sums of f with respect to P, denoted LL (P, f )
and UL (P, f ) respectively, by
n
X n
X
LL (P, f ) = mk λ(Ek ) and UL (P, f ) = Mk λ(Ek ),
k=1 k=1

where mk = inf{f (x) : x ∈ Ek } and Mk = sup{f (x) : x ∈ Ek }. Clearly


LL (P, f ) ≤ UL (P, f ) for every measurable partition P of [a, b]. As for the
Riemann integral, we could now define the upper and lower Lebesgue integrals
of f , and then define a function to be Lebesgue integrable if and only if these
two quantities are equal. The following theorem however shows that this is
unnecessary.

THEOREM 10.6.2 Let f be a bounded real-valued function on [a, b]. Then

sup LL (P, f ) = inf UL (Q, f ),


P Q

where the infimum and supremum are taken over all measurable partitions Q
and P of [a, b], if and only if f is measurable on [a, b].

Remark. Although the previous theorem is stated for a closed interval [a, b],
the result is also true for f defined an any measurable subset A of R with
λ(A) < ∞.
As a consequence of the previous theorem, which we will shortly prove, we
make the following definition.
Lebesgue Measure and Integration 497

DEFINITION 10.6.3 If f is a bounded real-valued measurable function on


R Rb
[a, b], the Lebesgue integral of f over [a, b], denoted [a,b] f dλ (or a f dλ),
is defined by Z
f dλ = sup LL (P, f ),
[a,b] P

where the supremum is taken over all measurable partitions of [a, b]. IfR A is a
measurable subset of [a, b], the Lebesgue integral of f over A, denoted A f dλ,
is defined by Z Z
f dλ = f χA dλ.
A [a,b]

R
Remarks. (a) In defining A f dλ it was implicitly assumed that f is defined
on all of [a, b]. If f is only defined on the measurable set A, A ⊂ [a, b], then
f χA can still be defined on all of [a, b] in the obvious manner; namely,
(
f (x), x ∈ A,
(f χA )(x) =
0, x 6∈ A.

Alternately, if f is a bounded measurable function defined on a measurable


subset A of [a, b], we could define
Z
f dλ = sup LL (Q, f ),
A

where the supremum is taken over all measurable partitions Q of A. However,


if Q = {E1 , ..., En } is any measurable partition of A, then

P = {E1 , ..., En , [a, b] \ A}

is a measurable partition of [a, b] with

LL (Q, f ) = LL (P, f χA ).

Conversely, if P = {E1 , ..., En } is any measurable partition of [a, b], then


Q = {E1 ∩ A, ..., En ∩ A} is a measurable partition
R of A for which LL (Q, f ) =
LL (P, f χA ). Thus the two definitions for A f dλ give the same value.
(b) To distinguish between the Lebesgue and Riemann integral of a
bounded real-valued function f on [a, b], the Riemann integral of f , if it exists,
will be denoted by
Z b
f (x)dx.
a
If in the Lebesgue integral we wish to emphasize the variable x, we will write
Rb
a
f (x) dλ(x) to denote the Lebesgue integral of f . The two different notations
should cause no confusion. In fact, in Corollary 10.6.8 we will prove that every
498 Introduction to Real Analysis

Riemann integrable function on [a, b] is also Lebesgue integrable, and the two
integrals are equal.
Prior to proving Theorem 10.6.2, we first need the analogue of Theorem
6.1.4.

DEFINITION 10.6.4 Let E be a measurable set and let P be a measurable


partition of E. A measurable partition Q of E is a refinement of P if every
set in Q is a subset of some set in P.

A useful fact about refinements is the following: If P = {E1 , ..., En } and


Q = {A1 , ..., Am } are measurable partitions of E, then the collection

{Ei ∩ Aj }n,m
i=1,j=1

is a measurable partition of E that is a refinement of both P and Q.

LEMMA 10.6.5 If P, Q are measurable partitions of [a, b] such that Q is a


refinement of P, then

LL (P, f ) ≤ LL (Q, f ) ≤ UL (Q, f ) ≤ UL (P, f ).

As a consequence,
sup LL (P, f ) ≤ inf UL (Q, f ).
P Q

Proof. The proof of the lemma is almost verbatim the proof of Lemma 6.1.3
and Theorem 6.1.4, and thus is left to the exercises (Exercise 2). 

EXAMPLE 10.6.6 In this example, we calculate the Lebesgue integral of


what is commonly called a simple function on [a, b]. A simple function on
[a, b] is a measurable real-valued function on [a, b] that assumes only a finite
number of values.
Suppose s is a simple function on [a, b] with Range s = {α1 , ..., αn }, where
αi 6= αj whenever i 6= j. For each i = 1, ..., n, set

Ei = {x ∈ [a, b] : s(x) = αi } = s−1 ({αi }).

Since s is measurable, each Ei is a measurable set, and


n
X
s(x) = αi χEi (x). (6)
i=1

since αi 6= αj if i 6= j, the sets Ei , i = 1, ..., n, are pairwise


Furthermore, S
n
disjoint with i=1 Ei = [a, b]. Equation (6) is called the canonical repre-
sentation of s. If all the sets Ei are intervals, then s is a step function on
[a, b]. 
Lebesgue Measure and Integration 499

We will now show that every simple function s is Lebesgue integrable on


[a, b] and compute the Lebesgue integral of s.

LEMMA 10.6.7 If s is a simple function on [a, b] with canonical represen-


tation
Xn
s= α i χ Ei ,
i=1
Sn
where {Ei }ni=1 are pairwise disjoint measurable subsets of [a, b] with i=1 Ei =
[a, b], then s is Lebesgue integrable on [a, b] with
Z Xn
s dλ = αj λ(Ej ).
[a,b] j=1

Proof. To show that s is Lebesgue integrable on [a, b] we will prove that

sup LL (Q, s) = inf UL (Q, f ),


Q Q

where the supremum and infimum are taken over all measurable partitions Q
of [a, b]. Since P = {E1 , ..., En } is a measurable partition of [a, b] and s(x) = αj
for all x ∈ Ej ,
Xn
LL (P, f ) = UL (P, f ) = αj λ(Ej ).
j=1

But then
n
X n
X
αj λ(Ej ) ≤ sup LL (Q, f ) ≤ inf UL (Q, f ) ≤ αj λ(Ej ).
Q Q
j=1 j=1

R n
P
Therefore s is Lebesgue integrable on [a, b] with s dλ = αi λ(Ei ). 
[a,b] i=1

Remark. Suppose f is a bounded real-valued measurable function on [a, b].


If P = {E1 , ..., En } is a measurable partition of [a, b], set
n
X n
X
ϕ(x) = mk χEk (x) and ψ(x) = Mk χEk (x), (7)
k=1 k=1

where mk = inf{f (x) : x ∈ Ek } and Mk = sup{f (x) : x ∈ Ek }. Then ϕ and


ψ are simple functions on [a, b] with ϕ(x) ≤ f (x) ≤ ψ(x) for all x ∈ [a, b].
Furthermore, by Lemma 10.6.7
Z X n
ϕ dλ = mk λ(Ek ) = LL (P, f ), and
[a,b] k=1
Z Xn
ψ dλ = Mk λ(Ek ) = UL (P, f ).
[a,b] k=1
500 Introduction to Real Analysis

Thus if f is a bounded real-valued measurable function on [a, b],


Z Z
f dλ = sup ϕ dλ,
[a,b] ϕ≤f [a,b]

where the supremum is taken over all simple functions ϕ on [a, b] satisfying
ϕ(x) ≤ f (x) for all x ∈ [a, b].
Proof of Theorem 10.6.2. Suppose f is a measurable function on [a, b] with
m ≤ f (x) < M for all x ∈ [a, b]. Let β = M − n, and for n ∈ N, partition
[m, M ) into n subintervals of length β/n. For each k = 1, ..., n, set
 
β β
Ek = x ∈ [a, b] : m + (k − 1) ≤ f (x) < m + k .
n n
Then Pn = {E1 , ..., En } is a measurable partition of [a, b]. Also, if mk =
inf{f (x) : x ∈ Ek } and Mk = sup{f (x) : x ∈ Ek }, then
β β
mk ≥ m + (k − 1) and Mk ≤ m + k .
n n
Therefore

0 ≤ inf UL (P, f ) − sup LL (Q, f ) ≤ UL (Pn , f ) − LL (Pn , f )


P Q
n
X
= (Mk − mk )λ(Ek )
k=1
Xn h i
≤ (m + k nβ ) − (m − (k − 1) nβ ) λ(Ek )
k=1
n
βX β
= λ(Ek ) = (b − a).
n n
k=1

Since n ∈ N is arbitrary, by letting n → ∞ we obtain

sup LL (Q, f ) = inf UL (P, f ). (8)


Q P

Conversely, suppose (8) holds. By taking a common refinement if necessary,


for each n ∈ N, there exists a measurable partition Pn of [a, b] such that
1
UL (Pn , f ) < inf UL (P, f ) + , and
P 2n
1
LL (Pn , f ) > sup LL (Q, f ) − .
Q 2n

Since equality holds in (8),


1
UL (Pn , f ) − LL (Pn , f ) < .
n
Lebesgue Measure and Integration 501

For the partition Pn , let ϕn and ψn be simple functions on [a, b] as defined by


equation (7), satisfying ϕn (x) ≤ f (x) ≤ ψn (x) for all x ∈ [a, b], and
Z Z
ϕn dλ = LL (Pn , f ), ψn dλ = UL (Pn , f ).
[a,b] [a,b]

Define ϕ and ψ on [a, b] by ϕ(x) = supn ϕn (x) and ψ(x) = inf n ψn (x). By
Theorem 10.5.9 the functions ϕ and ψ are measurable functions on [a, b], with

ϕ(x) ≤ f (x) ≤ ψ(x)

for all x ∈ [a, b].


To complete the proof we will show that ϕ = ψ a.e. on [a, b]. Then as a
consequence of Theorem 10.5.8, the function f will be measurable on [a, b].
Let
E = {x ∈ [a, b] : ϕ(x) < ψ(x)},
and for each k ∈ N, let
 
1
Ek = x ∈ [a, b] : ϕ(x) < ψ(x) − .
k
S∞ 1
Then E = k=1 Ek . If x ∈ Ek , then ϕn (x) < ψn (x) − k for all n ∈ N. For
n, k ∈ N, let  
1
An,k = x : ϕn (x) < ψn (x) − .
k
If x ∈ An,k , then ψn (x) − ϕn (x) > 1/k. Consider the simple function

sn (x) = (ψn (x) − ϕn (x))χAn,k (x).

Suppose the measurable partition Pn is given by {B1 , ..., BN }. Then


N
X
sn (x) = (Mj − mj )χBj ∩An,k (x),
j=1

where Mj and mj denote the supremum and infimum of f respectively over


Bj . The collection

Q = {Bj ∩ An,k }N
j=1 ∪ {[a, b] \ An,k }

is a measurable partition of [a, b]. If m∗j = inf{sn (x) : x ∈ Bj ∩ An,k }, then


m∗j > 1/k for all j = 1, ..., N . Also, since sn (x) = 0 for all x ∈ [a, b] \ An,k ,

N N
X 1X λ(An,k )
LL (Q, sn ) = m∗j λ(Bj ∩ An,k ) > λ(Bj ∩ An,k ) = .
j=1
k j=1 k
502 Introduction to Real Analysis

On the other hand,


N
X
UL (Q, sn ) = (Mj − mj )λ(Bj ∩ An,k )
j=1
N
X 1
≤ (Mj − mj )λ(Bj ) = UL (Pn , f ) − LL (Pn , f ) < .
j=1
n

Combining the above two inequalities gives λ(An,k ) < k/n for all k, n ∈ N.
Since Ek ⊂ An,k for all n, for each k,
k
λ(Ek ) <
n
for all n ∈ N. Therefore λ(Ek ) = 0. Finally since

X
λ(E) ≤ λ(Ek ),
k=1

we have λ(E) = 0. Thus ϕ = ψ a.e. on [a, b] which proves the result. 

Comparison with the Riemann Integral


The definition of the Lebesgue integral is very similar to that of the Riemann
integral, except that in the Lebesgue theory we use measurable partitions
rather than point partitions. If P = {x0 , x1 , ..., xn } is a partition of [a, b], then
P ∗ = {[x0 , x1 ]} ∪ {(xk−1 , xk ]}nk=2
is a measurable partition of [a, b]. Furthermore, if f is a bounded real-valued
function on [a, b], then
L(P, f ) = LL (P ∗ , f ) and U (P, f ) = UL (P ∗ , f ).
Therefore, the lower Riemann integral of f satisfies
Z b
f = sup{L(P, f ) : P is a partition of [a, b]}
a

≤ sup{LL (Q, f ) : Q is a measurable partition of [a, b]}.


Similarly, for the upper Riemann integral of f we have
Z b
f ≥ inf{UL (Q, f ) : Q is a measurable partition of [a, b]}.
a

If f is Riemann integrable on [a, b], then the upper and lower Riemann inte-
grals of f are equal, and thus
Z b Z b
f (x) dx ≤ sup LL (Q, f ) ≤ inf UL (Q, f ) ≤ f (x) dx,
a Q Q a
Lebesgue Measure and Integration 503

where the supremum and infimum are taken over all measurable partitions Q
of [a, b]. As a consequence of Theorem 10.6.2, this proves the following result.

COROLLARY 10.6.8 If f is Riemann integrable on [a, b], then f is


Lebesgue integrable on [a, b], and
Z Z b
f dλ = f (x)dx.
[a,b] a

The converse however is false! This is illustrated by the following example.

EXAMPLE 10.6.9 Let E = [0, 1] \ Q, and set


(
1, when x is irrational,
f (x) = χE (x) =
0, when x is rational.

By Example 6.1.6(a) the function f is not Riemann integrable. On the other


hand, since f is a simple function, f is Lebesgue integrable, and by Lemma
10.6.7, Z
f dλ = λ(E) = 1. 
[0,1]

Properties of the Lebesgue Integral for Bounded Func-


tions
The following theorem summarizes some basic properties of the Lebesgue in-
tegral for bounded functions.

THEOREM 10.6.10 Suppose f, g are bounded real-valued measurable func-


tions on [a, b]. Then
Z Z Z
(a) for all α, β ∈ R, (αf + βg) dλ = α f dλ + β g dλ.
[a,b] [a,b] [a,b]
(b) If A1 , A2 are disjoint measurable subsets of [a, b], then
Z Z Z
f dλ = f dλ + f dλ.
A1 ∪A2 A1 A2
Z Z
(c) If f ≥ g a.e. on [a, b], then f dλ ≥ g dλ.
[a,b] [a,b]
Z Z
(d) If f = g a.e. on [a, b], then f dλ = g dλ.
[a,b] [a,b]
Z Z
(e) f dλ ≤ |f | dλ.
[a,b] [a,b]
504 Introduction to Real Analysis

Proof. Since the proof of (a) is similar to the proof of the corresponding result
for the Riemann integral we leave it as an exercise (Exercise 4). For the proof
of (b), by definition
Z Z
f dλ = f χA1 ∪A2 dλ.
A1 ∪A2 [a,b]

Since A1 ∩ A2 = ∅, f χA1 ∪A2 = f χA1 + f χA2 , and the result now follows by
(a).
(c) Consider the function h(x) = f (x) − g(x). By hypothesis h ≥ 0 a.e. on
[a, b]. Let
E1 = {x : h(x) ≥ 0} and E2 = [a, b] \ E1 .
Consider the measurable partition P = {E1 , E2 } of [a, b]. Then
Z
h dλ ≥ LL (P, f ) = m1 λ(E1 ) + m2 λ(E2 ).
[a,b]

Since h(x) ≥ 0 for all x ∈ E1 , m1 = inf{h(x) : x ∈ E1 } ≥ 0. On the other


Rb
hand, since h ≥ 0 a.e., λ(E2 ) = 0. Therefore a h dλ ≥ 0. The result now
follows by (a).
The result (d) is an immediate consequence of (c), and (e) is left for the
exercises. The measurability of |f | follows from Exercise 8 or 10 of the previous
section. 

Bounded Convergence Theorem


One of the main advantages of the Lebesgue theory of integration involves the
interchange of limits. If {fn } is a sequence of Riemann integrable functions on
[a, b] such that fn (x) converges to a function f (x) for all x ∈ [a, b], then there
is no guarantee that f is Riemann integrable on [a, b]. An example of such a
sequence was given in Example 8.1.2(c). For the Lebesgue integral however
we have the following very useful result.

THEOREM 10.6.11 (Bounded Convergence Theorem) Suppose {fn }


is a sequence of real-valued measurable functions on [a, b] for which there exists
a positive constant M such that |fn (x)| ≤ M for all n ∈ N, and all x ∈ [a, b].
If
lim fn (x) = f (x) a.e. on [a, b],
n→∞

then f is Lebesgue integrable on [a, b] and


Z Z
f dλ = lim fn dλ.
[a,b] n→∞ [a,b]

Remark. Although we state and prove the bounded convergence theorem for
a closed and bounded interval [a, b], the conclusion is still valid if the sequence
Lebesgue Measure and Integration 505

{fn } is defined on a bounded measurable set A. The necessary modifications


to the proof are left to the exercises.
Proof. Since fn → f a.e., f is measurable by Corollary 10.5.10, and thus
Lebesgue integrable. Let
E = {x ∈ [a, b] : fn (x) does not converge to f (x)}.
Define the functions g and gn , n ∈ N, on [a, b] as follows:
( (
fn (x), x ∈ [a, b] \ E, f (x), x ∈ [a, b] \ E,
gn (x) = and g(x) =
0, x ∈ E, 0, x ∈ E.
Since λ(E) = 0, gn = fn a.e. and g = f a.e. Therefore
Z b Z b Z b Z b
gn dλ = fn dλ and g dλ = f dλ.
a a a a

Furthermore, gn (x) → g(x) for all x ∈ [a, b]. Let ǫ > 0 be given. For m ∈ N,
set
Em = {x ∈ [a, b] : |g(x) − gn (x)| < ǫ for all n ≥ m}.
S∞
Then E1 ⊂ E2 ⊂ · · · with n=1 Em = [a, b]. Therefore

\
c
Em = ∅.
m=1
c c
Here Em = [a, b] \ Em . Thus by Theorem 10.4.6 lim λ(Em ) = 0. Choose
m→∞
c
m ∈ N such that λ(Em ) < ǫ. Then |g(x) − gn (x)| < ǫ for all n ≥ m and all
x ∈ Em . Therefore
Z b Z b Z b Z b Z
f dλ − fn dλ = g dλ − gn dλ ≤ |g − gn | dλ
a a a a [a,b]
Z Z
= |g − gn | dλ + |g − gn | dλ
Em c
Em
c
< ǫ λ(Em ) + 2M ǫ[b − a + 2M ].
λ(Em ) <
R R
Since ǫ > 0 was arbitrary, we have lim [a,b] fn dλ = [a,b] f dλ. 
n→∞
Combining the bounded convergence theorem with Corollary 10.6.8, we
obtain the bounded convergence theorem for Riemann integrable functions
previously stated in Chapter 8. The theorem does require the additional hy-
pothesis that the limit function f is Riemann integrable.
THEOREM 8.4.3 Let f and fn , n ∈ N, be Riemann integrable functions on
[a, b] with lim fn (x) = f (x) for all x ∈ [a, b]. Suppose there exists a positive
n→∞
constant M such that |fn (x)| ≤ M for all x ∈ [a, b] and all n ∈ N. Then
Z b Z b
lim fn (x) dx = f (x) dx.
n→∞ a a
506 Introduction to Real Analysis

EXAMPLES 10.6.12 (a) In the first example we show that the conclu-
sion of the bounded convergence theorem is false if the sequence {fn } is
not bounded; that is, there does not exist a finite constant M such that
|fn (x)| ≤ M for all n ∈ N, and all x ∈ [a, b]. For each n ∈ N, define fn
on [0, 1] by (
n, 0 < x ≤ n1 ,
fn (x) =
0, otherwise.
Then {fn }∞
n=1 is a sequence of measurable functions on [0, 1] that is not
bounded but which satisfies

lim fn (x) = f (x) = 0 for all x ∈ [0, 1].


n→∞
R1
However, 0
fn dλ = nλ((0, 1/n]) = 1. Thus
Z Z
lim fn dλ = 1 6= 0 = f dλ.
n→∞ [0,1] [0,1]

(b) As our second example we consider the sequence {fn } of Example


9.2.3. For each n ∈ N, write n = 2k + j, where k = 0, 1, 2..., and 0 ≤ j < 2k .
Define fn on [0, 1] by
(
j
1, 2k
≤ x ≤ j+1
2k
,
fn (x) =
0, otherwise.

The first few of these are as follows: f1 = χ[0,1] , f2 = χ[0, 1 ] , f3 = χ[ 1 ,1] ,


2 2
f4 = χ[0, 1 ] , ... For each n ∈ N, fn ∈ R[0, 1] with
4

1
1
Z
fn (x) dx = .
0 2k
R1
Thus lim fn (x) dx = 0. On the other hand, if x ∈ [0, 1], then the sequence
n→∞ 0
{fn (x)} contains an infinite number of 0’s and 1’s, and thus does not converge.


Exercises 10.6.
1. *Let f be a bounded real-valued measurable function on [a, b] with m ≤
f (x) < M for all x ∈ [a, b]. Set β = M − m. For n ∈ N and j = 1, ..., n,
β β
let Ej = {x ∈ [a, b] : m + (j − 1) n ≤ f (x) < m + j n }.
The Lebesgue sums for f are defined by
n
P β
Sn (f ) = (m + (j − 1) n )λ(Ej ). Prove that
j=1
Z
lim Sn (f ) = f dλ.
n→∞ [a,b]
Lebesgue Measure and Integration 507

2. Prove Lemma 10.6.5.


3. *Let f be a bounded measurable functionR on [a, b]. If A is a measurable
subset of [a, b] with λ(A) = 0, prove that A f dλ = 0.
4. a. Prove Theorem 10.6.10(a).
b. Prove Theorem 10.6.10(e).
5. *Let f be a nonnegative bounded measurable function on R [a, b]. If
R E, F
are measurable subsets of [a, b] with E ⊂ F , prove that f dλ ≤ f dλ.
E F

6. Let f be a bounded measurable function on [a, b]. For each c > 0, prove
that
Z
1
λ({x ∈ [a, b] : |f (x)| > c}) ≤ |f | dλ.
c [a,b]
7. *Let
R f be a nonnegative bounded measurable function on [a, b] satisfying
[a,b]
f dλ = 0. Use the previous exercise to prove that f = 0 a.e. on [a, b].
8. (Fundamental Theorem of Calculus for the Lebesgue Integral) If
f is differentiable on [a, b] and f ′ is bounded on [a, b], then f ′ is Lebesgue
integrable, and
Z
f ′ dλ = f (b) − f (a).
[a,b]

∞ π/2 √
(1 − cos x)n sin x dx converges to a finite limit, and
P R
9. Prove that
n=0 0
find that limit.

R n function on [a, b] such that |f | < 1 a.e.


10. Let f be a bounded measurable
on [a, b]. Prove that lim f dλ = 0.
n→∞
[a,b]

11. Let {fn } be a sequence of nonnegative measurable functions on [a, b]


satisfying fn (x) ≤ M for all x ∈ [a, b] and n ∈ N. If {fn } converges to f
a.e. on [a, b], prove that
Z Z
lim fn e−fn dλ = f e−f dλ.
n→∞ [a,b] [a,b]

12. *If f is a bounded real-valued measurable function on [a, b], prove that
there exists a sequence {sn } of simple functions on [a, b] such that
lim sn (x) = f (x) uniformly on [a, b].
n→∞

13. Modify the proof of the bounded convergence theorem where the interval
[a, b] is replaced by a bounded measurable set A.
14. Use Egorov’s theorem (Exercise 15, Section 10.5) to provide an alternate
proof of the bounded convergence theorem.
15. Let f be a bounded measurable function on [a, b].
a. Given
R ǫ > 0, prove that there exists a simple function ϕ on [a, b] such
that |f − ϕ| dλ < ǫ.
[a,b]

*b. If ϕ is a simple function on [a, b] and ǫ > 0, prove that there exists a
508 Introduction to Real Analysis

step function h on [a, b] such that ϕ(x) = h(x) except on a set of measure
less than ǫ.
c. Given
R ǫ > 0, prove that there exists a step function h on [a, b] such
that |f − h| dλ < ǫ.
[a,b]
n
1
P
16. Let Sn (x) = A
2 0
+ Ak cos kx + Bk sin kx. If |Sn (x)| ≤ M for all
k=1
x ∈ [−π, π] and n ∈ N and f (x) = lim Sn (x) exists a.e. on [−π, π],
n→∞
prove that f is measurable and that the Ak and Bk are the Fourier
coefficients of f .

10.7 The General Lebesgue Integral


In this section, we extend the definition of the Lebesgue integral to include
both the case where the function f is unbounded, and also where the domain of
integration is unbounded. We will then prove the well known results of Fatou
and Lebesgue on the interchange of limits and integration. We first consider
the extension of the Lebesgue integral to nonnegative measurable functions.

The Lebesgue Integral of a Nonnegative Measurable Function


Suppose first that A is a bounded measurable subset of R, and that f is a
nonnegative measurable function defined on A. For each n ∈ N, consider the
function fn defined on A by
(
f (x), if f (x) ≤ n,
fn (x) = min{f (x), n} =
n, if f (x) > n.

Then {fn } is a sequence of nonnegative bounded measurable functions defined


on A, with lim fn (x) = f (x) for all x ∈ A. Furthermore, if m > n, then
n→∞

fn (x) ≤ fm (x) ≤ f (x)

for all x ∈ A, and thus the sequence


Z ∞
fn dλ
A n=1

is monotone increasing, and therefore converges either to a real number, or


diverges to ∞. This leads us to make the following definition:
Lebesgue Measure and Integration 509

DEFINITION 10.7.1
(a) Let f be a nonnegative measurable function defined on a bounded
R mea-
surable subset A of R. The Lebesgue integral of f over A, denoted A f dλ,
is defined by
Z Z Z
f dλ = lim fn dλ = sup min{f, n} dλ.
A n→∞ A n A

(b) If A is an unbounded measurable subset of R and f is a nonnega-


tive
R measurable function on A, the Lebesgue integral of f over A, denoted
A
f dλ, is defined by
Z Z
f dλ = lim f dλ.
A n→∞ A∩[−n,n]

In part (b) of the definition, the sequence


(Z )
f dλ
A∩[−n,n]
n∈N

is also monotone increasing, and thus converges either to a nonnegative real


number, or diverges to ∞. In the definition we do not exclude the possible
value of ∞ for the integral of f . If the integral however is finite, we make the
following definition.

DEFINITION 10.7.2 A nonnegative measurable function f defined on a


measurable subset A of R is said to be Lebesgue integrable on A if
Z
f dλ < ∞.
A

Remark. If A is either a finite or infinite interval with endpoints a, b ∈


R ∪ {−∞, ∞}, then the integral of a nonnegative measurable function f on A
Rb
is also denoted by f dλ.
a

EXAMPLES √ 10.7.3 (a) For our first example we consider the function
f (x) = 1 x defined on (0, 1). Then for each n ∈ N,
1

n,
 0 < x < 2,
fn (x) = min{f (x), n} = n
1 1
√ ,
 ≤ x ≤ 1.
x n2
Therefore
Z 1 1/n2 1  
1 1 2 1
Z Z
fn dλ = n dλ + √ dx = + 2 − =2− .
0 0 1/n2 x n n n
510 Introduction to Real Analysis

As a consequence
Z Z
f dλ = lim fn dλ = lim (2 − n1 ) = 2.
(0,1) n→∞ (0,1) n→∞

The answer in this example corresponds to the improper Riemann integral


of the function f . This will always be the case for nonnegative functions for
which the improper Riemann integral exists (Exercise 18).
(b) Let g(x) = 1/x, 0 < x ≤ 1. For the function g,
(
n, 0 < x < n1 ,
min{g(x), n} = 1 1
x, n ≤ x < 1.

Therefore
1 1/n 1
1
Z Z Z
min{g, n} dλ = n dλ + dx = 1 + ln n.
0 0 1/n x
Thus Z 1
g dλ = lim (1 + ln n) = ∞.
0 n→∞

Since the Lebesgue integral of g is infinite, g is not integrable on (0, 1].


(c) As our final example, consider f (x) = x−2 defined on A = (1, ∞). In
this example, for n ≥ 2, A ∩ [−n, n] = (1, n], and
Z n Z n
1
f dλ = x−2 dx = 1 − .
1 1 n
Therefore ∞
1
Z
f dλ = lim 1 − = 1.
1 n→∞ n
Thus f is integrable on (1, ∞). 

The following theorem summarizes some of the basic properties of the


Lebesgue integral of nonnegative measurable functions. Integrability of the
functions f and g are not required.
THEOREM 10.7.4 Let f, g be nonnegative measurable functions defined on
a measurable
Z set A. Then
Z Z Z Z
(a) (f + g) dλ = f dλ + g dλ and cf dλ = c f dλ for all
A A A A A
c > 0.
(b) If A1 , A2 are disjoint measurable subsets of A, then
Z Z Z
f dλ = f dλ + f dλ.
A1 ∪A2 A1 A2
Z Z
(c) If f ≤ g a.e. on A, then f dλ ≤ g dλ, with equality if f = g a.e.
A A
on A
Lebesgue Measure and Integration 511

Proof. We will indicate the method of proof by proving part of (a). The
remaining proofs are left to the exercises (Exercise 2). Suppose first that the
set A is bounded. Let h = f + g. Since

min{f (x) + g(x), n} ≤ min{f (x), n} + min{g(x), n} ≤ min{f (x) + g(x), 2n},

we have hn ≤ fn + gn ≤ h2n for all n ∈ N. As a consequence


Z Z Z Z
hn dλ ≤ fn dλ + gn dλ ≤ h2n dλ.
A A A A

Suppose f, g are integrable on A. Then


Z Z  Z Z Z Z
lim fn dλ + gn dλ = lim fn dλ+ lim gn dλ = f dλ+ g dλ.
n→∞ A A n→∞ A n→∞ A A A

Therefore, since
Z Z Z
lim hn dλ = lim h2n dλ = (f + g) dλ,
n→∞ A n→∞ A A

the result R
follows fromRthe above by letting n → ∞. If one, or both of the
sequences A fn dλ R, A gn dλ diverges to ∞, then so does their sum. In
this case, we obtain A (f + g)dλ = ∞. If A is unbounded, then by the above,
for each n ∈ N,
Z Z Z
(f + g) dλ = f dλ + g dλ,
A∩[−n,n] A∩[−n,n] A∩[−n,n]

and the result again follows by letting n → ∞. 


As a consequence of the previous theorem if f and g are nonnegative
integrable functions on the measurable set A, then so is f + g and cf for every
c > 0. Furthermore, if f ≤ g a.e. and g is integrable, then so is f .

Fatou’s Lemma
Our first major convergence theorem for integrals of nonnegative measurable
functions is the following result of Fatou.

THEOREM 10.7.5 (Fatou’s Lemma) If {fn } is a sequence of nonnega-


tive measurable functions on a measurable set A, and lim fn (x) = f (x) a.e.
n→∞
on A, then Z Z
f dλ ≤ lim fn dλ.
A n→∞ A

Proof. Suppose first that the set A is bounded. For each k ∈ N, let

hn (x) = min{fn (x), k} and h(x) = min{f (x), k}.


512 Introduction to Real Analysis

Then for each k ∈ N, the sequence {hn } converges a.e. to h on A. Since


|hn (x)| ≤ k for all x ∈ A, by the bounded convergence theorem
Z Z Z
min{f, k} dλ = lim min{fn , k} dλ ≤ lim fn dλ.
A n→∞ A n→∞ A

Since the above holds for each k ∈ N,


Z Z Z
f dλ = lim min{f, k} dλ ≤ lim fn dλ.
A k→∞ A n→∞ A

If A is unbounded, then by the above, for each k ∈ N


Z Z Z
f dλ ≤ lim fn dλ ≤ lim fn dλ.
A∩[−k,k] n→∞ A∩[−k,k] n→∞ A

Letting k → ∞ will give the desired result. 

EXAMPLE 10.7.6 In this example, we show that equality need not hold in
Fatou’s lemma. Consider the sequence {fn } on [0, 1] of Example 10.6.12(a).
For each n ∈ N, fn (x) = n if 0 < x ≤ n1 , and fn (x) = 0 elsewhere. This
sequence satisfies
Z Z
0= ( lim fn ) dλ < 1 = lim fn dλ. 
[0,1] n→∞ n→∞ [0.1]

Remark. Fatou’s lemma is often used to prove that the limit function f of a
convergent sequence ofR nonnegative Lebesgue integrable functions is Lebesgue
integrable. For if lim A fn dλ < ∞
R and if fn → f a.e. on A with fn ≥ 0 a.e.
for all n, then by Fatou’s lemma A f dλ < ∞. Thus f is integrable on A.

The General Lebesgue Integral


We now turn our attention to the case where f is an arbitrary real-valued
measurable function defined on a measurable subset A of R. As in Exercise
10 of Section 10.5, we define the functions f + and f − on A as follows:

f + (x) = max{f (x), 0}, f − (x) = max{−f (x), 0}.

If f (x) > 0, then f + (x) = f (x) and f − (x) = 0. On the other hand, if f (x) < 0,
then f + (x) = 0 and f − (x) = −f (x). If f is measurable on A, then f + and
f − are nonnegative measurable functions on A with

f (x) = f + (x) − f − (x) and |f (x)| = f + (x) + f − (x)

for all x ∈ A.
Our natural inclination is to define the integral of f over A by
Z Z Z
f dλ = f + dλ − f − dλ.
A A A
Lebesgue Measure and Integration 513
R +
RThe −only problem with this definition is that it is possible that A f dλ =
A
f dλ = ∞ giving the undefined ∞−∞ in the above. However, if we assume
that both f + and f − are integrable on A, then the above definition makes
sense. Furthermore, if f is measurable, and f + and f − are both integrable on
A, then |f | is also integrable on A. Conversely, if f is measurable and |f | is
integrable on A, then since f + ≤ |f | and f − ≤ |f |, by Theorem 10.7.4(c) both
f + and f − are integrable on A. Therefore we make the following definition.

DEFINITION 10.7.7 Let f be a measurable real-valued function defined


on a measurable subset A of R. The function f is said to be Lebesgue inte-
grable on A if |f | is Lebesgue integrable on A. The set of Lebesgue integrable
functions on A is denoted by L(A). For f ∈ L(A), the Lebesgue integral of
f on A is defined by
Z Z Z
f dλ = f + dλ − f − dλ.
A A A

Remark. The set L(A) of Lebesgue integrable functions on A is often also


denoted by L1 (A).
The definition of the general Lebesgue integral is consistent with our defini-
tion of the Lebesgue integral of a bounded function on [a, b]. If f is a bounded
real-valued measurable function on [a, b] then so are the functions f + and
f − . Let E1 = {x : f (x) ≥ 0} and E2 = {x : f (x) < 0}. Then E1 and E2
are disjoint measurable subsets of [a, b] with E1 ∪ E2 = [a, b]. Furthermore
f χE1 = f + everywhere and f χE2 = −f − a.e.. By Theorem 10.6.10(b)
Z b Z Z
f dλ = f dλ + f dλ
a E1 E2
Z b Z b Z b Z b
= f χE1 dλ + f χE2 dλ = f + dλ − f − dλ.
a a a a

Remark. For a nonnegative measurable function, the Lebesgue integral and


the improper Riemann integral are the same, provided of course that the
latter exists (Exercise 18). This however is false for functions
 that are not
nonnegative. For example, consider the function f (x) = (sin x) x, x ∈ [π, ∞),
of Example 6.4.4(b). By Exercise 7 of Section 6.4, the improper integral of f
exists on [π, ∞). However, as was shown in Example 6.4.4,
Z ∞ Z (n+1)π
| sin x|
|f | dλ = lim dx = ∞.
π n→∞ π x
Thus f is not Lebesgue integrable on [π, ∞). Another such example for a
finite interval is given in Exercise 23. The crucial fact to remember is that a
measurable
R function f is Lebesgue integrable on a measurable set A if and
only if A |f | dλ < ∞.
Our first result is the following extension of Theorem 10.7.4 to the class of
integrable functions.
514 Introduction to Real Analysis

THEOREM 10.7.8 Suppose f and g are Lebesgue integrable functions on


the measurable set A. Then
(a) f + g and cf, c ∈ R, are integrable on A with
Z Z Z Z Z
(f + g) dλ = f dλ + g dλ and cf dλ = c f dλ.
A A A A A
Z Z
(b) If f ≤ g a.e. on A, then f dλ ≤ g dλ, with equality if f = g a.e.
A A
(c) If A1 and A2 are disjoint measurable subsets of A, then
Z Z Z
f dλ = f dλ + f dλ.
A1 ∪A2 A1 A2
R R
Proof. The proof that cf is integrable and that A cf dλ = c A f dλ follows
immediately from the definition. Before proving the result about the sum, we
first note that the definition of the integral of f on A is independent of the
decomposition f = f + − f − . Suppose that f = f1 − f2 where f1 and f2 are
nonnegative integrable functions on A. Then
f + + f2 = f − + f1 ,
and thus by Theorem 10.7.4,
Z Z Z Z
+ −
f dλ + f2 dλ = f dλ + f1 dλ.
A A A A
Since all the integrals are finite,
Z Z Z Z Z
f dλ = f + dλ − f − dλ = f1 dλ − f2 dλ.
A A A A A
If f and g are integrable on A, then by definition so are the functions
f + + g + and f − + g − . Since f + g = (f + + g + ) − (f − + g − ), by the above
Z Z Z
(f + g) dλ = (f + + g + ) dλ − (f − + g − ) dλ,
A A A

which by Theorem 10.7.4


Z Z Z Z
= f + dλ − f − dλ + g + dλ − g − dλ
ZA Z A A A

= f dλ + g dλ.
A A

(b) If f ≤ g a.e., then g − f ≥ 0 a.e. Therefore by part (a),


Z Z Z
0 ≤ (g − f ) dλ = g dλ − f dλ,
A A A

from which the result follows. (c) The proof of (c) also follows from (a) and
the fact that since A1 ∩ A2 = ∅, f χA1 ∪A2 = f χA1 + f χA2 . 
Lebesgue Measure and Integration 515

Lebesgue’s Dominated Convergence Theorem


Our second major convergence result is the following theorem of Lebesgue.
THEOREM 10.7.9 (Lebesgue’s Dominated Convergence Theorem)
Let {fn } be a sequence of measurable functions defined on a measurable set A
such that lim fn (x) = f (x) exists a.e. on A. Suppose there exists a nonneg-
n→∞
ative integrable function g on A such that |fn (x)| ≤ g(x) a.e. on A. Then f
is integrable on A and
Z Z
f dλ = lim fn dλ.
A n→∞ A

Proof. Since g is integrable on A, and |fn | ≤ g a.e. on A, by Theorem 10.7.4


each fn is also integrable on A. Also, by Corollary 10.5.10, the function f is
measurable on A. Furthermore, by Fatou’s lemma,
Z Z Z
|f | dλ ≤ lim |fn | dλ ≤ g dλ < ∞.
A A A

Thus f is also integrable on A.


By redefining all the fn , n ∈ N, on a set of measure zero if necessary,
we can without loss of generality assume that |fn (x)| ≤ g(x) for all x ∈ A.
Consider the sequence {g + fn }n∈N of nonnegative measurable functions on
A. By Fatou’s lemma,
Z Z Z
(g + f ) dλ = lim (g + fn ) dλ ≤ lim (g + fn ) dλ
A A n→∞ n→∞ A
Z Z
= g dλ + lim fn dλ.
A n→∞ A

Therefore, Z Z
f dλ ≤ lim fn dλ.
A n→∞ A
Similarly, by applying Fatou’s lemma to the sequence {g − fn }n∈N , which is
again a sequence of nonnegative functions on A,
Z Z Z Z
(g − f ) dλ ≤ lim (g − fn ) dλ = g dλ + lim (−fn )dλ.
A n→∞ A A n→∞ A
R R
But lim A
(−fn )dλ = − lim fn dλ. Therefore,
n→∞ n→∞ A
Z Z
f dλ ≥ lim fn dλ.
A n→∞ A

Combining the two inequalities gives the desired result. 


Remark. The hypothesis that there exists an integrable function
R g satisfying
|fn | ≤ g a.e. is required in the proof Rin order to subtract g dλ in the above
inequalities. This is not possible if g dλ = ∞. As the following example
shows, if such a function g does not exist, then the conclusion may be false.
516 Introduction to Real Analysis

EXAMPLE 10.7.10 As in Example 10.6.12(a) consider the sequence {fn }


on [0, 1] defined by
(
n, 0 < x ≤ n1 ,
fn (x) =
0, elsewhere.
R
For the sequence {fn } we have lim fn (x) = 0 for all x ∈ [0, 1] but [0,1]
fn dλ =
n→∞
1 for all n. We now show that any measurable R function g satisfying g(x) ≥
fn (x) for all x ∈ [0, 1] and n ∈ N satisfies [0,1] g dλ = ∞. Since g(x) ≥ fn (x)
1
for all x ∈ [0, 1] we have g(x) ≥ n for all x ∈ ( n+1 , n1 ]. Since the collection
1 1 n
{( k+1 , k ]}k=1 of intervals is pairwise disjoint, by Theorem 10.7.4
n Z n n
1
Z X X X
1
g dλ ≥ g dλ ≥ k λ(( k+1 , k1 ]) = .
[0,1] 1
( k+1 1
,k ] k+1
k=1 k=1 k=1
P R
Since the series 1/(k + 1) diverges, we have [0,1]
g dλ = ∞. 

Exercises 10.7
1. RLet f be a nonnegative measurable function on a measurable set A. If
A
f dλ = 0, prove that f = 0 a.e. on A.
2. *a. Prove Theorem 10.7.4(b).
b. Prove Theorem 10.7.4(c).
3. Let A be a measurable subset of R.
a. If f is integrable on A and g is bounded and measurable on A, prove
that f g is integrable on A.
b. If f and g are integrable on A, is the function f g integrable on A?
4. *Let fp (x) = x−p , x ∈ (0, 1). Prove that fp is integrable on (0, 1) for all
p, 0 < p < 1, and that
Z
1
fp dλ = .
(0,1) 1−p
5. Define f on [1, ∞) by
(√
n if x ∈ [n, n + 1/n2 ), n = 1, 2, . . . ,
f (x) =
0 otherwise.
Show that f ∈ L([1, ∞)) but f 2 6∈ L([1, ∞)).
6. Let P denote the Cantor set of Section 2.5. Define f on [0, 1] as follows:
f (x) = 0 for every x ∈ P , and f (x) = k for each x in the open interval
of length 1/3k on [0, 1] \ P . Prove that f is integrable on [0, 1] and that
R1
f dλ = 3.
0
Lebesgue Measure and Integration 517

7. *Let f be a nonnegative integrable function on [a, b]. For each n ∈ N, let



P
En = {x : n ≤ f (x) < n + 1}. Prove that nλ(En ) < ∞.
n=1

8. Evaluate each of the following limits:


Z Z
2
a. lim (1−e−x /n )x−1/2 dx. b. lim (1−x/n)n ex/2 dx.
n→∞ [0,1] n→∞ [0,n]

9. *Let f be an integrable function on [a, b]. Given ǫ > 0, prove that there
Rb
exists a bounded measurable function g on [a, b] such that |f −g| dλ < ǫ.
a

10. *Suppose f is integrable on a measurable set A with λ(A) = ∞. Given


ǫ > 0, prove that there exists a bounded measurable set E ⊂ A such that
Z
|f | dλ < ǫ.
A\E

11. Show by example that Fatou’s lemma is false if the functions fn , n ∈ N,


are not nonnegative.
12. Show by example that the bounded convergence theorem is false for a
measurable set A with λ(A) = ∞.
13. *Let f be a Lebesgue integrable function on a measurable set A. Prove
that given ǫ > 0, there exists a δ > 0 such that
Z
|f | dλ < ǫ
E
for all measurable subsets E of A with λ(E) < δ.
14. Let f be an integrable function on (a, b), where −∞ ≤ a < b ≤ ∞. Define
F on (a, b) by
Z x
F (x) = f dλ, x ∈ (a, b).
a
a. Prove that F is continuous on (a, b).
b. If f is continuous at xo ∈ (a, b), prove that F is differentiable at xo
with F ′ (xo ) = f (xo ).

R t ∈ R the function x → sin(tf (x))


15. *If f ∈ L([0, 1]), show that for every
is in L([0, 1]), and that g(t) = sin(tf (x)) dλ(x) is a differentiable
[0,1]
function of t ∈ R. Find g ′ (t).
Z
16. *If f ∈ L(R), prove that lim f (x) sin nx dλ = 0.
n→∞ R
17. Let f ∈ L(R).
R R
a. Prove that f (x + t) dλ = f (x) dλ.
R R
R
b. Prove that lim |f (x + t) − f (x)| dλ = 0.
t→0 R

18. Let f be a nonnegative measurable function on (a, b] satisfying f ∈ R[c, b]


for every c, a < c < b. Prove that
Rb Rb
f dλ = lim f (x) dx.
a c→a+ c
518 Introduction to Real Analysis

19. *(Monotone Convergence Theorem) Let {fn } be a monotone in-


creasing sequence of nonnegative measurable functions on a measurable
set A. Prove that
R R
( lim fn ) dλ = lim fn dλ.
A n→∞ n→∞ A

20. Show that the monotone convergence theorem is false for decreasing se-
quences of measurable functions.
21. *a. Let f be a nonnegative measurable function on a measurable set A,
S {An } be a sequence of pairwise disjoint measurable subsets of A
and let
with n An = A. Prove that
Z X∞ Z
f dλ = f dλ.
A n=1 An

b. Prove that the conclusion of part (a) is still valid for arbitrary
f ∈ L(A).
22. Let {fn } be a sequence of measurable functions on [a, b] satisfying
|fn (x)| ≤ g(x) a.e., where g is integrable on [a, b]. If lim fn (x) = f (x)
n→∞
exists a.e. on [a, b], and h is any bounded measurable function on [a, b],
prove that
Z b Z b
f h dλ = lim fn h dλ.
a n→∞ a
23. *As in Exercise 4, Section 6.4, let f be defined on (0, 1) by
d
x2 sin x12 .

f (x) =
dx
Prove that f is not Lebesgue integrable on (0, 1).
24. Let f be a nonnegative measurable function on [a, b]. For each t ≥ 0, let
mf (t) = λ({x ∈ [a, b] : f (x) > t}).
a. Prove that mf (t) is monotone decreasing on [0, ∞).
Rb R∞
b. Prove that f dλ = mf (t) dt.
a 0

10.8 Square Integrable Functions


In analogy with the space ℓ2 of square summable sequences, we define the
space L2 of square integrable functions as follows.

DEFINITION 10.8.1 Let A be a measurable subset of R. We denote by


L2 (A) the set of all measurable functions f on A for which |f |2 is integrable
on A. For f ∈ L2 (A), set
Z 1/2
kf k2 = |f |2 dλ .
A
Lebesgue Measure and Integration 519

The quantity kf k2 is called the 2-norm or norm of f . Clearly, kf k2 ≥ 0,


and from the definition it follows that if f ∈ L2 (A) and c ∈ R, then cf ∈
L2 (A) with kcf k2 = |c|kf k2 . We will shortly prove that if f, g ∈ L2 (A), then
f + g ∈ L2 (A) with kf + gk2 ≤ kf k2 + kgk2 . Thus L2 (A) is a vector space
over R.
If f ∈ L2 (A) satisfies kf k2 = 0, then by Exercise 1, Section 10.7, |f |2 = 0
a.e., and thus f = 0 a.e. on A. This does not mean that f (x) = 0 for all
x ∈ A; only that f = 0 except on a set of measure zero. Thus k k2 satisfies
all the properties of a norm except for kf k2 = 0 if and only if f = 0. To get
around this difficulty we will consider any two functions f and g in L2 (A) for
which f = g a.e. as representing the same function. Formally, we define two
measurable functions f and g on A to be equivalent if f = g a.e.. In this
way it is possible to define L2 (A) as the set of equivalence classes of square
integrable functions on A. Rather than proceeding in this formal fashion, we
will take the customary approach of simply saying that two functions in L2
are equal if and only if they are equal almost everywhere. With this definition
L2 (A) is a normed linear space.

EXAMPLES 10.8.2
√
(a) For our first example let f (x) = 1 x, x ∈ (0, 1). By Exercise 4 of
R1
Section 10.7, f is integrable on (0, 1) with 0 f dλ = 2. Since f 2 (x) = 1/x, by
Example 10.7.3(a), f 2 is not integrable on (0, 1) and thus f 6∈ L2 ((0, 1)). On
the other hand, if g(x) = x−1/3 , then g 2 (x) = x−2/3 , which by Exercise 4 of
the previous section is integrable. Thus g ∈ L2 ((0, 1)) with
Z 1
kgk22 = x−2/3 dx = 3.
0


(b) Consider the function f (x) = 1 x for x ∈ [1, ∞). For any n ∈ N, n ≥ 2,
Z n Z n  
1
|f |2 dλ = x−2 dx = 1 − .
1 1 n

Thus Z ∞ Z n
kf k22 = 2
|f | dλ = lim |f |2 dλ = 1
1 n→∞ 1

Therefore f ∈ L2 ([1, ∞)) with kf k2 = 1. It is easily shown that f 6∈ L([1, ∞)).




Cauchy-Schwarz Inequality
Our first result will be the analogue of the Cauchy-Schwarz inequality for ℓ2 .
The following inequality is sometimes also referred to as Hölder’s inequality.
520 Introduction to Real Analysis

THEOREM 10.8.3 (Cauchy-Schwarz Inequality) Let A be a measur-


able subset of R. If f, g ∈ L2 (A), then f g is integrable on A with
Z
|f g| dλ ≤ kf k2 kgk2 .
A

Proof. By Theorem 10.5.4, the product f g is measurable, and for any x ∈ A,


we have
1
|f (x)g(x)| ≤ (|f (x)|2 + |g(x)|2 ).
2
Since by hypothesis f, g ∈ L2 (A), the function |f g| is integrable on A, and
thus f g is integrable on A. As in the proof of Theorem 7.4.3, for γ ∈ R
Z Z
2 2 2 2
0 ≤ (|f | − γ|g|) = kf k2 + γ kgk2 − 2γ |f g| dλ. (10)
A A

If kgk2 = 0, Rthen by Exercise 1 of Section 10.7, g = 0 a.e. on A. As a


consequence, A
|f g| dλ = 0, and the conclusion holds. If kgk2 6= 0, set
γ = ( A |f g| dλ) kgk22 . With γ as defined, (10) becomes
R

( |f g| dλ)2
R
0 ≤ kf k22 − A ,
kgk22

and thus ( A |f g| dλ)2 ≤ kf k22 kgk22 , which proves the result. 


R

Our next result is Minkowski’s inequality for the space L2 . Since the proof
of this is identical to the proof of Theorem 7.4.5, we leave the details to the
exercises.

THEOREM 10.8.4 (Minkowski’s Inequality) Let A be a measurable


subset of R. If f, g ∈ L2 (A), then f + g ∈ L2 (A) with

kf + gk2 ≤ kf k2 + kgk2 .

Proof. Exercise 4. 

The Normed Linear Space L2 ([a, b])


If A is a measurable subset of R, the norm k k2 on L2 (A) satisfies the
following properties:
(a) kf k2 ≥ 0 for any f ∈ L2 (A).
(b) kf k2 = 0 if and only if f = 0 a.e. on A.
(c) kcf k2 = |c|kf k2 for all f ∈ L2 (A) and c ∈ R.
(d) kf + gk2 ≤ kf k2 + kgk2 for all f, g ∈ L2 (A).
Properties (a) and (c) follow from the definition, and (d) is Minkowski’s in-
equality. With the convention that two functions f and g are equal if and
only if f = g a.e., L2 (A) is a normed linear space.
Lebesgue Measure and Integration 521

By Definition 8.3.9, a sequence {fn } in L2 converges to a function f in L2


if and only if lim kf − fn k2 = 0. Example 10.6.12(b) shows that convergence
n→∞
in L2 does not imply pointwise convergence of the sequence. As in Chapter
9, convergence in L2 is usually called convergence in the mean (or norm
convergence). We now prove that L2 ([a, b]) is a complete normed linear
space.

THEOREM 10.8.5 The normed linear space (L2 ([a, b]), k k2 ) is complete.

Remark. Although we state and prove the result for a closed and bounded
interval, the same method of proof will work for L2 (A) where A is any mea-
surable subset of R.
Before proving the theorem, we first state and prove the following lemma.

LEMMA 10.8.6 Let A be a measurable subset of R. Suppose {fn } is a mono-


tone increasing sequence of nonnegative measurable functions on A satisfying
Z
fn dλ ≤ C, for all n ∈ N,
A

and for some finite constant C. Then f (x) = lim fn (x) is finite a.e. on A.
n→∞

Proof. Since {fn (x)} is monotone increasing for each x ∈ A, the sequence
either converges to a real number or diverges to ∞. Let f (x) = lim fn (x),
n→∞
and let
E = {x ∈ A : f (x) = ∞}.
We will prove that λ(E) = 0. For each k ∈ N, let

Ek = {x ∈ A : f (x) > k}.


T
Then Ek ⊃ Ek+1 for all k ∈ N with k∈N Ek = E. For fixed k ∈ N, set

An,k = {x ∈ A : fn (x) > k}, n ∈ N.


S
Then An,k ⊂ An+1,k with n∈N An,k = Ek . Thus by Theorem 10.4.6(a),

λ(Ek ) = lim λ(An,k ).


n→∞

But
1 1 C
Z Z Z
λ(An,k ) = 1 dλ ≤ fn dλ ≤ fn dλ ≤ .
An,k k An,k k A k
Therefore λ(Ek ) ≤ C/k for all k ∈ N. Since λ(E1 ) < ∞, by Theorem 10.4.6(b),

λ(E) = lim λ(Ek ) = 0.


k→∞

Therefore f is finite a.e. on A. 


522 Introduction to Real Analysis

Proof of Theorem 10.8.5. Let {fn } be a Cauchy sequence in L2 . Then


given ǫ > 0, there exists no ∈ N such that kfn − fm k2 < ǫ for all n, m ≥ no .
For each k ∈ N, let nk be the smallest integer such that kfm − fn k2 < 1/2k
for all m, n ≥ nk . Then n1 ≤ n2 ≤ · · · ≤ nk ≤ · · · , and
1
kfnk+1 − fnk k2 < .
2k
For each k ∈ N, set

gk = |fn1 | + |fn2 − fn1 | + · · · + |fnk+1 − fnk |.

By Minkowski’s inequality,
 2
Z k
X
gk2 dλ = kgk k22 ≤ kfn1 k2 + kfnj+1 − fnj k2 
[a,b] j=1
 2
k
X 1  ≤ (kfn1 k2 + 1)2 .
≤ kfn1 k2 + k
j=1
2

Thus the sequence {gk2 } satisfies the hypothesis of Lemma 10.8.6. Therefore
lim gk2 is finite a.e. on [a, b]. But then lim gk is also finite a.e. on [a, b]. As a
k→∞ k→∞
consequence the series

X
|fn1 (x)| + |fnj+1 (x) − fnj (x)|
j=1

converges a.e. on [a, b], and therefore so does the series



X
fn1 (x) + (fnj+1 (x) − fnj (x)).
j=1

But the kth partial sum of this series is fnk+1 (x). Therefore the sequence
{fnk }k∈N converges a.e. on [a, b]. Let E denote the set of x ∈ [a, b] for which
this sequence converges. Then λ([a, b] \ E) = 0. Define
(
lim fnk (x), x ∈ E,
f (x) = k→∞
0, otherwise.

Then {fnk } converges to f a.e. on [a, b].


It remains to be shown that f ∈ L2 , and that {fn } converges to f in L2 .
Since
k
X
|fnk+1 | ≤ |fn1 | + |fnj+1 − fnj | = gk ,
j=1
Lebesgue Measure and Integration 523

by Fatou’s lemma
Z Z
|f |2 dλ ≤ lim gk2 < ∞.
[a,b] k→∞ [a,b]

Thus f ∈ L2 . Finally, since

f (x) − fnk (x) = lim (fnj+1 (x) − fnk (x)) a.e.,


j→∞

by Fatou’s lemma again,


 2
1
Z Z
|f − fnk |2 dλ ≤ lim |fnj+1 − fnk |2 < .
[a,b] j→∞ [a,b] 2k

Therefore kf − fnk k2 < 1/2k for all k ∈ N. Thus the subsequence {fnk }k∈N
converges to f in the norm of L2 . Finally by the triangle inequality,
1
kf − fn k2 ≤ kf − fnk k2 + kfnk − fn k2 < + kfnk − fn k2 .
2k
From this it now follows that the original sequence {fn } also converges to f
in the norm of L2 . 

Fourier Series
We close this chapter by making a few observations about Fourier series and
the space L2 ([−π, π]). As in Definition 9.3.1, if f is Lebesgue integrable on
[−π, π], the Fourier coefficients of f with respect to the orthogonal system
{1, cos nx, sin nx}∞
n=1 are given by

1 π
Z
an = f (x) cos nx dx, n = 0, 1, 2, ...,
π −π
Z π
1
bn = f (x) sin nx dx, n = 1, 2, ...
π −π

Since f is Lebesgue integrable, the functions f (x) sin nx and f (x) cos nx are
measurable on [−π, π], and in absolute value less than or equal to |f (x)|. Thus
the functions f (x) cos nx and f (x) sin nx are all integrable on [−π, π].
The same method of proof used in proving Bessel’s inequality for Riemann
integrable functions proves the following (Exercise 8): If f ∈ L2 ([−π, π]) and
{ak } and {bk } are the Fourier coefficients of f , then
∞ π
1 2 X 2 1
Z
a + ak + b2k ≤ f 2 dλ. (Bessel’s Inequality)
2 0 π −π
k=1

Thus the sequences {ak }∞ ∞


k=0 and {bk }k=1 are square summable. We now use
2
completeness of the space L ([−π, π]) to prove the converse.
524 Introduction to Real Analysis

THEOREM 10.8.7 If {ak }∞ ∞


k=0 and {bk }k=1 are any sequences of real num-
bers satisfying

1 2 X 2
a + ak + b2k < ∞,
2 0
k=1

then there exists f ∈ L2 ([−π, π]) whose Fourier coefficients are precisely {ak }
and {bk }.

Proof. For each n ∈ N, set


n
1 X
Sn (x) = a0 + ak cos kx + bk sin kx.
2
k=1

Since each Sn is continuous, Sn is square integrable on [−π, π]. If m < n, then


Z π " X n
#2
2
kSn − Sm k2 = ak cos kx + bk sin kx dx
−π k=m+1

which by orthogonality
n
X
=π (a2k + b2k ).
k=m+1

Since the series converges, the sequence {Sn } is a Cauchy sequence in


L2 ([−π, π]). Thus by Theorem 10.8.5, there exists a function f ∈ L2 ([−π, π])
such that Sn converges to f in L2 ; i.e., lim kf − Sn k2 = 0. If n > m, then
n→∞
Z π Z π
Sn (x) cos mx dx = πam and Sn (x) sin mx dx = πbm .
−π −π

Therefore
π π
1 1
Z Z
f (x) cos mxdx − am = (f (x) − Sn (x)) cos mx dx
π −π π −π

which by the Cauchy-Schwarz inequality

1 1
≤ kf − Sn k2 k cos mxk2 = √ kf − Sn k2 .
π π
Since this holds for all n > m, letting n → ∞ gives
1 π
Z
am = f (x) cos mx dx.
π −π
A similar argument proves that the bm are the sine coefficients of f . 
Lebesgue Measure and Integration 525

Is every Trigonometric Series a Fourier Series?


P∞ sin nx
In Section 9.3 we showed that the series , even though it converges
n=2 ln n
for all x ∈ R, Pis not the Fourier series of a Riemann integrable function on
[−π, π]. Since (ln n)−2 = ∞, by Bessel’s inequality it is also not the Fourier
series of a square integrable function. This however does not rule out the
possibility that it is the Fourier series of a Lebesgue integrable function. The
following interesting classical result is very useful in providing an answer to
this question. Since the proof of the theorem is beyond the level of this text,
we state the result without proof.2

THEOREM 10.8.8

X bn
(a) If bn > 0 for all n and = ∞, then
n=1
n


X
bn sin nx
n=1

is not the Fourier series of a Lebesgue integrable function.


(b) If {an } is a sequence of nonnegative real numbers with lim an = 0,
n→∞
satisfying an ≤ 21 (an−1 + an+1 ), then the series

X
an cos nx
n=1

is the Fourier series of a nonnegative Lebesgue integrable function on [−π, π].

  P∞ sin nx
Since the sequence 1 (ln n) satisfies hypothesis (a), the series
n=2 ln n
is not the Fourier series of any integrable
  function on [−π, π]. However, it is
interesting to note that the sequence 1 (ln n) also satisfies hypothesis (b)
(Exercise 13), and thus the series

X cos nx
n=2
ln n

is the Fourier series of a nonnegative Lebesgue integrable function.

2A proof of the result can be found in Chapter V of the text by Zygmund.


526 Introduction to Real Analysis

Exercises 10.8
1. *For x ∈ (0, 1) let fp (x) = x−p , p > 0. Determine all values of p such
that fp ∈ L2 ((0, 1)).
2. For each n ∈ N, let In = (n, n + 1/n2 ). For a given sequence {cn } of

P
real numbers, define f on [1, ∞) by f (x) = cn χIn (x). Show that
n=1
∞ c2
2 P n
f ∈ L ([1, ∞)) if and only if 2
< ∞.
n=1 n

3. Find an example of a real-valued function f on (0, ∞) such that f 2 is


integrable on (0, ∞), but |f |p is not integrable on (0, ∞) for any p, 0 <
p < ∞, p 6= 2.
1
(Hint: Consider the function g(x) = .)
x(1 + | ln x|)2
4. *Prove Theorem 10.8.4.
5. Let A be a measurable subset of R with λ(A) < ∞. If f ∈ L2 (A), prove
that f ∈ L(A) with
Z
|f | dλ ≤ kf k2 (λ(A))1/2 .
A

6. Let {fn } be a sequence in L2 ([a, b]). Suppose {fn } converges in L2 to


f ∈ L2 and {fn } converges a.e. to some measurable function g. Prove
that f = g a.e. on [a, b].
7. *Let {fn } be a sequence in L2 (A) that converges in L2 to a function
f ∈ L2 (A). If g ∈ L2 (A), prove that
Z Z
lim fn g dλ = f g dλ.
n→∞ A A

8. If f ∈ L2 ([−π, π]) and {ak } and {bk } are the Fourier coefficients of f ,
prove that

1 π 2
Z
1 2 X 2
a0 + ak + b2k ≤ f dλ.
2 π −π
k=1

9. Which of the following trigonometric series are Fourier series of an L2


function?
P∞ cos nx P∞ sin nx ∞ cos nx
P
*a. . b. √ . c. √ .
n=1 n n=1 n n=2 n ln n
10. Suppose E is a measurable subset of (−π, π) with λ(E) > 0. Prove that
for each δ > 0, there exist at most finitely many integers n such that
sin nx ≥ δ for all x ∈ E.
11. Let f ∈ L2 ([a, b]). Prove that given ǫ > 0, there exists a continuous func-
tion g on [a, b] such that kf − gk2 < ǫ. (Hint: First prove that there exists
a simple function having the desired properties, and then use Exercise
15, Section 10.6, and Lemma 9.4.8.)
12. For f ∈ L2 ([−π, π]), let {ak } and {bk } be the Fourier coefficients of f .
a. Prove Parseval’s equality:
Lebesgue Measure and Integration 527
∞ Z π
1 2 X 2 1
a0 + ak + b2k = f 2 dλ.
2 π −π
k=1

b. If ak = bk = 0 for all k, prove that f = 0 a.e.



13. *Show that the sequence {1 (ln n)} satisfies the hypothesis of Theorem
10.8.8(b).
14. Let {φk }∞ 2 ∞
k=1 be an orthonormal family in L ([a, b]). If {ck }k=1 ∈ ℓ ,
2
2
prove that there exists a function f ∈ L ([a, b]) such that ck = hf, φk i.
Pn
Furthermore, for this f , lim ksn − f k2 = 0, where sn = ck φk .
n→∞ k=1

Notes
Lebesgue’s development of the theory of measure and integration was one of the
great mathematical achievements of the twentieth century. His proof that every
bounded measurable function is Lebesgue integrable was based on the new idea of
partitioning the range of a function, rather than its domain. Lebesgue’s theory of
integration also permitted him to provide necessary and sufficient conditions for
Riemann integrability of a bounded function f .
In addition to the fact that the Lebesgue integral enlarged the family of integrable
functions, the power of the Lebesgue integral results from the ease with which it
handles the interchange of limits and integration. For the Riemann integral, uniform
convergence of the sequence {fn } is required. Otherwise, the limit function may not
be Riemann integrable. On the other hand, if {fn } is a sequence of measurable
functions on [a, b], then its pointwise limit f is also measurable. Hence if f is also
bounded, then f is integrable. The bounded convergence theorem is notable for its
simplicity of hypotheses and proof. It only requires that {fn } be uniformly bounded
and converge a.e. on [a, b]. This is sufficient to ensure that
Z Z
lim fn dλ = ( lim fn ) dλ.
n→∞ [a,b] [a,b] n→∞

With the additional hypothesis that the pointwise limit f is Riemann integrable,
the bounded convergence theorem is also applicable to a sequence {fn } of Riemann
integrable functions.
The bounded convergence theorem, or the dominated convergence theorem, are
also the tools required to prove the fundamental theorem of calculus for the Lebesgue
integral.
Theorem AR If f is differentiable and f ′ is bounded on [a, b], then f ′ is Lebesgue
b
integrable, and a
f ′ dλ = f (b) − f (a).
The proof of this result was requested in Exercise 8 of Section 10.6 . It fol-
lows simply by applying the bounded convergence theorem to the sequence {gn }
defined by gn (x) = n[f (x + n1 ) − f (x)]. Since f is differentiable, the sequence {gn }
converges pointwise to f . Also, by the mean value theorem the sequences {gn } is uni-
formly bounded on [a, b]. This then establishes the analogue of Theorem 6.3.2 for the
528 Introduction to Real Analysis

Lebesgue integral. If instead of bounded, one assumes that f ′ is Lebesgue integrable,


then the result also follows by Lebesgue’s dominated convergence theorem.
The Riemann theoryRof integration allows us to prove that if f ∈ R[a, b], and
x
F is defined by F (x) = a f (t)dt, then F ′ (x) = f (x) at any x ∈ [a, b] at which f
is continuous. Since f ∈ R[a, b] if and only if f is continuous a.e. on [a, b], we have
that F ′ (x) = f (x) a.e. on [a, b]. Although not proved in the text, this result is still
valid for the Lebesgue integral.
Theorem B Let f be Lebesgue integrable on [a, b], and define F (x) = ax f dλ.
R

Then F (x) = f (x) a.e. [a, b].
By writing f = f + − f − , it suffices to assume that f ≥ 0, and thus F is
monotone increasing on [a, b]. It is a fact independent of integration that every
monotone function is differentiable a.e. on [a, b].3 A slight generalization of Theorem
A then gives that Z x Z x
F ′ dλ = F (x) − F (a) = f dλ,
a a
Rx
or that a [F ′ −f ] dλ = 0 for all x ∈ [a, b]. As a consequence of Miscellaneous Exercise
5 we have F ′ = f a.e. Newton and Leibniz realized the inverse relationship of
differentiation and integration. The above two versions of the fundamental theorem
of calculus provide a rigorous formulation of this inverse relationship for a large class
of functions.
The Lebesgue theory of integration also provides the proper setting for the study
of Fourier series. The bounded convergence theorem was used by Lebesgue to prove
uniqueness of a Fourier series. If

1 X
f (x) = A0 + (Ak cos kx + Bk sin kx),
2
k=1

for all x ∈ [−π, π], then f , being the pointwise limit of a sequence of continuous
functions, is automatically measurable on [−π, π]. If f is also bounded, then f is in-
tegrable. If the sequence {Sn } of partial sums of the trigonometric series is uniformly
bounded, then by the bounded convergence theorem, the trigonometric series is the
Fourier series of f . In his 1903 paper, “Sur les series trigonometric,” 4 Lebesgue
showed that uniform boundedness of the partial sums may be removed; that bound-
edness of the function f itself was sufficient. This result was extended in 1912 by de
la Vallée-Poussin5 to the case were the function f is integrable on [−π, π]. The reader
is referred to the article by Alan Gluchoff for an overview of how trigonometric series
has influenced the various theories of integration.
3 Seepage 208 of the text by Natanson.
4 Annales Scientifiques de l’École Normale Supérieure, (3)20 (1903), 453–485.
5 “Sur l’unicité du développement trigonométrique,” Bull de l’Acad. Royale de Belgique

(1912), 702–718; see also Chapter 9 of the text by Zygmund.


Lebesgue Measure and Integration 529

Micellaneous Exercises
1. Let A be a measurable subset of R. For f ∈ L(A) set
Z
kf k1 = |f | dλ.
A

Prove that (L(A), k k1 ) is a complete normed linear space.


2. Let A be a measurable subset of R, and let f, fn , n = 1, 2, ... be measur-
able functions on A. The sequence {fn } is said to converge in measure
to f if for every δ > 0,
limn→∞ λ({x ∈ A : |fn (x) − f (x)| ≥ δ}) = 0.
a. If {fn } is a sequence in L(A), and {fn } converges in the norm of L(A)
to f ∈ L(A), prove that {fn } converges in measure to f .
b. Find a sequence {fn } of measurable functions on a measurable set A
that converges to a function f in measure, but does not converge to f in
norm.
c. If λ(A) is finite and {fn } is a sequence of measurable functions that
converges in measure to a measurable function f , prove that there exists
a subsequence {fnk } of {fn } such that fnk → f a.e. on A.
d. Show that the sequence {fn } of Example 10.6.12(b) converges to 0 in
measure.
e. Find a sequence {fn } of measurable functions on a measurable set A
such that fn → 0 everywhere on A but {fn } does not converge to 0 in
measure.
3. Let {ϕn } be a sequence of orthogonal functions on [a, b] having the prop-
Rb
erty that the only continuous real-valued function f satisfying f ϕn dλ =
a
0 for all n ∈ N, is the zero function. Prove that the system {ϕn } is com-
plete. (Hint: First use the hypothesis to prove that if f ∈ L2 ([a, b]) satis-
Rb
fies f ϕn dλ = 0 for all n ∈ N then f = 0 a.e. Next use completeness of
a
L2 to prove that Parseval’s equality holds for every f ∈ L2 ([a, b]).)
4. (Construction of a Nonmeasurable Set) For each x ∈ [− 21 , 21 ], define
the set K(x) by
K(x) = {y ∈ [− 21 , 12 ] : y − x ∈ Q}.
a. Prove that for any x, y ∈ [− 21 , 12 ], either K(x) ∩ K(y) = ∅ or K(x) =
K(y). (Note: K(x) = K(y) does not imply that x = y; it only implies
that x − y is rational).
Consider the family F = {K(x) : x ∈ [− 12 , 12 ]} of disjoint subsets of
[− 21 , 21 ]. Choose one point from each distinct set in this family and let A
denote the set of points selected. The ability to choose such a point from
each of the disjoint sets requires an axiom from set theory known as the
530 Introduction to Real Analysis

axiom of choice. Further information about this very important axiom


can be found in the text by Halmos.

Let rk , k = 0, 1, 2, . . . , be an enumeration of the rationals in [−1, 1], with


r0 = 0, and for each k = 0, 1, 2.., set Ak = A + rk .

b.
S∞Show that 3the collection {Ak } is pairwise disjoint with [− 12 , 21 ] ⊂
3
k=0 Ak ⊂ [− 2 , 2 ].

c. Use the above to show that λ∗ (A) = 0 and λ∗ (A) > 0, thus proving
that A is nonmeasurable.
Rx
5. Suppose f is Lebesgue integrable on [a, b]. If a f dλ = 0 for every x ∈
[a, b], prove that f = 0 a.e..

Supplemental Reading

Botts, T., “Probability theory and ence between Cantor sets,” Amer. Math.
the Lebesgue integral,” Math. Mag. 42 Monthly 101 (1994), 640–650.
(1969), 105–111. Maligranda, L., “A simple proof of
Burkill, H., “The periods of a pe- the Hölder and Minkowski inequality,”
riodic function,” Math Mag. 47 (1974), Amer. Math. Monthly 102 (1995), 256–
206–210. 259.
Darst, R. B., “Some Cantor sets Mc Shane, E. J., “A unified theory
and Cantor functions,” Math. Mag. 45 of integration,” Amer. Math. Monthly 80
(1972), 2–7. (1973), 349–359.
Priestly, W. M., “Sets thick and
Dressler, R. E. and Stromberg, K.
thin,” Amer. Math. Monthly 83 (1976),
R., “The Tonelli integral,” Amer. Math.
648–650.
Monthly 81 (1974), 67–68.
Thompson, B. S., “Monotone conver-
Gluchoff, A. D., “Trigonometric se-
gence theorem for the Riemann integral,”
ries and theories of integration,” Math.
Amer. Math. Monthly 117 (2010), 547–
Mag. 67 (1994), 3–20.
550.
Katznelson, Y. and Stromberg, K., Varberg, D. E., “On absolutely
“Everywhere differentiable, nowhere continuous functions,” Amer. Math.
monotone function,” Amer. Math. Monthly 72 (1965), 831–841.
Monthly 81 (1974), 349–354. Xiang, J. X., “A note on the
Koliha, J. J., “A fundamental the- Cauchy-Schwarz inequality,” Amer.
orem of calculus for Lebesgue integra- Math. Monthly 120 (2013), 456–459.
tion,” Amer. Math. Monthly 113 (2006), Wade, W. R., “The bounded conver-
551–555. gence theorem,” Amer. Math. Monthly
Kraft, R. L., “What’s the differ- 81 (1974), 387–389.
Bibliography

Berg, P. W. & McGregor, J. L., Elementary Partial Differential Equations,


Holden-Day, Oakland, CA, 1966.
Hewitt, E. & Stromberg, K., Real and Abstract Analysis, Springer-Verlag,
New York, 1965.
Katz, Victor J., A History of Mathematics, Harper Collins, New York,
1993.
Natanson, I. P., Theory of Functions of a Real Variable, vol. I, Frederick
Ungar Publ. Co., New York, 1964
Rudin, W., Principles of Mathematical Analysis, McGraw-Hill, Inc., New
York, 1976.
Titchmarch, E. C., The Theory of Functions, Oxford University Press,
1939.
Weinberger, H. F., Partial Differential Equations, John Wiley & Sons, New
York, 1965.
Zygmund, A., Trigonometric Series, vol. I & II, Cambridge University
Press, London, 1968.

531
Hints and Solutions

Chapter 1
Exercises 1.1 page 5
2. (a) A ∩ B = B, A ∩ Z = {−1, 0, 1, 2, 3, 4, 5}, B ∩ C = {x : 2 ≤ x ≤ 3}. (b)
A × B = {(x, y) : −1 ≤ x ≤ 5, 0 ≤ y ≤ 3}. 4. (a) Suppose x ∈ A ∩ (B ∩ C).
Then x ∈ A and x ∈ B ∩ C. Since x ∈ B ∩ C, x ∈ B and x ∈ C. Thus x ∈ A ∩ B
and x ∈ C. Therefore x ∈ (A ∩ B) ∩ C. This proves A ∩ (B ∩ C) ⊂ (A ∩ B) ∩ C.
The reverse containment is proved similarly. 7.(a) Let x ∈ A ∪ (B ∩ C). Then
x ∈ A or x ∈ B ∩ C. If x ∈ A, then x ∈ A ∪ B and x ∈ A ∪ C. Therefore
x ∈ (A ∪ B) ∩ (A ∪ C). If x ∈ B ∩ C, then x ∈ B and x ∈ C. Hence x ∈ A ∪ B and
x ∈ A ∪ C, i.e., x ∈ (A ∪ B) ∩ (A ∪ C). Thus A ∪ (B ∩ C) ⊂ (A ∪ B) ∩ (A ∪ C). The
reverse containment is proved similarly. (c) Let x ∈ C \ (A ∩ B). Then x ∈ C and
x∈/ A ∩ B. Since x ∈ / A ∩ B we have x ∈ / A or x ∈/ B . If x ∈ / A then x ∈ C \ A.
Likewise, if x ∈/ B, then x ∈ C \ B. In either case, x ∈ (C \ A) ∪ (C \ B). Therefore
C \ (A ∩ B) ⊂ (C \ A) ∪ (C \ B). The reverse containment is proved similarly.
8. If A = {1, 2, 3} then P(A) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, A}. 11. Let
(x, y) ∈ A × (B1 ∪ B2 ). Then x ∈ A and y ∈ B1 ∪ B2 . Thus x ∈ A and y ∈ B1 or
y ∈ B2 . If y ∈ B1 then (x, y) ∈ A × B1 . If y ∈ B2 , then (x, y) ∈ A × B2 . In either
case, (x, y) ∈ (A × B1 ) ∪ (A × B2 ). The reverse containment is proved similarly.
Exercises 1.2 page 14
1. (b) No. The ordered pairs (1, −1) and (1, 3) contradict the definition of function.
(d) In terms of ordered pairs k = {−1, 1), (0, 3), (1, 5), (4, 7)} and thus is a function
from A into B. 2. (a) No! The ordered pairs (0, 1) and (0, −1) are both elements
of A. This however contradicts the definition of function. 3. (a) f ({1, 2, 3, 4}) =
{1, 3, 5, 7} and f −1 ({1, 2, 3,√4}) = {1.2}. 4. (a) f (A) = √ {y : 0 ≤ y ≤ 9}; f −1 (A) =
3 3 −1
{x : x + 1 ∈ A} = {x : − 3 ≤ x ≤ 1}. (c) f (x) = 3 x − 1, x ∈ R.
5. (b) For k ∈ N, (f ◦ g)(k) = 2k + 3. Therefore (f ◦ g)(N) = {2k + 3 : k = 1, 2, 3, . . . }
which is the set of odd integers greater or equal to 5. 6. (b) Range f = R, f is
one-to-one, and x = f −1 (y) = 31 (y + 2). (e) Range f = R, f is not one-to-one: If
y1 6= y2 , then (x, y1 ) 6= (x, y2 ) for any x, yet f (x, y1 ) = f (x, y2 ).
(f ) Range f = {y : 21 ≤ y ≤ 1}. f is not one-to-one. If 0 < x ≤ 1, then f (−x) =
f (x). 7. (a) Range f = {(x, y) ∈ R × R : x2 + y 2 = 1}. (b) f −1 ((1, 0)) =
−1
0, f ((0, −1)) = 3π 2
. 11. Assume f is not one-to-one and show that this leads to
a contradiction.
Exercises 1.3 page 20
1. (b) For n = 1, 1 = 12 . Assume the result is true for n = k. Then for n = k + 1,
1+3+5+· · ·+(2k −1)+(2(k +1)−1) = k2 +(2k +1) = (k +1)2 . (d) When n = 1,
13 = [ 21 ·1·2]2 . Assume true for n = k. Then for n = k+1, 13 +23 +· · ·+k3 +(k+1)3 =

533
534 Hints and Solutions

[ 21 k(k + 1)]2 + (k + 1)3 = 41 (k + 1)2 (k2 + 4k + 4) = [ 21 (k + 1)(k + 2)]2 . (f ) When


n = 1, x2 − y 2 = (x − y)(x + y) and equality holds. For n = k + 1 write
xk+2 − y k+2 = xk+2 − xy k+1 + xy k+1 − y k+2 = x(xk+1 − y k+1 ) + (x − y)y k+1 , and now
apply the induction hypothesis. 2. (a) The result is true for n = 1. Assume that for
k ∈ N, 2k > k. Then by the induction hypothesis, 2k+1 = 2k ·2 > k ·2 = k +k ≥ k +1.
(c) For n = 4, 4! = 24 > 16 = 24 . Thus the inequality is true when n = 4. Assume
that n! > 2n for some n ≥ 4. Then (n + 1)! = (n + 1)n! > (n + 1)2n > 2 · 2n = 2n+1 .
Thus by the modified principle of mathematical induction the inequality holds for
all n ∈ N, n ≥ 4. (d) True for n = 3. Assume true for k ≥ 3. Then for n = k + 1,
13 + 22 + · · · + k3 + (k + 1)3 < 21 k4 + (k + 1)3 = 12 [k4 + 2k3 + 6k2 + 6k + 2].
But for k ≥ 2, 6k + 2 < 4k + 1 + 2k3 from which the result now follows. 4. For
n ∈ N let P (n) be the statement f (n) = 3 · 2n + (−1)n . Then P (n) is true for
n = 1, 2. For k ≥ 3, assume that P (j) is true for all j ∈ N, j < k. Use the fact that
f (k) = 2f (k − 2) + f (k − 1) and the induction hypothesis to show that P (k) is true.
Thus by the second principle of mathematical induction the result holds
for all n ∈ N. 5. (b) f (n) = n2 . (d) f (n) = 0 if n is even and f (n) =
(n−1)/2 1 3
(−1) n! if n is odd. (f ) f (2k + 1) = k+2 a1 and f (2k) = 2k+3 a2 . These can
be proved by induction on k. 7. For each n ∈ N let Sn = r + r + · · · + rn . Then
2

Sn −rSn = r−rn+1 , from which the result follows. 8. Hint: Let A = n1 (a1 +· · ·+an )
and write an+1 = xA for some x ≥ 0. Use the induction hypothesis to prove that
(a1 · · · an · an+1 )1/(n+1) ≤ x1/(n+1) A. Now use Bernoulli’s inequality to prove that
x1/(n+1) ≤ (n + x)/(n + 1). From this it now follows that
n+x 1
x1/(n+1) A ≤ A= (a1 + · · · + an + an+1 ).
n+1 n+1
Exercises 1.4. page 28
4. (a) Consider (a − b)2 . 5. (a) inf A = 0, sup A = 1. (c) inf C = 0, sup C = ∞.
(f ) inf F = 1, sup E = 3. (h) inf H = −2, sup H = 2 8. Apply Theorem
1.4.4. 14. (b) Since A and B are non-empty and bounded above, α = sup A and
β = sup B both exist in R. Since α = sup A we have a ≤ α for all a ∈ A. Similarly
b ≤ β for all b ∈ B. Therefore a + b ≤ α + β for all a ∈ A, b ∈ B. Thus α + β
is an upper bound for A + B, and thus γ = sup(A + B) ≤ α + β. To prove the
reverse inequality, we first note that since γ is an upper bound for A + B, a + b ≤ γ
for all a ∈ A, b ∈ B. Let b ∈ B be arbitrary, but fixed. Then a ≤ γ − b for all
a ∈ A. Thus γ − b is an upper bound for A and hence α ≤ γ − b. Since this holds
for all b ∈ B, we also have that b ≤ γ − α for all b ∈ B. Thus β ≤ γ − α; i.e.,
α + β ≤ γ. 15. (a) Let α = sup{f (x) : x ∈ X}, β = sup{g(x) : x ∈ X}. since
the range of f and g are bounded, α and β are finite with f (x) + g(x) ≤ α + β
for every x ∈ X. Therefore α + β is an upper bound for {f (x) + g(x) : x ∈ X}.
Thus sup{f (x) + g(x) : x ∈ X} ≤ α + β. (d) Let α = sup{g(x) : x ∈ X}. Hence
g(x) ≤ α for all x ∈ X. Thus by hypothesis f (x) ≤ α for all x ∈ X. Therefore α
is an upper bound for {f (x) : x ∈ X}. As a consequence sup{f (x) : x ∈ x} ≤ α.
16. (a) F (x) = 3x + 2, sup{F (x) : x ∈ [0, 1]} = 5. (c) Range f = [0, 5]. Therefore
sup{f (x, y) : (x, y) ∈ X × Y } = 5. 20. (a) F (x) = 3x + 2 and H(y) = 2y. Thus
sup{H(y) : y ∈ [0, 1]} = 2 and inf{F (x) : x ∈ [0, 1]} = 2.
Exercises 1.5. page 32
1. First prove that if p and q are positive integers then there exists n ∈ N such that
np > q. If p > q then n = 1 works. If p ≤ q, consider (q + 1)p. 4. Suppose r1 , r2
are rational with r1 < r2 . Then r2 − r1 > 0 and r = r1 + 12 (r2 − r1 ) is rational with

r1 < r < r2 . 6. (a) Use the fact 2/2 is irrational. (b) Use Theorem 1.5.2 and
Hints and Solutions 535

(a). 8. Since x < y and u > 0, we have x/u < y/u. Now apply Theorem 1.5.2 and
(a).
Exercises 1.6. page 36
1. (a) .0202020 . . . 2. (c) .0101 = 212 + 214 = 83 . (d) .010101 · · · = 212 + 214 +
∞ h i
1
· · · = 212 ( 14 )n = 14 1−1 1 = 13 . (f ) .001001 · · · = 213 + 216 + 219 + · · · =
P
26
n=0 4
∞ h i
1 1 n 1
( 8 ) = 8 1− 1 = 7 . 3. (a) .0022 = 30 + 302 + 323 + 324 = 27
1 1 2 2 8
P
23
+ 81 = 81 .
n=0 8
∞ h i
(d) .101010 · · · = 31 + 302 + 313 + 304 + 315 + · · · = 13 ( 91 )n = 13 1−1 1 = 38 . (f )
P
n=0 9
∞ ∞ ∞ ∞
1 2 1 1 n 2 1 n 1 9 2 9
= 58 .
P P P  P 
.121212 · · · = 32k+1
+ 32k
= 3 9
+ 9 9
= 3
· 8
+ 9
· 8
k=0 k=1 n=0 n=0
4. .010101 · · · . 5. Finite binary expansion is .0011 whereas the infinite binary
expansion is .0010111 · · · .
Exercises 1.7. page 45
1. (c) Let f : N → O be defined by f (n) = 2n−1. 4. (a) g(x) = a+x(b−a) is a one-
to-one mapping of (0, 1) onto (a, b). 6. (a) Since A ∼ X, there exists a one-to-one
function h from A onto X. Similarly, there exists a one-to-one function g from B onto
Y . To prove the result, show that F S : A×B →TX ×Y defined by F (a, b) = (h(a), g(b))
is one-to-one and onto. 8. (a) An = S R, An = {x :T−1 < x < 1}.
(c) An = (−1, 2), An = [0, 1]. (e) An = (0, 1), An = { 21 }. 12. (a) Let
S T
y ∈ f (∪α Eα ). Then y = f (x) for some x ∈ ∪α Eα . But then x ∈ Eα for some α.
Therefore y = f (x) ∈ f (Eα ). Thus f (∪α Eα ) ⊂ ∪α f (Eα ). The reverse containment
follows similarly. 13. (b) Since the set of rational number Q is countable and
the set of real numbers R is uncountable, by part (a) the set of irrational numbers,
namely R \ Q, is uncountable. 15. (a) For n ∈ N let Pn denote the set of all
polynomials in x of degree less than or equal to n with rational coefficients, and let
Qn+1 = Q × · · · × Q (n + 1 times). By repeated application of Exercise 6 (b), Qn+1 is
countable. Define f : Qn+1 → Pn by f (a0 , a1 , . . . , an ) = an xn + · · · + a1 x + a0 . Since
f maps Qn+1 onto Pn and Pn is infinite, Pn is countable. (b) By part (a) Pn is
countable. Thus by Theorem 1.7.15, ∪n Pn , the set of all polynomials with rational
coefficients, is countable. 18. (a) Consider the function on (0, 1) that for each
n ∈ N, n ≥ 2, maps n1 to n−1 1
, and is the identity mapping elsewhere. 19. For a
polynomial p(x) = an xn + · · · + a1 x + a0 , consider the height h of the polynomial
defined by h = n + |a0 | + |a1 | + · · · |an |. Prove that there are only a finite number of
polynomials with integer coefficients of a given height h, and therefore only a finite
number of algebraic numbers arising from polynomials of a given height h. 22. If
f is a function from A → P(A), show that f is not onto by considering the set
{x ∈ A : x 6∈ f (x)}. 23. For a, b ∈ [0, 1] with decimal expansion a = .a1 a2 . . . and
b = .b1 b2 . . . , consider the function f : [0, 1]×[0, 1] → [0, 1] by f (a, b) = .a1 b1 a2 b2 . . . .

Chapter 2
Exercises 2.1. page 56
2. (b) We first note that |x| = |x − y + y| ≤ |x − y| + |y|. Therefore |x| − |y| ≤ |x − y|.
Interchanging x and y gives |y| − |x| ≤ |y − x| = |x − y|. Now use the definition of
||x| − |y||. 5. (a) −3 ≤ x ≤ 13/3. (c) −1 < x < 2. 7. (c) This is a metric.
536 Hints and Solutions

The only nontrivial part is the triangle inequality. This follows from the following.
Since the ln function is increasing on (0, ∞), for a, b positive we have
ln(1 + a + b) ≤ ln((1 + a)(1 + b)) = ln(1 + a) + ln(1 + b).
11. (a)(i) Since the points are collinear, √ the distance is just the usual euclidean
distance, i.e. d(( 21 , 41 ), (− 12 , − 41 )) = 21 5. (ii) The points are not collinear. So

d(( 21 , 21 ), (0, 1)) = 21 2 + 1. 12. (b) For x ∈ [0, 1], |x − x2 | = x − x2 , which has
a maximum at x = 21 . Therefore d(f, g) = 41 . 14. Again the only non-trivial part is
the triangle inequality and it follows from the following: for a, b positive,
a+b a b a b
= + ≤ + .
1+a+b 1+a+b 1+a+b 1+a 1+b
Exercises 2.2 page 67
S
2. Let p ∈ O = α Oα be arbitrary. Then p ∈ Oαo for some αo ∈ A. Since Oαo
is open, there exists ǫ > 0 such that Nǫ (p) ⊂ Oαo . But then Nǫ (p) ⊂ O; i.e., p is
an interior point of O. 3. (a) By Corollary 2.2.16 a finite set has no limit points.
Now apply Theorem 2.2.14. This can also be proved directly by showing that the
complement ofpa finite set is the finite union of open intervals.
5. (b) Since x21p + x22 ≤ |x1 | + |x2 | we obtain Nǫ1 (p) ⊂ Nǫ2 (p). Likewise, since
max{|x1 |, |x2 |} ≤ x21 + x22 it follows that Nǫ2 (p) ⊂ Nǫ∞ (p). The last containment
follows since |x1 | + |x2 | ≤ 2 max{|x1 |, |x2 |}. 8. (a) For E = (0, 1) ∪ {2}, Int E =
(0, 1), E ′ = [0, 1], isolated points = {2}, E = [0, 1] ∪ {2}. (d) For E = { n1 : n ∈
N}, Int(E) = ∅, E ′ = {0}, isolated points = E, E = E ∪ {0}. 9. (a) {a, b}.
13. (a) Closed in X. (c) Neither. 16. Since α ∈ / A, for every ǫ > 0 there exists
an a ∈ A such that α − ǫ < a < α. Therefore α is a limit point of A. 17. (a)
Let p ∈ Int(E) be arbitrary. Since p is an interior point of E, there exists an ǫ > 0
such that Nǫ (p) ⊂ E. To show that Nǫ (p) ⊂ Int(E) it remains to be shown that
every q ∈ Nǫ (p) is an interior point of E. 19. (a) Since A ∪ B is a subset of A ∪ B
(which is closed), by Theorem 2.2.18(c) A ∪ B ⊂ A ∪ B. The reverse containment
follows analogously. 22. Let {rn }∞ ∞
n=1 be an enumeration of Q, and {ǫn }n=1 an
enumeration of the positive rational numbers. Take I = {Nǫj (rn ) : j, n ∈ N}.
23. Suppose U ⊂ R is open and suppose E ⊂ U is open in U . Let p ∈ E be arbitrary.
Use the fact that E is open in U and that U is open to show that there exists an ǫ > 0
such that Nǫ (p) ⊂ E. Thus p is an interior point of E. Since p ∈ E was arbitrary,
E is open in R. The converse is obvious. 25. (a) Let U = (0, 1) and V = ( 23 , 52 ).
26. Hint: Use the fact that A = A ∪ A′ , and if U open satisfies U ∩ A′ 6= ∅ then
U ∩ A 6= ∅. 27. Let A be a non-empty subset of R. If A contains at least two points
and is not an interval, then there exist r, s ∈ A with r < s and t ∈ R with r < t < s,
but t 6∈ A. The open sets U = (−∞, t) and V = (t, ∞) will prove that A is not
connected. Therefore, every connected set is an interval. Conversely, suppose A is
an interval and A is not connected. Then there exist disjoint open sets U and V
with A ∩ U 6= ∅, A ∩ V 6= ∅, and A ⊂ U ∪ V . Suppose a ∈ A ∩ U and b ∈ A ∩ V . By
Theorem 2.2.20 applied to U and V there exist disjoint open intervals I and J such
that a ∈ I and b ∈ J. Suppose a < b and J = (t, s). Show that t 6∈ U ∪ V , but t ∈ A.
This contradiction proves that A is connected.
Exercises 2.3. page 73
1. (b) Use the fact that 0 is a limit point of A. 3. (a) Let U = {Uα }α∈A be an
open cover of A ∪ B. Then U is also an open cover of A and B, respectively. Now
use the compactness of A and B to obtain a finite subcover of A ∪ B. 4. Since
K is compact, by Theorem 2.3.5 K is closed. Now show that K is bounded. Let
Hints and Solutions 537

α = sup K. This exists since K is non-empty and bounded. Now use the fact that
K is closed to show that α ∈ K.
Exercises 2.4. page 76
1. Take Kn = [0, n]. 3. Suppose In = [an , bn ]. Since In ⊃ Im for all m ≥ n, we
have an ≤ am ≤ bm ≤ bn for all m ≥ n. Let a = sup{an } and b = inf{bn }. Now
show that ∩In = [a, b].
Exercises 2.5. page 79
1
4. If x ∈ P with x = .a1 a2 · · · , an ∈ {0, 2}, set bn = a .
2 n
Consider the function
x → .b1 b2 · · · .

Chapter 3
Exercises 3.1. page 87
1. (a) Let an = (3n + 5) (2n + 7). Then |an − 32 | = 4 (4n + 7) < 1/n. Given ǫ > 0,
 

choose no ∈ N such that no ≥ 1/ǫ. Then for all n ≥ no , |an − 23 | < ǫ. Therefore
lim an = 23 . (c) Set an = (n2 + 1)/2n2 . Then |an − 21 | = 2n1 2 < 2n 1
. Given ǫ > 0,
1
choose no ∈ N such that no ≥ 1/2ǫ. √ Then for all n ≥
√ n o , |a n − | < ǫ. Thus
1 √  √ 2 √
lim an = 2 . (f ) First show that n + 1 − n = 1 ( n + 1 + n) < 1 (2 n).

Now given ǫ > 0 choose no such that 1 (2 n) < ǫ for all n ≥ no . 2. (a) If n is
even then n(1 + (−1)n ) = 2n. Thus the sequence is unbounded and diverges in R.
(c) When n is even, i.e., n = 2k, then sin nπ
2
= sin kπ = 0. On the other hand, when
n is odd, i.e. n = 2k+1, then sin nπ
2
= sin(2k+1) π
2
= (−1)k . Thus sin nπ 2
assumes the
values −1, 0, and 1 for infinitely many values of n. Therefore the sequence diverges.
4. (a) Write b = 1 + a where a > 0. By Example 1.3.2(b), bn ≥ 1 + na. Now use
the previous exercise. 5. First show that |a2n − a2 | ≤ (|an | + |a|)|an − a|. Now use
the fact that since {an } converges, there exists a positive constant M such that
|an | ≤ M for all n ∈ N. 6. Consider a = 0 and a > 0 separately. For a > 0,
√ √ √ √
an − a = (an − a) ( an + a). 8. Take ǫ = a/2. Then for this ǫ, there exists
no ∈ N such that |an − a| < a/2 for all n ≥ no . From this it now follows that
an > a/2 for all n ≥ no .
Exercises 3.2. page 93

5. (b) If p > 1, let xn = n p − 1. Apply the inequality of Example 1.3.2(b) to
n
(1 + xn ) . 6. (a) 3/5. (c) −1. (e) 0. (g) a/2. 7. (a) converges to 1.
(c) converges to 0. (e) converges
 to 2/3. 8. Use the fact that | cos x| ≤ 1.
10. (a) Suppose lim an+1 an = L < 1. Choose
 ǫ > 0 such that L + ǫ < 1. For
this ǫ there exists no ∈ N such that an+1 an < L + ǫ for all n ≥ no . From this
one obtains that nfor n > no , 0 < an ≤ (L + ǫ)n−no ano = M (L + ǫ)n , where
M = ano (L + ǫ) o . Since (L + ǫ) < 1, by Theorem 3.2.6(e), lim (L + ǫ)n = 0.
n→∞
The result now follows by Theorem 3.2.4. 11. (a) With an = n2 an , 0 < a < 1,
an+1
L = lim = a lim (1+ n1 )2 = a. Thus since L = a < 1, the sequence converges.
n→∞ an n→∞

12. Set xn = (an − 1) (an + 1) and solve for an 13. (b) Verify the result for n = 1.
Assume the result holds for n = k. For n = k + 1, (1 + a)(k+1) = (1 + a)(1 + a)k
which by the induction hypothesis
k   k   k+1
X k  j
X k j
X k j
= (1 + a) a = a + a .
j j j−1
j=0 j=0 j=1
538 Hints and Solutions
k+1
 
P k+1 j
Using part (a) show that the above is equal to a . 14. (a) By con-
j=0 j
sidering the sequence {ak − a} show first that one can assume a = 0.
Exercises 3.3. page 101

1. Take In = [n, ∞). 2. (a) Set an = n2 + 1 n = 1 + 1/n2 . Since 1/(n+1)2 <
 p

1/n2 we have an+1 < an and lim an = 1. 4. Use mathematical induction to show
that an > 1 for all n ∈ N. From the inequality 2ab ≤ a2 + b2 , a, b ≥ 0, we have
2an ≤ a2n + 1 or an+1 = 2 − 1/an ≤ an . Therefore {an } is monotone decreasing.
Finally, if a = lim an , then a = lim an+1 = lim (2 − 1/an ) = 2 − 1/a. Therefore
n→∞ n→∞ √
a = 1. 6. (a) Use induction to show that xn > α for all n. The inequality
1 2 2 √
ab ≤ 2 (a + b ), a, b ≥ 0 should prove useful. Use the fact that xn > α to prove
that {xn } is monotone decreasing. Consider xn+1 − xn and simplify. 8. (c) {an }
is monotone increasing with an ≤ 3 for all n. If a = lim an then a2 = 2a + 3. Thus
a = 3. (e) {an } is monotone decreasing with an > 2 for all n and lim an = 2.
10. (a) e2 (c) e3/2 . 11. To show that {sn } is unbounded show that s2n > 1 + n2 .
Hint: First show it for n = 1, 2, and 3, then use mathematical induction to prove the
result for all n ∈ N. 13. Hint: For k ≥ 2, k12 ≤ k(k−1) 1
= k−11
− k1 . 15. (a) Write
a = 1 + b with b > 0. Now use the binomial theorem to show that an /n ≥ cn for
n
some positive constant c and n sufficiently large. (c) n + (−1) ≥ n − n1 ≥ (n − 1).
n√
n
16. (d) The sequence is not monotone: If xn = n + (−1) n, then x2n+1 < x2n .
21. This problem is somewhat tricky. It is not sufficient to just choose a monotone
increasing sequence in the set; one also has to guarantee that the sequence converges
to the least upper bound of the set. Let E be a non-empty subset of R that is bounded
above. Let U denote the set of upper bounds of E. Since E 6= ∅, we can choose an
element x1 ∈ E. Also, since E is bounded above, U = 6 ∅. Choose β1 ∈ U . Let
α1 = 12 (x1 + β1 ), and consider the two intervals [x1 , α1 ] and (α1 , β1 ]. Since x1 ∈ E,
one, or both of these intervals have non-empty intersection with E. If (α1 , β1 ]∩E 6= ∅,
choose x2 ∈ E such that α1 < x2 ≤ b1 . In this case set β2 = β1 . If (α1 , β1 ] ∩ E = ∅,
choose x2 ∈ E such that x1 ≤ x2 ≤ α1 , and set β2 = α1 . In this case β2 ∈ U .
Proceeding inductively construct two monotone sequences {xn } and {βn } such that
(a) {xn } ⊂ E with xn ≤ xn+1 for all n, (b) {βn } ⊂ U with βn ≥ βn+1 for all n,
and (c) 0 ≤ βn − xn ≤ 2−n+1 (b1 − x1 ). Assuming that every bounded monotone
sequence converges, let β = lim βn . By (c) we also have β = lim xn . It only remains
to be shown that β = sup E. 22. Suppose A = {xn : n ∈ N} is a countable subset
of [0, 1]. To show that A $ [0, 1] proceed as follows: At least one of the three closed
intervals [0, 31 ], [ 13 , 23 ], [ 23 , 1] does not contain x1 . Call it I1 . Divide I1 into three closed
intervals of length 1/32 . At least one of these, say I2 , does not contain x2 .
Exercises 3.4. page 106
3. (a) {−1, 0, 1}. (c) {1}. (e) {1, −3}. 4. (a) e2 . 10. For convenience we
take n = 2. Let {pn } be a bounded sequence in R2 where for each n, pn = (an , bn ).
But then the sequences {an } and {bn } are also bounded. Since {an } is bounded, by
the Bolzano-Weierstrass theorem there exists a subsequence {ank } that converges,
to say a. Since the sequence {bnk } is also bounded, it has a convergent subsequence,
say {bnkj } that converges to say b. But then the subsequence {pnkj } converges to
(a, b).
Hints and Solutions 539

Exercises 3.5. page 112


1. (a) {−∞, ∞}. (c) {−1, 1}. (f ) {− 32 , 23 }. 3. 0, 1. 6. (a) Let α = lim an and
γ = lim(an + bn ). Since the sequences are all bounded, α, γ ∈ R. Let ǫ > 0 be given.
Then by Theorems 3.5.3 and 3.5.4 there exists no ∈ N such that an > α − ǫ/2 for all
n ≥ no and (an +bn ) < γ +ǫ/2 for all n ≥ no . Therefore bn < γ +ǫ/2−an < γ −α+ǫ.
From this it now follows that lim bn ≤ γ − α + ǫ. Since ǫ > 0 was arbitrary, we have
lim bn ≤ γ − α; i.e., lim an + lim bn ≤ lim(an + bn ). The other inequality is proved
similarly. 8. { 12 , 1}. Hint: Consider the subsequences {s2m } and {s2m+1 }.
10. By Theorem 3.5.7, there exists a subsequence {ank } of {an } such that ank → a.
Since {bn } converges to b, ank bnk → ab. Therefore ab is a subsequential limit of
{an bn }. Thus lim an bn ≤ ab. The reverse inequality follows similarly. The fact that
b 6= 0 is crucial.
Exercises 3.6. page 118
2. (a) The sequence {(n + 1)/n} converges and thus is Cauchy. (d) The sequence
converges to zero and thus is Cauchy. 4. (a) Consider s2n − sn . 10. For n ≥ 3,
|an+1 −an | < 41 |an −an−1 |. Therefore the sequence {an } is contractive. If a = lim an ,
then 0 < a < 1 and is a solution of a2 + 2a − 1 = 0. 13. (b) Since (an+1 − an ) =
(b − 1)(an − an−1 ), by induction (an+1 − an ) = (b − 1)n−1 (a2 − a1 ). Therefore,
n n−1 1 − (b − 1)n
(b − 1)k = (a2 − a1 )
P P
an+1 − a1 = (ak+1 − ak ) = (a2 − a1 ) .
k=1 k=0 2−b
1
Letting n → ∞ gives a = a1 + 2−b (a2 − a1 ).
Exercises 3.7. page 123
n n  
1
. Since k12 ≤ 1
− k1 , k ≥ 2, for n ≥ 2, sn ≤ 1 + 1 1
P P
1. Let sn = k2 k−1 k−1
− k
=
k=1 k=2
1
2 − ≤ 2. Therefore {sn } is bounded above and hence converges by Theorem 3.7.6.
n
5. Use the inequality ab ≤ 21 (a2 + b2 ), a, b ≥ 0.

Chapter 4
Exercises 4.1. page 142
1. (a) If f (x) = 2x − 7, L = −3, then |f (x) − L| = 2|x − 2|. Given ǫ > 0 take
δ = ǫ/2. Then for all x with |x − 2| < δ, |f (x) − L| = 2|x − 2| < 2δ = ǫ. (c) If
|x−1|
f (x) = x/(1+x), then |f (x)− 21 | = 2|x+1| < 12 |x−1| for all x > 0. Hence given ǫ > 0,
choose δ = min{2ǫ, 1}. With this choice of δ, x > 0 and thus |f (x) − 21 | < 21 δ ≤ ǫ.
(e) Let f (x) = (x3 + 1)/(x + 1). Since x3 + 1 = (x + 1)(x2 − x + 1), we have
f (x) = x2 −x+1 for x 6= −1. Therefore |f (x)−3| = |x+1||x−2|. But for −2 < x < 0
we have |x − 2| < 4. Therefore for all such x we have |f (x) − 3| < 4|x − (−1)|. Given
ǫ > 0 take δ = min{ 4ǫ , 1}. Then for |x − (−1)| < δ we have |f (x) − 3| < ǫ.
2. (c) We first note that x3 − p3 = (x − p)(x2 + xp + p2 ). For |x − p| < 1 we have
|x| < |p| + 1. Therefore |x3 − p3 | < (3|p|2 + 3|p| + 1)|x − p|. Hence given ǫ > 0
choose δ so that 0 < δ ≤ min{1, ǫ/(3|p|2 + 3|p| + 1). Then for |x − p| < δ we
√ √ √ √
have |x3 − p3 | < ǫ. (e) Note that for x > 0, x − p = (x − p)/( x + p).
√ √ √ √
Therefore | x − p| < |x − p|/ p. Given ǫ > 0 choose δ so that 0 < δ < pǫ.
√ √
Then if |x − p| < δ we have x > 0 and | x − p| < ǫ. 3. (a) The limit does not
exist. For x > 0, x/|x| = 1, whereas for x < 0, x/|x| = −1. (c) The limit does
not exist. Consider the sequence {1/nπ} which has limit 0 as n → ∞. (e) Since
540 Hints and Solutions

(x + 1)2 − 1 = x2 + 2x we have that for x 6= 0, [(x + 1)2 − 1]/x = x + 2. Thus the


limit as x → 0 exists and equals 2. 4. lim f (x) = −3. 5. (a) By Figure 4.5,
x→−1
for 0 < t < π/2, sin t = length of P Q < length of arc P R = t. 7. (a) Use the
inequality ||f (x)|
q− |L|| ≤ |f (x) − L|. (c) Use induction on n and Theorem 4.1.6.
4 1
8. (a) 0. (c) 7
. (e) 4
. (g) 2. Note: sin 2x/x = 2 (sin 2x/2x). 9. Let L =
1
lim f (x). By hypothesis L > 0. Take ǫ = 2
L and use the definition of limit.
x→p
12. Since g is bounded on E, there exists a positive constant M such that |g(x)| ≤ M
for all x ∈ E. Thus |f (x)g(x)| ≤ M |f (x)| for all x ∈ E. Now use the fact that
lim f (x) = 0. 14. (a) By Theorem 4.1.6 (a), lim g(x) = lim (f (x) + g(x)) −
x→p x→p x→p
lim f (x), both of which are assumed to exist. 16. Suppose lim f (x) = L. Let
x→p x→∞
ǫ > 0 be given. By definition there exists M > 0 such that |f (x) − L| < ǫ for
all x ∈ (a, ∞) with x > M . Let δ = 1/M . Then for all t ∈ (0, 1/a) with t < δ,
1/t ∈ (a, ∞) and 1/t > M . Therefore |g(t) − L| = |f ( 1t ) − L| < ǫ. The proof that
lim g(t) = L implies lim f (x) = L is similar. 17. (a) 3/2. (c) 2. (e) 1/2.
t→0 x→∞
1 1
(g) Limit does not exist. For all x > π/3, cos x
> 1/2. Thus x cos x
> x/2 and
x cos x1 → ∞ as x → ∞.
Exercises 4.2. page 155
1. (c) Since 1 − cos x = 2 sin2 (x/2), for x 6= 0, g(x) = x2 sin2 (x/2). Now use the
fact that | sin t| ≤ |t|. (d) Since limx→2 k(x) does not exist, k is not continuous
at xo = 2. 2. (b) The function f is not continuous at 1. Consider f (pn ) where
{pn } is a sequence of irrational numbers with pn → 1. 4. If p > 0, |f (x) − f (p)| =
√ √ √ √ √
| x− p| = |x−p| ( x+ p) < √1p |x−p|. Let ǫ > 0 be given. Set δ = min{p, pǫ}.
Then |x − p| < δ implies that |f (x) − f (p)| < ǫ. Therefore f is continuous at p. If
p = 0, set δ = ǫ2 . Alternately use Theorem 4.1.3 and Exercise 6 of Section 3.1.
5. (b) Verify that limx→0 f (x) = 0. Hence if we set f (0) = 0, the function f is
continuous on [0, 1]. 6. (a) See Exercise 7(a) of Section 4.1. 7. Use Theorem
4.2.4. and the fact that xn is continuous on R for all n ∈ N. 9. (a) R \ {−2, 0, 2}.
(c) R. 8. (a) Use an identity for cos x − cos y and Exercise 5(a) of Section 4.1.
12 (a) With the metric dp 2 , verify that p
d2 (f (x1 , y1 ), f (x2 , y2 )) = 2[(x1 − x2 ) + (y1 − y2 )]2 ≤ 2 (x1 − x2 )2 + (y1 − y2 )2 =
2d((x1 , y1 ), (x2 , y2 )).
Hence given ǫ > 0, take δ < 12 ǫ. 14. (a) Hint: Use the fact that max{f (x), g(x)} =
1
2
(f (x) + g(x) + |f (x) − g(x)|). 16. If p(x) is a polynomial of odd degree, show
that limx→∞ p(x)/p(−x) = −1. Hence there exists an r > 0 such that p(r) and
p(−r) are of opposite sign. Now apply the Intermediate Value Theorem to p(x) on
[−r, r]. 17. Consider g(x) = f (x) − f (x − 1), x ∈ [0, 1]. 19. Let p ∈ E be a limit
point of F . Use Theorem 3.1.4 and continuity of f to show that p ∈ F . 22. Take
ǫ = 1. Then for this choice of ǫ there exists a δ > 0 such that |f (x) − f (p)| ≤ 1 for all
x ∈ Nδ (p) ∩ E. Show that this implies that |f (x)| ≤ (|f (p)| + 1) for all x ∈ Nδ (p) ∩ E.
25. First show that f (K) is closed as follows. Let q be a limit point of f (K). Then
there exists a sequence {pn } in K such that f (pn ) → q. Now use Theorem 3.4.5 and
continuity of f to prove that q ∈ f (K). Now assume f (K) is not bounded. Then
there exists a sequence {pn } in K such that |f (pn )| → ∞. Obtain a contradiction as
above. 27. (a) Suppose x ∈ (g ◦ f )−1 (V ). Then g(f (x)) ∈ V . Hence by definition
f (x) ∈ g −1 (V ). But then x ∈ f −1 (g −1 )(V ). The reverse containment follows likewise.
29. By hypothesis, for each x ∈ K there exists ǫx > 0 and Mx > 0 such that
Hints and Solutions 541

|f (y)| ≤ Mx for all y ∈ Nǫx (x) ∩ K. The collection {Nǫx (x)}x∈K is an open cover of
K. Now use compactness of K to show that there exists a positive constant M such
that |f (y)| ≤ M for all y ∈ K.
Exercises 4.3. page 161
2. (a) Suppose f (x) = x2 is uniformly continuous on [0, ∞). Then with ǫ = 1, there
exists a δ > 0 such that |f (x) − f (y)| < 1 for all x, y ∈ [0, ∞) satisfying |x − y| < δ.
Set xn = n and yn = n + n1 . If no ∈ N is such that no δ > 1, then |yn − xn | = n1 < δ
for all n ≥ no . But |f (yn ) − f (xn )| = 2 + n12 ≥ 2 for all n. This is a contradiction!
1 1
(c) Take pn = (2n+1) π and qn = nπ . Then |h(pn )| = 1 and h(qn ) = 0 for all
2
n. But {pn } and {qn } both converge to 0. Hence given any δ > 0 there exists
an integer no such that |pn − qn | < δ for all n ≥ no and |h(pn ) − h(qn )| = 1.
Hence h is not uniformly continuous on (0, ∞). 3. (a) For all x, y ∈ [0, ∞),
x y |x−y|
|f (x) − f (y)| = 1+x
− 1+y
= (1+x)(1+y)
< |x − y|. Thus given ǫ > 0, the choice
2 2
δ = ǫ will work. (c) We first note that |h(x)−h(y)| = (x2|y −x |
+1)(y 2 +1)
< |y −x| |x|+|y|
x2 +1
.
2
But for |y − x| < 1, |y| < |x| + 1. As a consequence (|x| + |y|)/(x + 1) < 2.
(Verify!) Therefore |h(x) − h(y)| < 2|y − x|.  Hence given ǫ > 0, choose δ so that
0 < δ < min{1, ǫ/2}. (f ) Set g(x) = sin x x, x ∈ (0, 1] and g(0) = 1. Then g
is continuous on [0, 1], and thus by Theorem 4.3.4 uniformly continuous on [0, 1].
From this it now follows that f is uniformly continuous on (0, 1). 4. (a) Show
that |f (x) − f (y)| = | x1 − y1 | ≤ a12 |x − y| for all x, y ∈ [a, ∞), a > 0. (c) Using a
trigonometric identity forsin A −  sin B we obtain  
1 1 1 1 1 1 1 1 1 y−x
sin − sin = 2 sin − cos + ≤ 2 sin . Now us-
x y 2 x y 2 x y 2 xy
ing the   | sin h| ≤ |h| we have
inequality
1 y−x |y − x| 1
2 sin ≤ ≤ |y − x| for all x, y ∈ [a, ∞).
2 xy xy  a2
√ √ √ √ 1
5. (a) | x − y| = |x − y| ( x + y) ≤ 2√ a
|x − y| provided x, y ∈ [a, ∞).
(c) Assume that it does and obtain a contradiction. 7. (b) Suppose |f | and |g|
are bounded by M1 and M2 respectively. Then
|f (x)g(x) − f (y)g(y)| ≤ |f (x)||g(x) − g(y)| + |g(y)||f (x) − f (y)| ≤ M1 |g(x) − g(y)| +
M2 |f (x) − f (y)|.
Now use the uniform continuity of f and g. 10. Suppose f is not bounded. Then
there exists a sequence {xn } in E such that |f (xn )| → ∞. Since E is bounded,
{xn } has a convergent subsequence {xnk } in R. Thus {xnk } is Cauchy. Since f is
uniformly continuous, {f (xnk )} is Cauchy. But then by Theorem 3.6.2 the sequence
{f (xnk )} is bounded, which is a contradiction. 12. (a) By taking ǫ = 1 show that
there exists ro such that |f (x)| < |L| + 1 for all x ∈ [ro , ∞). Now use the continuity
of f on [a, ro ] to conclude that f is bounded on [a, ∞). 13. (a) Let x1 ∈ E be
arbitrary. For n ≥ 1 set xn+1 = f (xn ). Show that the sequence {xn } is contractive.
Exercises 4.4. page 175
2. (b) lim f (x) = lim f (x) = 0. (d) For 0 < |x| < 1, [x2 − 1] = −1, Hence
x→0+ x→0−
both the right and left limit at 0 exist and are equal to −1. (f) lim f (x) = 1.
x→0+
1 1
Hint: For n+1
< x ≤ n
, [ x1 ]
= n. 4. Use the fact that [x] is bounded near
1
xo = 2. 6. (b) b = −10. 7. (b) For n ∈ N set xn = n and yn = n + nπ , and
show that |f (xn ) − f (yn )| = sin 2. 10. (a) Define F on [a, b] by F (x) = f (x)
542 Hints and Solutions

for x ∈ (a, b] and F (a) = f (a+), which is assumed to exist. Then F is continuous
on [a, b], and thus uniformly continuous by Theorem 4.3.4. Hence f is uniformly
continuous on (a, b]. 12. If n ∈ N, g(x) = xn is continuous and strictly increasing
on (0, ∞) with Range g = (0, ∞). Therefore by Theorem 4.4.12, its inverse function
g −1 (x) = x1/n is also continuous on (0, ∞). From this it now follows that f (x) is
continuous on (0, ∞). 14. (a) Suppose first that U = (a, b) ⊂ I. Then since f is
strictly increasing and continuous on I, fS((a, b)) = (f (a), f (b)), which is open. For
an arbitrary open set U ⊂ I, write U = n In , where {In } is a finite or countable
collection of open intervals and use Theorems 1.7.14 and 2.2.9. 16. (b) f is strictly
increasing on [0, 21 ) and strictly decreasing on [ 21 , 1], and thus one-to-one on each of
the intervals. (c) f ([0, 21 )) = [0, 1) and f ([ 12 , 1]) = [1, 2]. Therefore f ([0, 1]) = [0, 2].
(d) For y ∈ [0, 1], f −1 (y) = 21 y, and for y ∈ [1, 2], f −1 (y) = 21 (3 − y). Therefore
f −1 (1−) = 21 and f −1 (1+) = 1. Thus f −1 is not continuous at yo = 1.

Chapter 5
Exercises 5.1. page 190
1. (a) For f (x) = x3 ,
f (x + h) − f (x) (x + h)3 − x3
f ′ (x) = lim = lim = lim (3x2 + 3xh + h2 ) = 3x2 .
h→0 h h→0 h h→0
(c) For h(x) = 1/x, x 6= 0,
1
− x1 −1 1
h′ (x) = lim x+h = lim = − 2.
h→0 h h→0 x(x + h) x
x f (x + h) − f (x) 1
(e) For f (x) = , x 6= −1, = .
x+1 h (x + 1)(x + h + 1)
1 1
Thus f ′ (x) = lim = .
h→0 (x + 1)(x + h + 1) (x + 1)2
2. If n ∈ N, by the binomial theorem
n
  n
 
n n n k n−k n−1 n
hk−1 xn−k .
P P
(x + h) − x = h x = nhx +h
k=1 k k=2 k
Dividing by h and taking the limit as h → 0 proves the result for n ∈ N. If n ∈ Z
is negative, write xn = 1 xm , m ∈ N, and use Theorem 5.1.5(c). 3. (a) Since
cos x = sin(x + π2 ), by the chain rule dx d
cos x = cos(x + π2 ) = − sin x. Alternately
use the definition of the derivative. 5. (a) Yes. (c) No. (e) Yes.
7. (a) f ′ (x) exists for all x ∈ R \ Z. For x ∈ (k, k + 1), k ∈ Z , f (x) = x[x] = kx.
Thus f ′ (x) = k. (c) The function h is differentiable at all x where sin x 6= 0.
For x ∈ (2kπ, (2k + 1)π), k ∈ Z, h′ (x) = cos x. For x ∈ ((2k − 1)π, 2kπ), k ∈ Z,
h′ (x) = − cos x. 9. (b) For x 6= 0, g ′ (x) = 2x sin x1 − cos x1 . Since lim cos x1 does
x→0
not exist, lim g ′ (x) does not exist. 10. (a) 2a + b = 6. (b) a = 4, b = −2.
x→0
Justify why b = −2. 12. (a) f (x) = 2 (2x + 1). (c) h′ (x) = 3[L(x)]2 x.

 

14. (b) f+ (0) = lim h(b−1) sin h1 which exists and equals 0 if and only if (b−1) > 0;
h→0+
i.e., b > 1. 15. (b) No. Consider f (x) = |x| at xo = 0.
Exercises 5.2. page 203
3. (a) Increasing on R. (c) Decreasing on (−∞, 0) and increasing on (0, ∞),
with an absolute minimum at x = 0. (e) Increasing on (−∞, 2) ∪ (2, ∞), decreas-
ing on (0, 2). The function has a local minimum at x = 2. 5. (a) Let f (x) =
Hints and Solutions 543
1
(1 + x) 2 , x > −1. By the mean value theorem f (x) − f (0) = f ′ (ζ) where ζ is be-
tween 0 and x. If x > 0, then f ′ (ζ) < 21 . On the other hand, if x < 0 and x < ζ < 0,
then f ′ (ζ) > 12 . But then f ′ (ζ) < 12 x. (c) Take f (x) = xα , 0 < α < 1. Then
f (x) − f (a) = f ′ (ζ)(x − a) where a < ζ < x. But then f ′ (ζ) < αaα−1 .
6. (a) Show that the function f (x) = x1/n − (x − 1)1/n is decreasing on the interval
1 ≤ x ≤ a/b. (b) Set f (x) = αx − xα , x ≥ 0. Prove that f (x) ≥ f (1) to obtain
xα ≤ αx + (1 − α), x ≥ 0. Now take x = a/b. 7. (a) If f ′′ (c) > 0, then there exists
a δ > 0 such that f ′ (x) (x − c) > 0 for all x, |x − c| < δ. Therefore f ′ (x) < 0 on
(c − δ, c) and f ′ (x) > 0 on (c, c + δ). Thus f has a local minimum at c.
8. Show that f ′ (x) = 0 for all x ∈ (a, b). 9. Since P (2) = 0 we can assume that
P (x) = a(x − 2)2 + b(x − 2). Now use the fact that P must satisfy P (1) = 1 and
P ′ (1) = 2 to determine a and b. 10. a = −2, b = 2, c = 1. 12. (a) See Example
4.1.10 (d). (b) By the result of (a), f ′ (x)< 0 for all x ∈ (0, π2 ]. 14. (a) Let
tn → c. Since f ′ (c) exists, lim (f (tn ) − f (c)) (tn − c) = f ′ (c). Now apply the mean
n→∞


value theorem. 17. Since f+ (a) = lim (f (x) − f (a)) (x − a) > 0, there exists a
x→a +

δ > 0 such that (f (x) − f (a)) (x − a) > 0 for all x, a < x < a + δ.
19. Hint: Consider f ′ (x). 20. (b) Check the values of f ′ (x) at pn = 1/(nπ), n ∈ N.
24. (a) For fixed a > 0 consider f (x) = L(ax), x ∈ (0, ∞). (c) By (a) and (b),
L(bn ) = nL(b) for all n ∈ Z and b ∈ (0, ∞). But then L(b) = L((b1/n )n ) = nL(b1/n ).
From this it now follows the L(br ) = rL(b) for all r ∈ Q. Now use the continuity
of L to prove that L(bx ) = xL(b) for all x ∈ R where bx = sup{br : r ∈ Q, r ≤
x}. 25. (b) Since tan(arctan x) = x, by Theorem 5.2.14 and the chain rule,
d
dx
tan(arctan x) = (sec2 (arctan x))( dxd
arctan x) = 1. The result now follows from
2 2
the identity sec (arctan x) = 1 + x . To prove this, consider the right triangle with
sides of length 1, |x|, 1 + x2 respectively.
Exercises 5.3. page 212
f (x) f (x) − f (xo )  g(x) − g(xo )
2. = . Now use Theorem 4.1.6(c) and the definition
g(x) x − xo x − xo
of the derivative. 4. Use the fact that since lim f (x) exists, f (x) is bounded on
x→a+
x5 + 2x − 3 5x4 + 2 7
(a, a + δ) for some δ > 0. 6. (a) lim = lim = . (c) By
x→1 2x3 − x2 − 1 x→1 6x2 − 2x 4
ln x 1
l’Hopital’s rule, lim = lim = 0. (e) Make the substitution x = 1/t.
x→∞ x x→∞ x
(g) 0. Use repeated applications of l’Hospitals rule until the exponent of ln x is less
than or equal to zero. (i) 0. Use l’Hospitals rule twice on (sin x−x)/(x sin x), x 6= 0.
9. (a) f ′ (0) = 0. (b) f ′′ (0) = − 31 .
Exercises 5.4. page 219
1. Let c1 > 0 be arbitrary. By Newton’s method cn+1 = (2c3n + α) 3c2n .


2. (a) f (0) = 1 and f (1) = −1. Therefore f has a zero on the interval [0, 1].
With c1 = 0.5, c2 = 0.33333333, f (c2 ) = .037037037, c3 = 0.34722222, f (c3 ) =
0.000195587, c4 = .34729635, f (c4 ) = 0.000000015.

Chapter 6
Exercises 6.1. page 238
1.(a) Since f is increasing on [−1, 0] and decreasing on [0, 2], m1 = m2 = 0, m3 = −3
and M1 = M2 = 1, M3 = 0. Since ∆xi = 1, L(P, f ) = −3 and U(P, f ) = 2.
544 Hints and Solutions
(
−1, 0 ≤ x < 1,
2. (a) f x) = Let P = {x0 , x1 , ..., xn } be any partition of [0, 2]
2, 1 ≤ x ≤ 2.
and let k ∈ {1, ..., n} be such that xk−1 < 1 ≤ xk . Then L(P, f ) = 4 − 3xk and
U (P, f ) = 4−3xk−1 . Thus U(P, f )−L(P, f ) = 3(xk −xk−1 ). By Theorem 6.1.7 it now
follows that f is Riemann integrable on [0, R 2 2]. Also, since xk−1 < 1 ≤ xk , L(P, f ) ≤
1 < U (P, f ) for any partition P. Hence 0 f = 1. Alternatively, consider the parti-
tion P = {0, c, 1, 2} where 0 < c < 1 is arbitrary. For this partition, L(P, f ) = 1 and
U(P, f ) = 4 − 3c. The results now follow as above. 3. (a) If P = {xo , x1 , . . . , xn }
is a partition of [a, b], then inf{f (t) : t ∈ [xi−1 , xi ]} = sup{f (t) : t ∈ [xi−1 , xi ]} = c.
Pn
Therefore L(P, f ) = U(P, f ) = c(xi −xi−1 ) = c(b−a). 4. (a) Since f (x) = [3x]
i=1
is monotone increasing on [0, 1], f is Riemann integrable on [0, 1]. For n ≥ 4 consider
the partition Pn = {0, 13 − n1 , 31 , 23 − n1 , 23 , 1 − n1 , 1}. For this partition L(Pn , f ) = 1
R1
and U(Pn , f ) = 1 + n3 . From this it now follows that [3x] dx = 1.
0
(c) Since f is increasing on [0, 1] it is Riemann integrable. Take Pn =
n
 n n
1 2
P i 1 3 P 1 P
{0, n , n , . . . , 1}. Then U (Pn , f ) = 3 +1 = 2
i + 1 =
i=1 n n n n
1
  i=1 i=1
3 n(n + 1) R 3 1 5
+ 1. Therefore (3x + 1)dx = lim 1+ +1 = . 6. Let
2 n2 0 n→∞ 2 n 2
P = {x0 , x1 , ..., xn } be a partition of [a, b]. Since f (x) ≤ g(x) for all x ∈ [a, b],
sup{f (t) : t ∈ [xi−1 , xi ]} ≤ sup{g(t) : t ∈ [xi−1 , xi ]} for all i = 1, ..., n. As a
consequence U (P, f ) ≤ U (P, g) for all partitions P of [a, b]. Taking the infimum
Rb Rb
over P gives f ≤ g. The result now follows from the fact that f, g ∈ R[a, b].
a a
R1 R1 1
8. f = 0, f = 2
. 10. Since a ≥ 0, f is increasing on [a, b]. Thus if
0 0
P = {x0 , x1 , ..., xn } is a partition of [a, b], mi = x2i−1 and Mi = x2i . Therefore
n n
x2i−1 ∆xi and U(P, f ) = x2i ∆xi . Now show that x2i−1 ∆xi ≤
P P
L(P, f ) =
i=1 i=1
1 3
1
3
(x3i − x3i−1 ) ≤ x2i ∆xi . From this it will now follow that L(P, f ) ≤ (b − a3 ) ≤
3
Rb 1 3
U (P, f ). Since f is continuous, f ∈ R[a, b] with f = (b − a3 ). 12. (a) With
a 3
Pn = { nk : k = 0, 1, 
. . . , n} and f (x) = x,
n k 1 P n n
k = 21 n(n + 1).
P P
U (Pn , f ) = = 2 k. By Exercise 1(a), Section 1.3,
k=1 n n k=1 k=1
R1
Therefore xdx = lim n(n + 1)/(2n2 ) = 1/2. (c) Take Pn = {0, n1 , n2 , . . . , 1}.
0 n→∞
Then 2
n i3
  n

1 1 P 1  2 1 1
i3 = 4 12 n(n + 1) =
P
U (Pn , f ) = = 4 1+ . Therefore,
i=1 n3 n n i=1 n 4 n
R1 3
x dx = lim U(Pn , f ) = 1/4. 13. Use Theorem 6.1.7. 14. (a) Let Pn =
0 n→∞
{xo , x1 , x2 , . . . , xn } be a partition of [a, b], and for each i = 1, 2, . . . , n, let Mi and
Mi∗ denote the supremum of f and |f | respectively on [xi−1 , xi ]. Likewise, let mi
and m∗i denote the infimum of f and |f | respectively on the same intervals. Use the
Hints and Solutions 545

inequality ||f (s) − |f (t)|| ≤ |f (s) − f (t)| to prove that Mi∗ − m∗i ≤ Mi − mi from
which the result follows. 15. (a) Since f is bounded on [a, b], |f (x)| ≤ M for all
x ∈ [a, b]. Let P = {x0 , x1 , ..., xn } be any partition of [a, b]. For each i let Mi (f 2 ) and
mi (f 2 ) denote the supremum and infimum of f 2 respectively over [xi−1 , xi ], with an
analogous definition for Mi (f ) and mi (f ). Let ǫ > 0 be given. Then for each i there
exists si , ti ∈ [xi−1 , xi ] such that Mi (f 2 ) < f 2 (si ) + 21 ǫ and mi (f 2 ) > f 2 (ti ) − 12 ǫ.
Therefore
0 ≤ Mi (f 2 ) − mi (f 2 ) < f 2 (si ) − f 2 (ti ) + ǫ ≤ |f (si ) + f (ti )||f (si ) − f (ti )| + ǫ ≤
2M [Mi (f ) − mi (f )] + ǫ.
Since this holds for any ǫ > 0, Mi (f 2 ) − mi (f 2 ) ≤ 2M [Mi (f ) − mi (f )]. Now use
Theorem 6.1.7. 16. Assume that f is continuous on [a, b] except perhaps at a or
b, or both. Since f is bounded there exists M > 0 such that |f (x)| ≤ M for  all
x ∈ [a, b]. Let ǫ> 0 be given. Choose y1 , y2 , a < y1 < y2 < b so that y1 − a < ǫ 8M
and b − y2 < ǫ 8M . Then
[sup{f (t) : t ∈ [a, y1 ]} − inf{f (t) : t ∈ [a, y1 ]}](y1 − a) ≤ 2M (y1 − a) < 41 ǫ.
Similarly for the interval [y2 , b]. Since f is continuous on [y1 , y2 ] there exists a par-
tition P of [y1 , y2 ] such that U(P, f ) − L(P, f ) < 21 ǫ. Let P ∗ = P ∪ {a, b}. Then P ∗
is a partition of [a, b], and by the above U(P ∗ , f ) − L(P ∗ , f ) < ǫ. Thus by Theorem
6.1.7, f ∈ R[a, b]. Suppose f is continuous on [a, b] except at a finite number of
points c1 , c2 , ..., cn with a ≤ c1 < c2 < · · · < cn ≤ b. Apply the above to each of
the intervals [a, c1 ], [c1 , c2 ], ..., [cn , b] to obtain a partition of [a, b] for which Theorem
6.1.7 holds.
Exercises 6.2. page 247
2. (a) Take ti = 31 (x2i + xi xi−1 + x2i−1 ). 3. (a) If c1 6= a, set c0 = a. Similarly,
Rc set
cn+1 = b if cn 6= b. First prove that f ∈ R[ci−1 , ci ], i = 1, 2, ..., n+1 with c i f = 0.
i−1
Now use Theorem 6.2.3. (b) Set h = f − g. 5. Since f is bounded, |f (x)| ≤ M
for all x ∈ [a, b] for some M > 0. Use equations (7), (8), and the fact that f ∈ R[c, b]
for every c ∈ (a, b) to prove that
Rb Rb Rc Rc
0 ≤ f − f = f − f ≤ 2M (c − a).
a a a a
1 Pn R1 1
Use this to show that f ∈ R[a, b]. 7. (a) lim 3
k2 = x2 dx = .
n→∞ n k=1 0 3
n
P n R1 1 π
(c) lim 2 2
= 2
dx = . 8. For each fixed c, 0 < c < 1, the series
n→∞ k=1 n + k 0 1+x 4 Rc
defining f becomes a finite sum on [0, c]. Evaluate 0 f and use Exercise 5.
Exercises 6.3. page 255
2. Since f is bounded, |f (x)| ≤ M for all x ∈ [a, b]. If x, y ∈ [a, b] with x < y, then
Ry Ry
|F (y) − F (x)| = f ≤ |f | ≤ M |y − x|.
x x
Thus F satisfies a Lipschitz condition on [a, b] and hence is uniformly continuous.
3. (b) F (x) = x for 0 ≤ x ≤ 21 ; F (x) = 23 − 2x for 21 < x ≤ 1. (d) F (x) = 0
for 0 ≤ x ≤ 31 ; F (x) = 21 (x2 − 19 ) for 31 < x ≤ 23 ; F (x) = x2 − 18
5
for 23 < x ≤ 1.
6. (b) F ′ (x) = cos x2 . (d) F ′ (x) = 2xf (x2 ). 7. By the chain rule, dx d
L( x1 ) =
− x1 = dxd
[−L(x)]. Therefore L( x1 ) = −L(x) + C for some constant C. Taking x = 1
shows that C = 0. 10. Let m and M denote the minimum and maximumR of f on
b
[a, b] respectively. Since g(x) ≥ 0 for all x, mg(x) ≤ f (x)g(x) ≤ M g(x). If a g > 0,
then from the previous inequality one obtains
546 Hints and Solutions
Rb  Rb
m≤ fg g ≤ M.
a a
Rb
Now apply the intermediate value theorem to f . If a g = 0, use Theorem 6.2.2 to
Rb
prove that a f g = 0 and thus the conclusion holds for any c ∈ [a, b]. 12. (a) With
R2 ln
R2
ϕ(x) = ln x, by Theorem 6.3.8 lnxx dx = x dx = 12 (ln 2)2 . (b) Use integration
1 0 √
by parts. (e) 2 ln 2 − 1. Use Exercise 9 with ϕ(x) = x and√f (t) = t/(1 + t).
13. (b) With the given change of variable, dx = a sec2 t dt and a2 + x2 = a sec t.
Z a Z π
1 4
Therefore √ dx = sec t dt. To evaluate this last integral first establish
2
x +a 2
0 0
b 1/n
sin t cos t
|f (x)|n dx
R
that sec t = + . 14. That ≤ M is straight forward.
cos t 1 + sin t a
For the reverse inequality, given ǫ, 0 < ǫ < M , using continuity of f show that there
exists an interval I ⊂ [a, b] such that |f (x)| ≥ M − ǫ for all x ∈ I. Using this show
b 1/n
|f (x)|n dx ≥ (M − ǫ)ℓ(I)1/n . Explain why the result now follows.
R
that
a
c+h
15. First show that F+′ (c) = lim 1
R
f (x) dx. Since f is monotone increasing
h→0+ h c
f (c+ ) exists. Thus given ǫ > 0, there exists δ > 0 such that f (c+ ) ≤ f (x) < f (c+ )+ǫ
for all x, c < x < c + δ. Explain how the conclusion now follows.
Exercises 6.4 page 263
R1 R1
1. (a) Since 0 < p < 1, 0 x−p dx = lim c x−p dx = lim 1
1−p
[1 − c1−p ] = 1
1−p
.
c→0+ c→0+
Note: If p ≤ 0, then x−p is continuous Ron [0, 1], and if p ≥ 1 then the improper in-
1
tegral diverges. (d) Converges, with 0 x ln x dx = − 14 . (f ) In this problem the
Rc
integrand is undefined at x = 1. For 0 < c < 1, tan π2 x dx = − π2 ln(cos π2 c). Since
0
lim ln(cos π2 c) does not exist, the improper integral diverges. 2. (a) Converges,
c→1+
R∞ Rc
with 0
e−x dx = lim e−x dx = lim 1 − e−c = 1. (c) The improper inte-
c→∞ 0 c→∞
c
gral converges with lim 1 x−p dx = lim p−1 1
1 − c1−p = p−11
R  
. (e) Diverges.
c→∞ c→∞
Rc 1
dx = ln(ln c) − ln(ln 2), which diverges to ∞ as c → ∞.
2 x ln x
c
x
dx = 12 ln(c2 + 1). The improper integral diverges since lim ln(c2 + 1)
R
(g) 2 +1
0 x c→∞
does not exist. 3. (a) For p > −1, the improper integral converges for all q ∈ R,
and for p < −1, the improper integral diverges for all q ∈ R. When p = −1,
the improper integral
R1 converges for all q < −1, and diverges for all q ≥ −1. 4.
Since limc→0+ c f (x)dx = sin 1 the improper integral converges. To show that the
1
improper integral of |f | diverges, first show that |f (x)| ≥ − 2x on each of the
  x
intervals √ 1 1 , √ 1 1 . Use the above to find a partition PN such that
(2n+ 3 )π (2n− 3 )π
N
P
L(PN , |f |) ≥ C 1/k, for some positive constant C independent of N . From this
k=1
you can conclude that the improper integral of |f | diverges. 5. Hint: Use the fact
Hints and Solutions 547
Rc cos x Rc 1 1 1
that 0 ≤ |f (x)| − f (x) ≤ 2|f (x)|. 7. (a) 2
dx ≤ 2
dx = − . Thus
π x π x π c
Rc Rc sin x 1 cos c Rc cos x
lim |f | < ∞. (b) By integration by parts, dx = − − − dx.
c→∞ π π x π c π x2
c
R cos x cos c
By (a) and Exercise 5, lim dx exists. Also, lim = 0. Therefore,
c→∞ π x2 c→∞ c
c
R sin x
lim dx exists. 9. (a) To show convergence of the improper integral con-
c→∞ π x
sider the integrals of tx−1 e−t over the two intervals (0, 1] and [1, ∞). If x ≥ 1, then
tx−1 e−t is continuous on [0, 1], and thus the integral over [0, 1] clearly exists. If
0 < x < 1, then since e−t ≤ 1 for t ∈ (0, 1],
R1 R1 1 R1
0 ≤ tx−1 e−t dt ≤ tx−1 dt = (1−cx ). Thus lim tx−1 e−t dt < ∞. For t ≥ 1, use
c c x c→0+ c
1
l’Hospital’s rule to show that there exists a to ≥ 1 such that tx−1 ≤ e 2 t for all t ≥ to .
R
Thus tx−1 e−t ≤ e−t/2 for all t ≥ to , and as a consequence lim tx−1 e−t dt < ∞.
R
R→∞ 1
Thus the improper integral defining Γ(x) converges for all x > 0.
Exercises 6.5. page 278
1. 2f (0). 3. (a) See Exercise 2, Section 6.3. 5. (a) π2 − 1. Use Theorem 6.5.10.
(c) 12 + 22 + 32 . For x ∈ [0, 3], [x] = I(x − 1) + I(x − 2) + I(x − 3). Now use formula
R4 R4 R4 R2 R3 R4
(12). (f ) (x−[x])dx3 = (x−[x])3x2 dx = 3x3 dx+ 3x2 dx+ 6x2 dx+ 9x2 dx
1 1 1 1 2 3

P 1
etc. 8. (a) n
. 12. (a) As in the solution of Exercise 15(a) of Section 6.1,
n=1 n2
if P = {x0 , x1 , ..., xn } is a partition of [a, b], Mi (f 2 ) − mi (f 2 ) ≤ 2M [Mi (f ) − mi (f )]
where M > 0 is such that |f (x)| ≤ M for all x ∈ [a, b]. From this it now follows that
0 ≤ U (P, f 2 , α) − L(P, f 2 , α) ≤ 2M [U(P, f, α) − L(P, f, α)]. Now apply Theorem
6.5.5.
Exercises 6.6. page 290
2. (a) With n = 4, h = .25. Set xi = .25 i, yi = f (xi ), i = 0, 1, 2, 3, 4. Then
y0 = 1.00000, y1 = 0.94118, y2 = 0.80000, y3 = 0.64000, y4 = 0.50000. Therefore,
.25
T4 (f ) = [y0 + 2y1 + 2y2 + 2y3 + y4 ] = 0.782795.
2
.25
S4 (f ) = [y0 + 4y1 + 2y2 + 4y3 + y4 ] = 0.785393.
3
By computation f ′′ (t) = 2(3t2 − 1) (1 + t2 )3 . Using the first derivative test, f ′′ (t)


has a local maximum of 12 at t = 1. Therefore |f ′′ (t)| ≤ 12 for all t ∈ [0, 1]. Thus by
equation (23) with n  =4,
π 1 1 1
| − T4 (f )| ≤ = 0.0026042. Since π/4 = 0.7853982 (to seven dec-
4 12 2 42
imal places), |π/4 − T4 (f )| = 0.0026032. 3. (b) By computation, f (4) (x) =
7
3(4x − 1)(1 + x ) . By the first derivative testqthe function f (4) has a maxi-
2 2 −2

mum on [0, 2] at x = 3/2. Thus |f (4) (x)| ≤ 6 (1 + 34 )7 < 1. If we choose n

R2
(even) so that | 0 f − Sn (f )| < 10−5 , then we will be guaranteed accuracy to
four
 decimal places. By inequality (26) with M = 1, we need to choose n so that
25 180n4 < 10−5 , or n4 > 17, 778. The value n = 12 will work. This value of n will
548 Hints and Solutions

guarantee that√E12 (f ) < 0.0000086. Compare your answer with the exact answer of

5 + 12 ln(2 + 5).
Exercises 6.7. page 296
2. Hint: Consider f (x) = 1 − χp (x), 0 ≤ x ≤ 1, and use Exercise 21(b) of Section
6.1. 4. No. Consider g = χQ on [0, 1] and let f be the zero function.

Chapter 7
Exercises 7.1. page 313
2. (a) Diverges. (c) Converges. (e) Diverges. (g) √ Converges by the ratio
test. P(i) Converges. (k) Diverges. First P show that k ≥ ln k for k ≥ 4. Thus
since 1/k diverges, by the comparison test 1/(ln k)2 diverges. (m) Converges
for p > 1; diverges for 0 < p ≤ 1. (o) Converges. Hint, rewrite k1 ln(1 + k1 ) as
1
ln(1 + k1 )k and use the comparison text. 3. (a) Converges to 1 (1 − sin p) for

k2
all p ∈ R for which
P | sin p| < 1; that is, for all p 6= (2k + 1) π2 , k ∈ Z.
4. (b) Since ak converges, lim ak = 0. Thus there exists ko ∈ NPsuch that 0 ≤
ak ≤ 1 for all k ≥ ko . But then 0 ≤ a2k ≤ ak for all k ≥ ko , and a2k converges
2
by the comparison test. (d) Take ak = 1/k . (f ) Converges. Use the inequality
ab ≤ 12 (a2 + b2 ), a, b ≥ 0. 5. The series diverges for all q < 1, p ∈ R, and converges
for all q > 1, p ∈ R. If q = 1, the series diverges for p ≤ 1 and converges for p > 1.
6. Suppose lim(an /bn ) = L, where 0 < L < ∞. Take ǫ = 12 L. For this ǫ, there exist
ko ∈ N such that 12 L ≤ ak bk ≤ 23 L for all k ≥ ko . The result now follows by the
comparison test. 8.PSince an > 0 for all n, we have bk ≥ a1 /k for all k. Hence by
the comparison test bk diverges. 11. For  a simple √ example take ak = (−1)k .
k+1 n k
12. The proof uses the fact that lim k
= lim kn = 1 for all n ∈ Z.
k→∞ k→∞
P∞ 1 P∞ 1
13. The given series is the sum of the two series 2
and 3
, each
k=1 (2k − 1) k=1 (2k)
of which converges. 16. Let sn = a1 + a2 + · · · + an and tk = a1 + 2a1 + · · · + 2k a2k .
By writing sn = a1 + (a2 + a3 ) + (a4 + a5 + a6 + a7 ) + · · · , show that if n < 2k , then
sn ≤ tk , and if n > 2k , then sn ≥ 21 tk . From these two inequalities it now follows that
P P k 
ak < ∞ if and only if 2 a2k < ∞. 18. (a) Diverges. If ak = 1 (k ln k), then
2k a2k = 1 (k ln 2). 19. Use Example 5.2.7 to show that ck − ck+1 ≥ 0 for all k.


Thus {ck } is monotone decreasing. Use the definition of ln k and the method of proof
of the integral test to show that ck ≥ 0 for all k. 21. Write ak+1 /ak = 1 − xk /k
where xk = (q − p)(k/(q + k + 1)).
 2 k  2
1 2k
22. (c) When p = 2, ak = 1·3···(2k−1) 1
Q 
2·4···(2k)
= 1 − 2j ≥ 1 − 2k . Now
j=1
n
1/h P
use the fact that lim (1 + h) = e. 23. (a) Set sn = ak , and let s = lim sn .
h→0 k=1
P √ √ √
Consider the series bk where b1 = ( s− s − s1 ) and for k ≥ 2, bk = ( s − sk−1 −

s − sk ).
Exercises 7.2. page 320

P
1. If {bn } is monotone increasing to b, consider (b − bk )ak . 2. Take bk = 1/k
k=1
n
for k odd, and bk = 1/k2 for k even.
P
4. If Bn = cos kt, then
k=1
Hints and Solutions 549
n
sin 21 t Bn = 1
[sin(k + 21 )t − sin(k − 12 )t] = 12 [sin(n + 21 )t − sin 12 t].
 P
2
k=1
kk
5. (a) Converges. (c) Converges. (d) Diverges; lim= 1/e 6= 0.
k→∞ (1 + k)k
(f ) Converges. (h) Converges for all t 6= 2nπ, n ∈ Z. If t = 2nπ, then the series
converges for p > 1 and diverges for 0 < p ≤ 1. 8. Use the partial summation
formula to prove that
n
P n−1
P n
P
kak = nAn − Ak where Ak = ak .
k=1 k=1 k=1
Now use Exercise 14 of Section 3.2.
Exercises 7.3. page 327
1 2 2
Pn |ab| ≤ 2 (a + b ), a, b ∈ R. 4. Use the hypothesis on |an | to
2. Use the inequality
show that sn = k=1 |ak | ≤ b1 − bn+1 . 6. (a) Converges conditionally.
(c) Converges absolutely for p > 1 and conditionally for 0 < p ≤ 1. (e) Converges
kk
absolutely for p > 1, and conditionally for 0 < p ≤ 1. (g) Rewrite ak =
(k + 1)k
as ak = (1 + k1 )−k . Since limk→∞ −1 k+1
P
a
P k 2 = e , the series (−1) a k diverges. (i)
By the comparison test (with 1/k ) the series converges absolutely. 9. First
n−1
P  1 
1 1
note that S3n = 3k+1
+ 3k+2 − 3k+3 . Now show that S3n → ∞ as n → ∞.
k=0 P
11. By Theorem, 7.2.6 the series converges. To show that | sin k|/k = ∞, show
that for any three consecutive integers, at least one satisfies | sin k| ≥ 21 .
Exercises 7.4. page 333

1. (a) k{1/(ln k)}k22 = 1/(ln k)2 , which diverges (Exercise 5, Section 7.1).
P
k=2
√ ∞
(c) k{ln k/ k}k22 = (ln k)2 /k which diverges by the Comparison test.
P
k=2
2. (a) |p| < 1. (c) p ≥ 21 . 3. Since {1/k} ∈ ℓ2 , the result follows by the
Cauchy-Schwarz inequality. 9. If we interpret the vectors a and b as forming
two sides of a triangle, with the third side given by b − a, then by the law of cosines
kb − ak22 = kbk22 + kak22 − 2kak2 kbk2 cos θ. Now apply Exercise 8(e). 11. (b)
Suppose {ak } ∈ ℓ1 . Since limk→∞ ak = 0 there exists ko ∈ N such that |ak | ≤ 1 for
all k ≥ ko . But then |ak |2 ≤ |ak | for all k ≥ ko . Hence {ak } ∈ ℓ2 .

Chapter 8
Exercises 8.1. page 343
(
nx 0, x = 0,
2. (a) lim =
n→∞ 1 + nx 1, x > 0.
(
0, x 6= ±(2k − 1) π2 , k ∈ N,
(c) lim (cos x)2n = 3. (a) By the root test the
n→∞ 1, x = ±(2k − 1) π2 , k ∈ N.
series converges for all |x| < 2; diverges for |x| ≥ 2. (c) converges for all x > 0;
Z 1 Z 1/n Z 2/n
diverges for x ≤ 0. 5. (c) fn = n2 x dx+ (2n−n2 x) dx = 1/2+1/2 =
0 0 1/n
1. 7. (a) If x = 0, fn (0) = 0 for all n ∈ N. If x > 0, then 0 < fn (x) < x/n, from
which the result follows. (b) For each n ∈ N, fn (x) has a maximum of e−1 at
550 Hints and Solutions

x = n. 8. Use
! the M !M ∈
fact that for N, N,
N M N M ∞
! ∞ ∞
!
X X X X X X X X
an,m = an,m ≤ an,m ≤ an,m .
n=1 m=1 m=1 n=1 m=1 n=1 m=1 n=1
The above inequalities hold since an,m ≥ 0 for all n, m ∈ N. Now first let M → ∞,
and then N →! ∞, to obtain
∞ ∞ ∞ ∞
!
X X X X
an,m ≤ an,m .
n=1 m=1 m=1 n=1
The same argument also proves the reverse inequality.
Exercises 8.2. page 350
2. (b) Suppose {fn } and {gn } converge uniformly to f and g respectively on E. Then
|fn (x)gn (x) − f (x)g(x)| ≤ |gn (x)||fn (x) − f (x)| + |f (x)||gn (x) − g(x)|. By hypothesis
|gn (x)| ≤ N for all x ∈ E, n ∈ N. Also, since |fn (x)| ≤ M for all x ∈ E, n ∈ N,
|f (x)| ≤ M for all x ∈ E. Therefore
|fn (x)gn (x) − f (x)g(x)| ≤ N |fn (x) − f (x)| + M |gn (x) − g(x)|.
Now use the definition of uniform convergence of {fn } and {gn } to show that given
ǫ > 0, there exists no ∈ N such that |fn (x)gn (x) − f (x)g(x)| < ǫ for all x ∈ E
and n ≥ no . 4. Find Mn = max{fn (x) : x ∈ [0, 1]}, and show that Mn → ∞.
5. (a) For x ∈ [0, a], |fn (x)| ≤ an . If 0 < a < 1, then lim an = 0. Thus given
n→∞
ǫ > 0 there exists no ∈ N so that an < ǫ for all n ≥ no ; that is |fn (x)| < ǫ for
all x ∈ [0, a], n ≥ no . Therefore {fn } converges uniformly to 0 on [0, a] whenever
1 1
a < 1. (b) No. Obtain a contradiction to Theorem 8.2.5. 8. (a) 2 ≤ 2
k + x2 k
1
1/k2 < ∞, the series
P P
for all x ∈ R. Since converges uniformly by
k 2 + x2
2 −kx
the Weierstrass M-test. (c) For x ≥ 1, k e ≤ k (1/e)k . Since 1/e < 1, the
2

P 2 sin 2kx 1 1
series k (1/e)k converges. 9. (a) ≤ ≤ C 3/2 for all
(2k + 1)3/2 (2k + 1)3/2 k
P 1
x ∈ R. Since < ∞, the given series converges uniformly for all x ∈ R by the
k3/2
Weierstrass M-test. (d) Since | sin h| ≤ |h| we have | sin(x/kp )| ≤ 2/kp for |x| ≤ 2.
Since p > 1, by the Weierstrass M-test theseries converges uniformly  and absolutely
Pn 1 1
for |x| ≤ 2. (e) Hint: Let Sn (x) = − . 10. (a) For
k=0 kx + 2 (k + 1)x + 2
2 2 P 2
x ≥ a > 0, 1 + k x ≥ ak . Thus since 1/k < ∞, by the Weierstrass M-test the
given series converges uniformly on [a, ∞) for every a > 0. To show that it does not
converge uniformly on (0, ∞), consider (S2n −Sn )(1/n2 ), where Sn P is the nth partial
sum of the series. 12. Since |ak xk | ≤ |ak | for all x ∈ [−1, 1] and |ak | converges,
the given series converges absolutely and uniformly by the Weierstrass M-test.
n
sin kt. By (1) of the proof of Theorem 7.2.6, |An | ≤ 1/| sin 21 t|.
P
16. Set An =
k=1
Thus {An } is uniformly bounded on any closed interval that does not contain an
integer multiple of 2π. The conclusion now follows by the Abel partial summation
xn
formula. 18. Suppose |F0 (x)| ≤ M for all x ∈ [0, 1]. Show that |Fn (x)| ≤ M
n!
for all x ∈ [0, 1], n ∈ N. Now use the Weierstrass M-test.
Exercises 8.3. page 359
(
∞ 0, x = 0, 1,
P k
1. Show that x(1 − x) = Thus by Corollary 8.3.2 the conver-
k=0 1, 0 < x < 1.
Hints and Solutions 551

gence cannot be uniform on [0, 1]. 4. Since f is uniformly continuous on R, given


ǫ > 0, there exists a δ > 0 such that |f (x) − f (y)| < ǫ for all x, y ∈ R, |x − y| < δ.
Choose no ∈ N such that 1/no < δ. Then for all n ≥ no , |f (x) − fn (x)| < ǫ for all
x ∈ R. 6. Let ǫ > 0 be given. Since {fn } converges uniformly on D, there exists
no ∈ N such that |fn (x) − fm (x)| < ǫ for all x ∈ D, n, m ≥ no . Use continuity of
the functions and the fact that D is dense in E to hprove thati |fn (y) − fm (y)| ≤ ǫ
n n x
for all y ∈ E, n, m ≥ no . 9. Note, 1 + nx = 1 + nx x . For x > 0, as in
n
Example 3.3.5, the sequence 1 + nx is an increasing sequence converging to ex .
x
Set gn (x) = e −fn (x) and apply Dini’s theorem on [a, b] provided a ≥R 0. This has to
x
be modified if a < 0. 11. (a) We first note that |(T ϕ)(x)| ≤ kϕku 0 dt = xkϕku .
1 2
It now follows that |T (T ϕ)(x)| ≤ 2 x kϕku for all x ∈ [0, 1].
Exercises 8.4. page 362
ak xk con-
P
1. By the Weierstrass M-test and the hypothesis on {ak }, the series
verges uniformly on [0, 1]. Now apply Corollary 8.4.2. 4. Since f ∈ R[0, 1], f is
bounded on [0, 1], i.e., |f (x)| ≤ M for all x ∈ [0, 1]. Now apply the bounded conver-
gence theorem to gn (x) = xn f (x) which converges pointwise to g(x) = 0, 0 ≤ x < 1,
and f (1) when x = 1. 6. We first note that
Z 1 Z 1
n
f (x )dx − f (0) ≤ |f (xn ) − f (0)|dx.
0 0
Now
R 1 use the fact that xn → 0 uniformly on [0, c] for any c, 0 < c < 1, and that
n
c
|f (x ) − f (0)|dx ≤ M (1 − c) for some constant M .
7. For each k ∈ N the function 2−k I(x − rk ) is Riemann integrable on [0, 1] with
R1 −k
2 I(x − rk ) = 2−k (1 − rk ). By the Weierstrass M -test the series converges uni-
0
R1 ∞
2−k (1 − rk ).
P
formly on [0, 1]. Thus f ∈ R[0, 1] with f = 9. By Theorem
0 k=1
6.2.1, fn g ∈ R[a, b] for all n ∈ N. Show that {fn g} converges uniformly to f g on
[a, b], and apply Theorem 8.4.1. 10. Since |fn (x)| ≤ g(x) for all x ∈ [0, ∞), n ∈ N,
the same is true for |f (x)|. By Exercise 5, Section 6.4, it now follows that the im-
R∞
proper integrals of fn , n ∈ N, and f on [0, ∞) converge. Since g < ∞, show
0
R∞ 1
that given ǫ > 0, there exists c ∈ R, c > 0, so that g < 2
ǫ. Now show that
c
R∞ R∞ Rc R∞
f− fn ≤ |f − fn | + 2 g. Use the uniform convergence of {fn } to f on [0, c]
0 0 0 c
to finish the proof. 11. (b) To show that (C[a, b], k k1 ) is not complete it suffices
to find a sequence {fn } of continuous functions that converges in the norm k k1 to
a Riemann integrable function f that is not continuous.
Exercises 8.5. page 369
Rx
2. By the fundamental theorem of calculus, fn (x) = fn (xo ) + fn′ (t)dt for all
xo
x ∈ [a, b]. If {fn′ } converges uniformly to g on [a, b], use Theorems 6.3.4 and 8.4.1
to prove that {fn } converges uniformly to a function f on [a, b] with f ′ (x) = g(x)
for all x ∈ [a, b]. 4. Let x ∈ (a, b) be arbitrary, and choose c, d such that a < c <
x < d < b. Now apply Theorem 8.5.1 to the sequence {fn } on [c, d], to obtain that
f is differentiable at x with f ′ (x) = lim fn′ (x). 6. (a) Use the comparison test
n→∞
552 Hints and Solutions

(1 + kx)−2
P
to show that the given series converges for all x > 0. Let S(x) =
k=1
n n
(1 + kx)−2 . Then Sn′ (x) = −2 k(1 + kx)−3 . Use the Weierstrass
P P
and Sn (x) =
k=1 k=1
M-test and the comparison test to show that the sequences {Sn (x)} and {Sn′ (x)}
converge uniformly on [a, ∞) for every a > 0. Thus by Theorem 8.5.1

S ′ (x) = lim Sn′ (x) = −2 k(1 + kx)−3
P
n→∞ k=1
for all x ∈ [a, ∞). Since this holds for every a > 0, the results holds for all x ∈
P(0, ∞).
(c) By the root Ptest the given series converges for all x, |x| < 1. Let S(x) = ∞ k=0 x
k
n k ′ P n−1 k
and SnP (x) = k=0 x . Then Sn (x) = k=0 (k + 1)x . Again by the root test, the
∞ k
series k=0 (k + 1)x converges pointwise for all x, |x| < 1, and uniformly for all
x, |x| ≤ a for any a, 0 P< a < 1. Hence by Theorem 8.5.1
S ′ (x) = lim Sn′ (x) = ∞ k=0 (k + 1)x
k
n→∞
for all x, |x| ≤ a. Since this holds for every a, 0 < a < 1, the result holds for all
x, |x| < 1.
Exercises 8.6. page 377
2. Let P = {xo , x1 , ..., xn } be a partition of [a, a + p]. Set yj = xj − a. Then
P ∗ = {yo , y1 , ..., yn } is a partition of [0, p]. If t ∈ [xj−1 , xj ], then t = s + p for
some s ∈ [yj−1 , yj ]. Since f is periodic of period p, f (t) = f (s + p) = f (s). There-
fore, sup{f (t) : t ∈ [xj−1 , xj ]} = sup{f (s) : s ∈ [yj−1 , yj ]}, and as a consequence
R a+p R p
U(P, f ) = U(P ∗ , f ). From this it now follows that a f = 0 f . The proof for the
lower integral is similar. Thus f ∈ R[0, p] if and only if f ∈ R[a, a + p].
4. (a) cn = 12 (n + 1). 6. Set An (δ) = sup{Qn (x) : x ∈ [−δ, δ]}. Then 0 < δ1 < δ2
implies An (δ1 ) ≤ An (δ2 ). Suppose lim An (δ1 ) < ∞ for some δ1 > 0. Then there
n→∞
exists a finite constant C and no ∈ N such that An (δ) ≤ C for all n ≥ no , 0 < δ ≤ δ1 .
Use this fact to obtain a contradiction to the hypothesis that {Qn } is an approximate
identity.
Exercises 8.7. page 396
1. (b) R = 2. (d) R = e. (f ) R = 2. 2. (b) By the root test the series
converges absolutely for all x, −2 < x < 2/5, and diverges for all other x ∈ R.
1 1 ∞
3. (a) x (1 − x)2 . 8. (a) (−1)k x2k , |x| < 1. Use the
 P
= =
1 + x2 1 − (−x2 ) k=0
Rx
previous exercise and the fact that arctan x = (1 + t2 )−1 dt to find the Taylor
0
series expansion of arctan x at c = 0. (c) Use Theorem 7.2.4. 10. Let P (x) =
x4 + 3x2 − 2x + 5 and use Taylor’s theorem with c = 1. 12. (b) By Example
∞ (−1)k+1
(x−1)k , which converges for all x, 0 <
P
8.7.20(c), ln x = ln(1+(x−1)) =
k=1 k
1
x ≤ 2. (d) By computation, f (k) (x) = 12 · 23 · · · (k − 12 )(1 − x)−(k+ 2 ) . Therefore
1
1 ∞ · 2 · · · (k − 21 ) k
3
the Taylor series expansion of (1 − x)− 2 is given by 1 + 2
P
x . For
k=1 k!
1
1 · 3 · · · (n + 2 ) n+1
−1 < x ≤ 0, use Theorem 8.7.16 to show that |Rn (x)| ≤ |x| .
(n + 1)!
1
∞ 1 · 3 · · · (n + )
2
|x|n+1 , |x| < 1, to conclude that
P
Use convergence of the series
k=1 (n + 1)!
Hints and Solutions 553

lim Rn (x) = 0, −1 < x ≤ 0. If 0 < x < 1, use Corollary 8.7.19 to show that
n→∞
n
1 · 3 · · · (n + 21 )

x x−ζ
|Rn (x)| ≤
n! (1 − x)3/2 1 − ζ
for some ζ, 0 < ζ < x. Now use the method of Example 8.7.20(c) to show that
1
lim Rn (x) = 0 for all x, 0 < x < 1. Thus the series converges to (1 − x)− 2 for all
n→∞
Rx 1
x, |x| < 1. (f ) Use the fact that arcsin x = √ dt, |x| < 1. (h) For p real,
0 1 − t2
p(p − 1) 2 p(p − 1)(p − 2) 3
(1 + x)p = 1 + px + x + x + · · · . If p is a positive integer,
2! 3!
then the expansion is finite.
Exercises 8.8. page 402

1. (a) Γ( 23 ) = Γ( 12 +1) = 21 Γ( 12 ) = 21 π. 2. Make the change of variable t = − ln s.
π/2 1 1
R∞ √ 1 Γ(n + 2 )Γ( 2 )
3. (a) 0 e−t t3/2 dt = Γ( 25 ) = 34 π. 5. (a) (sin x)2n dx =
R
.
0 2 Γ(n + 1)

Chapter 9
Exercises 9.1. page 417
R1 2 R1 R1 2 R1 2 1
R1
1. φ1 = 1 = 2 and φ2 = x = 2/3. Therefore c1 = 2
sin πx dx = 0 and
−1 −1 −1 −1 −1

3
R1
c2 = 2
x sin πx dx = 3/π. Thus by Theorem 9.1.4, S2 (x) = (3x)/π gives the best
−1
approximation in the mean to sin πx on [−1, 1]. 3. (a) a1 = 1/2, a2 = 1, a3 =
−1/6. (b) S2 (x) = π2 + π603 (π 2 − 12)(x2 − x + 61 ).

5. (c) bn = π2 x sin nx dx = − n2 cos nπ = n2 (−1)n+1 . Therefore
0
P∞ (−1)k+1 π ∞
4 P 1
x∼2 sin kx. 6. (c) x ∼ − cos(2k + 1)x.
k=1 k 2 π k=0 (2k + 1)2
12. (a) As in the proof of Theorem 7.4.3,for λ ∈ R, 0 ≤ kx − λyk2 = kxk2 −
2λhx, yi + λ2 kyk2 . If y 6= 0, take λ = hx, yi kyk2 to derive the inequality.
Exercises 9.2. page 422

4. (a) For theRπorthogonal system {sin nx}∞ 2 1
n=1 on [0, π], 0 sin nx dx = 2 π. There-
2
fore bn = π 0 f (x) sin nx dx. Thus Parseval’s equality for the orthogonal system
∞ Rπ π2 π2
b2n = π2 f 2 (x) dx. (b) (i)
P
{sin nx} becomes . (ii) . 5. Use Parse-
n=1 0 8 6
1 2 2 2
val’s equality and the fact that f g = 2 [(f + g) − f − g ]. 6. Any function that is
Rb
identically zero except at a finite number of points will satisfy a f (x)φn (x) dx = 0.
Exercises 9.3. page 431
1. (a) If f is even on [−π, π], then f (x) sin nx is odd and f (x) cos nx is even. Thus

bn = 0 for all n = 1, 2, ... and an = π2 f (x) cos nx dx, n = 0, 1, 2....
0
2 P∞ (1 − (−1)k ) 4 P∞ 1
3. (a) f (x) ∼ sin kx = sin(2k + 1)x.
π k=1 k π k=0 2k + 1
554 Hints and Solutions
π 4 P∞ 1 P∞ (−1)k+1
(c) |x| ∼ − cos(2k + 1)x. (e) 1 + x ∼ 1 + 2 sin kx.
2 π k=0 (2k + 1)2 k=1 k
∞ ∞ k
4 P 1 1 2 P (−1)
5. (a) 1, sin(2k + 1)x. (f ) − cos(2k + 1)x,
π k=0 2k + 1 2 π k=0 2k + 1
2 ∞ 1
(−1)n − cos 21 nπ sin nx.
P  

π n=0 n
6. (c) Since he is even, the Fourier series of he is the cosine series of h. Therefore
2c π/2
R 2c Rπ πc
a0 = x dx + (π − x) dx = , and
π 0 π π/2 2
2c π/2 2c Rπ 2c 
2 cos nπ + (−1)n+1 − 1 .
R 
an = x cos nx dx + (π − x) cos nx dx = 2
π 0 π π/2 πn2
cπ c P∞ 1 − (−1)k 2
Thus he (x) ∼ − cos 2kx. 7 (b) f (x) ∼ −
4 π k=1 k2 π
∞ π
4 P 1 1 R ′′
cos 2kx. 8. By integration by parts, bk = − 2 f (x) sin kx dx.
π k=1 (4k2 − 1) πk −π
′′ ′′
Since f ∈  R[−π, π], it is bounded on [−π, π]; i.e., |f (x)| ≤ M . Therefore
|bk | ≤ 2M k2 . Similarly for ak . Thus by the Weierstrass M-test, the Fourier se-
ries of f converges uniformly on [−π, π].
Exercises 9.4. page 442
π4
2. . 3. To show that {sin nx}∞ n=1 is complete on [0, π] it suffices to show that
90
∞ 2 Rπ 2
b2n =
P
Parseval’s equality holds for every f ∈ R[0, π]; i.e., f (x) dx. To accom-
n=1 π0
plish this, let fo denote the odd extension
Rπ of f to [−π, π]. Since fo is odd, an = 0
for all n = 0, 1, 2..., and bn = π2 0 f (x) sin nx dx. Since the orthogonal system
∞ 1 Rπ 2 2 Rπ 2
{1, cos nx, sin nx}∞ b2n =
P
n=1 is complete on [−π, π], fo (x) dx = f (x) dx.
n=1 π −π π 0
n
6. Let Sn (t) = 12 a0 =
P
ak cos kt + bk sin kt. If x ∈ [−π, π], then
k=1
Rx Rx Rx
f (t)dt − Sn (t) dt ≤ |f (t) − Sn (t)| dt.
−π −π −π
Rx Rx
Thus by Exercise 5(a) and Theorem 9.4.7, lim Sn (t) dt = f (t) dt, with
n→∞ −π −π
Rx
the convergence being uniform on [−π, π]. But Sn (t) dt = 12 a0 (x + π) +
−π
n
 
P ak bk
sin kx − (cos kx − cos kπ) , from which the result follows. 10. Use
k=1 k k
Lemma 9.4.8 and the Weierstrass approximation theorem.
Exercises 9.5. page 453
π2 π2 1 1 π (eaπ − 1) 2a P ∞ ((−1)n eaπ − 1)
2. , . 3. (a) . ( b) − . 6. (a) + cos kx.
12 6 2 2 4 aπ π k=1 a2 + k 2
(b) On [−π, π], the series converges to e|ax| , and thus to the 2π-periodic extension
of e|ax| on all of R.
Hints and Solutions 555

Chapter 10
Exercises 10.2. page 474
2. Since U is open and non-empty, there exists x ∈ U and r > 0 such that
(x − r, x + r) ⊂ U . Thus by Theorem 10.2.4, m(U ) ≥ m((x − r, x + r)) = 2r > 0. 3.
Let Vǫ = (−ǫ, 1 + ǫ). Then Vǫ is an open set containing P . Show that m(Vǫ \ P ) ≥ 1
for all ǫ > 0 and thus m(P ) < 2ǫ for every ǫ > 0. Alternately, show that
m(P c ∩ [0, 1]) = 1 and use Theorem 10.2.15. 6. First show that there exist dis-
joint bounded open sets U1 , U2 with U1 ⊃ K1 and U2 ⊃ K2 . Then m(K1 ∪ K2 ) =
m(U1 ∪U2 )−m((U1 ∪U2 )\(K1 ∪K2 )). But (U1 ∪U2 )\(K1 ∪K2 ) = (U1 \K1 )∪(U2 \K2 ).
Now use Theorem 10.2.9.
Exercises 10.3. page 480
1. (b) First show that if U is any open set, then U +x is open and m(U +x) = m(U ).
Use this and the definition to prove that λ∗ (E + x) = λ∗ (E). If K is compact and U
is a bounded open set containing K, show that (U + x) \ (K + x) = (U \ K) + x. Use
this to show that m(K +x) = m(K) and λ∗ (E +x) = λ∗ (E). 3. Since E1 ∩E2 ⊂ E1
and λ∗ (E1 ) = 0, λ∗ (E1 ∩ E2 ) = 0. Thus by Theorem 10.3.5, E1 ∩ E2 is measurable.
For E1 ∪ E2 apply Theorem 10.3.9. 6. If λ∗ (E) < ∞, then for each k ∈ N there
∗ 1
T set Uk with Uk ⊃ E such that m(Uk ) < λ (E) + k . Now use the fact
exists an open
that E ⊂ Un ⊂ Uk for all k ∈ N. 8. Set Ek = E ∩ [−k, k], k ∈ N. Then {λ∗ (Ek )}
is monotone increasing with λ∗ (Ek ) ≤ λ∗ (E) for all k ∈ N. Let α = lim λ∗ (Ek ).
k→∞
Suppose α < λ∗ (E). Choose β ∈ R such that α < β < λ∗ (E). By definition there
exists a compact set K with K ⊂ E such that m(K) > β. Use this to show that
there exists ko ∈ N such that λ∗ (Ek ) > β for all k ≥ ko , which is a contradiction.
Exercises 10.4. page 488
2. If E is bounded, the result follows from the definition of λ∗ (E) and λ∗ (E), and
Theorem 10.4.5(b) (for a finite union). If E is unbounded, let En = E ∩ (−n, S n).
Given ǫ > 0, chooseS Un open such that E ⊂ Un and λ(Un \E) < ǫ 2n . Let U = Un .
Show that U \ E ⊂ (Un \ En ). Now use Theorem 10.3.5 to show that λ(U \ E) < ǫ.
To obtain a closed set F ⊂ E satisfying λ(E \F ) < ǫ, apply the result for open sets to
E c . 4. First show that λ(E1 ∪E2 ) = 1; then use Theorem 10.4.1. 6. If E satisfies
λ∗ (E ∪T )+λ∗ (E c ∪T ) = λ∗ (T ) for every T ⊂ R, then E satisfies Theorem 10.4.2 and
thus is measurable. Conversely, suppose E is measurable and T ⊂ R. If λ∗ (T ) = ∞,
the result is true. Assume λ∗ (T ) < ∞. Let ǫ > 0 be arbitrary. Then there exists an
open set U ⊃ T such that λ(U ) < λ∗ (T ) + ǫ. Since E and U are measurable, E ∩ U
and E c ∩ U are disjoint measurable sets with (E ∩ U ) ∪ (E c ∩ U ) = U . Furthermore,
E ∩ U ⊃ E ∩ T and E c ∩ U ⊃ E c ∩ T . Thus by Theorem 10.3.9
λ∗ (T ) ≤ λ∗ (E ∩ T ) + λ∗ (E c ∩ T ) ≤ λ(E ∩ U ) + λ(E c ∩ U ) = λ(U ) < λ∗ (T ) + ǫ.
Since the above holds for every ǫ > 0, we have λ∗ (E ∩ T ) + λ∗ (E c ∩ T ) = λ∗ (T ).
Exercises 10.5. page 494

[0, 1],


if c < 0,
(0, 1], if 0 ≤ c < 1,
1. {x : f (x) > c} = 1


 (0, c
) ∪ {1}, if 1 ≤ c < 2,
1
(0, c ), if 2 ≤ c.

T
5. If c > 0, then {x : 1/g(x) > c} = {x : g(x) > 0} {x : g(x) < 1/c}. Since g
is measurable, each of the sets {g(x) > 0} and {g(x) < 1/c} is measurable. Thus
556 Hints and Solutions

their intersection is measurable. The case c < 0 is treated similarly. 7. If f is


continuous on [a, b], then f −1 ((s, ∞)) is open in [a, b] for every s ∈ R. Thus for
a fixed s, f −1 ((s, ∞)) = U ∩ [a, b] where U is open in R. Since both U and [a, b]
are measurable, so is f −1 ((s, ∞)), i.e., f is measurable. 10. (a) If c ≥ 0, then
{x : f + (x) > c} = {x : f (x) > c}, and if c < 0, then {x : f + (x) > c} = E. Since
each of the sets {f (x) > c} and E are measurable, f + is measurable. (c) Not in
general. If E is a non-measurable set, consider the function that is 1 on E and −1
on E c . 12. Since both xn and f are measurable, by Theorem 10.5.4 their product
is measurable. Suppose |f (x)| ≤ M for all x ∈ [0, 1], then |fn (x)| ≤ M xn from which
the result now follows. 14. Since f is differentiable on [a, b],
f ′ (x) = lim n[f (x + n1 ) − f (x)]
n→∞
for all x ∈ [a, b]. For each n ∈ N, gn (x) = n[f (x + n1 ) − f (x)] is measurable (Justify).
Thus by Corollary 10.5.10, the function f ′ is measurable. 15. First show that given
ǫ, δ > 0, there exists a measurable set E ⊂ [a, b] and no ∈ N such that λ([a, b]\E) < ǫ
and |fn (x) − f (x)| < δ for all x ∈ E and n ≥ no . To accomplish this, for each k ∈ N
consider
Ak = {x : |fn (x) − f (x)| < δ for all n ≥ k}.
Now show that lim λ(Ack ) = 0. Here Ack = [a, b]\Ak . Complete the proof of Egorov’s
k→∞
 k k ∈ N, there exists a measurable set Ek
theorem as follows: By the above, for each
c
and an integer nk such that
T λ(E k ) < ǫ 2 and |f (x) − fn (x)| < 1/k for all x ∈ Ek
and n ≥ nk . The set E = Ek will have the desired properties.
Exercises 10.6. page 506
n
P β
1. For each n ∈ N, let ϕn = (m + (j − 1) n )χEj . Then ϕn is a simple function on
j=1
Rb
[a, b] with a ϕn dλ = Sn (f ). Furthermore, for each x ∈ [a, b], 0 ≤ f (x) − ϕn (x) ≤
β/n. Therefore lim ϕn (x) = f (x) for all x ∈ [a, b]. Now apply the bounded conver-
n→∞ R R
gence theorem. 3. Suppose |f | ≤ M . Then A R f dλ ≤ R A |f |dλ R≤ M λ(A),Rwhich
proves the result. 5. By Theorem 10.6.10(b), f dλ = f dλ+ f dλ ≥ f dλ.
F E F \E E
7. For each n ∈ N, let En = {x : f (x) > n1 }. Then En = {x : f (x) > 0}. Use
S
the previous exercise to show λ(En ) = 0. Now use Theorem 10.4.5. 12. The func-
tion ϕn defined in the solution to Exercise 1 satisfies |f (x) − ϕn (x)| < β/n for
all x ∈ [a, b]. Thus {ϕn } converges uniformly to f on [a, b]. 15. (b) Suppose
first that ϕ = χA where A is a measurable subset of [a, b]. By Exercise 2, Section
10.4, there exists an open set U ⊃ A such that λ(U \ A) < ǫ/2. Use the set U to
show that there exists a finite number of disjoint closed intervals {Jn }Nn=1 such that
SN N
P
V = n=1 Jn ⊂ U and λ(U \ V ) < ǫ/2. Let h = χJn . Then h is a step function
n=1
n
P
on [a, b] and {x : h(x) 6= ϕ(x)} ⊂ (U \ V ) ∪ (U \ A). If ϕ = αj χAj , where the Aj
j=1
are disjoint measurable subsets of [a, b], approximate each χAj by a step function hj
which agrees with χAj except on a set of measure less than ǫ/n.
Exercises 10.7. page 516
2. (a) Assume first that A1 and A2 are bounded measurable sets. For each n ∈ N,
setR fn = min{f,
R n}. By Theorem
R 10.7.4(b)
fn dλ = A1 fn dλ + A2 fn dλ.
A1 ∪A2
Hints and Solutions 557

Since each of the sequences { A fn }∞


R
i
n=1 , i = 1, 2 are monotone increasing, they con-
verge R number, or to ∞.
R either to a finite R In eitherR case, R R
f dλ = lim fn dλ = lim fn dλ + fn dλ = f dλ + f dλ.
A1 ∪A2 n→∞ A ∪A n→∞ A A2 A1 A2
1 2 1
If either A1 or A2 is unbounded, consider the integral of f over (A1 ∪ A2 ) ∩ [−n, n],
and use the above.
( 4. For fp (x) = x−p , x ∈ (0, 1), fn (x) = min{fp (x), n} =
n, 0 < x < n−1/p ,
Therefore
x , n−1/p ≤ x < 1.
−p

R1 n−1/p R1
 
1 1 1
x−p dx =
R
fn dλ = n dx + + 1 − =
0 0 n−1/p
n(1−p)/p 1−p n(1−p)/p
1 h p i
1 − (1−p)/p .
1−p n
R1 PN N R
P
Since (1 − p) > 0, lim fn dλ = 1/(1 − p). 7. 0 ≤ nλ(En ) ≤ f dλ
n→∞ 0 n=1 n=1 En
Z Zb
= f dλ ≤ f dλ < ∞. 9. Justify first why f can be assumed to be non-
∪N a
n=1 En
negative
R R use Definition 10.7.1. 10. Let An = A ∩ [−n, n]. By definition,
and then
A
f dλ = lim An f dλ. Since f is integrable, given ǫ > 0, there exists n ∈ N such
R n→∞ R R R R
that 0 ≤ f dλ − f dλ < ǫ. By Theorem 10.7.4(b), f dλ − f dλ = f dλ.
A An A An A\An
Thus E = An is the desired set. 13. Let fn = min{|f |, n}. Then 0 ≤ Rfn ≤ |f |.
Since f is integrable, by Lebesgue’s dominated convergence theorem lim fn dλ =
n→∞ A
R R
|f | dλ. Therefore, given ǫ > 0, there exists n ∈ N such that (|f | − fn ) dλ < ǫ/2.
A A
In
R particular,
R if E is any measurable subset of A,
|f | dλ < fn dλ + 21 ǫ.
E E R
But if λ(E) < ∞, E fndλ ≤ nλ(E). Hence choose δ> 0 so that nδ < ǫ/2. 15. For
n ∈ N set hn (t, x) = n sin((t + n1 )f (x) − sin(tf (x)) . Show that |hn (t, x)| ≤ 2|f (x)|
and apply Lebesgue’s dominated convergence theorem. 16. First prove the result
for the characteristic function of an interval. Then use Exercise 9 above and Exercise
15 of Section 10.6. 19. Since {fn } is monotone increasing on A, f (x) = lim fn (x)
n→∞
exists,
R either Ras a finite number or as ∞, for every x ∈ A. By Fatou’s R lemma,
RA f dλ ≤ lim A fn dλ. On the other hand, since fn ≤ f for all n ∈ N, lim A fn dλ ≤
A
f dλ. Combining the two inequalities proves the result. 21. (a) Use the mono-
tone convergence theorem. 23. Hint: |f (x)| ≥ (1/x)| cos 1/x2 | − 2x ≥ x−1 − 2x on
1 1
each of the intervals ((2n + 31 )π)− 2 ≤ x ≤ ((2n − 13 )π)− 2 .
Exercises 10.8. page 526
1. 0 < p < 4. Since |f + g|2 ≤ 2(|f |2 + |g|2 ), the function f + g ∈ L2 (A).
1
2
.
Assume kf +R gk2 6= 0. Then
kf + gk22 = |f + g|2 ≤ |f + g||f | + |f + g||g|
R R
A A A
which by the Cauchy Schwarz inequality ≤ kf + gk2 kf k2 + kf + gk2 kgk2 . The result
follows upon simplification. 7. Use the Cauchy-Schwarz inequality. 9. (a) By
Bessel’s inequality it is the Fourier series of an L2 ([0, π]) function. 13. Let f (x) =
1 ln x, x ∈ (1, ∞). Since f ′′ (x) > 0 for all x ∈ (1, ∞), f is convex on (1, ∞) (see


Miscellaneous Exercise 3, Chapter 5). Therefore f ( 12 x + 12 y) ≤ 12 f (x) + 12 f (y) for all


x, y ∈ (1, ∞). Since n = 21 (n − 1) + 21 (n + 1) the inequality follows.
Index

A = B, 3 S(P, f ), 243
A ∩ B, 3 S(P, f, α), 277
A ∪ B, 3 U(P, f, α) (L(P, f, α)), 265
A ∼ B, 37 U(P, f ), L(P, f ), 225
A ⊂ B, 3 E, 64
Rb
A $ B, 3
Rb
a
f dα f dα, 266
a
A × B, 5 Rb Rb
Ac , 3 a
f , a f , 226
B(x, y), 402 σ–algebra, 487
B \ A, 3 sup E, 23
E ′ , 64 lim sn , lim sn , 107
En (f ), 283 d(p, A), 158
Jf (p), 177 d2 (p, q), d1 (p, q), 54
Lf (Uf ), 293 d∞ (p, q), 55
Mn (f ), 281 f (E), f −1 (H), 10
Rn (f, c)(x), 388 f (p+), f (p−), 163, 445
Tn (f, c)(x), 386 f : A → B, 8
′ ′
[x], 166 f+ (p), f− (p), 184
+ −
Γ(x), 398 f , f , 512
m(U ), 463
S T
Eα , Eα , 41
χE , 293, 467 nth derivative, 184
√ 1
ℓ2 , 328 nth root, n x, x n , 31
inf
R E, 24 nth term, 39, 84, 120
RA f dλ, 509 x ∈ A, x 6∈ A, 2
f dλ, 497 y = f (x), 7
RAb
f dα, 267
Rab Abel partial summation formula, 316
a
f , 228
Abel’s test, 320
Int(E), 59
Abel’s theorem, 380
λ(E), 477
Abel, Neils, 316
λ∗ (E), λ∗ (E), 475
absolute maximum (minimum), 192
h , i, 80, 330, 408
absolute value, 52
C(K), 356
absolutely convergent, 321
L(A), L1 (A), 513
absolutely integrable, 262
L2 (A), 518
algebraic number, 46
L2 ([a, b]), 520
almost everywhere (a.e.), 491
LL (P, f ), UL (P, f ), 496
alternating series, 317
M, 486
alternating series test, 317
P(A), 4
antiderivative, 249
R(α), 270
approaches infinity, 100
R[a, b], 228

559
560 Index

approximate identity, 371 Cauchy-Schwarz inequality, 80, 257,


approximation in the mean, 412 329, 412, 418, 520
Archimedes, 46, 297, 334 chain rule, 189
Archimedian Property, 30 change of variable theorem, 254
arithmetic-geometric mean inequality, characteristic function, 293, 467
21 closed in, 66
associative, 21 closed interval, 27
associative law, 332 closed set, 60
at most countable set, 38 closure of E, 64
axiom of choice, 530 coefficients of a power series, 378
Cohen, Paul, 47
Barrow, Issac, 297 collection, 2
Bernoulli’s inequality, 17 commutative, 21
Bernoulli, Jakob, 220 commutative law, 332
Bernoulli, Johann, 206, 220, 335 compact set, 70
Bessel’s inequality, 416, 424, 523 comparison test, 262, 303
Beta function, 402 complement of, 3
binary expansion, 33 complete metric space, 115
binomial coefficient, 90 complete normed linear space, 357
binomial series, 400 complete orthogonal sequence, 421
binomial theorem, 90, 392 completeness property, 25, 115
Bolzano, Bernhard, 51, 73, 83, 144 complex number, 48
Bolzano-Weierstrass theorem, 75, 105 composition, 13
Borel, Emile, 70, 73 conditionally convergent, 321
boundary point, 68 connected set, 67
bounded, 23 constant sequence, 84
bounded above, 22 contained in, 3
bounded below, 23 continuous almost everywhere, 492
bounded convergence theorem, 361, 504 continuous at a point, 145, 337
bounded function, 55, 138 continuous on, 145
bounded sequence, 85 continuum hypothesis, 47
bounded set, 74, 76 contraction mapping, 358
contractive, 162
canonical representation, 498 contractive sequence, 117
Cantor ternary function, 179 converge almost everywhere, 492
Cantor ternary set, 78 convergence in the mean, 419, 521
Cantor, Georg, 1, 43, 46, 51, 124, 456 convergent improper integral, 258, 260
Carathéodory, Constantin, 459, 487 convergent sequence, 84
cardinality, 37 converges, 120, 302
Cartesian product, 4 converges in measure, 529
Cauchy condensation test, 314 converges in norm, 357, 419
Cauchy criterion, 121, 347 converges in the mean, 439
Cauchy mean value theorem, 197 converges pointwise, 340
Cauchy product, 336 converges uniformly, 345
Cauchy sequence, 113, 125, 357 convex (concave up), 221
Cauchy’s form of the remainder, 392 countable set, 38
Cauchy, Augustin-Louis, 83, 121, 129,
144, 182, 220, 223, 296, 301, d’Alembert, Jean, 309, 335
336, 340, 378 Darboux, Jean Gaston, 201, 224
Index 561

De Morgan’s laws, 4, 42 Euler’s number e, 99


decreasing, 167 Euler, Leonhard, 99, 144, 220, 280, 297,
Dedekind, Richard, 1 398
dense, 31, 65 even extension, 430
denumerable, 38 even function, 247, 425
derivative, 183 exponential function, 257
differentiable, 183
Dini’s theorem, 355 factorial, 19, 91
Dini, Ulisse, 450 family, 2
Dirac sequence, 372 Fatou’s lemma, 511
Dirichlet kernel, 433 Fejér kernel, 436, 437
Dirichlet test, 316, 351 Fejér’s theorem, 438
Dirichlet’s theorem, 445 Fejér, L., 438, 452
Dirichlet, Peter Lejeune, 315, 407, 454 Fermat, Pierre de, 297
discontinuity of first kind, 165 field, 21
discontinuity of second kind, 165 finite expansion, 35
disjoint, 3 finite set, 38
distance, 333 first derivative test, 199
distance from a point to a set, 158 first order method, 283
distance function, 53 fixed point, 162
distribution function, 172 Fourier coefficient, 415, 424, 523
distributive, 21 Fourier cosine coefficient, 430
distributive laws, 4, 42 Fourier cosine series, 430
diverge, 120 Fourier series, 415, 424
divergent improper integral, 258, 260 Fourier sine coefficient, 430
divergent sequence, 84 Fourier sine series, 430
divergent series, 302 Fourier, Joseph, 407, 454
diverges to ∞, 100, 206 fourth order method, 288
diverges to infinity, 100, 302 function, 6
domain, 7 fundamental theorem of calculus, 249,
251
Egorov’s theorem, 495 fundamental theorem of calculus for the
element of, 2 Lebegue integral, 507
empty set, 2
enumerable, 38 Gödel, Kurt, 47
enumeration, 39 Galileo, 37
equal, 3, 5 Gamma function, 263, 398
equal almost everywhere, 491 geometric series, 120
equivalence class, 125 Gibbs phenomenon, 455
equivalent functions, 519 Gibbs, Josiah, 455
equivalent sequences, 125 graph, 7
equivalent sets, 37 greatest integer, 166
error, 283 greatest lower bound, 24
error function, 388 Greatest Lower Bound Property, 25
Euclid, 334 Gregory, James, 339
euclidean distance, 53
euclidean length, 80 half-closed interval, 27
euclidean norm, 54 half-open interval, 27
Euler’s constant, 315 harmonic series, 307
562 Index

Heine, Eduard, 70, 73, 161, 456 least squares approximation, 412
Heine-Borel theorem, 74 least upper bound, 23
Heine-Borel-Bolzano-Weierstrass Least Upper Bound Property, 25
theorem, 74 Lebesgue integrable, 509, 513
Lebesgue integral, 497, 509, 513
identity function, 9 Lebesgue measurable, 477
image of, 10 Lebesgue sums, 506
improper Riemann integral, 258, 260 Lebesgue’s dominated convergence
increasing, 167 theorem, 515
index set, 40 Lebesgue’s theorem, 237, 293
indexed family, 40 Lebesgue, Henri, 74, 237, 297, 454, 459,
infimum, 24 527
Infimum Property, 25 left continuous, 163
infinite expansion, 35 left derivative, 184
infinite intervals, 27 left limit, 163, 445
infinite limits, 100 Legendre polynomials, 418
infinite product, 124 Leibniz’s rule, 192
infinite series, 120 Leibniz, Gottfried, 129, 181, 223, 296
infinite set, 38 limit, 84, 130
infinitely differentiable, 383 limit at ∞, 141
inner measure, 475 limit comparison test, 304
inner product, 80, 330, 408 limit inferior, 107
integers, 2 limit of the sequence {fn }, 340
integrable function, 227, 267 limit point, 63
integral form of the remainder, 390 limit superior, 107
integral test, 306 limit superior (inferior) of f , 178
integration by parts formula, 254, 272 linear approximation, 386
interior, 59 linear function, 337
interior point, 59 Lipschitz condition, 159
intermediate value theorem, 152 Lipschitz function, 159
intermediate value theorem for local maximum (minimum), 192
derivatives, 201 lower (upper) Lebesgue sums, 496
intersection, 3, 40 lower bound, 23
interval, 27 lower function, 293
inverse function, 12, 173 lower integral, 226
inverse function theorem, 201 lower Riemann-Stieltjes integral, 266
inverse image, 10 lower Riemann-Stieltjes sum, 265
irrational numbers, 2 lower sum, 225
isolated point, 63
Müntz-Szasz theorem, 376
jump discontinuity, 165 Maclaurin series, 386
jump of f at p, 177 Maclaurin, Colin, 340, 386
mapping, 8
l’Hospital’s rule, 208, 211 maps, 8
l’Hospital, Marquis de, 206 mathematical induction, 16
Lagrange form of the remainder, 389 maximum element, 23
Lagrange, Joseph, 195, 220, 336, 389 mean value theorem, 195
Laplace transform, 298 mean value theorem for integrals, 253,
largest element, 23 272
Index 563

mean-square convergence, 419 ordered field, 22


measurable function, 488 ordered pairs, 5
measurable partition, 496 Oresme, Nicole, 122, 335
measurable set, 477 orthogonal, 408, 409
measure of E, 477 orthonormal, 411
measure of compact set, 471 oscillation of f , 178
measure of open set, 463 outer measure, 475
measure zero, 236, 291
Mengoli, Pietro, 335 p-series, 307
Mercator, Nicolaus, 339, 394 pairwise disjoint, 65, 463
mesh of a partition, 244 Parseval’s equality, 420, 441
method of bisection, 214 partial sum, 120, 302, 341
metric, 53, 333 partition, 224
metric space, 53 periodic extension, 425
midpoint approximation, 281 periodic function, 162, 370
Minkowski’s inequality, 330, 520 piecewise continuous, 447
modified principle of mathematical polynomial function, 139, 149
induction, 18 positive integers, 2
monotone, 95, 167 positive rational numbers, 22
monotone increasing (decreasing), 95, positive real numbers, 22
167 power series, 378
montone convergence theorem, 518 power set, 4
Principle of Mathematical Induction, 16
natural exponential function, 203, 395 projection, 9
natural logarithm function, 99, 253 proper subset, 3
natural numbers, 2 Pythagoras, 1
neighborhood, 57
nested intervals property, 76, 96 quadratic method, 218
Newton’s method, 216
Newton, Isaac, 129, 181, 214, 223, 296, Raabe’s test, 315
335, 339, 378, 394 radius of convergence, 378
nondecreasing (nonincreasing), 95, 167 range, 7
nonnegative integers, 2 Raphson, Joseph, 214
nonterminating expansion, 35 ratio test, 309, 322
norm, 80, 328, 332, 411 rational function, 149
norm convergence, 333, 357, 521 rational numbers, 2
norm of a partition, 244, 293 real analytic functions, 380
normed linear space, 332 real numbers, 2
null sequence, 125 real-valued function, 8
rearrangement, 323
odd extension, 430 recursive, 19
odd function, 247, 425 refinement, 226, 498
one-to-one function, 12 reflexive, 37
onto, 7 relative complement, 3
open cover, 70 remainder, 388
open in, 66 removable discontinuity, 164
open interval, 27 Reymond, P. du Bois, 452
open set, 60 Riemann integrable, 227
order properties, 22 Riemann integral, 228
564 Index

Riemann sum, 243 Taylor polynomial, 386


Riemann, Georg Bernhard, 223, 297, Taylor series, 386
324, 407, 454 Taylor, Brooks, 340, 386
Riemann-Lebesgue lemma, 426 tends to ∞, 206
Riemann-Stieltjes integrable, 267 terminating expansion, 35
Riemann-Stieltjes sum, 277 ternary expansion, 33
right limit, 163 transitive, 37
right continuous, 163 trapezoidal approximation, 285
right derivative, 184 trapezoidal rule, 285
right limit, 445 triangle inequality, 52, 53, 331, 333
Rolle’s theorem, 194 trigonometric polynomial, 439
Rolle, Michel, 194 trigonometric series, 318, 423
root test, 310, 322 trivial metric, 54
two-norm, 519
Schröder-Bernstein theorem, 48
secant line, 182 uncountable set, 38
second derivative, 184 uniform norm, 356
second derivative test, 204 uniformly continuous, 158
second order method, 218, 283 uniformly differentiable, 205
Second Principle of Mathematical union, 3, 40
Induction, 19 unit, 22
sequence, 39, 84 unit jump function, 168
sequence of functions, 340 upper (lower) Lebesgue sums, 496
sequence of subsets, 40 upper bound, 22
series, 120 upper function, 293
series of functions, 341 upper integral, 226
set, 2 upper Riemann-Stieltjes integral, 266
sigma-algebra, 487 upper Riemann-Stieltjes sum, 265
simple discontinuities, 165 upper sum, 225
simple function, 498 usual metric, 53
Simpson approximation, 288
Simpson, Thomas, 287, 297 value of the function, 7
square integrable functions, 518 vector, 80
square root, 13 vector space, 332
square summable, 328
squeeze theorem, 90 Weierstrass approximation theorem, 370
step function, 297, 441 Weierstrass M-test, 348
Stieltjes, Thomas-Jean, 224 Weierstrass, Karl, 51, 73, 187, 339, 365
strictly decreasing (increasing), 95, 167 well-ordered, 16
strictly monotone, 167 Well-Ordering Principle, 16
subsequence, 103
subsequential limit, 103 zero element, 22, 332
subset, 3
sum of the series, 120, 302, 341
supremum, 23
Supremum Property, 25
symmetric, 37

tangent line, 182

You might also like