# Turning Up the Heat: Insights from Tao’s ‘Analysis II’

- Foreword
- Analysis II
- 12: Metric Spaces
- Proving Completeness
- 13: Continuous Functions on Metric Spaces
- 14: Uniform Convergence
- Breaking Point
- Progress
- Convoluted No Longer
- Weierstrass Approximation Theorem
- 15: Power Series
- EXP
- Complex Exponentiation
- 16: Fourier Series
- 17: Several Variable Differential Calculus
- Implicit Function Theorem
- 18: Lebesgue Measure
- 19: Lebesgue Integration
- Conceptual Rotation
- Final Thoughts
- Forwards
- Tips
- Verification
- Why Bother?
- Fairness

# Foreword

It’s been too long—a month and a half since my last review, and about three months since *Analysis I*. I’ve been immersed in my work for CHAI, but reality doesn’t grade on a curve, and I want more mathematical firepower.

On the other hand, I’ve been cooking up something really special, so watch this space!

# Analysis II

## 12: Metric Spaces

*Metric spaces; completeness and compactness. *

### Proving Completeness

It sucks, and I hate it.

## 13: Continuous Functions on Metric Spaces

*Generalized continuity, and how it interacts with the considerations introduced in the previous chapter. Also, a terrible introduction to topology.*

There’s a lot I wanted to say here about topology, but I don’t think my understanding is good enough to break things down—I’ll have to read an actual book on the subject.

## 14: Uniform Convergence

*Pointwise and uniform convergence, the Weierstrass **-test, and uniform approximation by polynomials. *

### Breaking Point

Suppose we have some sequence of functions , , which converge pointwise to the 1-indicator function (*i.e.*, and otherwise). Clearly, each is (infinitely) differentiable; however, the limiting function isn’t differentiable at all! Basically, pointwise convergence isn’t at all strong enough to stop the limit from “snapping” the continuity of its constituent functions.

### Progress

As in previous posts, I mark my progression by sharing a result derived without outside help.

*Already proven:* .

*Definition.* Let and . A function is said to be an *-approximation to the identity* if it obeys the following three properties:

- is compactly supported on .
- is continuous, and .
- for all .

*Lemma: *For every and , there exists an -approximation to the identity which is a polynomial on .

*Proof of Exercise 14.8.2(c).* Suppose ; define for and otherwise. Clearly, is compactly supported on and is continuous. We want to find such that the second and third properties are satisfied. Since is non-negative on , must be positive, as must integrate to . Therefore, is non-negative.

We want to show that for all . Since is non-negative, we may simplify to . Since the left-hand side is strictly monotone increasing on and strictly monotone decreasing on , we substitute without loss of generality. As , so we may take the reciprocal and multiply by , arriving at .

We want ; as is compactly supported on , this is equivalent to . Using basic properties of the Riemann integral, we have . Substituting in for ,

with the second inequality already having been proven earlier. Note that although the first inequality is not always true, we can make it so: since is fixed and , the left-hand side approaches more quickly than does. Therefore, we can make as large as necessary; isolating ,

the second line being a consequence of . Then set to be any natural number such that this inequality is satisfied. Finally, we set . By construction, these values of satisfy the second and third properties. □

### Convoluted No Longer

Those looking for an excellent explanation of convolutions, look no further!

### Weierstrass Approximation Theorem

*Theorem. *Suppose is continuous and compactly supported on . Then for every , there exists a polynomial such that .

In other words, any continuous, real-valued on a finite interval can be approximated with arbitrary precision by polynomials.

*Why I’m talking about this.* On one hand, this result makes sense, especially after taking machine learning and seeing how polynomials can be contorted into basically whatever shape you want.

On the other hand, I find this theorem intensely beautiful. ’s proof was slowly constructed, much to the reader’s benefit. I remember the very moment the proof sketch came to me, newly-installed gears whirring happily.

## 15: Power Series

*Real analytic functions, Abel’s theorem, and , complex numbers, and trigonometric functions.*

Cached thought from my CS undergrad: exponential functions always end up growing more quickly than polynomials, no matter the degree. Now, I finally have the gears to see why:

has *all* the degrees, so no polynomial (of necessarily finite degree) could ever hope to compete! This also suggests why .

## Complex Exponentiation

You can multiply a number by itself some number of times.

[*nods*]

You can multiply a number by itself a negative number of times.

[Sure.]

You can multiply a number by itself an irrational number of times.

[OK, I understand limits.]

You can multiply a number by itself an imaginary number of times.

[Out. Now.]

Seriously, this one’s weird (rather, it *seems* weird, but how can “how the world is” be “weird”)?

Suppose we have some , where . Then , so “all” we need to figure out is how to take an imaginary exponent. Brian Slesinsky has us covered.

*Years before becoming involved with the rationalist community, Nate asks this question, and Qiaochu answers. *

*This isn’t a coincidence, because nothing is ever a coincidence.*

*Or maybe it is a coincidence, because Qiaochu answered every question on StackExchange.*

## 16: Fourier Series

*Periodic functions, trigonometric polynomials, periodic convolutions, and the Fourier theorem.*

## 17: Several Variable Differential Calculus

*A beautiful unification of Linear Algebra and calculus: linear maps as derivatives of multivariate functions, partial and directional derivatives, Clairaut’s theorem, contractions and fixed points, and the inverse and implicit function theorems. *

### Implicit Function Theorem

If you have a set of points in , when do you know if it’s secretly a function ? For functions , we can just use the geometric “vertical line test” to figure this out, but that’s a bit harder when you only have an algebraic definition. Also, sometimes we can implicitly define a function locally by restricting its domain (even if no explicit form exists for the whole set).

*Theorem.* Let be an open subset of , let be continuously differentiable, and let be a point in such that and . Then there exists an open containing , an open containing , and a function such that , and

So, I think what’s really going on here is that we’re using the derivative at this known zero to locally linearize the manifold we’re operating on (similar to Newton’s approximation), which lets us have some neighborhood in which we can derive an implicit function, even if we can’t always write it out.

## 18: Lebesgue Measure

*Outer measure; measurable sets and functions.*

Tao lists desiderata for an ideal measure before deriving it. Imagine that.

## 19: Lebesgue Integration

*Building up the Lebesgue integral, culminating with Fubini’s theorem.*

### Conceptual Rotation

Suppose is measurable, and let be a measurable, non-negative function. The Lebesgue integral of is then defined as

This hews closely to how we defined the *lower* Riemann integral in Chapter 11; however, we don’t need the equivalent of the upper Riemann integral for the Lebesgue integral.

To see why, let’s review why Riemann integrability demands the equality of the lower and upper Riemann integrals of a function . Suppose that we integrate over , and that is the indicator function for the rationals. As the rationals are dense in the reals, any interval () contains rational numbers, no matter how much the interval shrinks! Therefore, the upper Riemann integral equals 1, while the lower equals 0 (for similar reasons). *is* Lebesgue integrable; since it’s 0 almost everywhere (as the rationals have 0 measure), its integral is 0.

This marks a fundamental shift in how we integrate. With the Riemann integral, we consider the and of increasingly-refined upper and lower Riemann sums—this is the *length *approach. In Lebesgue integration, however, we consider which is responsible for each value in the range (*i.e.*, ), multiplying by the measure of - this is *inversion*.

In a sense, the Lebesgue integral more cleanly strikes at the heart of what it *means* to integrate. Surely, Riemann integration was not far from the mark; however, if you rotate the problem slightly in your mind, you will find a better, cleaner way of structuring your thinking.

## Final Thoughts

Although Tao botches a few exercises and the section on topology, I’m a big fan of *Analysis I* and *II*. Do note, however, that *II *is far more difficult than *I* (not just in content, but in terms of the exercises). He generally provides relevant, appropriately-difficult problems, and is quite adept at helping the reader develop rigorous and intuitive understanding of the material.

# Forwards

Next is Jaynes’ *Probability Theory*.

## Tips

To avoid getting hung up in Chapter 17, this book should be read after a linear algebra text.

Don’t do exercise 17.6.3 - it’s wrong.

Deep understanding comes from sweating it out. Don’t hide, don’t wave away bothersome details—stay and explore. If you follow my strategy of quickly generating outlines—can you formally and precisely write out each step?

## Verification

I completed every exercise in this book; in the second half, I started avoiding looking at the hints provided by problems until I’d already thought for a few minutes. Often, I’d solve the problem and then turn to the hint: “be careful when doing *X*—don’t forget edge case *Y*; hint: use lemma *Z*”! A pit would form in my stomach as I prepared to locate my mistake and back-propagate where-I-should-have-looked, before realizing that I’d *already* taken care of that edge case using that lemma.

## Why Bother?

One can argue that my time would be better spent picking up things as I work on problems in alignment. However, while I’ve made, uh, quite a bit of progress with impact measures this way, concept-shaped holes are impossible to notice. If there’s some helpful information-theoretic way of viewing a problem that I’d only realize if I had *already taken* information theory, I’m out of luck.

Also, developing mathematical maturity brings with it a more rigorous thought process.

## Fairness

There’s a sense I get where even though I’ve made immense progress over the past few months, it still *might not be enough*. The standard isn’t “am I doing impressive things for my reference class?“, but rather the stricter “am I good enough to solve serious problems that might not get solved in time otherwise?“. This is quite the standard, and even given my textbook and research progress (including the upcoming posts), I don’t think I meet it.

In a way, this excites me. I welcome any advice for buckling down further and becoming yet stronger.

*If you are interested in working with me or others on the task of learning MIRI-relevant math, if you have a burning desire to knock the alignment problem down a peg—I would be more than happy to work with you. Messaging me may also net you an invitation to the MIRIx Discord server.*

*On a related note: thank you to everyone who has helped me; in particular, TheMajor has been incredibly generous with their explanations and encouragement.*

Can you say more about why exercise 17.6.3 is wrong?

If we define f:[0,1]→R by f(x):=x/(1+x) then for distinct x,y∈[0,1] we have |f(x)−f(y)|=∣∣∣x1+x−y1+y∣∣∣=∣∣∣x−y(1+x)(1+y)∣∣∣<|x−y|

We also have f′(0)=1 since limx→0;x∈(0,1]f(x)−f(0)x−0=limx→0;x∈(0,1]x(x+1)x=1

In general, the derivative is f′(x)=1/(1+x)2, which is continuous on [0,1].

He defined a strict contraction f:X→X on a metric space (X,d) as requiring d(f(x),f(y))≤cd(x,y) for c∈(0,1) and for all x,y∈X. Your proposed solution doesn’t fix such a c; in fact, as x→0;x∈(0,1], c→1, which is why f′(0)=1.

Claim: You can’t solve the exerciseProof(thanks to TheMajor). Let {xn}∞n=1 be a sequence in the domain converging to x∈[a,b] such that f′(x)=limn→∞f(xn)−f(x)xn−x. Since f is a strict contraction with contraction constant c, ∀n∈N+:|f(xn)−f(x)||xn−x|≤c<1. Since the absolute value is continuous, we conclude that |f′(x)|≤c<1. ◻️I think we are working off different editions. According to the errata, the condition for strict contraction was changed to |f(x)−f(y)|<|x−y| for all distinct x,y∈[a,b].

Then you can solve it, yeah.

I’m working through Munkres’ book on topology at the moment, which is part of Miri’s reserach guide. It’s super awesome; rigorous, comprehensive, elegant, and quite long (with

lotsof exercises). I’m planing to do a similar post once I’m done, but it’s taking me a while. if you get to it eventually, you’ll probably beat me to it.Hey there, sorry for the late reply. I wanted to let you know that every now and again I answer Turntrout’s math questions via Discord, and wanted to let you (and anybody else reading this while working through undergrad math textbooks) know that I’d love to help if you have any questions! I’m a math grad student and have been teaching assistant for over 5 years now, and honestly I just love explaining math. While my time is limited and irregular please don’t hesitate to shoot me a question if you’re stuck on anything and would like some advice.