This post is also available at HFI Programming Club's blog and Zhihu.

As the title suggests, today we are going to do some integral calculus. First, let's recall the definition of Riemann integral:

Riemann Integral and Its Formal Definition

Traditionally, an integral of some function f(x)f(x) over some interval [a,b][a,b] is defined as the signed area of f(x)f(x) over the curve:

abf(x)dxSigned area under f(x) where x[a,b]\int_a^bf(x)\mathrm dx\triangleq\text{Signed area under $f(x)$ where $x\in[a,b]$}

and Riemann integral is one way to define it rigorously. Particularly, we define it as the sum of the areas of tiny rectangles:

abf(x)dxk=1Nf(ξk)(xkxk1)\int_a^bf(x)\mathrm dx\approx\sum_{k=1}^Nf(\xi_k)(x_k-x_{k-1})

where ξk\xi_k are sampled in [xk1,xk][x_{k-1},x_k] and the increasing sequence xk{x_k} is called a partition of the interval [a,b][a,b]:

a=x0x1x2xN=ba=x_0\le x_1\le x_2\le\cdots\le x_N=b

To formalize Riemann integral, let's define the meshxn\operatorname{mesh}{x_n} as the length of the maximum of interval in a partition xn{x_n}:

mesh{xn}max1nN(xnxn1)\operatorname{mesh}\{x_n\}\triangleq\max_{1\le n\le N}(x_n-x_{n-1})

Then we say that some function f(x)f(x) is Riemann integrable if for all ε>0\varepsilon>0, there exists δ>0\delta>0 such that when meshxnδ\operatorname{mesh}{x_n}\le\delta we always have

k=1Nf(ξk)(xkxk1)abf(x)dx<ε\left \vert \sum_{k=1}^Nf(\xi_k)(x_k-x_{k-1})-\int_a^bf(x)\mathrm dx\right \vert <\varepsilon

Alternatively, if some function f(x)f(x) is Riemann-integrable on [a,b][a,b], then the limit

limmesh{xn}0k=1Nf(ξk)(xkxk1)\lim_{\operatorname{mesh}\{x_n\}\to0}\sum_{k=1}^Nf(\xi_k)(x_k-x_{k-1})

exists and converges to abf(x)dx\int_a^bf(x)\mathrm dx.

From Riemann Integral to Riemann-Stieltjes Integral

Although Riemann integral appears to be sufficient to integrate functions, it is not friendly to integrate piecewise continuous functions. Let's first look at its definition:

abf(x)dg(x)=limmesh{xn}0I(xn,ξn)limmesh{xn}0k=1Nf(ξk)[g(xk)g(xk1)]\int_a^bf(x)\mathrm dg(x) =\lim_{\operatorname{mesh}\{x_n\}\to0}I(x_n,\xi_n)\triangleq\lim_{\operatorname{mesh}\{x_n\}\to0}\sum_{k=1}^Nf(\xi_k)[g(x_k)-g(x_{k-1})]

In order for this it to exist, we need to set up constraints on f(x)f(x) and g(x)g(x):

Theorem 1: The Riemann-Stieltjes integral abfdg\int_a^bf\mathrm dg exists if ff is continuous on [a,b][a,b] and gg is of bounded variation on [a,b][a,b]

When we say gg is of bounded variation, we mean that the following quantity exists:

Var[a,b](g)supkg(xk)g(xk1)\displaystyle\operatorname{Var}_{[a,b]}(g)\triangleq\sup\sum_k \vert g(x_k)-g(x_{k-1}) \vert

Proof. Let's define yn=y1,y2,,yM{y_n}={y_1,y_2,\dots,y_M} as another partition of [a,b][a,b] such that xn{x_n} is its subsequence and ηk\eta_k be the sampled abscissa in [yk,yk1][y_k,y_{k-1}], so if we designate PkP_k to be the set of mm such that ymy_m's are contained in interval (xk1,xk](x_{k-1},x_k]:

Pk={mxk1<ymxk}P_k=\{m \vert x_{k-1}<y_m\le x_k\}

then we have

mPk[g(ym)g(ym1)]=g(xk)g(xk1)\sum_{m\in P_k}[g(y_m)-g(y_{m-1})]=g(x_k)-g(x_{k-1})

which implies

I(x_n,\xi_n)=\sum_{k=1}^N\sum_{m\in P_k}f(\xi_k)[g(y_m)-g(y_{m-1})]\tag1 I(y_n,\eta_n)=\sum_{k=1}^N\sum_{m\in P_k}f(\eta_m)[g(y_m)-g(y_{m-1})]\tag2

Because f(x)f(x) is uniformly continuous within [a,b][a,b], we know that for every ε>0\varepsilon>0 there exists δ>0\delta>0 such that when s,t[a,b]s,t\in[a,b] satisfy stmeshxnδ \vert s-t \vert \le\operatorname{mesh}{x_n}\le\delta then f(s)f(t)<ε \vert f(s)-f(t) \vert <\varepsilon. Accordingly, if we were to take the absolute values of (1) and (2), then

I(xn,ξn)I(yn,ηn)=n=1NmPkf(ηm)f(ξk)g(ym)g(ym1)εm=1Mg(ym)g(ym1)εVar[a,b](g)\begin{aligned} \vert I(x_n,\xi_n)-I(y_n,\eta_ n) \vert &=\sum_{n=1}^N\sum_{m\in P_k} \vert f(\eta_m)-f(\xi_k) \vert \vert g(y_m)-g(y_{m-1}) \vert \\ &\le\varepsilon\sum_{m=1}^M \vert g(y_m)-g(y_{m-1}) \vert \le\varepsilon\operatorname{Var}_{[a,b]}(g) \end{aligned}

Now, let zn{z_n} be another partition of [a,b][a,b], ζn\zeta_n be its correponding samples abscissa and yn{y_n} be the union of both partitions, then

I(xn,ξn)I(zn,ζn)=I(xn,ξn)I(yn,ηn)[I(zn,ζn)I(yn,ηn)]I(xn,ξn)I(yn,ηn)+I(zn,ζn)I(yn,ηn)2εVar[a,b](g)\begin{aligned} \vert I(x_n,\xi_n)-I(z_n,\zeta_n) \vert &= \vert I(x_n,\xi_n)-I(y_n,\eta_n)-[I(z_n,\zeta_n)-I(y_n,\eta_n)] \vert \\ &\le \vert I(x_n,\xi_n)-I(y_n,\eta_n) \vert + \vert I(z_n,\zeta_n)-I(y_n,\eta_n) \vert \\ &\le2\varepsilon\operatorname{Var}_{[a,b]}(g) \end{aligned}

which indicates the validness of this theorem. \square

Properties of Riemann-Stieltjes Integral

In addition to the constraint for the Riemann-Stieltjes integral to exist, we can also transform it into a Riemann integral at specific occasions:

Theorem 2: If g(x)g'(x) exists and is continuous on [a,b][a,b] then

abf(x)dg(x)=abf(x)g(x)dx(3)\int_a^bf(x)\mathrm dg(x)=\int_a^bf(x)g'(x)\mathrm dx\tag3

Proof. Since g(x)g(x) is differentiable, we can use mean value theorem to guarantee the existence of ξk[xk1,xk]\xi_k\in[x_{k-1},x_k] such that g(xk)g(xk1)=g(ηk)(xkxk1)g(x_k)-g(x_{k-1})=g'(\eta_k)(x_k-x_{k-1}), so

k=1Ng(xk)g(xk1)=k=1Ng(ηk)(xkxk1)abg(x)dx(mesh{xn}0)\begin{aligned} \sum_{k=1}^N \vert g(x_k)-g(x_{k-1}) \vert &=\sum_{k=1}^N \vert g'(\eta_k) \vert (x_k-x_{k-1}) \\ &\to\int_a^b \vert g'(x) \vert \mathrm dx\quad(\operatorname{mesh}\{x_n\}\to0) \end{aligned}

As a result, g(x)g(x) is of bounded variation, implying the existence of the left hand side of (3).

S(xn,ξn)=k=1Nf(ξk)g(ηk)(xkxk1)=k=1Nf(ξk)g(ξk)(xkxk1)T1+k=1Nf(ξk)[g(ηk)g(ξk)](xkxk1)T2\begin{aligned} S(x_n,\xi_n) &=\sum_{k=1}^Nf(\xi_k)g'(\eta_k)(x_k-x_{k-1}) \\ &=\underbrace{\sum_{k=1}^Nf(\xi_k)g'(\xi_k)(x_k-x_{k-1})}_{T_1} +\underbrace{\sum_{k=1}^Nf(\xi_k)[g'(\eta_k)-g'(\xi_k)](x_k-x_{k-1})}_{T_2} \\ \end{aligned}

By the uniform continuity of g(x)g'(x), we know that for all ε>0\varepsilon>0 there exists δ>0\delta>0 such that g(s)g(t)<ε \vert g'(s)-g'(t) \vert <\varepsilon whenever st<δ \vert s-t \vert <\delta, indicating T1abfgT_1\to\int_a^bfg' and T20T_2\to0 as meshxn0\operatorname{mesh}{x_n}\to0. Accordingly, we arrive at the conclusion that

abf(x)dg(x)=abf(x)g(x)dx\int_a^bf(x)\mathrm dg(x)=\int_a^bf(x)g'(x)\mathrm dx thus completing the proof. \square

In addition, we can also apply integration by parts on Riemann-Stieltjes integrals. Particularly, if we assume ff has a continuous derivative and gg is of bounded variation on [a,b][a,b], then

abf(x)dg(x)=f(x)g(x)qbabg(x)f(x)dx\int_a^bf(x)\mathrm dg(x)=f(x)g(x) \vert _q^b-\int_a^bg(x)f'(x)\mathrm dx

Riemann-Stieltjes Integral and Partial Summation

Let h(n)h(n) be some arithmetic function and H(x)H(x) be its summatory function

R(x)=nxr(n)(4)R(x)=\sum_{n\le x}r(n)\tag4

Let f(t)f(t) have continuous derivative on [0,)[0,\infty) and b>a>0b>a>0 then we have

abf(x)dR(x)=k=1Nf(ξk)[R(xk)R(xk1)]\int_a^bf(x)\mathrm dR(x)=\sum_{k=1}^Nf(\xi_k)[R(x_k)-R(x_{k-1})]

where we require that meshxn<1\mathrm{mesh}{x_n}<1. Recall (4), we observe that R(x)R(x) is a step function that only jumps at integer values, so we only need to sum over knk_n's such that n(xkn1,xkn]n\in(x_{k_n-1},x_{k_n}]. Hence, this integral becomes a summation that sums over integers values in (a,b](a,b]:

abf(x)dR(x)=a<nbf(ξkn)r(n)=a<nbf(n)r(n)+a<nb[f(ξkn)f(n)]r(n)\begin{aligned} \int_a^bf(x)\mathrm dR(x) &=\sum_{a<n\le b}f(\xi_{k_n})r(n) \\ &=\sum_{a<n\le b}f(n)r(n)+\sum_{a<n\le b}[f(\xi_{k_n})-f(n)]r(n) \\ \end{aligned}

Since ff is uniformly continuous on [a,b][a,b], we have that f(x)f(y)<δ \vert f(x)-f(y) \vert <\delta when ever xy<ε \vert x-y \vert <\varepsilon, thus the second summation is of O(δ)\mathcal O(\delta), leaving us

abf(x)dR(x)=a<nbf(n)r(n)(5)\int_a^bf(x)\mathrm dR(x)=\sum_{a<n\le b}f(n)r(n)\tag5

Employing (5) in different situations can give us plentiful brilliant results. Let's have a look at some of them:

Asymptotic Expansions

It was well-known that the harmonic series 1+12+13+1+\frac12+\frac13+\cdots diverges, and we can provide a formal proof using Riemann-Stieltjes integral:

nN1n=1εNdxx=xx1εN+1εNxx2dx=1εNdxx+11εN{x}x2dx=1Ndxx+11N{x}x2dx=logN+11{x}x2dx+N{x}x2dx=logN+γ+O(1N)\begin{aligned} \sum_{n\le N}\frac1n &=\int_{1-\varepsilon}^N{\mathrm d\lfloor x\rfloor\over x} \\ &=\left.\lfloor x\rfloor\over x\right \vert _{1-\varepsilon}^N+\int_{1-\varepsilon}^N{\lfloor x\rfloor\over x^2}\mathrm dx \\ &=\int_{1-\varepsilon}^N{\mathrm dx\over x}+1-\int_{1-\varepsilon}^N{\{x\}\over x^2}\mathrm dx \\ &=\int_1^N{\mathrm dx\over x}+1-\int_1^N{\{x\}\over x^2}\mathrm dx \\ &=\log N+1-\int_1^\infty{\{x\}\over x^2}\mathrm dx+\int_N^\infty{\{x\}\over x^2}\mathrm dx \\ &=\log N+\gamma+\mathcal O\left(\frac1N\right) \end{aligned}

Since logN\log N\to\infty as NN\to\infty, we conclude that the harmonic series diverges.

Evaluation of an Interesting Series

n1(1)nlognn\sum_{n\ge1}{(-1)^n\log n\over n}

Now, let's first consider the finite case:

nN(1)nlognn=2nN/2log(2n)2nnNlognn+O(logNN)=nN/2log2+lognnnNlognn+O(logNN)=log2nN/21nN/2<nNlognn+O(logNN)\begin{aligned} \sum_{n\le N}{(-1)^n\log n\over n} &=2\sum_{n\le N/2}{\log(2n)\over2n}-\sum_{n\le N}{\log n\over n}+\mathcal O\left(\log N\over N\right) \\ &=\sum_{n\le N/2}{\log2+\log n\over n}-\sum_{n\le N}{\log n\over n}+\mathcal O\left(\log N\over N\right) \\ &=\log2\sum_{n\le N/2}\frac1n-\sum_{N/2<n\le N}{\log n\over n}+\mathcal O\left(\log N\over N\right) \\ \end{aligned}

In fact, using Riemann-Stieltjes integral, we can show

N/2<nNlognn=N/2Nlogxxdx=Nlog(N)Nlog(N/2)NN/2N[x{x}]d(logxx)+O(1n)=log2N/2N(1logxx)dx+O(logNN)=N/2Nlogxxdx+O(logNN)=12[log2Nlog2(N/2)]+O(logNN)=12[logN+log(N/2)][logNlog(N/2)]+O(logNN)=12log2[2logNlog2]+O(logNN)=log2logN12log22+O(logNN)\begin{aligned} \sum_{N/2<n\le N}{\log n\over n} &=\int_{N/2}^N{\log x\over x}\mathrm d\lfloor x\rfloor \\ &={N\log(N)-N\log(N/2)\over N}-\int_{N/2}^N[x-\{x\}]\mathrm d\left(\log x\over x\right)+\mathcal O\left(\frac1n\right) \\ &=\log2-\int_{N/2}^N\left({1-\log x\over x}\right)\mathrm dx+\mathcal O\left(\log N\over N\right) \\ &=\int_{N/2}^N{\log x\over x}\mathrm dx+\mathcal O\left(\log N\over N\right) \\ &=\frac12[\log^2N-\log^2(N/2)]+\mathcal O\left(\log N\over N\right) \\ &=\frac12[\log N+\log(N/2)][\log N-\log(N/2)]+\mathcal O\left(\log N\over N\right) \\ &=\frac12\log2[2\log N-\log2]+\mathcal O\left(\log N\over N\right) \\ &=\log2\log N-\frac12\log^22+\mathcal O\left(\log N\over N\right) \end{aligned}

Now, employing this obtained identity and the asymptotic formula for harmonic series yields:

nN(1)nlognn=log2(logN+γ)log2logN+12log22+O(logNN)=γlog2+12log22+O(logNN)\begin{aligned} \sum_{n\le N}{(-1)^n\log n\over n} &=\log2(\log N+\gamma)-\log2\log N+\frac12\log^22+\mathcal O\left(\log N\over N\right) \\ &=\gamma\log2+\frac12\log^22+\mathcal O\left(\log N\over N\right) \end{aligned}

Now, take the limit nn\to\infty on both side gives

n1(1)nlognn=γlog2+12log22\sum_{n\ge1}{(-1)^n\log n\over n}=\gamma\log2+\frac12\log^22

The Prime Number Theorem

If we were to define the prime indicator function

1p(n)={1n is a prime0otherwise\mathbf1_p(n)= \begin{cases} 1 & \text{$n$ is a prime} \\ 0 & \text{otherwise} \end{cases}

Then the prime counting function can be defined as

π(x)=px1=nx1p(n)\pi(x)=\sum_{p\le x}1=\sum_{n\le x}\mathbf1_p(n)

Now, let's also define Chebyshev's function ϑ(x)\vartheta(x):

ϑ(x)=pxlogp=nx1p(n)logn\vartheta(x)=\sum_{p\le x}\log p=\sum_{n\le x}\mathbf1_p(n)\log n

Hence, we have

π(x)=nx1logn1p(n)logn=2εxdϑ(t)logt=ϑ(x)logx+2xϑ(t)tlog2tdt\begin{aligned} \pi(x) &=\sum_{n\le x}{1\over\log n}\cdot\mathbf1_p(n)\log n\\ &=\int_{2-\varepsilon}^x{\mathrm d\vartheta(t)\over\log t} \\ &={\vartheta(x)\over\log x}+\int_2^x{\vartheta(t)\over t\log^2t}\mathrm dt \end{aligned}

It is known that ϑ(x)x\vartheta(x)\sim x, so plugging it into the above equation gives

π(x)xlogx\pi(x)\sim{x\over\log x}

which is the prime number theorem.

Conclusion

To sum up, in this blog, we first define and explore the Riemann-Stieltjes integral, then uses this integration techniques to solve problems via asymptotic expansion. Lastly, we provide a conditional proof for the prime number theorem.