Ito’s Lemma X(t) is a stochastic process

tarix	24.06.2016
ölçüsü	97 Kb.

Ito’s Lemma

Ito’s Lemma

X(t) is a stochastic process if its value evolves stochastically over time. Start with discrete–time processes whose values can change only at discrete points in time. The change in X has some probability distribution.

X(t+1) – X(t) ~ f [, , … , (other parameters)]
e. g. if X(t) = ln(stock price at t) and f (, ) = normal density (, ²)

then future stock price is log-normally distributed.

Markov Process: stochastic process for which probability density depends only on current value X(t) and not on any earlier value X(t – s). Given current price, no additional information in examination of past prices.
Start with this process (random walk with drift):
X(t+1) = X(t) +  + _t
where
  ~ (0, 1) [notation means (, ) is  = 0,  = 1, i.e., a standardized random variable, but not necessarily normal

(2) _tis independent of _s for t s.

This specification implies that X(t+2) – X(t) = 2 + _t + _t+1
and that: X(t+2) – X(t) ~ (2, 2 ²),
and in general: X(t+n) – X(t) ~ (n, n ²)
Implications:

“drift” is proportional to n

variance is proportional to n; so standard deviation is proportional to

Digression:

Interesting implication: what is the probability that the return on the stock index (e.g., the S&P 500) will be positive in a given period?
Using historical (post-war) data,
_annual12% _annual 16%

Convert to daily: n = 1/250. Therefore,

_daily= 12/250 = 0.048% _daily= 16/= 1.01%

Annual basis, odds about 23% market will fall if returns are normal distributed. For the market to fall, return must be below the mean by 12%, which is ¾ of a standard deviation: N() = 0.23.

Actual ratio, 1946 – 2001, is 13/56 = 0.23!!

However, on a daily basis, mean return is insignificant compared to . N() = .48. So the odds are essentially 50/50 market will rise or fall. Essentially a random walk.

A related paradox:
Suppose your portfolio is $1 million. Put it in a safe (real) annuity, and you can earn about $35,000 (real) annually, clearly not enough to retire on (at least in Boston).
Put it in the market index, and the daily standard deviation of your portfolio will be about .01  $1,000,000 = $10,000, which dominates your salary. [E.g., even with a salary of $250,000, you earn only $1,000 a day, relatively insignificant compared to your daily volatility.] So why bother working?
How do we resolve these two conclusions?

Given this analysis, we can write

X(t+n) – X(t) = n +   [where  ~ (0,1)]

Notice that variance = n²

Now let’s move in “opposite” direction. Hold the period fixed and divide it into n subperiods, each of length 1/n = t = h,
Subperiod: t t+h t+2h t+3h • • • t+nh

|––––––––|–––––––|–––––––|––––––––––––––––––|––

t • • • t + 1

Innovation: _ _ _• • •

Within any period, the random portion of the increment to X (the “innovation”) is _i ,

with variance ²h. Therefore, the change in X across any subperiod is :
X(t + ih) – X[t + (i – 1)h] = h +  _i
Notice that the mean and variance of X(t+1) – X(t) are unaffected by the number of subperiods:

X(t+1) – X(t) = (h +  _i) =  +  _i

Mean = 

Variance = ²h  var (_i) = ²h  n = ²

Notice also that the innovation to X(t+1) – X(t) is the sum of many random variables, each scaled by = .
This suggests that even if we do not assume _iare each normally distributed, it still may be the case that the scaled sum _i =  _i/is normally distributed, N(0,²). This would follow from a central limit theorem (CLT).
The CLT requires some “regularity” conditions on _i,however. These in effect require that the sum of the _iis not dominated by one (or a small number) of the individual _i as n gets large.
Rules out: (a) fat–tailed distributions

(b) “jump” in stock prices – this means stock prices are continuous.

Economic content of these assumptions: over small t, you are “very” sure that possible innovation to X also is small. Conclusion: if stock prices are continuous, the CLT implies that we can act as though the _iare normally distributed, even without assuming normality from the outset.
Therefore, assuming continuity is equivalent to assuming that ~ N(0, 1).

Now take limit as h  dt:

X(t + ih) – X [t + (i – 1)h] = h +  _i

dX = dt + dz; dz  

dz is called a Weiner process or pure Brownian motion, with properties:

• independent increments

• normally distributed, dz ~ N(0, dt)
While I have written mean and std. deviation as constants,  and , we can be more general. An Ito process allows  and  to depend on X and t:
dX = (X, t) dt + (X, t) dz
This process is still Markov: transition probabilities depend only on current values of X and t.
Special cases:

Arithmetic Brownian Motion

 and  constant. Then increments to X are independent and identically distributed random variables, each normally distributed. Total increment is the sum of these iid normal increments. Hence, X_t– X₀ ~ N(t, ²t).

Geometric Brownian Motion

(X, t) = X and (X, t) = X 

dX = Xdt + Xdz

= dt + dz

Percentage increments to X have constant mean and variance. Implies X_t is log-normal [i.e., ln(X_t/X₀) is normal].

Ornstein – Uhlenbeck process

(X, t) = k (X* – X) and  (X, t) = 

This specification implies mean reversion to a “long–run” value of X*.

(4) Sometimes interest rates are modeled as
dX = k (X* –X) + X^dz ; 0 < < 1
which rules out negative values if X* > 0.
Notice that while X is continuous, it is nowhere differentiable:
Continuity: lim E {[X(t+h) – X(t)]²}

h0

= lim E[(h +  )²]

= lim E[²h² + ²h²+2h^3/2] = lim (²h² + ²h) = 0  MSE convergence

Differentiability:

If = X/t

converges in MSE to some value X'(t), this would be its time derivative.
We require E[ ()² ] to converge for the derivative to exist.
But lim E[ ()² ] = lim (² + ²/h) = 
Intuition: think about the stock market example:
= 

Example: The typical daily fluctuation is a very large rate of fluctuation: for example, a one standard deviation swing in return (e.g., 1% per day) is equivalent to a swing in the annualized % rate of change of

= 250%.
Per hour: one standard deviation swing is 1%/. [7 trading hours per day]
Annualize:
As t  0, rate of change  
If we were to graph the instant-by-instant stock price, it would be continuous, but infinitely vibrating with slope = . Jagged (nowhere smooth or differentiable) but continuous.

S_t

______________________________ time

Notice: E(dS/dt) does not exist, but E(dS)/dt does exist!

What about functions of X?
For example, if X = ln S or S = e^X and

dX = dt + dz,

then what can we say about dS? [or for a more interesting example, if we know process for dS, what about rules for the evolution of the call option value, c(S, t)? ]
** Usual calculus will lead you astray!!**
To see why, let’s consider an example. Note that usual rule says that S = e^X dS = e^XdX

To make this example easy, suppose = 0.

Therefore, dX = dz  dX ~ N(0, ²dt), symmetric around 0.

S = exp(X)

 exp(X)

exp(X₀ + X)

exp(X₀)

_exp(X₀___X)

_________________________________________________ _X

X₀ – X X₀ X₀ + X

Notice that although disturbance around X₀ is symmetric (± X), disturbance around e^X⁰ is not.
Convexity of function implies gain > loss, so there is upward drift in stock price even though rate of return is symmetric around zero.
E(S) = E(e^x) > e ^E(x) = e^X⁰ = S₀

In general, E[f(X)]  f[E(X)] This is “Jensen’s Inequality”

Therefore, while usual calculus states that

dS = e^XdX

we’ve just shown that

E(dS)  e^X E(dX) = 0

In this case E(dS) > 0
So usual calculus must be wrong for functions of stochastic processes.
Must consider effect of the curvature of the function

S  error from curvature

 Slope = f’(X₀)

____________________________________________ X

X₀
Obvious tool is a Taylor series expansion, which tells us how to correct for curvature:
f(X₀+ X) = f(X₀) + f^'(X₀) X + ½ f^’’(X₀) (X)² + 1/6 f^’’’(X₀) (X)³ + . . .

f^’ = initial slope

f^’’ adjusts for change in slope (convex f^’’ > 0 ; concave f^’’< 0)

f^’’’adjusts for change in rate of change in slope, etc.

How far do we need to go in Taylor expansion? Answer is given by Ito’s lemma.

In a loose sense, Ito’s lemma tells us that when we take a Taylor series, we can use the following “multiplication rules:”

(i) (dt)² = 0 [ In general, (dt)ⁿ = 0 if n > 1]
(ii) (dt)ⁿ  dz = 0 for any n > 0. [Any (zero-mean) stochastic term of higher order than can be ignored.]

(iii) (dz)² = dt

The first two rules are familiar arguments from usual calculus – these terms are of a smaller order of magnitude than other terms.
The last rule is surprising: the square of a normally distributed random variable is nonstochastic!

Intuition:

dz = 

(dz)² = ²dt

E(dz²) = 1  dt = dt
Var(dz²) = E [{dz² – E(dz²)}²]

= E [(²dt – dt)²]

= E (⁴dt²– 2 dt² ² + dt²)

= dt²  E(⁴ – 2 ² + 1)

So var (dz²) is proportional to dt² , which is negligible. So (dz)² is not literally non-stochastic, but as h dt, the uncertainty approaches zero “very” rapidly. [Notice importance of normal distribution: need very thin tails for E(⁴) to be finite.]
Notice that Ito’s lemma implies that if

dX = (X,t) dt + (X,t) dz

then (dX)² = ²(X,t) dt² + 2 (X,t) dt (X,t) dz + ²(X,t) dz² = 0 + 0 + ²(X,t) dt
which is nonstochastic. When you square an Ito process, the only term that survives is variance.

Now consider a function of X, y = f(X,t ).

e.g., y may be the value of a call option and X the stock price on which the call is written.
Taylor series:

f(X + dX, t + dt) = f(X, t) + f_XdX + f_tdt + ½ f_XXdX² + ½ f_ttdt² + f_XtdXdt

 df = f_X(X, t) dX + f_t(X, t) dt + ½ f_XX(X, t) dX² + 0 + 0

= f_X[(X, t) dt + (X, t)dz] + f_tdt + ½ f_XX²(X, t) dt + 0 + 0

= [f_X (X, t) + f_t + ½ f_XX²(X, t)] dt + f_X(X, t)dz

This is called Ito’s lemma.

** Notice that dy and dX are (locally) perfectly correlated. Their stochastic terms are both known numbers times dz, the same random variable.

y = f(X)

 Slope = f'(X₀)

____________________________________________ X

X₀
Since X is small for small t, may treat relationship as locally linear and correlation  1
What does this analysis imply about “portfolios” comprised of y and X? That they can be made riskless. More on this next week.
This graph also explains why the “extra” or second derivative term enters the drift for y. A Jensen’s inequality term that depends on both convexity and volatility.

Example 1: Log-normality and geometric brownian motion

X = ln(S) or S = e^X = f(X)

If dX = dt + dz with ,  constant

[i.e., ln(S) has arithmetic brownian motion, equivalent to assuming ln(S_T) is normally dist.]

then

dS = f_Xdt + f_tdt + ½ f_XX²dt + f_Xdz

= (e^X + 0 + ½ e^X²) dt + e^Xdz

= e^X( + ½ ²) dt + e^Xdz



correction for Jensen's inequality

But S = e^X. Therefore, dS/S = ( + ½ ²) dt + dz

Define  =  + ½ ²

We will commonly write: dS/S =  dt + dz

S has G. B.M. which we now know is equivalent to: S_T /S₀ is log–normally distributed.

Example 2: Futures price dynamics

Suppose expected rate of return on stock = , but stock pays continuous proportional dividend at rate . Then

dS = ( – ) S dt + S dz [Notice that ²(S, t) = ²S²]

Spot-futures parity for futures maturing at time T implies that

F(S_t, t) = S_t e ^{(r –}^^)(T–t)
Ito's lemma 

dF = F_sdS + F_tdt + ½ F_ss(dS)²

= F_s( – )S dt + F_tdt + ½ F_ss²S²dt + F_sS dz
But F_S  S = F and F_t = – (r – ) F and F_ss= 0 

dF = ( – )F dt – (r – )Fdt + 0 + F dz

 dF/F = ( – r)dt + dz
Issues: 1) why is E(dF/F) less than expected rate of return on stock by r dt?

2) when will futures price be unbiased predictor of E(S_t)?

Example 3: Valuation

Consider a security paying a continuous dividend stream, i.e., a dividend in period dt of Xdt (equivalently, a rate of dividends X), forever.

Suppose dX/X = dt + dz
What is value of security? Assume dz is “non–systematic” risk.
Call f(X) the value of security. [Notice that value function does not depend on t. Why not?]
df = f_X X dt + f_tdt + ½ f_XXX²²dt + f_XX dz
expected income = [(f_X X + ½ f_XXX²²) + X] dt = rf dt

  

capital gain in paren div = fair return [equilibrium condition]
Can we solve this differential equation for f( )? Boundary condition is f(0) = 0

Hypothesize a solution of form f(X) = kX

Then: f_XX = f

f_XX = 0

Now plug into pde:

f + 0 + X = rf  kX + X = rkX 

k = 1/(r–)  f(X) = = growing perpetuity value (Gordon growth model)

Multivariate Ito’s lemma

Need one more multiplication rule. If dz₁ and dz₂ are two Weiner processes, with correlation of , then

dz₁  dz₂= dt

Notice that dz₁ dz₂= ₁ ₂= ₁₂dt .

E (₁₂) = cov (₁, ₂) = ₁₂ = 
Now let y = f (X₁, X₂, t)

where dX₁ = ₁ (X₁, X₂, t) dt + ₁(X₁, X₂, t) dz₁

dX₂ = ₂ (X₁, X₂, t) dt + ₂(X₁, X₂, t) dz₂
Then dy = f₁dX₁ + f₂dX₂ + f₃dt + ½ f₁₁(dX₁)²+ ½ f₂₂(dX₂)² + f₁₂dX₁dX₂

= [f₁ ₁ + f₂₂ + f₃ + ½ f₁₁₁²(X₁, X₂, t) + ½ f₂₂₂²(X₁, X₂, t) + f₁₂₁₂] dt

+ f₁₁dz₁ + f₂₂dz₂

The generalization to n sources of risk is obvious.

This has interesting relation to the APT. In continuous time, if X₁and X₂ are continuous, then (local) uncertainty in y is linear and additive in sources of uncertainty, even if original functional form is not additive. Ito’s lemma provides a “factor” loading equation.
Example: suppose production function is Q = L^aK^band L and K both follow GBM:

dL/L = _Ldt + _Ldz_L  dL = _LL dt + _LL dz_L

dK/K = _Kdt + _Kdz_K  dK = _KK dt + _KK dz_K
Notice that Q_L= Q/L = aL^a–1K^b and

Q_LL = aL^aK^b= aQ

Similarly: Q_KK = bQ

Q_KKK²= b(b–1)Q

Q_LLL² = a(a–1)Q

Q_LKLK = abQ

dQ = Q_LdL + Q_KdK + ½ Q_LL(dL)²+ ½ Q_KK(dK)² + Q_KLKL dK dL

= Q_L (_LL dt + _LL dz_L) + Q_K (_KK dt + _KK dz_K) + ½ Q_KK(K²dt) +

½ Q_LL(L²dt) + Q_KL(KL _K_Ldt)
= dt {aQ_L + bQ_K + ½ Q a(a–1)+ ½ Q b (b–1) + Q a b  _L_K}+

a Q _Ldz_L + b Q _Kdz_K

dQ/Q = [a_L + b_K + ½ a (a–1) + ½ b (b–1)+ a b _L_K]dt

+ a _Ldz_L + b _Kdz_K

So the elasticities a, b are also the “factor loadings.”
Notice that the constant elasticity specification, f(L,K) = L^aK^b preserves Geometric Brownian Motion.
In a single-variable example, e.g., Q = L^a we would get
dQ/Q = (a_L + ½ a (a–1) ) dt + a _Ldz_L
and the second derivative term in the drift is positive if a > 1, i.e., if function is convex, and negative if a < 1, i.e., if function is concave.
Extreme market timing would shift fully between the market index and Treasury bills.

Performance of a perfectly successful timer as a function of market performance

Portfolio

return

r_f

Market index return

r_f

Notice: There is no meaning to "the beta" of the fund. Beta = 0 or 1, depending on the market forecast.

Can you guess the results of this strategy, if executed with perfect success?

Compound

1926 1999 growth (annual)

Bills only: $1 $12.6 3.5%

Market index only: $1 $973 9.9%

Perfect timer (annual) $1 $41,022 15.7%

Perfect timer (monthly) $1 $5.74B 36.0%