5. Lagrangian for LQ Control#

!pip install quantecon
Hide code cell output
Requirement already satisfied: quantecon in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (0.7.2)
Requirement already satisfied: numba>=0.49.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from quantecon) (0.59.0)
Requirement already satisfied: numpy>=1.17.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.26.4)
Requirement already satisfied: requests in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from quantecon) (2.31.0)
Requirement already satisfied: scipy>=1.5.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.11.4)
Requirement already satisfied: sympy in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.12)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from numba>=0.49.0->quantecon) (0.42.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2024.2.2)
Requirement already satisfied: mpmath>=0.19 in /home/runner/miniconda3/envs/quantecon/lib/python3.11/site-packages (from sympy->quantecon) (1.3.0)
import numpy as np
from quantecon import LQ
from scipy.linalg import schur

5.1. Overview#

This is a sequel to this lecture linear quadratic dynamic programming

It can also be regarded as presenting invariant subspace techniques that extend ones that we encountered earlier in this lecture stability in linear rational expectations models

We present a Lagrangian formulation of an infinite horizon linear quadratic undiscounted dynamic programming problem.

Such a problem is also sometimes called an optimal linear regulator problem.

A Lagrangian formulation

  • carries insights about connections between stability and optimality

  • is the basis for fast algorithms for solving Riccati equations

  • opens the way to constructing solutions of dynamic systems that don’t come directly from an intertemporal optimization problem

A key tool in this lecture is the concept of an n×n symplectic matrix.

A symplectic matrix has eigenvalues that occur in reciprocal pairs, meaning that if λi(1,1) is an eigenvalue, then so is λi1.

This reciprocal pairs property of the eigenvalues of a matrix is a tell-tale sign that the matrix describes the joint dynamics of a system of equations describing the states and costates that constitute first-order necessary conditions for solving an undiscounted linear-quadratic infinite-horizon optimization problem.

The symplectic matrix that will interest us describes the first-order dynamics of state and co-state vectors of an optimally controlled system.

In focusing on eigenvalues and eigenvectors of this matrix, we capitalize on an analysis of invariant subspaces.

These invariant subspace formulations of LQ dynamic programming problems provide a bridge between recursive (i.e., dynamic programming) formulations and classical formulations of linear control and linear filtering problems that make use of related matrix decompositions (see for example this lecture and this lecture).

While most of this lecture focuses on undiscounted problems, later sections describe handy ways of transforming discounted problems to undiscounted ones.

The techniques in this lecture will prove useful when we study Stackelberg and Ramsey problem in this lecture.

5.2. Undiscounted LQ DP Problem#

The problem is to choose a sequence of controls {ut}t=0 to maximize the criterion

t=0{xtRxt+utQut}

subject to xt+1=Axt+But, where x0 is a given initial state vector.

Here xt is an (n×1) vector of state variables, ut is a (k×1) vector of controls, R is a positive semidefinite symmetric matrix, Q is a positive definite symmetric matrix, A is an (n×n) matrix, and B is an (n×k) matrix.

The optimal value function turns out to be quadratic, V(x)=xPx, where P is a positive semidefinite symmetric matrix.

Using the transition law to eliminate next period’s state, the Bellman equation becomes

(5.1)#xPx=maxu{xRxuQu(Ax+Bu)P(Ax+Bu)}

The first-order necessary conditions for the maximum problem on the right side of equation (5.1) are

Note

We use the following rules for differentiating quadratic and bilinear matrix forms: xAxx=(A+A)x;yBzy=Bz,yBzz=By.

(Q+BPB)u=BPAx,

which implies that an optimal decision rule for u is

u=(Q+BPB)1BPAx

or

u=Fx,

where

F=(Q+BPB)1BPA.

Substituting u=(Q+BPB)1BPAx into the right side of equation (5.1) and rearranging gives

(5.2)#P=R+APAAPB(Q+BPB)1BPA.

Equation (5.2) is called an algebraic matrix Riccati equation.

There are multiple solutions of equation (5.2).

But only one of them is positive definite.

The positive define solution is associated with the maximum of our problem.

It expresses the matrix P as an implicit function of the matrices R,Q,A,B.

Notice that the gradient of the value function is

(5.3)#V(x)x=2Px

We shall use fact (5.3) later.

5.3. Lagrangian#

For the undiscounted optimal linear regulator problem, form the Lagrangian

(5.4)#L=t=0{xtRxt+utQut+2μt+1[Axt+Butxt+1]}

where 2μt+1 is a vector of Lagrange multipliers on the time t transition law xt+1=Axt+But.

(We put the 2 in front of μt+1 to make things match up nicely with equation (5.3).)

First-order conditions for maximization with respect to {ut,xt+1}t=0 are

(5.5)#2Qut+2Bμt+1=0 , t0μt=Rxt+Aμt+1 , t1.

Define μ0 to be a vector of shadow prices of x0 and apply an envelope condition to (5.4) to deduce that

μ0=Rx0+Aμ1,

which is a time t=0 counterpart to the second equation of system (5.5).

An important fact is that

(5.6)#μt+1=Pxt+1

where P is a positive define matrix that solves the algebraic Riccati equation (5.2).

Thus, from equations (5.3) and (5.6), 2μt is the gradient of the value function with respect to xt.

The Lagrange multiplier vector μt is often called the costate vector that corresponds to the state vector xt.

It is useful to proceed with the following steps:

  • solve the first equation of (5.5) for ut in terms of μt+1.

  • substitute the result into the law of motion xt+1=Axt+But.

  • arrange the resulting equation and the second equation of (5.5) into the form

(5.7)#L (xt+1μt+1) = N (xtμt) , t0,

where

L= (IBQ1B0A),N= (A0RI).

When L is of full rank (i.e., when A is of full rank), we can write system (5.7) as

(5.8)#(xt+1μt+1) =M (xtμt)

where

(5.9)#ML1N=(A+BQ1BA1RBQ1BA1A1RA1).

5.4. State-Costate Dynamics#

We seek to solve the difference equation system (5.8) for a sequence {xt}t=0 that satisfies

  • an initial condition for x0

  • a terminal condition limt+xt=0

This terminal condition reflects our desire for a stable solution, one that does not diverge as t.

We inherit our wish for stability of the {xt} sequence from a desire to maximize

t=0[xtRxt+utQut],

which requires that xtRxt converge to zero as t+.

5.5. Reciprocal Pairs Property#

To proceed, we study properties of the (2n×2n) matrix M defined in (5.9).

It helps to introduce a (2n×2n) matrix

J=(0InIn0).

The rank of J is 2n.

Definition: A matrix M is called symplectic if

(5.10)#MJM=J.

Salient properties of symplectic matrices that are readily verified include:

  • If M is symplectic, then M2 is symplectic

  • The determinant of a symplectic, then det(M)=1

It can be verified directly that M in equation (5.9) is symplectic.

It follows from equation (5.10) and from the fact J1=J=J that for any symplectic matrix M,

(5.11)#M=J1M1J.

Equation (5.11) states that M is related to the inverse of M by a similarity transformation.

For square matrices, recall that

  • similar matrices share eigenvalues

  • eigenvalues of the inverse of a matrix are inverses of eigenvalues of the matrix

  • a matrix and its transpose share eigenvalues

It then follows from equation (5.11) that the eigenvalues of M occur in reciprocal pairs: if λ is an eigenvalue of M, so is λ1.

Write equation (5.8) as

(5.12)#yt+1=Myt

where yt=(xtμt).

Consider a triangularization of M

(5.13)#V1MV=(W11W120W22)

where

  • each block on the right side is (n×n)

  • V is nonsingular

  • all eigenvalues of W22 exceed 1 in modulus

  • all eigenvalues of W11 are less than 1 in modulus

5.6. Schur decomposition#

The Schur decomposition and the eigenvalue decomposition are two decompositions of the form (5.13).

Write equation (5.12) as

(5.14)#yt+1=VWV1yt.

A solution of equation (5.14) for arbitrary initial condition y0 is evidently

(5.15)#yt=V[W11tW12,t0W22t] V1y0

where W12,t=W12 for t=1 and for t2 obeys the recursion

W12,t=W11t1W12,t1+W12,t1W22t1

and where Wiit is Wii raised to the tth power.

Write equation (5.15) as

(y1ty2t) = [W11tW12,t0W22t](y10y20)

where yt=V1yt, and in particular where

(5.16)#y2t=V21xt+V22μt,

and where Vij denotes the (i,j) piece of the partitioned V1 matrix.

Because W22 is an unstable matrix, yt will diverge unless y20=0.

Let Vij denote the (i,j) piece of the partitioned V1 matrix.

To attain stability, we must impose y20=0, which from equation (5.16) implies

V21x0+V22μ0=0

or

μ0=(V22)1V21x0.

This equation replicates itself over time in the sense that it implies

μt=(V22)1V21xt.

But notice that because (V21 V22) is the second row block of the inverse of V, it follows that

(V21 V22)(V11V21)=0

which implies

V21V11+V22V21=0.

Therefore,

(V22)1V21=V21V111.

So we can write

μ0=V21V111x0

and

μt=V21V111xt.

However, we know that μt=Pxt, where P occurs in the matrix that solves the Riccati equation.

Thus, the preceding argument establishes that

(5.17)#P=V21V111.

Remarkably, formula (5.17) provides us with a computationally efficient way of computing the positive definite matrix P that solves the algebraic Riccati equation (5.2) that emerges from dynamic programming.

This same method can be applied to compute the solution of any system of the form (5.8) if a solution exists, even if eigenvalues of M fail to occur in reciprocal pairs.

The method will typically work so long as the eigenvalues of M split half inside and half outside the unit circle.

Systems in which eigenvalues (properly adjusted for discounting) fail to occur in reciprocal pairs arise when the system being solved is an equilibrium of a model in which there are distortions that prevent there being any optimum problem that the equilibrium solves. See [Ljungqvist and Sargent, 2018], ch 12.

5.7. Application#

Here we demonstrate the computation with an example which is the deterministic version of an example borrowed from this quantecon lecture.

# Model parameters
r = 0.05
c_bar = 2
μ = 1

# Formulate as an LQ problem
Q = np.array([[1]])
R = np.zeros((2, 2))
A = [[1 + r, -c_bar + μ],
     [0,              1]]
B = [[-1],
     [0]]

# Construct an LQ instance
lq = LQ(Q, R, A, B)

Given matrices A, B, Q, R, we can then compute L, N, and M=L1N.

def construct_LNM(A, B, Q, R):

    n, k = lq.n, lq.k

    # construct L and N
    L = np.zeros((2*n, 2*n))
    L[:n, :n] = np.eye(n)
    L[:n, n:] = B @ np.linalg.inv(Q) @ B.T
    L[n:, n:] = A.T

    N = np.zeros((2*n, 2*n))
    N[:n, :n] = A
    N[n:, :n] = -R
    N[n:, n:] = np.eye(n)

    # compute M
    M = np.linalg.inv(L) @ N

    return L, N, M
L, N, M = construct_LNM(lq.A, lq.B, lq.Q, lq.R)
M
array([[ 1.05      , -1.        , -0.95238095,  0.        ],
       [ 0.        ,  1.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.95238095,  0.        ],
       [ 0.        ,  0.        ,  0.95238095,  1.        ]])

Let’s verify that M is symplectic.

n = lq.n
J = np.zeros((2*n, 2*n))
J[n:, :n] = np.eye(n)
J[:n, n:] = -np.eye(n)

M @ J @ M.T - J
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

We can compute the eigenvalues of M using np.linalg.eigvals, arranged in ascending order.

eigvals = sorted(np.linalg.eigvals(M))
eigvals
[0.9523809523809523, 1.0, 1.0, 1.05]

When we apply Schur decomposition such that M=VWV1, we want

  • the upper left block of W, W11, to have all of its eigenvalues less than 1 in modulus, and

  • the lower right block W22 to have eigenvalues that exceed 1 in modulus.

To get what we want, let’s define a sorting function that tells scipy.schur to sort the corresponding eigenvalues with modulus smaller than 1 to the upper left.

stable_eigvals = eigvals[:n]

def sort_fun(x):
    "Sort the eigenvalues with modules smaller than 1 to the top-left."

    if x in stable_eigvals:
        stable_eigvals.pop(stable_eigvals.index(x))
        return True
    else:
        return False

W, V, _ = schur(M, sort=sort_fun)
W
array([[ 1.        , -0.02316402, -1.00085948, -0.95000594],
       [ 0.        ,  0.95238095, -0.00237501, -0.95325452],
       [ 0.        ,  0.        ,  1.05      ,  0.02432222],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])
V
array([[ 0.99875234,  0.00121459, -0.04992284,  0.        ],
       [ 0.04993762, -0.02429188,  0.99845688,  0.        ],
       [ 0.        ,  0.04992284,  0.00121459,  0.99875234],
       [ 0.        , -0.99845688, -0.02429188,  0.04993762]])

We can check the modulus of eigenvalues of W11 and W22.

Since they are both triangular matrices, eigenvalues are the diagonal elements.

# W11
np.diag(W[:n, :n])
array([1.        , 0.95238095])
# W22
np.diag(W[n:, n:])
array([1.05, 1.  ])

The following functions wrap M matrix construction, Schur decomposition, and stability-imposing computation of P.

def stable_solution(M, verbose=True):
    """
    Given a system of linear difference equations

        y' = |a b| y
        x' = |c d| x

    which is potentially unstable, find the solution
    by imposing stability.

    Parameter
    ---------
    M : np.ndarray(float)
        The matrix represents the linear difference equations system.
    """
    n = M.shape[0] // 2
    stable_eigvals = list(sorted(np.linalg.eigvals(M))[:n])

    def sort_fun(x):
        "Sort the eigenvalues with modules smaller than 1 to the top-left."

        if x in stable_eigvals:
            stable_eigvals.pop(stable_eigvals.index(x))
            return True
        else:
            return False

    W, V, _ = schur(M, sort=sort_fun)
    if verbose:
        print('eigenvalues:\n')
        print('    W11: {}'.format(np.diag(W[:n, :n])))
        print('    W22: {}'.format(np.diag(W[n:, n:])))

    # compute V21 V11^{-1}
    P = V[n:, :n] @ np.linalg.inv(V[:n, :n])

    return W, V, P

def stationary_P(lq, verbose=True):
    """
    Computes the matrix :math:`P` that represent the value function

         V(x) = x' P x

    in the infinite horizon case. Computation is via imposing stability
    on the solution path and using Schur decomposition.

    Parameters
    ----------
    lq : qe.LQ
        QuantEcon class for analyzing linear quadratic optimal control
        problems of infinite horizon form.

    Returns
    -------
    P : array_like(float)
        P matrix in the value function representation.
    """

    Q = lq.Q
    R = lq.R
    A = lq.A * lq.beta ** (1/2)
    B = lq.B * lq.beta ** (1/2)

    n, k = lq.n, lq.k

    L, N, M = construct_LNM(A, B, Q, R)
    W, V, P = stable_solution(M, verbose=verbose)

    return P
# compute P
stationary_P(lq)
eigenvalues:

    W11: [1.         0.95238095]
    W22: [1.05 1.  ]
array([[ 0.1025, -2.05  ],
       [-2.05  , 41.    ]])

Note that the matrix P computed in this way is close to what we get from the routine in quantecon that solves an algebraic Riccati equation by iterating to convergence on a Riccati difference equation.

The small difference comes from computational errors and will decrease as we increase the maximum number of iterations or decrease the tolerance for convergence.

lq.stationary_values()
(array([[ 0.1025, -2.05  ],
        [-2.05  , 41.01  ]]),
 array([[-0.09761905,  1.95238095]]),
 0)

Using a Schur decomposition is much more efficient.

%%timeit
stationary_P(lq, verbose=False)
72.4 µs ± 241 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit
lq.stationary_values()
1.16 ms ± 2.52 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

5.8. Other Applications#

The preceding approach to imposing stability on a system of potentially unstable linear difference equations is not limited to linear quadratic dynamic optimization problems.

For example, the same method is used in our Stability in Linear Rational Expectations Models lecture.

Let’s try to solve the model described in that lecture by applying the stable_solution function defined in this lecture above.

def construct_H(ρ, λ, δ):
    "contruct matrix H given parameters."

    H = np.empty((2, 2))
    H[0, :] = ρ,δ
    H[1, :] = - (1 - λ) / λ, 1 / λ

    return H

H = construct_H(ρ=.9, λ=.5, δ=0)
W, V, P = stable_solution(H)
P
eigenvalues:

    W11: [0.9]
    W22: [2.]
array([[0.90909091]])

5.9. Discounted Problems#

5.9.1. Transforming States and Controls to Eliminate Discounting#

A pair of useful transformations allows us to convert a discounted problem into an undiscounted one.

Thus, suppose that we have a discounted problem with objective

t=0βt{xtRxt+utQut}

and that the state transition equation is again xt+1=Axt+But.

Define the transformed state and control variables

  • x^t=βt2xt

  • u^t=βt2ut

and the transformed transition equation matrices

  • A^=β12A

  • B^=β12B

so that the adjusted state and control variables obey the transition law

x^t+1=A^x^t+B^u^t.

Then a discounted optimal control problem defined by A,B,R,Q,β having optimal policy characterized by P,F is associated with an equivalent undiscounted problem defined by A^,B^,Q,R having optimal policy characterized by F^,P^ that satisfy the following equations:

F^=(Q+BP^B)1B^PA^

and

P^=R+A^PA^A^PB^(Q+BP^B^)1B^PA^

It follows immediately from the definitions of A^,B^ that F^=F and P^=P.

By exploiting these transformations, we can solve a discounted problem by solving an associated undiscounted problem.

In particular, we can first transform a discounted LQ problem to an undiscounted one and then solve that discounted optimal regulator problem using the Lagrangian and invariant subspace methods described above.

For example, when β=11+r, we can solve for P with A^=β1/2A and B^=β1/2B.

These settings are adopted by default in the function stationary_P defined above.

β = 1 / (1 + r)
lq.beta = β
stationary_P(lq)
eigenvalues:

    W11: [0.97590007 0.97590007]
    W22: [1.02469508 1.02469508]
array([[ 0.0525, -1.05  ],
       [-1.05  , 21.    ]])

We can verify that the solution agrees with one that comes from applying the routine LQ.stationary_values in the quantecon package.

lq.stationary_values()
(array([[ 0.0525, -1.05  ],
        [-1.05  , 21.    ]]),
 array([[-0.05,  1.  ]]),
 0.0)

5.9.2. Lagrangian for Discounted Problem#

For several purposes, it is useful explicitly briefly to describe a Lagrangian for a discounted problem.

Thus, for the discounted optimal linear regulator problem, form the Lagrangian

(5.18)#L=t=0βt{xtRxt+utQut+2βμt+1[Axt+Butxt+1]}

where 2μt+1 is a vector of Lagrange multipliers on the state vector xt+1.

First-order conditions for maximization with respect to {ut,xt+1}t=0 are

(5.19)#2Qut+2βBμt+1=0 , t0μt=Rxt+βAμt+1 , t1.

Define 2μ0 to be the vector of shadow prices of x0 and apply an envelope condition to (5.18) to deduce that

μ0=Rx0+βAμ1,

which is a time t=0 counterpart to the second equation of system (5.19).

Proceeding as we did above with the undiscounted system (5.5), we can rearrange the first-order conditions into the system

(5.20)#[IβBQ1B0βA][xt+1μt+1]=[A0RI][xtμt]

which in the special case that β=1 agrees with equation (5.5), as expected.

By staring at system (5.20), we can infer identities that shed light on the structure of optimal linear regulator problems, some of which will be useful in this lecture when we apply and extend the methods of this lecture to study Stackelberg and Ramsey problems.

First, note that the first block of equation system (5.20) asserts that when μt+1=Pxt+1, then

(I+βQ1BPBP)xt+1=Axt,

which can be rearranged to sbe

xt+1=(I+βBQ1BP)1Axt.

This expression for the optimal closed loop dynamics of the state must agree with an alternative expression that we had derived with dynamic programming, namely,

xt+1=(ABF)xt.

But using

(5.21)#F=β(Q+βBPB)1BPA

it follows that

ABF=(IβB(Q+βBPB)1BP)A.

Thus, our two expressions for the closed loop dynamics agree if and only if

(5.22)#(I+βBQ1BP)1=(IβB(Q+βBPB)1BP).

Matrix equation (5.22) can be verified by applying a partitioned inverse formula.

Note

Just use the formula (abd1c)1=a1+a1b(dca1b)1ca1 for appropriate choices of the matrices a,b,c,d.

Next, note that for any fixed F for which eigenvalues of ABF are less than 1β in modulus, the value function associated with using this rule forever is x0P~x0 where P~ obeys the following matrix equation:

(5.23)#P~=(R+FQF)+β(ABF)P(ABF).

Evidently, P~=P only when F obeys formula (5.21).

Next, note that the second equation of system (5.20) implies the “forward looking” equation for the Lagrange multiplier

μt=Rxt+βAμt+1

whose solution is

μt=Pxt,

where

(5.24)#P=R+βAP(ABF)

where we must require that F obeys equation (5.21).

Equations (5.23) and (5.24) provide different perspectives on the optimal value function.