Onesimo Hernandez-Lerma, Jean B. Lasserre's Discrete-Time Markov Control Processes: Basic Optimality PDF

By Onesimo Hernandez-Lerma, Jean B. Lasserre

ISBN-10: 1461207290

ISBN-13: 9781461207290

ISBN-10: 1461268842

ISBN-13: 9781461268840

This ebook offers the 1st a part of a deliberate two-volume sequence dedicated to a scientific exposition of a few contemporary advancements within the idea of discrete-time Markov regulate tactics (MCPs). curiosity is especially restrained to MCPs with Borel kingdom and keep an eye on (or motion) areas, and doubtless unbounded charges and noncompact regulate constraint units. MCPs are a category of stochastic regulate difficulties, often referred to as Markov selection techniques, managed Markov procedures, or stochastic dynamic professional­ grams; occasionally, fairly while the nation house is a countable set, also they are known as Markov selection (or managed Markov) chains. whatever the identify used, MCPs look in lots of fields, for instance, engineering, economics, operations learn, facts, renewable and nonrenewable re­ resource administration, (control of) epidemics, and so forth. even if, many of the lit­ erature (say, a minimum of 90%) is focused on MCPs for which (a) the kingdom house is a countable set, and/or (b) the costs-per-stage are bounded, and/or (c) the regulate constraint units are compact. yet interestingly sufficient, the main established regulate version in engineering and economics--namely the LQ (Linear system/Quadratic fee) model-satisfies none of those stipulations. additionally, while facing "partially observable" platforms) a customary method is to remodel them into identical "completely observable" sys­ tems in a bigger country area (in truth, an area of chance measures), that is uncountable no matter if the unique kingdom method is finite-valued.

Sample text

If (b) holds, then a straightforward induction argument shows that Ct(x) ::; mktw(x) \Ix E X and t = 0, 1, .... Thus, C(x) ::; mw(x)j(l - ak) < 00 for each x. (c) implies (d). Suppose that (c) holds. 3) L at-nCt(x) \In = 0, 1, .... 7c), and, therefore, E; co(xt+d ::; E; CI (Xt). 3). 4), observe first that it trivially holds for n = O-by the definition of C(x). 7c) again, g; [C(xn)lhn- l , an-I] = J C(y)Q(dylxn-l, an-d f: J at Ct(y)Q(dylxn-l, an-I) t=O L 00 < t=O atCt+1 (Xn-I).

6). 8 Lemma. 2 hold. 6). 50 4. Infinite-Horizon Discounted-Cost Problems Proof. 1), and the assumption that c 2: 0, Therefore, Vn(x) :::; V*(x) \Ix E X. , if u and u' are functions in M(X)+ and u 2: u', then Tu 2: Tu'. Therefore, since Vo := 0 and Vn := TVn-l for n 2: 1, the a-VI functions form a nondecreasing sequence in M(X)+, which implies that Vn i v* for some function v* E M(X)+. u(x,a) c(x, a) +a c(x, a) +a J J vn(y)Q(dylx, a), v*(y)Q(dylx, a). c. C. 4, V* = lim Vn = lim TVn-l = Tv*; n n that is, v* satisfies the a-DCOE v* = Tv*.

For all t, x'qtX 2: \:fx E ]Rn, and a'rta > \:fa =1= in ]Rm). The same approach used in the scalar case yields the optimal policy Jr* = {to, ... 8)] ° ° ° where KN = qN and, for t = N -1, ... 6 A consumption-investment problem 37 for any initial state x E ffi. n . LQ-discounted cost. ) = J(Jr,x):= E; [~l O/(qx 2+ra2)]. 10) with qN = 0 and qt = qci, rt 0, ... , N - 1). 15) t = N - 1, ... 6 o. 2. An investor wishes to allocate his/her current wealth Xt between investment (at) and consumption (Xt - at), in each period t = 0,1, ...

