Movatterモバイル変換


[0]ホーム

URL:


Upgrade to Pro — share decks privately, control downloads, hide ads and more …
Speaker DeckSpeaker Deck
Speaker Deck

近似動的計画入門

Avatar for MIKIO KUBO MIKIO KUBO
May 26, 2025

 近似動的計画入門

青山学院の小林先生のご講演スライド

Avatar for MIKIO KUBO

MIKIO KUBO

May 26, 2025
Tweet

More Decks by MIKIO KUBO

See All by MIKIO KUBO

Other Decks in Research

See All in Research

Featured

See All Featured

Transcript

  1. খྛ࿨ത੨ࢁֶӃେֶཧ޻ֶ෦LPCBZBTIJ!JTFBPZBNBBDKQ ۙࣅಈతܭըೖ໳ 

  2. ಈతܭը๏ ྫ ࢝఺ ͔Βऴ఺ ·Ͱͷ࠷୹ܦ࿏ s t ࢝఺s ऴ఺t ఺

    ͔Βऴ఺·Ͱͷ࠷୹ڑ཭ pi : i 
  3. ಈతܭը๏ ྫ ࢝఺ ͔Βऴ఺ ·Ͱͷ࠷୹ܦ࿏ s t ࢝఺s ऴ఺t ఺

    ͔Βऴ఺·Ͱͷ࠷୹ڑ཭ pi : i i j k csi ఺ ͔Β఺ ʹ௚઀Ҡಈ͢Δίετ cij : i j csj csk pk pi pj 
  4. ಈతܭը๏ ྫ ࢝఺ ͔Βऴ఺ ·Ͱͷ࠷୹ܦ࿏ s t ࢝఺s ऴ఺t ఺

    ͔Βऴ఺·Ͱͷ࠷୹ڑ཭ pi : i i j k csi ఺ ͔Β఺ ʹ௚઀Ҡಈ͢Δίετ cij : i j csj csk pk pi pj ps = min{csi + pi , csj + pj , csk + pk } 
  5. ۙࣅಈతܭը๏ ࢝఺s ऴ఺t ʹෆ࣮֬ੑ͕͋Δͱ͖ʹۙࣅΛ༻͍Δ pi i j k csi csj

    csk pk ???? pi ???? pj ???? ps = min{csi + pi , csj + pj , csk + pk } ͕ܭࢉͰ͖ͳ͍ 
  6. ଟஈ֊ͷҙࢥܾఆ໰୊ w ଟ͘ͷҙࢥܾఆ໰୊͸ɼଟஈ֊ͷҙࢥܾఆ໰୊ w ྫ ੜ࢈؅ཧ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ

    w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ 
  7. ଟஈ֊ͷҙࢥܾఆ໰୊ w ଟ͘ͷҙࢥܾఆ໰୊͸ɼଟஈ֊ͷҙࢥܾఆ໰୊ w ྫ ൃిܭը w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ

    w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ 
  8. ଟஈ֊ͷҙࢥܾఆ໰୊ w ଟ͘ͷҙࢥܾఆ໰୊͸ɼଟஈ֊ͷҙࢥܾఆ໰୊ w ྫ ੜ࢈؅ཧ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ

    w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ ෆ࣮֬ੑΛ࣋ͭ΋ͷ wੜ࢈ػցͷ༧ఆ֎ͷނো wੜ࢈඼ͷ༧ఆ֎ͷधཁ૿Ճ wੜ࢈඼ͷ༌ૹͷ஗Ԇ 
  9. ଟஈ֊ͷҙࢥܾఆ໰୊ w ଟ͘ͷҙࢥܾఆ໰୊͸ɼଟஈ֊ͷҙࢥܾఆ໰୊ w ྫ ൃిܭը w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ

    w ݄೔ͷൃిܭըͷܾఆɼൃిͷ࣮ߦɼൃిޙͷঢ়ଶ ෆ࣮֬ੑΛ࣋ͭ΋ͷ 
  10. ଟஈ֊ͷҙࢥܾఆ໰୊ 1PXFMMʹΑΔه๏ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 ,

    S1 , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ظ ͔Βظ ΁ͷਪҠͷաఔͰ໌Β͔ʹͳͬͨ৘ใ Wt+1 : t t + 1 S0 x0 W1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ S1 x1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ ࣌ؒͷܦա w࣮ݱͨ͠ੜ࢈ྔ w൑໌ͨ͠धཁྔ \  W2
  11. 8#1PXFMMʹΑΔه๏ɾఆࣜԽ 8#1PXFMM 1SPGFTTPS&NFSJUVT 1SJODFUPO6OJWFSTJUZ 0QUJNBM%ZOBNJDT w 8#1PXFMM "QQSPYJNBUF%ZOBNJD1SPHSBNNJOH 8JMFZ 

    w 8#1PXFMM .JOJTZNQPTJVN"6OJ fi FE'SBNFXPSLGPS 0QUJNJ[BUJPO6OEFS6ODFSUBJOUZ *OGPSNT 8BTIJOHUPO %$  :PVUVCFͰࢹௌՄ
  12. None
  13. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ظ ͔Βظ ΁ͷਪҠͷաఔͰ໌Β͔ʹͳͬͨ৘ใ Wt+1 : t t + 1 S0 x0 W1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ S1 x1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ ࣌ؒͷܦա w࣮ݱͨ͠ੜ࢈ྔ w൑໌ͨ͠धཁྔ \ Θ͔͍ͬͯΔ΋ͷ ܾΊΔ΋ͷ ෆ࣮֬ͳ΋ͷ͕࣮ݱͨ͠΋ͷ 
  14. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ৽͍͠৘ใ Wt+1 : د༩ؔ਺  C(St , xt ) ظ ʹ͓͚ΔҙࢥܾఆΛ ͱ͢Δ͜ͱʹΑΔد༩ DPOUSJCVUJPO Λදؔ͢਺ t xt د༩ɹίετɼརӹͳͲ ྫ ظ Ͱͷधཁ༧ଌɼࡏݿྔͳͲΛݟܾͯఆͨ͠ྔ Λੜ࢈͢ΔͨΊͷੜ࢈ίετ t xt 
  15. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ৽͍͠৘ใ Wt+1 : ҙࢥܾఆ  xt = Xπ(St ) wҙࢥܾఆ ͸ɼํࡦ QPMJDZ  ʹΑܾͬͯΊΔɽ wํࡦ͸ɼঢ়ଶ ʹґଘ͢Δ wಉ͡ঢ়ଶ Ͱ͋ͬͯ΋ɼํࡦ ͱํࡦ ͱͰ͸ҙࢥܾఆ ͸ҟͳΓ͏Δ wد༩ Λ࠷దԽ͢Δํࡦ ΛٻΊ͍ͨ xt π St St π1 π2 xt C(St , xt ) π 
  16. ଟஈ֊ͷҙࢥܾఆ໰୊ w ༷ʑͳ෼໺Ͱݚڀ͞Ε͍ͯΔ w ҙࢥܾఆ͕཭ࢄత͔࿈ଓత͔ ༷ʑͳݺͼ໊ ڧԽֶशɼ࠷ద੍ޚɼ֬཰୳ࡧ ଟ࿹όϯσΟοτ໰୊ɼϚϧνΤʔδΣϯτγες Ϝɼཱ֬ܭըɼΦϯϥΠϯ࠷దԽɼ.1$ FUDʜ

    
  17. ଟஈ֊ͷҙࢥܾఆ໰୊ͷఆࣜԽ w ෆ࣮֬ͳཁૉ͕ͳ͚Ε͹਺ཧ࠷దԽ໰୊ͱͯ͠ఆࣜԽ ྫ ઢܗ࠷దԽ min x cx Ax =

    b TU x ≥ 0 ྫ ଟظؒͷઢܗ࠷దԽ min x0 ,x1 ,...,xT T ∑ t=0 ct xt At xt − Bt−1 xt−1 = bt TU xt ≥ 0 Dt xt ≤ ut 
  18. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  19.  ঢ়ଶม਺   ෺ཧঢ়ଶ  ෺ཧҎ֎ͷ৘ใɼ ҙࢥܾఆͷͨΊͷલఏ৘ใͳͲ  ܾఆม਺

     ܾఆม਺ͷ஋͸ɼํࡦ ʹΑͬͯఆΊΔ  ֎ੑ৘ใ   ͱ ͷؒʹ໌Β͔ʹͳͬͨɼ͋Δ͍͸࣮ݱͨ͠৘ใ St = (Rt , It , Bt ) Rt : It Bt : (xt , at , ut ) Xπ(St |θ) Wt+1 t t + 1  ਺ཧϞσϧͷͭͷཁૉʙ
  20.  ભҠؔ਺  ঢ়ଶ ͷԼͰํࡦʹΑΓܾఆม਺ ͷ஋ΛఆΊͯɼ΍͕ͯɼ৘ใ ͕໌Β ͔ʹͳͬͨޙʹɼظ ʹঢ়ଶ ʹࢸΔ

     ໨తؔ਺  શظؒͷد༩ؔ਺ͷظ଴஋Λ࠷େԽ͢ΔΑ͏ͳํࡦ ΛٻΊ͍ͨ  St+1 = SM(St , xt , Wt+1 ) St xt Wt t + 1 St+1 max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) π  ਺ཧϞσϧͷͭͷཁૉʙ
  21. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  22. ঢ়ଶม਺ ͷྫ St = (Rt , It , Bt )

    ෺ཧঢ়ଶ  ෺ཧҎ֎ͷ৘ใɼ ҙࢥܾఆͷͨΊͷલఏ৘ใ w ݪࡐྉͷঢ়ଶɼӡൖंͷҐஔɼ঎඼ࡏݿྔ w  ݪࡐྉՁ֨ɼఱީ w  Ձ֨ʹର͢Δࢢ৔ͷ൓Ԡɼੜ࢈ઃඋͷՔಇঢ়گ Rt : It Bt : Rt It Bt = 
  23. ෩ྗൃిɾ஝ిγεςϜͷྫ w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w ظ ͰͷిྗάϦουͰͷిྗՁ֨

    w ࠓޙͷ෩ྗ༧ଌ w ঢ়ଶม਺  Rt t Dt t pt t Bt = St = (Rt , Dt , pt , Bt )  Rt Dt pt Bt
  24. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  25. ܾఆม਺ ͷྫ xt w ཭ࢄม਺·ͨ͸࿈ଓม਺ɹଟ࣍ݩͷม਺ ʢ਺ཧ࠷దԽ໰୊ʹ͓͚Δܾఆม਺ͷΠϝʔδʣ ΛͲ͏ٻΊΔ͔͸Կ΋ݴ͍ͬͯͳ͍ɽ ਺ཧ࠷దԽ໰୊ͷ࠷దղͱͯ͠ಘΔͱ͸ݶΒͳ͍ɽ ؔ਺ ʹैܾͬͯΊΔɽɹɹʢ

    ͸ํࡦʣ xt xt Xπ(St ) π 
  26. ෩ྗൃిɾ஝ిγεςϜͷྫ ܾఆม਺ w άϦου͔Βͷిྗͷߪೖྔɽ ͕ൢചྔɼ ͕ߪೖྔ w ੍໿৚݅ w ʢൢചͰ͖Δͷ͸஝ిྔͷൣғ಺ʣ

    w ʢߪೖྔ͸஝ి༰ྔͷ࢒Γͷൣғ಺ʣ w ํࡦ  xt xt > 0 xt < 0 xt ≤ Rt −xt ≤ Rmax − Rt xt = Xπ(St )  Rt Dt pt Bt xt = Xπ(St ) St = (Rt , Dt , pt , Bt )
  27. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  28. ֎ੑ৘ใ ͷྫ Wt w ظ Ͱ໌Β͔ʹͳΔ৘ใ   ઃඋނোɼ஗Ԇɼ৽نܖ໿ӡൖंɼͳͲ ސ٬ͷ৽ͨͳधཁ

    Ձ֨ͷมԽ ఱީͳͲ؀ڥͷ৘ใ Wt t ( ̂ Rt , ̂ Dt , ̂ Et , ̂ pt ) ̂ Rt ̂ Dt ̂ pt ̂ Et w ͸ ͱ ʹґଘ͠͏ΔͷͰɼ ͱॻ͘͜ͱ΋͋Δ Wt+1 St xt Wt+1 (St , xt ) 
  29. ෩ྗൃిɾ஝ిγεςϜͷྫ ֎ੑ৘ใ w  w ظ ͱظ ͷؒͷిྗྔʢൃిʹΑΔʣ ͷมԽ w

    ظ ͱظ ͷؒʹ໌Β͔ʹͳͬͨిྗध ཁ w ظ ͱظ ͷؒͷిྗՁ֨ͷมԽ Wt+1 = ( ̂ Et+1 , ̂ Dt+1 , ̂ pt+1 ) ̂ Et+1 = t t + 1 ̂ Dt+1 = t t + 1 ̂ pt+1 = t t + 1  Rt Dt pt Bt xt = Xπ(St )
  30. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  31. ભҠؔ਺ ͷྫ St+1 St+1 = SM(St , xt , Wt+1

    )  ࡏݿอଘ  ظ ͷՁ֨͸ ʹՁ֨มԽ͕൓ө͞Εܾͯ·Δ  ໌Β͔ʹͳͬͨظ ͷधཁ Rt+1 = Rt + xt + ̂ Rt+1 pt+1 = pt + ̂ pt+1 t + 1 pt Dt+1 = ̂ Dt+1 t + 1  Rt Dt pt Bt xt = Xπ(St )
  32. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  33. ໨తؔ਺ͷྫ w ྦྷੵརಘʢΦϯϥΠϯֶशʣ  w ࠷ऴརಘʢΦϑϥΠϯֶशʣ  ࠷ऴͷܾఆม਺ ͷ஋Λ࠷దʹ͍ͨ͠ max

    π 𝔼 { T ∑ t=0 Ct (St , Xπ(St ), Wt+1)|S0} max π 𝔼 {F (xπ,N, ̂ W) |S0} xπ,N 
  34. ෩ྗൃిɾ஝ిγεςϜͷྫ ໨తؔ਺ w ظ ͰͷిྗചΓ্͛ɹɹ ɹ͜ͷ ͸ํࡦ ʹΑܾͬͯ·ΔͷͰɼ ͱॻ͘ w

     Ͱͷد༩ؔ਺ͷ࿨ͷظ଴஋Λɼॳظঢ়ଶ ͷ΋ͱͰ࠷େԽ͢Δ ֤ظͷܾఆม਺ ΛఆΊΔํࡦ Λ࠷దԽ͢Δ C(St , xt ) = pt xt t xt π Xπ (St) max π 𝔼 { T ∑ t=0 C (St , Xπ (St)) |S0} t = 0,1,...,T S0 xt π  Rt Dt pt Bt xt = Xπ(St )
  35. ํࡦ ํࡦͱ͸ɼঢ়ଶ͔Βߦಈʢʹܾఆʣ΁ͷࣸ૾ʢ ͔Β ΁ͷࣸ૾ʣ ɹঢ়ଶ ͕༩͑ΒΕͨΒɼͦͷͱ͖ʹͱΔ΂͖ߦಈ ͕ܾ·Δ΋ͷ ແݶʹ͋Δࣸ૾ͷͳ͔͔ΒͲ͏ܾΊΔ͔ʁɹ St xt

    St xt 
  36. ํࡦ ͷྫ π ํࡦΛύϥϝʔλΛ࣋ͭؔ਺ͰఆΊΔ৔߹   ͷ৔߹   

    ͷ৔߹    ͷ৔߹  Xπ (St |ρ) = pt < ρDIBSHF ρDIBSHF < pt < ρEJTDIBSHF ρDIBSHF < pt ͸ظ ʹͳͬͯ͸͡Ίͯ൑໌͢Δ ظͰ Λʢྫ͑͹਺ཧ࠷దԽ໰୊ͷ࠷దղͱͯ͠ʣܾΊΔ͜ͱ͸Ͱ͖ͳ͍ pt t xt 
  37. ํࡦ ͷྫ π ํࡦΛύϥϝʔλΛ࣋ͭؔ਺ͱ͢Δ৔߹ɼ ํࡦΛܾΊΔ͜ͱʹύϥϝʔλͷ஋ΛܾΊΔ͜ͱ ͱͳΔ ্ͷࢦඪΛ࠷େԽ͢ΔΑ͏ͳύϥϝʔλ Λ༻͍Δ ํࡦΛܾΊΔ ρ

    max ρ 𝔼 { T ∑ t=0 C (St , Xπ (St |ρ))|S0} 
  38. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt 
  39. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt ظ ʹ͓͍ͯ ͕ ͱΓ͏Δ஋ͷू߹ t xt 
  40. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt max π 𝔼 { T ∑ t=0 Ct (St , Xπ t (St), Wt+1 |S0) } ํࡦ Xπ : S → 𝒳 ੍໿ xt = Xπ t (St) ∈ 𝒳 t ભҠؔ਺ St+1 = SM(St , xt , Wt+1 ) ֎ੑ৘ใ (S0 , W1 , W2 , . . . , WT ) 
  41. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ໨తؔ਺ 
  42. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ≈ 1 N N ∑ n=1 T ∑ t=0 ct Xπ t (Sn t (ωn)) ໨తؔ਺ 
  43. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ≈ 1 N N ∑ n=1 T ∑ t=0 ct Xπ t (Sn t (ωn)) ໨తؔ਺ αϯϓϧʹΑΔظ଴஋ͷධՁ 
  44. ෩ྗൃిɾ஝ిγεςϜͷྫ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t 
  45. ෩ྗൃిɾ஝ిγεςϜͷྫ ܾఆม਺  w ظ ͰͷάϦου͔ΒόοςϦʔ΁ͷిྗྔ w ظ ͰͷάϦου͔Βຬͨ͞ΕΔిྗधཁ w

    ظ Ͱͷ෩ྗൃిॴ͔ΒόοςϦʔ΁ྲྀΕΔ஝ిྔ w ظ Ͱͷ෩ྗൃిॴ͔ΒͷిྗͰຬͨ͞ΕΔిྗधཁ w ظ ͰͷόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ xt = (xGB t , xGD t , xEB t , xED t , xBD t ) xGB t = t xGD t = t xEB t = t xED t t xBD t t 
  46. ෩ྗൃిɾ஝ిγεςϜͷྫ ੍໿৚݅  w  w  w xEB t

    + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt xGD t , xEB t , xED t , xBD t ≥ 0 ˡάϦου͔ΒɼόοςϦʔ͔Βɼൃిॴ͔Βͷ ిྗͰधཁ͕શͯຬͨ͞ΕΔ ˡόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ͸஝ిྔΛ௒͑ͳ͍ 
  47. ෩ྗൃిɾ஝ిγεςϜͷྫ ੍໿৚݅  w  w  w xEB t

    + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt xGD t , xEB t , xED t , xBD t ≥ 0 ˡάϦου͔ΒɼόοςϦʔ͔Βɼൃిॴ͔Βͷ ిྗͰधཁ͕શͯຬͨ͞ΕΔ ˡόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ͸஝ిྔΛ௒͑ͳ͍ ํࡦ ͸ɼ͜ΕΒͷ੍໿Λຬͨ͢ ΛఆΊͳ͚Ε͹ͳΒͳ͍ Xπ(St ) x 
  48. ෩ྗൃిɾ஝ిγεςϜͷྫ ֎ੑ৘ใ  w ظ ͔Βظ Ͱͷ෩ྗൃి͔ΒͷൃిྔͷมԽ w ظ ͔Βظ

    ͰͷిྗधཁͷมԽ w ظ ͰͷάϦουͰͷిྗՁ֨ͷมԽ Wt+1 = ( ̂ Et+1 , ̂ Dt+1 , ̂ pt+1 , ) ̂ Et+1 = t t + 1 ̂ Dt+1 = t t + 1 ̂ pt+1 = t + 1 
  49. ෩ྗൃిɾ஝ిγεςϜͷྫ ભҠؔ਺  w ˡόοςϦʔ಺ͷ஝ిྔͷมԽ w ˡ࣮ݱͨ͠ظ ʹ͓͚Δൃిྔ w ˡ൑໌ͨ͠ظ

    ʹ͓͚Δधཁྔ w St = SM (St , xt , Wt+1) Rt+1 = Rt + η (xGB t + xEB t − xBD t ) Et+1 = Et + ̂ Et+1 t + 1 Dt+1 = Dt + ̂ Dt+1 t + 1 pt+1 = ̂ pt+1 
  50. ෩ྗൃిɾ஝ిγεςϜͷྫ ໨తؔ਺  ͨͩ͠ɼ w  w ͕ط஌ͱ͢Δ max π

    𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} St+1 = SM (St , xt = Xπ (St), Wt+1) (S0 , W1 , W2 , . . . , WT) 
  51. ෆ࣮֬ੑͷऔΓѻ͍ 

  52. ࣌ܥྻϞσϧΛ࢖͏ྫʢՁ֨ʣ w ʢ"3*."Ϟσϧʣ ͜ͷϞσϧΛఆΊΔύϥϝʔλ͸ɼ    pt+1 = θ0

    pt + θ1 pt−1 + θ2 pt−2 + ϵp t+1 (θ0 , θ1 , θ2) pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 ¯ θ¯ pt + ϵp t+1  ¯ θ = θ0 θ1 θ2 , ¯ pt = pt pt−1 pt−2 
  53. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t 
  54. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 
  55. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 
  56. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  ੍໿৚݅  w  w St =

    (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) xEB t + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 ભҠؔ਺ 
  57. ύϥϝʔλͷධՁɾߋ৽  ͷධՁ͕ඞཁ pt+1 = θ0 pt + θ1 pt−1

    + θ2 pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt = (¯ θt0 , ¯ θt1 , ¯ θt2) 
  58. ύϥϝʔλͷධՁɾߋ৽  ͷධՁ͕ඞཁ pt+1 = θ0 pt + θ1 pt−1

    + θ2 pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt = (¯ θt0 , ¯ θt1 , ¯ θt2) ͱ͓͘ ¯ Ft (¯ pt | ¯ θt) = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 = ¯ θt ⊤ ¯ pt ϵt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 − pt+1 = ¯ Fprice t (¯ pt |θt) − pt+1 ¯ pt = (pt , pt−1 , pt−2 )⊤ 
  59. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ pt = (pt , pt−1 , pt−2 )⊤ ¯ θt+1,0 = ¯ θt,0 + 1 γt + m0 pt + m1 pt−1 + m2 pt−2 ظ ʹ͓͚ Λɼ Λ༻͍ͯߋ৽͢Δʢஞ࣍࠷খೋ৐ʣ t + 1 ¯ θt+1,0 ¯ θt,0 , pt , pt−1 , pt−2 
  60. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt+1,0 ¯ θt+1,1 ¯ θt+1,2 = ¯ θt,0 ¯ θt,1 ¯ θt,2 + 1 γt m00 m01 m02 m10 m11 m12 m20 m21 m22 pt pt−1 pt−2 ¯ pt = (pt , pt−1 , pt−2 )⊤ ظ ʹ͓͚ Λɼ Λ༻͍ͯߋ৽͢Δʢஞ࣍࠷খೋ৐ʣ t + 1 ¯ θt+1,0 ¯ θt,0 , pt , pt−1 , pt−2 
  61. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1   ϵt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 − pt+1 = ¯ Fprice t (¯ pt |θt) − pt+1 ¯ θt+1 = ¯ θt + 1 γt Mt ¯ pt ϵt+1 Mt+1 = Mt − 1 γt Mt ¯ pt (¯ pt) ⊤ Mt  γt+1 = 1 − (¯ pt) ⊤ Mt ¯ pt 
  62. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 , ¯ θt+1 = ¯ θt + 1 γt Mt ¯ pt ϵt+1 (Rt , Dt , Et , (pt , pt−1 , pt−2), (¯ θt , Mt)) 
  63. ܾఆ ͷఆΊํ xt ܾఆ ΛܾΊΔํ๏ͷ͜ͱΛɼํࡦ ͱ͍͏ w ঢ়ଶ ͱύϥϝʔλ ͕༩͑ΒΕͨͱ͖ͷܾఆ

     ͜ͷΑ͏ͳํࡦ ΛٻΊΔ͜ͱ͕Ͱ͖Δ͔ʁɹ xt π xt = Xπ (St |θ) St θ xt π 
  64. ܾఆ ͷఆΊํ xt ֶशʹΑΔํࡦͷܾΊํ  ͱͯ͠े෼ͳֶशσʔλ͕ඞཁ  ࢀরද  ύϥϝτϦοΫϞσϧ

     ϊϯύϥϝτϦοΫϞσϧ min f,θ 1 N N ∑ n=1 (yn − f(xn |θ)) 2 (xn, yn) ଟஈ֊ͷҙࢥܾఆ max π 1 N N ∑ n=1 T ∑ t=0 C (St , xt) 
  65. ܾఆ ͷఆΊํ xt ֶशʹΑΔํࡦͷܾΊํ  ͱͯ͠े෼ͳֶशσʔλ͕ඞཁ min f,θ 1 N

    N ∑ n=1 (yn − f(xn |θ)) 2 (xn, yn) ଟஈ֊ͷҙࢥܾఆ max π 1 N N ∑ n=1 T ∑ t=0 C (St , xt) ؔ਺Λ୳͢ ํࡦΛ୳͢ 
  66. ํࡦ ํࡦͱ͸ɼঢ়ଶ͔Βߦಈʢʹܾఆʣ΁ͷࣸ૾ ɹঢ়ଶ͕༩͑ΒΕͨΒɼͦͷͱ͖ʹͱΔ΂͖ߦಈ͕ܾ·Δ΋ͷ ແݶʹ͋Δࣸ૾ͷͳ͔͔ΒͲ͏ܾΊΔ͔ʁɹ 

  67. ํࡦͷઃܭํ਑ " ํࡦ୳ࡧ 1PMJDZ4FBSDI  ໨తΛ࠷దԽ͢Δؔ਺Λݟ͚ͭΔ  # ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO

     ݱࡏͷܾఆ͕কདྷʹٴ΅͢ӨڹΛۙࣅ͢Δ max π=( f,θ) 𝔼 { T ∑ t=0 C (St , Xπ t (St |θ)) |S0} X* t (St) = BSHNBY ( C(St , xt ) + 𝔼 { max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt}) xt 
  68. ํࡦͷઃܭํ਑ " ํࡦ୳ࡧ 1PMJDZ4FBSDI  ໨తΛ࠷దԽ͢Δํࡦ Λݟ͚ͭΔɹʢʹؔ਺ ͱͦͷύϥϝʔλ Λݟ͚ͭΔʣ 

    # ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO  ݱࡏͷܾఆ͕কདྷʹٴ΅͢ӨڹΛۙࣅ͢Δ͜ͱͰɼܾఆ ΛఆΊΔํ๏Λಛఆ͢Δ π f θ max π=( f,θ) 𝔼 { T ∑ t=0 C (St , Xπ t (St |θ)) |S0} xt X* t (St) = BSHNBY ( C(St , xt ) + 𝔼 { max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt}) xt 
  69. "ํࡦ୳ࡧ 1PMJDZTFBSDI  ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶃ ࢀরද ᶄ ύϥϝτϦοΫؔ਺

    ᶅ ϊϯύϥϝτϦοΫؔ਺  ίετؔ਺ۙࣅ DPTUGVODUJPOBQQSPYJNBUJPOT  w ֬ఆతϞσϧΛ֬཰తཁૉΛѻ͏ͨΊʹमਖ਼͢Δ XCFA(St |θ) = BSHNBY xt ¯ Cπ (St , xt |θ) 
  70. "ํࡦ୳ࡧ 1PMJDZTFBSDI  ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶃ ࢀরද w ৔߹Θ͚धཁ͕˓˓ͷͱ͖͸ɼ99͚ͩੜ࢈͢Δ

    ᶄ ύϥϝʔλΛ࣋ͭؔ਺ w ࡏݿ؅ཧͷ ํࡦࡏݿ͕ ·ͰݮͬͨΒ ͚ͩิॆ͢Δɼχϡʔϥϧ ωοτ ᶅ ϊϯύϥϝτϦοΫؔ਺ w Χʔωϧճؼɼਂ૚χϡʔϥϧωοτϫʔΫ (s, S) s S − s 
  71. "ํࡦ୳ࡧ w ؔ਺ʹج͍ͮͨํࡦʢྫ͑͹ɼࡏݿ؅ཧͷ ํࡦʣ (s, S) S s 

  72. "ํࡦ୳ࡧ w ํ਑ ؔ਺ʹج͍ͮͨํࡦͷ৔߹ʢྫ͑͹ɼࡏݿ؅ཧͷ ํࡦʣ (s, S) w खॱB ύϥϝʔλΛ࣋ͭؔ਺ΛఆΊΔ

    w खॱC ύϥϝʔλͷνϡʔχϯάΛߦ͏ ྫ ࡏݿ؅ཧ w खॱB  ํࡦΛ࠾༻͢Δ͜ͱʹ͢Δ w खॱC  ͷ஋ͱ ͷ஋ΛνϡʔχϯάͰఆΊΔ (s, S) s S 
  73. "ํࡦ୳ࡧ w ໨తؔ਺Λઃఆ͠ɼαϯϓϧʹରͯ͠Α͘ৼΔ෣͏Α͏ʹํࡦΛܾఆ͢Δ ؔ਺ͷΫϥε ℱ ؔ਺ ͸ύϥϝʔλ ͷ஋ʹΑͬͯৼΔ෣͍͕ఆ·Δ f ∈

    ℱ θ ∈ Θf ํࡦ ͸ ͱ ʹΑͬͯఆ·Δɽɹɹ π f θ π = (f, θ) 
  74. "ํࡦ୳ࡧ w ໨తؔ਺Λઃఆ͠ɼαϯϓϧʹରͯ͠Α͘ৼΔ෣͏Α͏ʹํࡦΛܾఆ͢Δ ํࡦ ͸ ͱ ʹΑͬͯఆ·Δɽɹɹ π f θ

    π = (f, θ) ࣍ͷؔ਺Λ࠷దԽ͢ΔΑ͏ʹɼํࡦ ΛఆΊΔ π max π 𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} 
  75. "ํࡦ୳ࡧ w ໨తؔ਺Λઃఆ͠ɼαϯϓϧʹରͯ͠Α͘ৼΔ෣͏Α͏ʹํࡦΛܾఆ͢Δ ํࡦ ͸ ͱ ʹΑͬͯఆ·Δɽɹɹ π f θ

    π = (f, θ) ࣍ͷؔ਺Λ࠷దԽ͢ΔΑ͏ʹɼํࡦ ΛఆΊΔ π max π 𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} w ॳظঢ়ଶ ͔Β͸͡Ίͯɼํࡦ ʹΑͬͯظ ͷܾఆ ΛఆΊΔɹ w ֤ظͷد༩ ͷ࿨ͷظ଴஋Λ࠷େԽ͢ΔΑ͏ͳํࡦΛٻΊΔ S0 π t Xπ (St) C (St , Xπ (St)) 
  76. "ํࡦ୳ࡧ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0}  ࠷దԽ͸೉͍͠ͷͰɼͭͷઓུͰۙࣅ͢Δ ᶃ ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶄ د༩ؔ਺ۙࣅ DPTUGVODUJPOBQQSPYJNBUJPOT 
  77. "ํࡦ୳ࡧᶃํࡦؔ਺ۙࣅ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0} ํࡦؔ਺ Λۙࣅ͢ΔʢࢀরදɼύϥϝʔλΛ࣋ͭؔ਺౳ʣ ྫʣઢܗؔ਺ۙࣅ  ଞʹ΋ɼඇઢܗؔ਺ɼχϡʔϥϧωοτͳͲΛ࢖͏͜ͱͰ͖Δ Xπ (St) Xπ (St |θ) = θ0 + θ1 ϕ1 (St) + θ2 ϕ2 (St) 
  78. "ํࡦ୳ࡧᶄد༩ؔ਺ۙࣅ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0} د༩ؔ਺ Λ࠷େԽ͢ΔܾఆΛ༻͍Δ  C (St , Xπ(St )) Xπ (St |θ) = BSHNBY ¯ Cπ t (St , x|θ) x ∈ 𝒳 π t (θ) ¯ Cπ t (St , x|θ) C(St , Xπ(St )) Λ Ͱۙࣅ͢Δ 
  79. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  80. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  81. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  82. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  83. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 S2 
  84. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt xt Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) 
  85. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt xt Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} Λ ɹͰۙࣅ͢Δ Vt+1 (St+1 ) 
  86. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 S2 
  87. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 x1 (S1 , x1 ) xt W1 W2 S2 ɹ Vt+1 (St+1 ) 
  88. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S11 S12 S10 S0 (S0 , x0 ) x0 S1 xt W1 ɹ Vt+1 (St+1 ) ɹ V1 (S10 ) ɹ V1 (S11 ) ɹ V1 (S12 ) X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) 
  89. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) 
  90. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt Λ 
  91. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt Λ ۙࣅ͢Δ 
  92. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St

    xt St Sx t St+1 Sx t+1 St+2 ֬཰తਪҠ 
  93. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) 
  94. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) 
  95. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) C(St , xt ) 
  96. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) C(St , xt ) Vx t (Sx t ) 
  97. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } 
  98. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) 
  99. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) Wt+1 
  100. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) Wt+1 Vt+1 (St+1 ) 
  101. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) 
  102. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) Vx t (Sx t ) ͸Θ͔Βͳ͍ʢ֓೦తͳ΋ͷʣ ԿΒ͔ͷํ๏Ͱද͢ඞཁ͕͋Δˠ ¯ Vx t (St ) 
  103. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO      X*

    t (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) = BSHNBY xt (C(St , xt ) + ¯ Vx t (Sx t )) 
  104. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO      

    2MFBSOJOH X* t (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) = BSHNBY xt (C(St , xt ) + ¯ Vx t (Sx t )) = BSHNBY xt ¯ Qt (St , xt ) 
  105. ۙࣅಈతܭը๏ w ଟஈ֊ͷҙࢥܾఆ໰୊ʹର͢ΔϞσϧԽͷํ๏ w ෆ֬ఆͳঢ়گΛѻ͏ w ಈతܭը͕࣮ߦͰ͖ͳ͍ͷͰɼকདྷʢޙஈʣͷՁ஋ΛۙࣅͰධՁ͢Δ w ۙࣅͷͨΊʹ͸ֶश΍ؔ਺ۙࣅͳͲΛ༻͍Δ͜ͱ͕Ͱ͖Δ 


[8]ページ先頭

©2009-2025 Movatter.jp