Inhalt des Dokuments
Publikationsliste
Zitatschlüssel | Mabrouk:2010:LAS |
---|---|
Autor | Mahmoud Mabrouk |
Jahr | 2010 |
Schule | TU Berlin |
Zusammenfassung | This thesis discusses a new method to linearize the Bellman equation for a special class of problems and tests its resulting algorithm with the state-of-the-art solutions. Reinforcement learning and Dynamic programming are presented and the state-of-the-art algorithms are discussed. The new framework and its mathematical foundations are then introduced. It results in a linear solution to the optimal action both in discrete and continuous domains, and in a new formulation of the cost-to-go function which exchanges the exhaustive search over actions with a linear solution. Later, an online and an offline algorithm are developed from the last results. They are tested against Policy Iteration and Q-Learning in a stochastic variant of the Mountain car problem. Results show a great improvement brought by the new algorithms both in speed and efficiency. Last, the limitations of the new framework are discussed. |
Typ der Publikation | Bachelor Thesis |
Zusatzinformationen / Extras
Direktzugang
Schnellnavigation zur Seite über Nummerneingabe