reinforcement learning

Trajectory modeling via random utility inverse reinforcement learning

We consider the problem of modeling trajectories of drivers in a road network from the perspective of inverse reinforcement learning. Cars are detected by sensors placed on sparsely distributed points on the street network of a city. As rational …

A reinforcement learning approach to the stochastic cutting stock problem

We propose a formulation of the stochastic cutting stock problem as a discounted infinite-horizon Markov decision process. At each decision epoch, given current inventory of items, an agent chooses in which patterns to cut objects in stock in …

Aplicação de aprendizado por reforço ao problema de corte de estoque estocástico

Propõe-se uma formulação do problema de corte de estoque estocástico como um processo de decisão markoviano de horizonte infinito descontado. Em cada época de decisão deve-se escolher as quantidades de itens a serem cortados em antecipação à demanda …