Traffic congestion is a challenging problem faced in everyday life. It has multiple negative effects on average speed, overall total travel time, and fuel consumption; in addition, it is a primary cause of accidents and air pollution. Hence, comes the need for an intelligent reliable traffic control system. The objective of this paper is to optimize the overall traffic congestion in freeways via multiple ramps metering controls. An optimal freeway operation can be reached if we always keep the freeway density within a small margin of the critical ratio for maximum traffic flow. In this paper, we propose a multi-agent reinforcement learning control system for ramp metering. Our system introduces a new microscopic framework at the network level based on collaborative Markov Decision Process modeling and an associated cooperative Q-learning algorithm. The technique incorporates payoff propagation (max-plus algorithm) under the coordination graph framework, particularly suited for optimal control purposes. The proposed system provides three control designs: fully independent, fully distributed, and centralized; suited for different network architectures. Our framework was extensively tested in order to assess the proposed model of the joint payoff, as well as the global payoff. We conducted experiments with heavy traffic flow under the renowned VISSIM traffic simulator so as to evaluate the proposed framework. The experimental results show that we achieved a significant decrease in the total travel time and an increase in the average speed -when compared with the base case- while maintaining an optimal traffic flow. |