The thermal performance index (TPI) known as the efficiency parameter evaluates the highlight of the heat transfer for the same pumping power requirement. The optimum design of a heat exchanger with minimum pressure drops and efficient heat transfer is a big challenging task from an energy-saving point of view. One way to optimize this efficiency parameter is by changing the geometric parameters of the heat exchanger. For this purpose, the multi-objective reinforcement learner non-dominated sorting genetic algorithm (NSGARL) is employed maximizing TPI and Nusselt number and minimizing Fanning friction factor to the optimal design of a double pipe heat exchanger, with perforated baffles in the annulus side. Its performance is compared with a non-dominated sorting genetic algorithm - version II (NSGA-II), and a chaotic non-dominated sorting genetic algorithm (CNSGA). The calculations were carried out using the turbulent shear stress transport k-ω model. The multiobjective optimization techniques presented promising results in terms of the number of nondominated solutions, Euclidean distances to the origin point (reference point), and hypervolume indicators. However, in general, the results based on mean values indicate that the NSGA-RL outperforms the NSGA-II and CNSGA concerning the three performance indicators for the two double pipe heat exchanger evaluated cases. TPI increases 78% for case 1, and 108% for case 2 compared without baffles in a double pipe heat exchanger (DPHE). Numerical results revealed that DPHE employing internal perforated baffles improved heat transfer significantly with Nusselt number increased around 7.93–8.25 times and friction factor increased by 6.5–9.75 times compared with plain DPHE. |