On one hand combining Unmanned Aerial Vehicles (UAVs) and Non-Orthogonal Multiple Access (NOMA) is a remarkable direction to sustain the exponentially growing traffic requirements of the forthcoming Sixth Generation (6G) networks. In this paper, we investigate effective Power Allocation (PA) and Trajectory Planning Algorithm (TPA) for UAV-aided NOMA systems to assist multiple survivors in a post-disaster scenario, where ground stations are malfunctioned. Here, the UAV maneuvers to collect data from survivors, which are grouped in multiple clusters within the disaster area, to satisfy their traffic demands. On the other hand, while the problem is formulated as Budgeted Multi-Armed Bandits (BMABs) that optimize the UAV trajectory and minimize battery consumption, challenges may arise in real-world scenarios. Herein, the UAV is the bandit player, the disaster area clusters are the bandit arms, the sum rate of each cluster is the payoff, and the UAV energy consumption is the budget. Hence, to tackle these challenges, two Upper Confidence Bound (UCB) BMAB schemes are leveraged to handle this issue, namely BUCB1 and BUCB2. Simulation results confirm the superior performance of the proposed BMAB solution against benchmark solutions for UAV-aided NOMA communication. Notably, the BMAB-NOMA solution exhibits remarkable improvements, achieving 60% enhancement in the total number of assisted survivors, 80% improvement in convergence speed, and a considerable amount of energy saving compared to UAV-OMA. |