Eventos Anais de eventos

COBEM 2023

27th International Congress of Mechanical Engineering

Deep reinforcement learning-based algorithm to replace the PID for controlling satellite axis pointing

Submission Author: Gabriel Goes Aragão Santana , AL
Co-Authors: Gabriel Goes Aragão Santana, Ronan Arraes Jardim Chagas
Presenter: Gabriel Goes Aragão Santana

doi://10.26678/ABCM.COBEM2023.COB2023-0126

Abstract

Attitude is the orientation of a solid body with reference to a chosen reference frame. Satellites must be able to control their attitude throughout their lifetimes to carry out their missions. Attitude plays a role in these tasks, as these require pointing instruments and antennae towards positions in Earth and space, accurate orbital maneuvers, and proper thermal control. Reaction wheels driven by electric motors are frequently employed to store and exchange momentum with the satellite body, providing a way to control the satellite attitude. The applied torque on the wheels must be governed by some control law, which usually uses a proportional-derivative (PD) control. However, fine-tuning these parameters is often laborious and time-consuming, carried out mainly by trial-and-error attempts. Advancing past research in INPE, this work presents an alternative controller development by employing artificial intelligence techniques. In Reinforcement Learning (RL), an agent learns how to behave in an environment employing a scalar reward signal. Control coupled with RL promises some unique and powerful capabilities: improvements in performance in nominal cases, adaptability for varying conditions, fault tolerance in unusual scenarios, attitude control for future capture missions, and a general framework for controller design capable of controlling satellites with masses from a few kilograms up to several tons. This universal feature would provide designers with a quick and robust solution, reducing the mission cost. Recent developments in the RL field led to novel and efficient algorithms able to tackle continuous control problems. Some of these algorithms, namely, Deep Deterministic Policy Gradient (DDPG), Twin-Delayed DDPG (TD3), and Soft Actor-Critic (SAC), are presented and applied to the current problem. Simple heuristics that inject previous knowledge into the agent, which enhances learning, are discussed here. All implementations and training were run on a simulated environment using the parameters of the Amazonia-1 satellite, which was developed by INPE and successfully launched in 2020. We compared the new controller with the PD employed in Amazonia-1. The RL controller outperformed their PD counterparts in most of the scenarios, suggesting their viability and hinting at an extensive array of possible uses

Keywords

attitude control, Satellite, Reinforcement Learning, Intelligent control

DOWNLOAD PDF

‹ voltar para anais de eventos ABCM

REDEFINIR SENHA