While there is ample research on hardware design and reconfiguration control for modular self-reconfigurable satellites, relatively few reconfiguration planning algorithms, especially algorithms used in real-world reconfiguration have been developed. Decentralized path planning, which only uses partial observation for each module to make decision is an important problem for real-world task. This article presents partially observable self-reconfiguration path planning, addressing the reconfiguration path planning problem for a single module using partial observations while aiming to maximize the policy learning efficiency. An end-to-end algorithm is proposed by employing a recurrent Q-learning algorithm and a deep neural network, where a Long Short Term Memory network is used to remember useful features from historical observations. Moreover, a 3-D convolutional neural network is used to automatically extract high-level features from observation data and is shown to significantly increase the learning efficiency. Experiments performed on a test rig of electromagnetic self-reconfigurable satellite verified the potency of the proposed algorithm.