Aiming at the inadequacy of the existing grasp detection methods that cannot effectively perceive sparse and weak features, leading to performance degradation in robot grasp detection under low-light environments, a robotic grasp detection method that integrated spatial-Fourier domain information for low-light environments was proposed. Firstly, the proposed model utilized an encoder-decoder architecture as its backbone, and performed spatial-Fourier domain feature extraction during the fusion of deep and shallow features within the network. Specifically, in the spatial domain, global contextual information was captured using strip convolutions applied in horizontal and vertical directions, enabling the extraction of information critical to the grasp detection task. In the Fourier domain, image details and texture features were restored by independently modulating amplitude and phase components. Furthermore, a R-CoA (Row-Column Attention) module was incorporated to effectively balance global and local image information, while encoding the relative positional relationships of image rows and columns to emphasize positional information pertinent to grasp tasks. Finally, validation on low-light Cornell, low-light Jacquard, and the constructed low-light C-Cornell datasets demonstrates that the proposed method achieves highest accuracies of 96.62%, 92.01%, and 95.50%, respectively. Specifically, on the low-light Cornell dataset (Gaussian noise and
), the proposed method outperforms GR-ConvNetv2 (Generative Residual Convolutional Neural Network v2) and SE-ResUNet (Squeeze-and-Excitation ResUNet) in accuracy by 2.24 percentage points and 1.12 percentage points, respectively. The proposed method can effectively improve the robustness and accuracy of grasp detection in low-light environments, providing support for robotic grasping tasks under insufficient illumination conditions.