The diagnosis of major depressive disorder predominantly relies on subjective methods, including physician consultations and scale assessments, which may lead to misdiagnosis. EEG (ElectroEncephaloGraphy) offers advantages such as high temporal resolution, low cost, ease of setup, and non-invasiveness, making it a potential quantitative measurement tool for psychiatric disorders, including depressive disorder. Recently, deep learning algorithms have been diversely applied to EEG signals, notably in the diagnosis and classification of depressive disorder. Due to significant redundancy is observed when processing EEG signals through a self-attention mechanism, a convolutional neural network leveraging a Probabilistic sparse Self-Attention mechanism (PSANet) was proposed. Firstly, a limited number of pivotal attention points were chosen in the self-attention mechanism based on the sampling factor, addressing the high computational cost and facilitating its application to extensive EEG data sequences; concurrently, EEG data was amalgamated with patients’ physiological scales for a comprehensive diagnosis. Experiments were executed on a dataset encompassing both depressive disorder patients and a healthy control group. Experimental results show that PSANet exhibits superior classification accuracy and a reduced number of parameters relative to alternative methodologies such as EEGNet.