Facial features contain a lot of information and hold significant value in facial attribute and expression analysis tasks, but the diversity and complexity of facial features make facial analysis tasks difficult. Aiming at the above issue, a model of Facial Attribute estimation and Expression Recognition based on contextual channel attention mechanism (FAER) was proposed from the perspective of fine-grained facial features. Firstly, a local feature encoding backbone network based on ConvNext was constructed, and by utilizing the effectiveness of the backbone network in encoding local features, the differences among facial local features were represented adequately. Secondly, a Contextual Channel Attention (CC Attention) mechanism was introduced. By adjusting the weight information on feature channels dynamically and adaptively, both global and local features of deep features were represented, so as to address the limitations of the backbone network ability in encoding global features. Finally, different classification strategies were designed. For Facial Attribute Estimation (FAE) and Facial Expression Recognition (FER) tasks, different combinations of loss functions were employed to encourage the model to learn more fine-grained facial features. Experimental results show that the proposed model achieves an average accuracy of 91.87% on facial attribute dataset CelebA (CelebFaces Attributes), surpassing the suboptimal model SwinFace (Swin transformer for Face) by 0.55 percentage points, and the proposed model achieves accuracies of 91.75% and 66.66% respectively on facial expression datasets RAF-DB and AffectNet, surpassing the suboptimal model TransFER (Transformers for Facial Expression Recognition) by 0.84 and 0.43 percentage points respectively.