Speaker Identification (SI) in novels aims to determine the speaker of a quotation by its context. This task is of great help in assigning appropriate voices to different characters in the production of audiobooks. However, the existing methods mainly use fixed window values in the selection of the context of quotations, which is not flexible enough and may produce redundant segments, making it difficult for the model to capture useful information. Besides, due to the significant differences in the number of quotations and writing styles in different novels, a small number of labeled samples cannot enable the model to fully generalize, and the labeling of datasets is expensive. To solve the above problems, a novel speaker identification framework that integrates narrative units and reliable labels was proposed. Firstly, a Narrative Unit-based Context Selection (NUCS) method was used to select a suitable length of context for the model to focus highly on the segment closest to the quotation attribution. Secondly, a Speaker Scoring Network (SSN) was constructed with the generated context as input. In addition, the self-training was introduced, and a Reliable Pseudo Label Selection (RPLS) algorithm was designed to compensate for the lack of labeled samples to some extent and screen out more reliable pseudo-label samples with higher quality. Finally, a Chinese Novel Speaker Identification corpus (CNSI) containing 11 Chinese novels was built and labeled. To evaluate the proposed framework, experiments were conducted on two public datasets and the self-built dataset. The results show that the novel speaker identification framework that integrates narrative units and reliable labels is superior to the methods such as CSN (Candidate Scoring Network), E2E_SI and ChatGPT-3.5.
Aiming at the problems that the Total Variation (TV) minimization method easily leads to image over-smoothing and block effects in Low-Dose Computed Tomography (LDCT) image reconstruction, an LDCT image reconstruction method based on low-rank and TV joint regularization was proposed to improve the visual quality of LDCT reconstructed images. Firstly, a low-rank and TV joint regularization based image reconstruction model was established, thus, more accurate and natural reconstruction results were obtained theoretically. Secondly, a low-rank prior with non-local self-similarity property was introduced to overcome the limitations of only using the TV minimization method. Finally, the Chambolle-Pock (CP) algorithm was used to optimize and solve the model, which improved the solution efficiency of the model and ensured the effective solution of the model. The effectiveness of the proposed method was verified under three different LDCT scanning conditions. Experimental results on Mayo dataset show that compared with the PWLS-LDMM (Penalized Weighted Least-Squares based on Low-Dimensional Manifold) method, NOWNUNM (NOnlocal Weighted NUclear Norm Minimization) method and CP method, at 25% dose, the proposed method increases the Visual Information Fidelity (VIF) by 28.39%, 8.30% and 2.93%, respectively; at 15% dose, the proposed method increases the VIF by 29.96%, 13.83% and 4.53%, respectively; at 10% dose, the proposed method increases the VIF by 30.22%, 17.10% and 7.66%, respectively. It can be seen that the proposed method can retain more detailed texture information while removing noise and stripe artifacts, which verifies that the proposed method has better noise artifact suppression capability.
To address the problem of the difficulty of monitoring and controlling enterprise emissions, a Vertical Federated Learning Enterprise Emission Prediction (VFL-EEP) model with integration of electricity data was proposed by considering the premise of secure data sharing and privacy protection. Firstly, within the framework of Vertical Federated Learning (VFL), the logistic regression model was enhanced to allow the separation of data usage and model training without leaking the monitoring data of electricity and environmental protection enterprises. Then, the logistic regression algorithm was improved to incorporate with Paillier encryption technology for ensuring the security of model parameter transmission, thereby solving the issue of insecure communication among participants in VFL effectively. Finally, through experiments on simulated data, the pollution prediction results of the proposed model were compared with those of the centralized logistic regression model. The results show that the proposed model integrates electricity data under the premise of privacy security, and has the accuracy, recall, precision, and F1 value improved by 8.92%, 7.62%, 3.95%, and 11.86%, respectively, realizing the balance between privacy protection and model performance effectively.
Accurate prediction of port traffic flow is a challenging task due to its stochastic uncertainty and time-unsteady characteristics. In order to improve the accuracy of port traffic flow prediction, a port traffic flow prediction model based on knowledge graph and spatio-temporal diffusion graph convolution network, named KG-DGCN-GRU, was proposed, taking into account the external disturbances such as meteorological conditions and the opening and closing status of the port-adjacent highway. The factors related to the port traffic network were represented by the knowledge graph, and the semantic information of various external factors were learned from the port knowledge graph by using the knowledge representation method, and Diffusion Graph Convolutional Network (DGCN) and Gated Recurrent Unit (GRU) were used to effectively extract the spatio-temporal dependency features of the port traffic flow. The experimental results based on the Tianjin Port traffic dataset show that KG-DGCN-GRU can effectively improve the prediction accuracy through knowledge graph and diffusion graph convolutional network, the Root Mean Squared Error (RMSE) is reduced by 4.85% and 7.04% and the Mean Absolute Error (MAE) is reduced by 5.80% and 8.17%, compared with Temporal Graph Convolutional Network (T-GCN) and Diffusion Convolutional Recurrent Neural Network (DCRNN) under single step prediction (15 min).
Deploying the YOLOv8L model on edge devices for road crack detection can achieve high accuracy, but it is difficult to guarantee real-time detection. To solve this problem, a target detection algorithm based on the improved YOLOv8 model that can be deployed on the edge computing device Jetson AGX Xavier was proposed. First, the Faster Block structure was designed using partial convolution to replace the Bottleneck structure in the YOLOv8 C2f module, and the improved C2f module was recorded as C2f-Faster; second, an SE (Squeeze-and-Excitation) channel attention layer was connected after each C2f-Faster module in the YOLOv8 backbone network to further improve the detection accuracy. Experimental results on the open source road damage dataset RDD20 (Road Damage Detection 20) show that the average F1 score of the proposed method is 0.573, the number of detection Frames Per Second (FPS) is 47, and the model size is 55.5 MB. Compared with the SOTA (State-Of-The-Art) model of GRDDC2020 (Global Road Damage Detection Challenge 2020), the F1 score is increased by 0.8 percentage points, the FPS is increased by 291.7%, and the model size is reduced by 41.8%, which realizes the real-time and accurate detection of road cracks on edge devices.
Panoramic videos have attracted wide attention due to their unique immersive and interactive experience. The high bandwidth and low delay required for wireless streaming of panoramic videos have brought challenges to existing network streaming systems. Tile-based viewport adaptive streaming can effectively alleviate the streaming pressure brought by panoramic video, and has become the current mainstream scheme and hot research topic. By analyzing the research status and development trend of tile-based viewport adaptive streaming, the two important modules of this streaming scheme, namely viewport prediction and bit rate allocation, were discussed, and the methods in relevant fields were summarized from different perspectives. Firstly, based on the panoramic video streaming framework, the relevant technologies were clarified. Secondly, the user experience quality indicators to evaluate the performance of the streaming system were introduced from the subjective and objective dimensions. Then, the classic research methods were summarized from the aspects of viewport prediction and bit rate allocation. Finally, the future development trend of panoramic video streaming was discussed based on the current research status.
To address the issue that traditional Sequential Pattern Mining (SPM) does not consider pattern repetition and ignores the effects of utility (unit price or profit) and pattern length on user interest, a Top-k One-off high average Utility sequential Pattern mining (TOUP) algorithm was proposed. The TOUP algorithm mainly includes two core steps: average utility calculation and candidate pattern generation. Firstly, a CSP (Calculation Support of Pattern) algorithm based on the occurrence position of each item and the item repetition relation array was proposed to calculate pattern support, thereby achieving rapid calculation of the average utility of patterns. Secondly, candidate patterns were generated by itemset extension and sequence extension, and a maximum average utility upper bound was proposed. Based on this upper bound, effective pruning of candidate patterns was achieved. Experimental results on five real datasets and one synthetic dataset show that compared to the TOUP-dfs and HAOP-ms algorithms, TOUP algorithm reduces the number of candidate patterns by 38.5% to 99.8% and 0.9% to 77.6%, respectively, and decreases the running time by 33.6% to 97.1% and 57.9% to 97.2%, respectively. Therefore, the algorithm performance of TOUP is better, and it can mine patterns of interests to users more efficiently.
Aiming at the hidden danger of fire caused by electric bicycles and gas tanks taken into elevators, an improved attention mechanism was proposed to detect dangerous goods in elevator scene, and a method based on the mechanism was proposed. With YOLOX-s as the baseline model, firstly, a depthwise separable convolution was introduced in the enhanced feature extraction network to replace the standard convolution, which improved the reasoning speed of the model. Secondly, an Efficient Convolutional Block Attention Module (ECBAM) based on mixed-domain was proposed and embedded into the backbone feature extraction network. In the channel attention part of ECBAM, two fully connected layers were replaced by a one-dimensional convolution, which not only reduced the complexity of Convolutional Block Attention Module (CBAM) but also improved the detection precision. Finally, a multi-frame collaboration algorithm was proposed to reduce the false alarms of dangerous goods’ intrusion into the elevator by combining the dangerous goods detection results of multiple images. Experimental results show that compared with YOLOX-s, the improved model can increase the mean Average Precision (mAP) by 1.05 percentage points, reduce the floating point computational cost by 34.1% and reduce the model size by 42.8%. The improved model reduces false alarms in practical applications and meets the precision and speed requirements of dangerous goods detection in elevator scene.
Betweenness centrality is a common metric for evaluating the importance of nodes in a graph. However, the update efficiency of betweenness centrality in large-scale dynamic graphs is not high enough to meet the application requirements. With the development of multi-core technology, algorithm parallelization has become one of the effective ways to solve this problem. Therefore, a Parallel Algorithm of Betweenness centrality for dynamic networks (PAB) was proposed. Firstly, the time cost of redundant point pairs was reduced through operations such as community filtering, equidistant pruning and classification screening. Then, the determinacy of the algorithm was analyzed and processed to realize parallelization. Comparison experiments were conducted on real datasets and synthetic datasets, and the results show that the update efficiency of PAB is 4 times that of the latest batch-iCENTRAL algorithm on average when adding edges. It can be seen that the proposed algorithm can improve the update efficiency of betweenness centrality in dynamic networks effectively.
The parity blocks of the Maximum-Distance-Separable (MDS) code are all global parity blocks. The length of the reconstruction chain increases with the expansion of the storage system, and the reconstruction performance gradually decreases. Aiming at the above problems, a new type of Non-Maximum-Distance-Separable (Non-MDS) code called local redundant hybrid code Code-LM(s,c) was proposed. Firstly, two types of local parity blocks called horizontal parity block in the strip-set and horizontal-diagonal parity block were added in any strip-sets to reduce the length of the reconstruction chain, and the parity layout of the local redundant hybrid code was designed. Then, four reconstruction formulations of the lost data blocks were designed according to the generation rules of the parity blocks and the common block existed in the reconstruction chains of different data blocks. Finally, double-disk failures were divided into three situations depending on the distances of the strip-sets where the failed disks located and the corresponding reconstruction methods were designed. Theoretical analysis and experimental results show that with the same storage scale, compared with RDP (Row-Diagonal Parity), the reconstruction time of CodeM(s,c) for single-disk failure and double-disk failure can be reduced by 84% and 77% respectively; compared with V2-Code, the reconstruction time of Code-LM(s,c) for single-disk failure and double-disk failure can be reduced by 67% and 73% respectively. Therefore, local redundant hybrid code can support fast recovery from failed disks and improve reliability of storage system.
In order to solve the problem of trajectory privacy leakage caused by the collection of numerous trajectory information of moving objects, a dummy trajectory-based trajectory privacy protection algorithm was proposed. In this algorithm, considering the user’s locations under disclosure, a heuristic rule was designed based on the comprehensive measure of trajectory similarity and location diversity to select the dummy trajectories, so that the generated dummy trajectories were able to effectively hide the real trajectory and sensitive locations. Besides, the trajectory directed graph strategy and the grid-based map strategy were proposed to optimize the execution efficiency of the algorithm. Experimental results on real trajectory datasets demonstrate that the proposed algorithm can effectively protect the real trajectory with high data utility.
Aiming at the problems of current aviation card readers, include poor portability, slow speed and tags' little capacity, a design method of large capacity Radio Frequency Identification (RFID) system based on STM32 was proposed. Using STM32 microprocessor as a core and adopting CR95HF radio chip, a new handled RFID card reader which worked in High Frequency (HF) and supported ISO 15693, ISO 18092 protocols was designed. The design of power, antenna and optimization of software speed, error rate was discussed in detail. A new large compiled capacity passive tag was also designed whose capacity is up to 32KB to form a large capacity RFID system with card reader. The experimental results show that, compared with the traditional card reader, the reading and writing speed of the card reader increases by 2.2 times, error rate reduces by 91.7% and tag capacity increases 255 times. It provides a better choice for fast, accurate and high data requirements of aviation logistics.
As a common disaster on transmission lines, ice disaster affects the safe operation of the power system seriously. A digital twin system for ice shedding of overhead transmission lines was designed and implemented to meet the precise mapping and real-time communication and visualization presentation requirements of de-icing vibrations of actual transmission lines. Firstly, the idea of real-time mapping of physical information was introduced, and the overall framework of the transmission line de-icing digital twin system was built based on the Unity3D platform. Then, an analytical solution based iterative model during any time period was established and solved, and the iterative model was combined with the measured time-varying parameters for segmented iterative correction, so as to complete modeling of the digital twin model. At the same time, the calculated twin data were displayed on the twin interaction platform for transmission line de-icing vibrations. Finally, this system was applied to simulate successive three levels of conductor ice shedding to verify the effectiveness of the model. Experimental results show that the results calculated by the twin model are highly consistent with the actual measurements, and the Mean Absolute Percentage Error (MAPE) of the twin model system is within 0.5%. It can be seen that the proposed method can establish an accurate model that reflects the operating status of the transmission lines.