Concerning the trade-off between convergence and diversity in solving the multi-trip pickup and delivery Vehicle Routing Problem (VRP), a hybrid Non-dominated Sorting Genetic Algorithm Ⅱ (NSGA-Ⅱ) combining Adaptive Large Neighborhood Search (ALNS) algorithm and Adaptive Neighborhood Selection (ANS), called NSGA-Ⅱ-ALNS-ANS, was proposed. Firstly, considering the influence of the initial population on the convergence speed of the algorithm, an improved regret insertion method was employed to obtain high-quality initial population. Secondly, to improve global and local search capabilities of the algorithm, various destroy-repair operators and neighborhood structures were designed, according to the characteristics of the pickup and delivery problem. Finally, a Best Fit Decreasing (BFD) algorithm based on random sampling and an efficient feasible solution evaluation criterion were proposed to generate vehicle routing schemes. The simulation experiments were conducted on public benchmark instances of different scales, in the comparison experiments with the MA (Memetic Algorithm), the optimal solution quality of the proposed algorithm increased by 27%. The experimental results show that the proposed algorithm can rapidly generate high-quality vehicle routing schemes that satisfy multiple constraints, and outperform the existing algorithms in terms of both convergence and diversity.
Sparse-dense Matrix Multiplication (SpMM) is widely used in the fields such as scientific computing and deep learning, and it is of great importance to improve its efficiency. For a class of sparse matrices with band feature, a new storage format BRCV (Banded Row Column Value) and an SpMM algorithm based on this format as well as an efficient Graphics Processing Unit (GPU) implementation were proposed. Due to the fact that each sparse band can contain multiple sparse blocks, the proposed format can be seen as a generalization of the block sparse matrix format. Compared with the commonly used CSR (Compressed Sparse Row) format, BRCV format was able to significantly reduce the storage complexity by avoiding redundant storage of column indices in sparse bands. At the same time, the GPU implementation of SpMM based on BRCV format was able to make more efficient use of GPU’s shared memory and improve the computational efficiency of SpMM algorithm by reusing the rows of both sparse and dense matrices. For randomly generated band sparse matrices, experimental results on two different GPU platforms show that BRCV outperforms not only cuBLAS (CUDA Basic Linear Algebra Subroutines), but also cuSPARSE based on CSR and block sparse formats. Specifically, compared with cuSPARSE based on CSR format, BRCV has the maximum speedup ratio of 6.20 and 4.77 respectively. Moreover, the new implementation was applied to accelerate the SpMM operator in Graph Neural Network (GNN). Experimental results on real application datasets show that BRCV outperforms cuBLAS and cuSPARSE based on CSR format, also outperforms cuSPARSE based on block sparse format in most cases. In specific, compared with cuSPARSE based on CSR format, BRCV has the maximum speedup ratio reached 4.47. The above results indicate that BRCV can improve the efficiency of SpMM effectively.
Quantum Dynamics Framework (QDF) is a basic iterative process of optimization algorithm with representative and universal significance, which is obtained under the quantum dynamics model of optimization algorithm. Differential acceptance is an important mechanism to avoid the optimization algorithm falling into local optimum and to solve the premature convergence problem of the algorithm. In order to introduce the differential acceptance mechanism into the QDF, based on the quantum dynamics model, the differential solution was regarded as a potential barrier encountered in the process of particle motion, and the probability of particles penetrating the potential barrier was calculated by using the transmission coefficient in the quantum tunneling effect. Thus, the differential acceptance criterion of quantum dynamics model was obtained: Potential Barrier Estimation Criterion (PBEC). PBEC was related to the height and width of the potential barrier and the quality of the particles. Compared with the classical Metropolis acceptance criterion, PBEC can comprehensively estimate the behavior of the optimization algorithm when it encounters the differential solution during sampling. The experimental results show that, the QDF algorithm based on PBEC has stronger ability to jump out of the local optimum and higher search efficiency than the QDF algorithm based on Metropolis acceptance criterion, and PBEC is a feasible and effective differential acceptance mechanism in quantum optimization algorithms.
The initial cluster number of the K-means clustering algorithm is randomly determined, a large number of redundant features are contained in the original datasets, which will lead to the decrease of clustering accuracy, and Cuckoo Search (CS) algorithm has the disadvantages of low convergence speed and weak local search. To address these issues, a K-means clustering algorithm combined with Dynamic CS Feature Selection (DCFSK) was proposed. Firstly, an adaptive step size factor was designed during the Levy flight phase to improve the search speed and accuracy of the CS algorithm. Then, to adjust the balance between global search and local search, and accelerate the convergence of the CS algorithm, the discovery probability was dynamically adjusted. An Improved Dynamic CS algorithm (IDCS) was constructed, and then a Dynamic CS-based Feature Selection algorithm (DCFS) was built. Secondly, to improve the calculation accuracy of the traditional Euclidean distance, a weighted Euclidean distance was designed to simultaneously consider the contribution of samples and features to distance calculation. To determine the selection scheme of the optimal number of clusters, the weighted intra-cluster and inter-cluster distances were constructed based on the improved weighted Euclidean distance. Finally, to overcome the defect that the objective function of the traditional K-means clustering only considers the distance within the clusters and does not consider the distance between the clusters, a objective function based on the contour coefficient of median was proposed. Thus, a K-means clustering algorithm based on the adaptive cuckoo optimization feature selection was designed. Experimental results show that, on ten benchmark test functions, IDCS achieves the best metrics. Compared to algorithms such as K-means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), DCFSK achieves the best clustering effects on six synthetic datasets and six UCI datasets.
Aiming at the Multi-Depot Vehicle Routing Problem with Time Window (MDVRPTW), an Improved Adaptive Large Neighborhood Search algorithm (IALNS) was proposed. Firstly, a path segmentation algorithm was improved in the stage of constructing the initial solution. Then, in the optimization stage, the designed removal and repair heuristic operators were used to compete with each other to select the optimal operator, a scoring mechanism was introduced for the operators, and the heuristic operator was selected by roulette. At the same time, the iteration cycle was segmented and the operator weight information was dynamically adjusted in each cycle, effectively to prevent the algorithm from falling into local optimum. Finally, simulated annealing mechanism was adopted as the acceptance criterion of the solution. The relevant parameters of the IALNS were determined by experiments on the Cordeau normative instances, and the solution results of the proposed algorithm were compared with other representative research results in this field. The experimental results show that the solution error between IALNS and Variable Neighborhood Search (VNS) algorithm does not exceed 0.8%, even better in some cases; compared with the multi-phase improved shuffled frog leaping algorithm, the average time-consuming of the proposed algorithm is reduced by 12.8%, and the runtime is shorter for most instances. So the above results verify IALNS is an effective algorithm for solving MDVRPTW.
To address the problems of high cross-shard transaction proportion and complex cross-shard transaction validation in optimizing blockchain performance, a blockchain sharding method for reducing cross-shard transaction proportion was proposed. Firstly, from the perspective of data sharding, a blockchain transaction sharding model was constructed, and evaluation indicators for sharding performance were provided. Then, for the long-term historical transaction data in blockchain, the sets of transaction frequencies for the sender and the receiver were constructed from the perspective of accounts’ correlation. Finally, a Frequency-considered Blockchain Transcation Sharding algorithm (FBTS) was designed to solve the problem of high cross-shard proportion in transaction sharding. The proposed algorithm was compared with Random Sharding Algorithm (RSA) and Modular Sharding Algorithm (MSA) under the sharding size of 2,3,5,7,15,20,30 and 50. The proposed algorithm outperformed RSA and MSA in terms of performance indicators such as cross-shard transaction proportion, average cross-shard number of accounts, and weighted average cross-shard number of accounts. In addition, the most accounts and transactions were concentrated at low cross-shard number, indicating that the completion of transaction does not involve multiple shards. The experimental results show that proposed algorithm can effectively reduce the cross-shard transaction proportion and shorten the delay of cross-shard transaction.
Existing image classification models are becoming more and more complex, and the hardware resources and computation time required for computation are increasing. A hybrid Classical-Quantum classification model based on DenseNet (CQDenseNet model) was proposed to address this problem. First, a Variational Quantum Circuit (VQC) that could operate on a Noisy Intermediate-Scale Quantum (NISQ) device was used as a classifier to replace the fully connected layer of DenseNet. Secondly, by using transfer learning, a pre-trained DenseNet model on the ImageNet dataset was utilized as a pre-training model for CQDenseNet. Finally, CQDenseNet model was compared with the benchmark models AlexNet, GoogLeNet, VGG19, ResNet and DenseNet-169 on Chinese Medicine and CIFAR-100 datasets. Experimental results show that CQDenseNet model is more effective than the best-performing benchmark model, with improvements of 2.2 and 7.4 percentage points in accuracy, 2.2 and 7.3 percentage points in precision, 2.2 and 7.1 percentage points in recall, and 2.3 and 6.4 percentage points in F1-score, respectively. It shows that the performance of the hybrid classical-quantum model is better than the classical models.
To effectively reduce cache defragmentation overhead and improve cache hit radio in data-intensive workflows, a persistent client cache for distributed file system was proposed, namely DFS-Cache (Distributed File System Cache), which was designed and implemented based on Non-Volatile Memory (NVM) and was able to ensure data persistence and crash consistency with significantly reducing cold start time. DFS-Cache was consisted of a cache defragmentation mechanism based on virtual memory remapping and a cache space management strategy based on Time-To-Live (TTL). The former was based on the characteristic that NVM could be directly addressed by the memory controller. By dynamically modifying the mapping relationship between virtual addresses and physical addresses, zero-copy memory defragmentation was achieved. The latter was a cold-hot separated grouping management strategy that could enhance cache space management efficiency with the support of the remapping-based cache defragmentation mechanism. Experiments were conducted using real Intel Optane persistent memory devices. Compared with commercial distributed file systems MooseFS and GlusterFS, while employing standard benchmarking programs like Fio and Filebench, the proposed client cache can increase the system throughput by up to 5.73 times and 1.89 times.
Two optimization methods for quantum simulator implemented on Sunway supercomputer were proposed aiming at the problems of gradual scaling of quantum hardware and insufficient classical simulation speed. Firstly, the tensor contraction operator library SWTT was reconstructed by improving the tensor transposition strategy and computation strategy, which improved the computing kernel efficiency of partial tensor contraction and reduced redundant memory access. Secondly, the balance between complexity and efficiency of path computation was achieved by the contraction path adjustment method based on data locality optimization. Test results show that the improvement method of operator library can improve the simulation efficiency of the "Sycamore" quantum supremacy circuit by 5.4% and the single-step tensor contraction efficiency by up to 49.7 times; the path adjustment method can improve the floating-point efficiency by about 4 times with the path computational complexity inflated by a factor of 2. The two optimization methods have the efficiencies of single-precision and mixed-precision floating-point operations for the simulation of Google’s 53-bit, 20-layer quantum chip random circuit with a million amplitude sampling improved from 3.98% and 1.69% to 18.48% and 7.42% respectively, and reduce the theoretical estimated simulation time from 470 s to 226 s for single-precision and 304 s to 134 s for mixed-precision, verifying that the two methods significantly improve the quantum computational simulation speed.
The K-Means algorithms typically utilize Euclidean distance to calculate the similarity between data points when dealing with large-scale heterogeneous data. However, this method has problems of low efficiency and high computational complexity. Inspired by the significant advantage of Hamming distance in handling data similarity calculation, a Quantum K-Means Hamming (QKMH) algorithm was proposed to calculate similarity. First, the data was prepared and made into quantum state, and the quantum Hamming distance was used to calculate similarity between the points to be clustered and the K cluster centers. Then, the Grover’s minimum search algorithm was improved to find the cluster center closest to the points to be clustered. Finally, these steps were repeated until the designated number of iterations was reached or the clustering centers no longer changed. Based on the quantum simulation computing framework QisKit, the proposed algorithm was validated on the MNIST handwritten digit dataset and compared with various traditional and improved methods. Experimental results show that the F1 score of the QKMH algorithm is improved by 10 percentage points compared with that of the Manhattan distance-based quantum K-Means algorithm and by 4.6 percentage points compared with that of the latest optimized Euclidean distance-based quantum K-Means algorithm, and the time complexity of the QKMH algorithm is lower than those of the above comparison algorithms.
Due to differences in resource endowments and industrial policies among different regions, the role of distributed production in improving the competitiveness of manufacturing enterprises is very important. How to use distributed production to enhance the flexibility of mass customization is an important problem to be solved to boost consumer confidence. Combined with the idea of minicells — small manufacturing cells, in the distributed mixed production scenario with the multi-market and multi-product characteristics, an integrated model of distributed factory construction and production scheduling was proposed with the objectives to minimize the operating costs (e.g., labor and transportation costs) and minimize the makespan. By the proposed model, the minicell construction, worker and machine configuration, as well as production strategies for each batch of products were able to be solved. With the help of the proposed model, the enterprises were able to realize the quick release of production capacity and reasonable mixed flow production, so as to realize distributed manufacturing and sales that meet the multi-region, multi-product, and differentiated needs, and reduce the operating cost in the manufacturing process while guaranteeing the throughput. In addition, a Multi-Objective Particle Swarm Optimization (MOPSO) algorithm was designed to solve the proposed model, and was compared with Non-Dominated Sorting Genetic Algorithm Ⅱ (NSGA-Ⅱ) and Multi-Objective Simulated Annealing (MOSA) algorithm. The results of extensive numerical experiments show that MOPSO algorithm outperforms NSGA-Ⅱ and MOSA algorithm with the same running time in terms of three metrics: C-Metric (CM), Mean Ideal Distance (MID) and Maximum Spread (MS). The proposed algorithm can provide a high-quality decision-making scheme of production operation for the miniaturized distributed production system.
The optimization of swarm intelligence algorithms is a main way to improve swarm intelligence algorithms. As the swarm intelligence algorithms are more and more widely used in all kinds of model optimization, production scheduling, path planning and other problems, the demand for performance of intelligent algorithms is also getting higher and higher. As an important means to optimize swarm intelligence algorithms, subgroup strategies can balance the global exploration ability and local exploitation ability flexibly, and has become one of the research hotspots of swarm intelligence algorithms. In order to promote the development and application of subgroup strategies, the dynamic subgroup strategy, the subgroup strategy based on master-slave paradigm, and the subgroup strategy based on network structure were investigated in detail. The structural characteristics, improvement methods and application scenarios of various subgroup strategies were expounded. Finally, the current problems and the future research trends and development directions of the subgroup strategies were summarized.
The implementation of the functions of a multiprocessor system relies heavily on the topological properties of the interconnection network of this system. The subnetwork reliability of k-ary n-cube network is an important factor that needs to be taken into account when the computing tasks are processed by the multiprocessor systems constructed with k-ary n-cube as topological structure. In order to accurately and efficiently measure the reliability of the k-ary (n-1)-cube subnetwork in a k-ary n-cube under the probabilistic fault condition, an approximate method to evaluate the reliability of k-ary (n-1)-cube subnetwork based on the Back Propagation (BP) neural network was proposed. Firstly, the generation method for dataset to train BP neural network was given by the aid of the Monte Carlo simulation method and the known upper and lower bounds on the reliability of the k-ary (n-1)-cube subnetwork. Then, the BP neural network model for evaluating the reliability of the k-ary (n-1)-cube subnetwork was constructed on the basis of the generated training dataset. Finally, the approximate evaluation results of the k-ary (n-1)-cube subnetwork reliability obtained by the BP neural network model were analyzed and compared with the results obtained by the approximate calculation formula and the evaluation method based on Monte Carlo simulation. The results obtained by the proposed method were more accurate compared with the approximate calculation formula, and the evaluation time of the proposed method was reduced by about 59% on average compared with the evaluation method based on Monte Carlo simulation. Experimental results show that the proposed method has certain advantages in balancing accuracy and efficiency.
The purpose of the task scheduling strategy for multiple data centers is to allocate computing tasks to different servers in each data center to improve the resource utilization and energy efficiency. Therefore, a deep reinforcement learning-based integrated scheduling strategy for multiple data center was proposed, which is divided into two stages: data center selection and task allocation within the data centers. In the multiple data centers selection stage, the computing power resources were integrated to improve the overall resource utilization. Firstly, a Deep Q Network (DQN) with Prioritized Experience Replay (PER-DQN) was used to obtain the communication paths to each data center in the network with data centers as nodes. Then, the resource usage cost and network communication cost were calculated, and the optimal data center was selected according to the principle that the sum of the two costs is minimum. In the task allocation stage, firstly, in the selected data center the computing tasks were divided and added to the scheduling queue according to the First-Come First-Served (FCFS) principle. Then, combining the computing device status and ambient temperature, the task allocation algorithm based on Double DQN (Double DQN) was used to obtain the optimal allocation strategy, thereby selecting the server to perform the computing task, avoiding the generation of hot spots and reducing the energy consumption of refrigeration equipment. Experimental results show that the average total cost of PER-DQN-based data center selection algorithm is reduced by 3.6% and 10.0% respectively compared to those of Computing Resource First (CRF) and Shortest Path First (SPF) path selection methods. Compared to Round Robin scheduling (RR) and Greedy scheduling (Greedy) algorithms, the Double DQN-based task deployment algorithm reduces the average Power Usage Effectiveness (PUE) by 2.5% and 1.7% respectively. It can be seen that the proposed strategy can reduce the total cost and data center energy consumption effectively, and realize the efficient operation of multiple data centers.
The rapid development of crowdsourcing has enriched workers’ experience and skills of workers, making them more aware of tasks and tend to complete multiple tasks at the same time. Therefore, assigning tasks according to workers’ subjective preferences has become a common way of task assignment. However, out of personal interests, workers may take malicious bidding behaviors to obtain higher utility. It is detrimental to the development of crowdsourcing platforms. To this end, an incentive mechanism of crowdsourcing multi-task assignment against malicious bidding was proposed, named GIMSM (Greedy Incentive Mechanism for Single-Minded). First, a linear ratio was defined as the allocation basis by this mechanism. Then, according to the greedy strategy, from a sequence of increasing worker ratios, tasks were selected and assigned. Finally, the workers selected by allocation algorithm were paid according to payment function, and the result of task assignment was obtained. The experiments were conducted on Taxi and Limousine Commission Trip Record Data dataset. Compared to TODA (Truthful Online Double Auction mechanism), TCAM (Truthful Combinatorial Auction Mechanism) and FU method, GIMSM’s average quality level of task results under different numbers of workers increased by 25.20 percentage points, 13.20 percentage points and 4.40 percentage points, respectively. GIMSM’s average quality level of task results under different numbers of tasks increased by 26.17 percentage points, 16.17 percentage points and 9.67 percentage points, respectively. In addition, the proposed mechanism GIMSM satisfies individual rationality and incentive compatibility, and can obtain task assignment results in linear time. The experimental results show that the proposed mechanism GIMSM has good anti-malicious bidding performance, and has a better performance on the crowdsourcing platforms with a large amount of data.
The difficulty of solving constrained multi-objective optimization problems lies in balancing objective optimization and constraint satisfaction, while balancing the convergence and diversity of solution sets. To solve complex constrained multi-objective optimization problems with large infeasible regions and small feasible regions, a constrained multi-objective evolutionary algorithm based on Two-Stage search and Dynamic Resource Allocation (TSDRA) was proposed. In the first stage, infeasible regions were crossed by ignoring constraints; in the second stage, two kinds of computing resources were allocated dynamically to coordinate local exploitation and global exploration, while balancing the convergence and diversity of the algorithm. The simulation results on LIRCMOP and MW series test problems show that compared with four representative algorithms of Constrained Multi-objective Evolutionary Algorithm with Multiple Stages (CMOEA-MS), Two-phase (ToP), Push and Pull Search (PPS) and Multi Stage Constrained Multi-Objective evolutionary algorithm (MSCMO), the proposed algorithm obtains better results in both Inverted Generational Distance (IGD) and HyperVolume (HV). TSDRA obtains 10 best IGD values and 9 best HV values on LIRCMOP series test problems, and 9 best IGD values and 10 best HV values on MW series test problems, indicating that the proposed algorithm can effectively solve problems with large infeasible regions and small feasible regions.
Programming on domestic high-performance many-core processors has requirement of using the lowest-level interface to develop software, making programming and debugging very difficult. Moreover, the limitations of programming models for high-performance software on these platforms and the absence of common computing software are identified as factors that contribute to repetitive development work. Aiming at the above problems, a generalized programming model and corresponding support library were realized: on the one hand, the thread-level parallelism of domestic high-performance many-core processors based on the message queue mechanism was developed; on the other hand, the data-level parallelism on slave cores based on the Single Instruction Multiple Data (SIMD) programming model was developed. Firstly, the architecture of the domestic high-performance multicore processor was abstracted. Then, a message queue mechanism was designed for the proposed model, along with a set of heterogeneous parallel programming interfaces, including system parameter interface, slave core thread control interface, message queue interface, and SIMD abstraction interface. Finally, a new software development model and methodology for high-performance computing were formed on the basis of the above, which was convenient for users to develop parallel computing software based on domestic high-performance many-core processors. The results of performance transmission test show that the transmission bandwidth of the proposed model on domestic many-core processors generally reaches 90% of the peak DMA (Direct Memory Access) bandwidth when a few multi-cores are turned on; and that the transmission bandwidth of the message queue model generally reaches 70% of the peak DMA bandwidth when a large number of multi-cores are turned on. In matrix multiplication experiments, the performance of the proposed model reaches 90% of the performance of the system’s original primitives for transferring matrices and calculating them; in password guessing system, the performance of the proposed model code is basically the same as that of the code developed by using the lowest-level interface directly. The proposed generalized programming model and support framework make the High Performance Computing (HPC) software development easier and more portable, which can help to promote the development of domestic independent HPC software.
SATisfiability problem (SAT) is a NP complete problem, which is widely used in artificial intelligence and machine learning. Exact SATisfiability problem (XSAT) is an important subproblem of SAT. Most of the current research on XSAT is mainly at the theoretical level, but few efficient solution algorithms are studied, especially the stochastic local search algorithms with efficient verifiability. To address above problems and analyze some properties of both basic and equivalent coding formulas, a stochastic local search algorithm WalkXSAT was proposed for solving XSAT directly. Firstly, the random local search framework was used for basic search and condition determination. Secondly, the appropriate unsatisfiable scoring value of the text to which the variables belonged was added, and the variables that were not easily and appropriately satisfied were prioritized. Thirdly, the search space was reduced using the heuristic strategy of anti-repeat selection of flipped variables. Finally, multiple sources and multiple formats of examples were used for comparison experiments. Compared with ProbSAT algorithm, the number variable flips and the solving time of WalkXSAT are significantly reduced when directly solving the XSAT. In the example after solving the basic encoding transformation, when the variable size of the example is greater than 100, the ProbSAT algorithm is no longer effective, while WalkXSAT can still solve the XSAT in a short time. Experimental results show that the proposed algorithm WalkXSAT has high accuracy, strong stability, and fast convergence speed.
Both the S-box (multiple outputs) in block ciphers and the feedback function in stream ciphers require special Boolean functions to ensure the security of the cipher algorithm. To solve the problems of excessive resource consumption of reconfigurable hardware operation units and low clock frequency caused by Non-Linear Boolean Function (NLBF) in the existing algorithms of stream cipher, a high-efficiency AIC(And-Inverter Cone)-based design scheme for NLBF reconfigurable operation units was proposed, namely RA-NLBF. Based on the theories of cryptography, after analyzing the NLBF characteristics of various stream cipher algorithms and extracting the function features of NLBF including the times of AND terms, the number of AND terms, and the number of input ports, an NLBF simplification method based on the dual-logic hybrid form of “Mixed Polarity Reed-Muller (MPRM)” and “Traditional Boolean function (TB)” was proposed, which reduced the number of NLBF AND terms by 29% and formed an NLBF expression suitable for the AIC. Based on the simplified expression characteristics, such as the distribution of the number of AND terms and the times of AND terms, reconfigurable AIC units and interconnection networks were designed to form the reconfigurable units that can satisfy the NLBF operation in the existing public stream cipher algorithms. The proposed RA-NLBF was verified by logic synthesis based on CMOS 180 nm technology, and the results show that the area of RA-NLBF is 12 949.67 μm2, and the clock frequency reaches 505 MHz, which is a 59.7% reduction in area and a 37.3% increase in clock frequency compared with Reconfigurable Logic Unit for Sequence Cryptographic (RSCLU), an existing method with the same function.
Exact cover problems are NP complete problems in combinatorial optimization, and it is difficult to solve them in polynomial time by using classical algorithms. In order to solve this problem, on the open source quantum computing framework qiskit, a quantum circuit solution based on Quantum Approximate Optimization Algorithm (QAOA) was proposed, and Constrained Optimization BY Linear Approximation (COBYLA) algorithm based on the simplex method was used to optimize the parameters in the quantum logic gates. Firstly, the classical Ising model was established through the mathematical model of the exact cover problem. Secondly, the classical Ising model was quantized by using the rotation variable in quantum theory, and then the Pauli rotation operator was used to replace the rotation variable to obtain the quantum Ising model and the problem Hamiltonian, which improved the speed of QAOA in finding the optimal solution. Finally, the expected expression of the problem Hamiltonian was obtained by the accumulation of the product of the unitary transformation with the mixed Hamiltonian as the generator and the unitary transformation with the problem Hamiltonian as the generator, and the generative quantum circuit was designed accordingly. In addition, the classical processor was used to optimize the parameters in the two unitary transformations to adjust the expected value of the problem Hamiltonian, thereby increasing the probability of solution. The circuit was simulated on qiskit, IBM’s open source quantum computing framework. Experimental results show that the proposed scheme can obtain the solution of the problem in polynomial time with a probability of 95.6%, which proves that the proposed quantum circuit can find a solution to the exact cover problem with a higher probability.
Route planning is very important for the task execution of Unmanned Aerial Vehicle (UAV) swarm, and the computation is usually complex in high dimensional scenarios. Swarm intelligence has provided a good solution for this problem. Particle Swarm Optimization (PSO) algorithm is especially suitable for route planning problem because of its advantages such as few parameters, fast convergence and simple operation. However, PSO algorithm has poor global search ability and is easy to fall into local optimum when applied to route planning. In order to solve the problems above and improve the effect of UAV swarm route planning, a Dynamic Cluster Particle Swarm Optimization (DCPSO) algorithm was proposed. Firstly, artificial potential field method and receding horizon control principle were used to model the task scenario of route planning problem of UAV swarm. Secondly, Tent chaotic map and dynamic cluster mechanism were introduced to further improve the global search ability and search accuracy. Finally, DCPSO algorithm was used to optimize the objective function of the model to obtain each trajectory point selection of UAV swarm. On 10 benchmark functions with different combinations of unimodal/multimodal and low-dimension/high-dimension, simulation experiments were carried out. The results show that compared with PSO algorithm, Pigeon-Inspired Optimization (PIO), Sparrow Search Algorithm (SSA) and Chaotic Disturbance Pigeon-Inspired Optimization (CDPIO) algorithm, DCPSO algorithm has better optimal value, mean value and variance, better search accuracy and stronger stability. Besides, the performance and effect of DCPSO algorithm were demonstrated in the route planning application instances of UAV swarm simulation experiments.
Aiming at the power problem of Automated Guided Vehicle (AGV) in the process of performing tasks in Automated Container Terminal (ACT), an integrated scheduling considering AGV charging strategy based on improved Non-dominated Sorting Genetic Algorithm-Ⅱ (NSGA-Ⅱ) was proposed. Firstly, considering the power consumption of AGV under different operating statuses in the integrated scheduling mode of quay crane, yard crane and AGV, a multi-objective mixed programming model with the goal of minimizing the completion time and total power consumption was established. Secondly, to improve the performance of the traditional NSGA-Ⅱ, an adaptive NSGA-Ⅱ was designed and compared with CPLEX solver and Multi-Objective Partical Swarm Optimization (MOPSO) algorithm on performance. Finally, different charging strategies and equipment number ratios of AGV were designed for experimental research. The experimental results of algorithm comparison show that the solution results of the adaptive NSGA-Ⅱ are improved by 2. 80% and 2. 63% respectively on the two objectives proposed compared with NSGA-Ⅱ. The experimental results of applying the adaptive NSGA-Ⅱ to study the ratio of charging strategies and equipment number ratios show that increasing AGV charging number can reduce AGV charging time, and adjusting the ratio of the equipment number to 3:3:9 and 3:7:3 lead to the highest time utilization of yard crane and AGV respectively. It can be seen that the AGV charging strategy and equipment number ratio can influence the terminal integrated scheduling with multiple equipment.
The k-ary n-cube has many good characteristics, and it has become one of the most commonly used interconnection network topologies in multiprocessor systems. The maintenance ability of system subnetworks plays an important role for the practical applications of the systems when failures occur in the interconnection network. In order to accurately measure the fault tolerance of subnetworks with arbitrary size in a k-ary n-cube, the reliability of k-ary (n-m)-cube subnetworks in a k-ary n-cube in the presence of failures was studied. When k was an odd integer and k was bigger than 2, the upper bound and lower bound on the probability that at least one k-ary (n-m)-cube subnetwork was fault-free in a k-ary n-cube were obtained under the probabilistic fault condition, and an approximate method for evaluating the reliability was proposed. Experimental results show that there is a gradual convergence between the upper bound and lower bound on the k-ary (n-m)-cube subnetwork reliability as the vertex reliability decreases, and the evaluation result obtained by the approximate method is relatively accurate when the vertex reliability is large.
The traditional Deep Learning (DL)-based multi-objective solvers have the problems of low model utilization and being easy to fall into the local optimum. Aiming at these problems, a Multi-objective Optimization model for Unmanned aerial vehicles Trajectory based on Decomposition and Trajectory search (DTMO-UT) was proposed. The proposed model consists of the encoding and decoding parts. First, a Device encoder (Dencoder) and a Weight encoder (Wencoder) were contained in the encoding part, which were used to extract the state information of the Internet of Things (IoT) devices and the features of the weight vectors. And the scalar optimization sub-problems that were decomposed from the Multi-objective Optimization Problem (MOP) were represented by the weight vectors. Hence, the MOP was able to be solved by solving all the sub-problems. The Wencoder was able to encode all sub-problems, which improved the utilization of the model. Then, the decoding part containing the Trajectory decoder (Tdecoder) was used to decode the encoding features to generate the Pareto optimal solutions. Finally, to alleviate the phenomenon of greedy strategy falling into the local optimum, the trajectory search technology was added in trajectory decoder, that was generating multiple candidate trajectories and selecting the one with the best scalar value as the Pareto optimal solution. In this way, the exploration ability of the trajectory decoder was enhanced during trajectory planning, and a better-quality Pareto set was found. The results of simulation experiments show that compared with the mainstream DL MOP solvers, under the condition of 98.93% model parameter quantities decreasing, the proposed model reduces the distribution of MOP solutions by 0.076%, improves the ductility of the solutions by 0.014% and increases the overall performance by 1.23%, showing strong ability of practical trajectory planning of DTMO-UT model.
Aiming at the problem that the heuristic algorithms have unstable path lengths and are easy to fall into local minimum in the process of robot path planning, an Adaptively Adjusted Harris Hawk Optimization (AAHHO) algorithm was proposed. Firstly, the convergence factor adjustment strategy was used to adjust the balance between the global search stage and the local search stage, and the natural constant was used as the base to improve the search efficiency and convergence accuracy. Then, in the global search phase, the elite cooperation guided search strategy was adopted, by three elite Harris hawks cooperatively guiding other individuals to update the positions, so that the search performance was enhanced, and the information exchange among the populations was enhanced through the three optimal positions. Finally, by simulating the intraspecific competition strategy, the ability of the Harris hawks to jump out of the local optimum was improved. The comparative experimental results of function testing and robot path planning show that the proposed algorithm is superior to comparison algorithms such as IHHO(Improve Harris Hawk Optimization) and CHHO(Chaotic Harris Hawk Optimization), in both function testing and path planning, and it has better effectiveness, feasibility and stability in robot path planning.
With the rapid development of Internet of Things (IoT), a large amount of data generated in edge scenarios such as sensors often needs to be transmitted to cloud nodes for processing, which brings huge transmission cost and processing delay. Cloud-edge collaboration provides a solution for these problems. Firstly, on the basis of comprehensive investigation and analysis of the development process of cloud-edge collaboration, combined with the current research ideas and progress of intelligent cloud-edge collaboration, the data acquisition and analysis, computation offloading technology and model-based intelligent optimization technology in cloud edge architecture were analyzed and discussed emphatically. Secondly, the functions and applications of various technologies in intelligent cloud-edge collaboration were analyzed deeply from the edge and the cloud respectively, and the application scenarios of intelligent cloud-edge collaboration technology in reality were discussed. Finally, the current challenges and future development directions of intelligent cloud-edge collaboration were pointed out.
The purpose of research on Multi-Robot Task Allocation (MRTA) is to improve the task completion efficiency of robots in smart factories. Aiming at the deficiency of the existing algorithms in dealing with large-scale multi-constrained MRTA, an MRTA Algorithm Combining Genetic Algorithm and Rolling Scheduling (ACGARS) was proposed. Firstly, the coding method based on Directed Acyclic Graph (DAG) was adopted in genetic algorithm to efficiently deal with the priority constraints among tasks. Then, the prior knowledge was added to the initial population of genetic algorithm to improve the search efficiency of the algorithm. Finally, a rolling scheduling strategy based on task groups was designed to reduce the scale of the problem to be solved, thereby solving large-scale problems efficiently. Experimental results on large-scale problem instances show that compared with the schemes generated by Constructive Heuristic Algorithm (CHA), MinInterfere Algorithm (MIA), and Genetic Algorithm with Penalty Strategy (GAPS), the scheme generated by the proposed algorithm has the average order completion time shortened by 30.02%, 16.86% and 75.65% respectively when the number of task groups is 20, which verifies that the proposed algorithm can effectively shorten the average waiting time of orders and improve the efficiency of multi-robot task allocation.
Genotype imputation can compensate for the missing due to technical limitations by estimating the sample regions that are not covered in gene sequencing data with imputation, but the existing deep learning-based imputation methods cannot effectively capture the linkage among complete sequence loci, resulting in low overall imputation accuracy and high dispersion of batch sequence imputation accuracy. Therefore, FCSA (Fusing Convolution and Self-Attention), an imputation method that fuses convolution and self-attention mechanism, was proposed to address the above problems, and two fusion modules were used to form encoder and decoder to construct network model. In the encoder fusion module, a self-attention layer was used to obtain the correlation among complete sequence loci, and the local features were extracted through the convolutional layer after fusing the correlation to global loci. In the decoder fusion module, the local features of the encoded low-dimensional vector were reconstructed by convolution, and the complete sequence was modeled and fused by self-attention layer. The genetic data of multiple species of animals were used for model training, and the comparison and validation were carried out on Dog, Pig and Chicken datasets. The results show that compared to SCDA (Sparse Convolutional Denoising Autoencoders), AGIC (Autoencoder Genome Imputation and Compression) and U-net, FCSA achieves the highest average imputation accuracy at 10%, 20% and 30% missing rate. Ablation experimental results also show that the design of the two fusion modules is effective in improving the accuracy of genotype imputation.
The domestic DCU (Deep Computer Unit) adopts the parallel execution model of Single Instruction Multiple Thread (SIMT). When the programs are executed, inconsistent control flow is generated in the kernel function, which causes the threads in the warp be executed serially. And that is warp divergence. Aiming at the problem that the performance of the kernel function is severely restricted by warp divergence, a compilation optimization method to reduce the warp divergence time — Partial-Control-Flow-Merging (PCFM) was proposed. Firstly, divergence analysis was performed to find the fusible divergent regions that are isomorphic and contained a large number of same instructions and similar instructions. Then, the fusion profit of the fusible divergent regions was evaluated by counting the percentage of instruction cycles saved after merging. Finally, the alignment sequence was searched, the profitable fusible divergent regions were merged. Some test cases from Graphics Processing Unit (GPU) benchmark suite Rodinia and the classic sorting algorithm were selected to test PCFM on DCU. Experimental results show that PCFM can achieve an average speedup ratio of 1.146 for the test cases. And the speedup of PCFM is increased by 5.72% compared to that of the branch fusion + tail merging method. It can be seen that the proposed method has a better effect on reducing warp divergence.
Most of the Multi-objective Optimization Problems (MOP) in real life are Dynamic Multi-objective Optimization Problems (DMOP), and the objective function, constraint conditions and decision variables of such problems may change with time, which requires the algorithm to quickly adapt to the new environment after the environment changes, and guarantee the diversity of Pareto solution sets while converging to the new Pareto frontier quickly. To solve the problem, an Adaptive Prediction Dynamic Multi-objective Optimization Algorithm based on New Evaluation Index (NEI-APDMOA) was proposed. Firstly, a new evaluation index better than crowding was proposed in the process of population non-dominated sorting, and the convergence speed and population diversity were balanced in different stages, so as to make the convergence process of population more reasonable. Secondly, a factor that can judge the strength of environmental changes was proposed, thereby providing valuable information for the prediction stage and guiding the population to better adapt to environmental changes. Finally, three more reasonable prediction strategies were matched according to environmental change factor, so that the population was able to respond to environmental changes quickly. NEI-APDMOA, DNSGA-Ⅱ-A (Dynamic Non-dominated Sorting Genetic Algorithm-Ⅱ-A), DNSGA-Ⅱ-B (Dynamic Non-dominated Sorting Genetic Algorithm-Ⅱ-B) and PPS (Population Prediction Strategy) algorithms were compared on nine standard dynamic test functions. Experimental results show that NEI-APDMOA achieves the best average Inverted Generational Distance (IGD) value, average SPacing (SP) value and average Generational Distance (GD) value on nine, four and eight test functions respectively, and can respond to environmental changes faster.