[1] NAKAMOTO S. Bitcoin: a peer-to-peer electronic cash system[EB/OL].[2017-10-10]. https://bitcoin.org/bitcoin.pdf. [2] COURTOIS N T, BAHACK L. On subversive miner strategies and block withholding attack in bitcoin digital currency[J/OL]. arXiv Preprint, 2014, 2014: arXiv:1402.1718(2014-01-28)[2014-12-02]. https://arxiv.org/abs/1402.1718. [3] EYAL I. The miner’s dilemma[C]// Proceedings of the 2015 IEEE Symposium on Security and Privacy. Piscataway, NJ: IEEE, 2015:89-103. [4] EYAL I, SIRER E G. Majority is not enough: bitcoin mining is vulnerable[C]// FC 2014: International Conference on Financial Cryptography and Data Security. Berlin: Springer, 2014: 436-454. [5] KIAYIAS A, KOUTSOUPIAS E, KYROPOULOU M, et al. Blockchain mining games[C]// Proceedings of the 2016 ACM Conference on Economics and Computation. New York: ACM, 2016: 365-382. [6] LEWENBERG Y, BACHRACH Y, SOMPOLINSKY Y, et al. Bitcoin mining pools: a cooperative game theoretic analysis[C]// Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems, 2015: 919-927. [7] LIU X, WANG W, NIYATO D, et al. Evolutionary game for mining pool selection in blockchain networks[J]. IEEE Wireless Communications Letters, 2017, 7(5): 760-763. [8] 唐长兵, 杨珍, 郑忠龙,等. PoW共识算法中的博弈困境分析与优化[J]. 自动化学报, 2017, 43(9):1520-1531.(TANG C B, YANG Z, ZHENG Z L, et al. Game dilemma analysis and optimization of PoW consensus algorithm[J]. Acta Automatica Sinica, 2017, 43(9):1520-1531.) [9] SUTTON R S, McALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]// NIPS 2000: Neural Information Processing Systems. Boston: MIT Press, 2000:1057-1063. [10] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J].Machine Learning, 1992,8(3/4):229-256. [11] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human level control through deep reinforcement learning[J].Nature, 2015,518(7540):529-533. [12] TAMPUU A, MATⅡSEN T, KODELJA D, et al. Multiagent cooperation and competition with deep reinforcement learning[J].PLoS One, 2017, 12(4):e0172395. [13] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J/OL]. arXiv Preprint, 2015, 2015: arXiv:1509.02971[2015-09-09]. https://arxiv.org/abs/1509.02971. [14] 王兵团, 张作泉, 赵平福. 数值分析简明教程(大学数学系列丛书)[M]. 北京:清华大学出版社, 2012:50-60. (WANG B T, ZHANG Z Q, ZHAO P F. Numerical Analysis Concise Tutorial(University Mathematics Series)[M]. Beijing: Tsinghua University Press,2012:50-60.) |