Speeding up the execution-time of Crystals-Kyber PQC Algorithm on FPGA

Document Type : Original Article

Authors

1 Master's student, Boali Sina University, Hamadan, Iran

2 Assistant Professor, Bu- Ali Sina University, Hamadan, Iran

3 Associate Professor, Bu- Ali Sina University, Hamedan, Iran

Abstract

Quantum computers have much more computing power than classical computers and this has created a challenge in the field of public-key cryptography algorithms, which is predicted quantum computers will reach the computational power to break existing public-key cryptography algorithms by 2030. To solve this problem, NIST published a call for post-quantum cryptography algorithms. Implementing these algorithms faces challenges such as execution time and resources. One of the algorithms that made it to the third round is the CRYSTALS-KYBER algorithm. In this algorithm, by optimizing the NTT module, the execution time is reduced. Usually, the implementation of NTT is created with radix-2, but in the proposed method, radix-4 is used, and this reduces the execution time. Changes to NTT are required to implement radix-4 NTT. DIT is used to implement NTT and DIF is used to implement INTT. In NTT and INTT formulas changes are made to the twiddle factors and the values of the twiddle factors stored to the ROM. In the following, we compared radix-4 butterfly unit with radix-2 butterfly unit. By reusing results in CT and GS butterfly units, we need four multiplications, additions, and subtractions, and the structure of radix-4 butterfly unit is mentioned. The memory unit uses eight RAMs to increase read and write speeds, four of which are for writing and the remaining four are for reading. It is also necessary to make corrections to the NTT parameters which are suitable for implementation on Kyber. Next, we implemented the proposed method on two FPGA Artix-7 and Virtex-7 using Vivado software. In the implementation on Artix-7 and Virtex-7 in exchange for a slight increase in the resources, the execution time in Artix-7 is reduced by 28.74% and 12.34% compared to similar implementations.

Keywords


Smiley face

[1] ‏Farahmand, F., Nguyen, D. T., Dang, V. B., Ferozpuri, A., & Gaj, K., “Software/Hardware Codesign of the Post Quantum Cryptography Algorithm NTRUEncrypt Using High-Level Synthesis and Register-Transfer Level Design Methodologies,” In 29th International Conference on Field Programmable Logic and Applications (FPL), pp. 225-231, 2019.
[2] Xie, J., Basu, K., Gaj, K., & Guin, U., “Special Session: The Recent Advance in Hardware Implementation of Post-Quantum Cryptography,”In IEEE 38th VLSI Test Symposium (VTS), pp. 1-10, 2020.
[3] Avanzi, R., Bos, J., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V., Schanck, J.M., Schwabe, P., Seiler, G. & Stehlé, D., “CRYSTALS-Kyber algorithm specifications and supporting documentation,” NIST PQC Round 2, 2017.
[4] Yarman, F., Mert, A.C., Öztürk, E. & Savaş, E., “A hardware accelerator for polynomial multiplication operation of CRYSTALS-KYBER PQC scheme,” In 2021 Design, Automation & Test in Europe Conference & Exhibition pp. 1020-1025, 2021.
[5] Derya, K., Mert, A.C., Öztürk, E. & Savaş, E., “CoHA-NTT: A Configurable Hardware Accelerator for NTT-based Polynomial Multiplication,” Microprocessors and Microsystems, pp. 104-451, 2022.
[6] Xing, Y. & Li, S., “A compact hardware implementation of CCA-secure key exchange mechanism CRYSTALS-KYBER on FPGA,” IACR Transactions on Cryptographic Hardware and Embedded Systems, pp. 328-356, 2021.
[7] Guo, W., Li, S. & Kong, L., “An Efficient Implementation of KYBER,” IEEE Transactions on Circuits and Systems II: Express Briefs, 2021.
[8] Zhang, C., Liu, D., Liu, X., Zou, X., Niu, G., Liu, B. & Jiang, Q., “Towards efficient hardware implementation of NTT for kyber on FPGAs,” In 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5, 2021.
[9] Chen, X., Yang, B., Yin, S., Wei, S. & Liu, L., “CFNTT: Scalable Radix-2/4 NTT Multiplication Architecture with an Efficient Conflict-free Memory Mapping Scheme,” IACR Transactions on Cryptographic Hardware and Embedded Systems, pp. 94-126, 2022.
[10] Garrido, M., Grajal, J., Sanchez, M.A. & Gustafsson, O., “Pipelined radix-$2^{k} $ feedforward FFT architectures,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems,  pp. 23-32, 2011.
[11] Swartzlander, E.E., Young, W.K. & Joseph, S.J., “A radix 4 delay commutator for fast Fourier transform processor implementation,” IEEE Journal of solid-state circuits, pp. 702-709, 1984.
[12] Bisheh-Niasar, M., Azarderakhsh, R. & Mozaffari-Kermani, M., “High-speed NTT-based polynomial multiplication accelerator for CRYSTALS-Kyber post-quantum cryptography,” Cryptology ePrint Archive, 2021.
[13] Doustimotlagh, S. N., “A New Mechanism for Enhancing the Security of Military Internet of Things by Using Quantum and Classic Cryptography,” Electronic and Cyber Defense, vol. 9, no. 6, pp. 29-49, 2021. (In Persian)
Volume 10, Issue 4 - Serial Number 40
January 2023
Pages 101-110
  • Receive Date: 11 September 2022
  • Revise Date: 08 October 2022
  • Accept Date: 15 October 2022
  • Publish Date: 21 January 2023