Implementation updates (April 15, 2020)
The specification document has been updated to reflect the following recent developments from members of the SIKE team.
SIKE team updates
|Algorithmic improvements (applicable to all platforms)|
New field arithmetic algorithms, available in the Microsoft SIDH library. With the new optimizations, Encaps+Decaps runs in 5.9 msec., 8.2 msec., 16.1 msec. and 24.9 msec. for
SIKEp751, respectively, on a 3.4GHz x64 Intel (Skylake) CPU.
New compressed SIKE implementation, which now includes the improvements described in [Naehrig and Renes, 2019] and [Pereira et al., 2020], available in the Microsoft SIDH library, resulting in the following changes:
- Decapsulation is now ~1.6 times faster than the NIST round 2 compressed SIKE implementation, and only 5-9% slower than uncompressed SIKE. KeyGen is ~1.6 times faster, and Encapsulation is ~1.3 times faster.
- Static library size is 1.5 to 4 times smaller than the NIST round 2 compressed SIKE implementation due to smaller discrete logarithm tables. For example, the
SIKEp503_compressedlibrary size has decreased from 5.6MB to 1.4MB.
- Decapsulation no longer requires any discrete logarithm tables. As a result, decapsulation-only routines now have nearly the same static library size as SIKE without key compression.
- Ciphertext sizes are 12.5% larger than the NIST round 2 compressed SIKE implementation. Public keys are 1 byte larger. Secret keys are ~50% larger, but still remain ~7% smaller than in SIKE without key compression.
New ARMv8 optimized implementations, available in the Microsoft SIDH library, representing a further improvement over previous work published by [Seo et al.] in TCAS (to appear). Encaps+Decaps for
SIKEp751now run in 29.4 ms, 40.9 ms, 94.9 ms and 141.6 ms, respectively, on a 1.992GHz 64-bit ARM Cortex-A72 processor.
New speed-optimized pure hardware (FPGA, Artix-7) implementations by Brian Koziel, A-Bon Ackie, Rami El Khatib, Reza Azarderakhsh, and Mehran Mozaffari-Kermani (preprint, source code). The new results are faster and smaller than previous work. For example,
SIKEp434takes 14.4ms for Encap+Decap and occupies 8K slices, 240 DSPs and 26.5 BRAMs.
New ARM Cortex-M4 assembly optimized implementations by Hwajeong Seo, Mila Anastasova, Amir Jalali, and Reza Azarderakhsh (preprint, source code). The new results are 1.8x faster than previously reported results of [Seo et al.] from CANS 2019. For example,
SIKEp434runs in a total time of 1.08s @168MHz (Encap+Decap). Results for energy and power consumption have been added to the updated specification document.
New compact and efficient hw/sw co-design implementation by Pedro M.C. Massolino, Patrick Longa, Joost Renes and Lejla Batina (preprint, source code, to appear in CHES 2020), targeting embedded applications. The smallest architecture based on a 128-bit MAC unit takes only 3415 slices, 21 BRAMs and 57 DSPs on a Virtex 7 690T and can perform key generation, encapsulation and decapsulation in 14.4, 24.4 and 26.0 milliseconds for
The following additional recent developments may also be of interest to the community. The updated specification file does not reflect these developments.