Zkt Extension

Versions

1.0.0
State

ratified

Ratification date

2021-11

1.0.1
State

ratified

Ratification date
Changes

Synopsis

The Zkt extension attests that the machine has data-independent execution time for a safe subset of instructions. This property is commonly called "constant-time" although should not be taken with that literal meaning.

All currently proposed cryptographic instructions (scalar K extension) are on this list, together with a set of relevant supporting instructions from I, M, C, and B extensions.

Failure to prevent leakage of sensitive parameters via the direct timing channel is considered a serious security vulnerability and will typically result in a CERT CVE security advisory.

Scope and Goal

An "ISA contract" is made between a programmer and the RISC-V implementation that Zkt instructions do not leak information about processed secret data (plaintext, keying information, or other "sensitive security parameters" — FIPS 140-3 term) through differences in execution latency. Zkt does not define a set of instructions available in the core; it just restricts the behaviour of certain instructions if those are implemented.

Currently, the scope of this document is within scalar RV32/RV64 processors. Vector cryptography instructions (and appropriate vector support instructions) will be added later, as will other security-related functions that wish to assert leakage-free execution latency properties.

Loads, stores, conditional branches are excluded, along with a set of instructions that are rarely necessary to process secret data. Also excluded are instructions for which workarounds exist in standard cryptographic middleware due to the limitations of other ISA processors.

The stated goal is that OpenSSL, BoringSSL (Android), the Linux Kernel, and similar trusted software will not have directly observable timing side channels when compiled and running on a Zkt-enabled RISC-V target. The Zkt extension explicitly states many of the common latency assumptions made by cryptography developers.

Vendors do not have to implement all of the list’s instructions to be Zkt compliant; however, if they claim to have Zkt and implement any of the listed instructions, it must have data-independent latency.

For example, many simple RV32I and RV64I cores (without Multiply, Compressed, Bitmanip, or Cryptographic extensions) are technically compliant with Zkt. A constant-time AES can be implemented on them using "bit-slice" techniques, but it will be excruciatingly slow when compared to implementation with AES instructions. There are no guarantees that even a bit-sliced cipher implementation (largely based on boolean logic instructions) is secure on a core without Zkt attestation.

Out-of-order implementations adhering to Zkt are still free to fuse, crack, change or even ignore sequences of instructions, so long as the optimisations are applied deterministically, and not based on operand data. The guiding principle should be that no information about the data being operated on should be leaked based on the execution latency.

It is left to future extensions or other techniques to tackle the problem of data-independent execution in implementations which advanced out-of-order capabilities which use value prediction, or which are otherwise data-dependent.
Note to software developers

Programming techniques can only mitigate leakage directly caused by arithmetic, caches, and branches. Other ISAs have had micro-architectural issues such as Spectre, Meltdown, Speculative Store Bypass, Rogue System Register Read, Lazy FP State Restore, Bounds Check Bypass Store, TLBleed, and L1TF/Foreshadow, etc. See e.g. NSA Hardware and Firmware Security Guidance

It is not within the remit of this proposal to mitigate these micro-architectural leakages.

Background

  • Timing attacks are much more powerful than was realised before the 2010s, which has led to a significant mitigation effort in current cryptographic code-bases.

  • Cryptography developers use static and dynamic security testing tools to trace the handling of secret information and detect occasions where it influences a branch or is used for a table lookup.

  • Architectural testing for Zkt can be pragmatic and semi-formal; security by design against basic timing attacks can usually be achieved via conscious implementation (of relevant iterative multi-cycle instructions or instructions composed of micro-ops) in way that avoids data-dependent latency.

  • Laboratory testing may utilize statistical timing attack leakage analysis techniques such as those described in ISO/IEC 17825 cite:[IS16].

  • Binary executables should not contain secrets in the instruction encodings (Kerckhoffs’s principle), so instruction timing may leak information about immediates, ordering of input registers, etc. There may be an exception to this in systems where a binary loader modifies the executable for purposes of relocation — and it is desirable to keep the execution location (PC) secret. This is why instructions such as LUI, AUIPC, and ADDI are on the list.

  • The rules used by audit tools are relatively simple to understand. Very briefly; we call the plaintext, secret keys, expanded keys, nonces, and other such variables "secrets". A secret variable (arithmetically) modifying any other variable/register turns that into a secret too. If a secret ends up in address calculation affecting a load or store, that is a violation. If a secret affects a branch’s condition, that is also a violation. A secret variable location or register becomes a non-secret via specific zeroization/sanitisation or by being declared ciphertext (or otherwise no-longer-secret information). In essence, secrets can only "touch" instructions on the Zkt list while they are secrets.

Specific Instruction Rationale

  • HINT instruction forms (typically encodings with rd=x0) are excluded from the data-independent time requirement.

  • Floating point (F, D, Q, L extensions) are currently excluded from the constant-time requirement as they have very few applications in standardised cryptography. We may consider adding floating point add, sub, multiply as a constant time requirement for some floating point extension in case a specific algorithm (such as the PQC Signature algorithm Falcon) becomes critical.

  • Cryptographers typically assume division to be variable-time (while multiplication is constant time) and implement their Montgomery reduction routines with that assumption.

  • Zicsr, Zifencei are excluded.

  • Some instructions are on the list simply because we see no harm in including them in testing scope.

Programming Information

For background information on secure programming "models", see:

Zkt listings

The following instructions are included in the Zkt subset They are listed here grouped by their original parent extension.

Note to implementers

You do not need to implement all of these instructions to implement Zkt. Rather, every one of these instructions that the core does implement must adhere to the requirements of Zkt.

RVI (Base Instruction Set)

Only basic arithmetic and slt* (for carry computations) are included. The data-independent timing requirement does not apply to HINT instruction encoding forms of these instructions.

RV32 RV64 Mnemonic Instruction

lui rd, imm

'lui'

auipc rd, imm

'auipc'

addi rd, rs1, imm

'addi'

slti rd, rs1, imm

'slti'

sltiu rd, rs1, imm

'sltiu'

xori rd, rs1, imm

'xori'

ori rd, rs1, imm

'ori'

andi rd, rs1, imm

'andi'

slli rd, rs1, imm

'slli'

srli rd, rs1, imm

'srli'

srai rd, rs1, imm

'srai'

add rd, rs1, rs2

'add'

sub rd, rs1, rs2

'sub'

sll rd, rs1, rs2

'sll'

slt rd, rs1, rs2

'slt'

sltu rd, rs1, rs2

'sltu'

xor rd, rs1, rs2

'xor'

srl rd, rs1, rs2

'srl'

sra rd, rs1, rs2

'sra'

or rd, rs1, rs2

'or'

and rd, rs1, rs2

'and'

addiw rd, rs1, imm

'addiw'

slliw rd, rs1, imm

'slliw'

srliw rd, rs1, imm

'srliw'

sraiw rd, rs1, imm

'sraiw'

addw rd, rs1, rs2

'addw'

subw rd, rs1, rs2

'subw'

sllw rd, rs1, rs2

'sllw'

srlw rd, rs1, rs2

'srlw'

sraw rd, rs1, rs2

'sraw'

RVM (Multiply)

Multiplication is included; division and remaindering excluded.

RV32 RV64 Mnemonic Instruction

mul rd, rs1, rs2

'mul'

mulh rd, rs1, rs2

'mulh'

mulhsu rd, rs1, rs2

'mulhsu'

mulhu rd, rs1, rs2

'mulhu'

mulw rd, rs1, rs2

'mulw'

RVC (Compressed)

Same criteria as in RVI. Organised by quadrants.

RV32 RV64 Mnemonic Instruction

c.nop

'c_nop'

c.addi

'c_addi'

c.addiw

'c_addiw'

c.lui

'c_lui'

c.srli

'c_srli'

c.srai

'c_srai'

c.andi

'c_andi'

c.sub

'c_sub'

c.xor

'c_xor'

c.or

'c_or'

c.and

'c_and'

c.subw

'c_subw'

c.addw

'c_addw'

c.slli

'c_slli'

c.mv

'c_mv'

c.add

'c_add'

RVK (Scalar Cryptography)

All K-specific instructions are included. Additionally, seed CSR latency should be independent of ES16 state output entropy bits, as that is a sensitive security parameter. See <<crypto_scalar_appx_es_access'.

RV32 RV64 Mnemonic Instruction

aes32dsi

'aes32dsi'

aes32dsmi

'aes32dsmi'

aes32esi

'aes32esi'

aes32esmi

'aes32esmi'

aes64ds

'aes64ds'

aes64dsm

'aes64dsm'

aes64es

'aes64es'

aes64esm

'aes64esm'

aes64im

'aes64im'

aes64ks1i

'aes64ks1i'

aes64ks2

'aes64ks2'

sha256sig0

'sha256sig0'

sha256sig1

'sha256sig1'

sha256sum0

'sha256sum0'

sha256sum1

'sha256sum1'

sha512sig0h

'sha512sig0h'

sha512sig0l

'sha512sig0l'

sha512sig1h

'sha512sig1h'

sha512sig1l

'sha512sig1l'

sha512sum0r

'sha512sum0r'

sha512sum1r

'sha512sum1r'

sha512sig0

'sha512sig0'

sha512sig1

'sha512sig1'

sha512sum0

'sha512sum0'

sha512sum1

'sha512sum1'

sm3p0

'sm3p0'

sm3p1

'sm3p1'

sm4ed

'sm4ed'

sm4ks

'sm4ks'

RVB (Bitmanip)

The Zbkb, `Zbkc' and `Zbkx' extensions are included in their entirety.

Note to implementers

Recall that rev, zip and unzip are pseudoinstructions representing specific instances of grevi, shfli and unshfli respectively.

RV32 RV64 Mnemonic Instruction

clmul

'clmul-sc'

clmulh

'clmulh-sc'

xperm4

'xperm4-sc'

xperm8

'xperm8-sc'

ror

'ror-sc'

rol

'rol-sc'

rori

'rori-sc'

rorw

'rorw-sc'

rolw

'rolw-sc'

roriw

'roriw-sc'

andn

'andn-sc'

orn

'orn-sc'

xnor

'xnor-sc'

pack

'pack-sc'

packh

'packh-sc'

packw

'packw-sc'

brev8

'brev8-sc'

rev8

'rev8-sc'

zip

'zip-sc'

unzip

'unzip-sc'