Clean up NIP 44 to clarify separation of concerns (encryption vs messaging), improve formatting and clarify encryption/decryption steps

2024-09-16 02:03:39 +00:00 · 2023-12-20 11:28:43 -08:00 · 2023-12-20 11:28:43 -08:00 · 732b0ce0a4
commit 732b0ce0a4
parent ff533d7a99
1 changed files with 130 additions and 115 deletions
--- a/44.md
+++ b/44.md
@ -1,42 +1,14 @@
-# NIP-44
+NIP-44
+=====

-## Encrypted Payloads (Versioned)
+Encrypted Payloads (Versioned)
+------------------------------

 `optional`

 The NIP introduces a new data format for keypair-based encryption. This NIP is versioned
-to allow multiple algorithm choices to exist simultaneously.
-
-Nostr is a key directory. Every nostr user has their own public key, which solves key
-distribution problems present in other solutions. The goal of this NIP is to have a
-simple way to send messages between nostr accounts that cannot be read by everyone.
-
-The scheme has a number of important shortcomings:
-
- No deniability: it is possible to prove the event was signed by a particular key
- No forward secrecy: when a user key is compromised, it is possible to decrypt all previous conversations
- No post-compromise security: when a user key is compromised, it is possible to decrypt all future conversations
- No post-quantum security: a powerful quantum computer would be able to decrypt the messages
- IP address leak: user IP may be seen by relays and all intermediaries between user and relay
- Date leak: the message date is public, since it is a part of NIP 01 event
- Limited message size leak: padding only partially obscures true message length
- No attachments: they are not supported
-
-Lack of forward secrecy is partially mitigated by these two factors:
-  1. the messages should only be stored on relays, specified by the user, instead of a set of all public relays.
-  2. the relays are supposed to regularly delete older messages.
-
-For risky situations, users should chat in specialized E2EE messaging software and limit use of nostr to exchanging contacts.
-
-## Dependence on NIP-01
-
-It's not enough to use NIP-44 for encryption: the output must also be signed.
-
-In nostr case, the payload is serialized and signed as per NIP-01 rules.
-
-The same event can be serialized in two different ways, resulting in two distinct signatures. So, it's important to ensure serialization rules, which are defined in NIP-01, are the same across different NIP-44 implementations.
-
-After serialization, the event is signed by Schnorr signature over secp256k1, defined in BIP340. It's important to ensure the key and signature validity as per BIP340 rules.
+to allow multiple algorithm choices to exist simultaneously. This format may be used for
+many things, but MUST be used in the context of a signed event as described in NIP 01.

 ## Versions

@ -46,29 +18,129 @@ Currently defined encryption algorithms:
 - `0x01` - Deprecated and undefined
 - `0x02` - secp256k1 ECDH, HKDF, padding, ChaCha20, HMAC-SHA256, base64

+## Limitations
+
+Every nostr user has their own public key, which solves key distribution problems present
+in other solutions. However, nostr's relay-based architecture makes it difficult to implement
+more robust private messaging protocols with things like metadata hiding, forward secrecy,
+and post compromise secrecy.
+
+The goal of this NIP is to have a _simple_ way to encrypt payloads used in the context of a signed
+event. When applying this NIP to any use case, it's important to keep in mind your users' threat
+model and this NIP's limitations. For high-risk situations, users should chat in specialized E2EE
+messaging software and limit use of nostr to exchanging contacts.
+
+On its own, messages sent using this scheme has a number of important shortcomings:
+
+- No deniability: it is possible to prove an event was signed by a particular key
+- No forward secrecy: when a key is compromised, it is possible to decrypt all previous conversations
+- No post-compromise security: when a key is compromised, it is possible to decrypt all future conversations
+- No post-quantum security: a powerful quantum computer would be able to decrypt the messages
+- IP address leak: user IP may be seen by relays and all intermediaries between user and relay
+- Date leak: `created_at` is public, since it is a part of NIP 01 event
+- Limited message size leak: padding only partially obscures true message length
+- No attachments: they are not supported
+
+Lack of forward secrecy may be partially mitigated by only sending messages to trusted relays, and asking
+relays to delete stored messages after a certain duration has elapsed.
+
 ## Version 2

-The algorithm choices are justified in a following way:
+NIP-44 version 2 has the following design characteristics:

- Encrypt-then-mac-then-sign instead of encrypt-then-sign-then-mac: only events wrapped in NIP-01 signed envelope are currently accepted by nostr.
- ChaCha instead of AES: it's faster and has [better security against multi-key attacks](https://datatracker.ietf.org/doc/draft-irtf-cfrg-aead-limits/)
- ChaCha instead of XChaCha: XChaCha has not been standardized. Also, we don't need xchacha's improved collision resistance of nonces: every message has a new (key, nonce) pair.
- HMAC-SHA256 instead of Poly1305: polynomial MACs are much easier to forge SHA256 instead of SHA3 or BLAKE: it is already used in nostr. Also blake's
-  speed advantage is smaller in non-parallel environments - Custom padding instead of padmé: better leakage reduction for small messages
- Base64 encoding instead of an other compression algorithm: it is widely available, and is already used in nostr
+- Payloads are authenticated using a MAC before signing rather than afterwards because events are assumed
+  to be signed as specified in NIP-01. The outer signature serves to authenticate the full payload, and MUST
+  be validated before decrypting.
+- ChaCha is used instead of AES because it's faster and has
+  [better security against multi-key attacks](https://datatracker.ietf.org/doc/draft-irtf-cfrg-aead-limits/).
+- ChaCha is used instead of XChaCha because XChaCha has not been standardized. Also, xChaCha's improved collision
+  resistance of nonces isn't necessary since every message has a new (key, nonce) pair.
+- HMAC-SHA256 is used instead of Poly1305 because polynomial MACs are much easier to forge.
+- SHA256 is used instead of SHA3 or BLAKE because it is already used in nostr. Also BLAKE's speed advantage
+  is smaller in non-parallel environments.
+- A custom padding scheme is used instead of padmé because it provides better leakage reduction for small messages.
+- Base64 encoding is used instead of another compression algorithm because it is widely available, and is already used in nostr.

-### Functions and operations
+### Encryption
+
+1. Calculate a conversation key
+   - Execute ECDH (scalar multiplication) of public key B by private key A
+     Output `shared_x` must be unhashed, 32-byte encoded x coordinate of the shared point
+   - Use HKDF-extract with sha256, `IKM=shared_x` and `salt=utf8_encode('nip44-v2')`
+   - HKDF output will be a `conversation_key` between two users.
+   - It is always the same, when key roles are swapped: `conv(a, B) == conv(b, A)`
+2. Generate a random 32-byte nonce
+   - Always use [CSPRNG](https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator)
+   - Don't generate a nonce from message content
+   - Don't re-use the same nonce between messages: doing so would make them decryptable,
+     but won't leak the long-term key
+3. Calculate message keys
+   - The keys are generated from `conversation_key` and `nonce`. Validate that both are 32 bytes long
+   - Use HKDF-expand, with sha256, `OKM=conversation_key`, `info=nonce` and `L=76`
+   - Slice 76-byte HKDF output into: `chacha_key` (bytes 0..32), `chacha_nonce` (bytes 32..44), `hmac_key` (bytes 44..76)
+4. Add padding
+   - Content must be encoded from UTF-8 into byte array
+   - Validate plaintext length. Minimum is 1 byte, maximum is 65535 bytes
+   - Padding format is: `[plaintext_length: u16][plaintext][zero_bytes]`
+   - Padding algorithm is related to powers-of-two, with min padded msg size of 32
+   - Plaintext length is encoded in big-endian as first 2 bytes of the padded blob
+5. Encrypt padded content
+   - Use ChaCha20, with key and nonce from step 3
+6. Calculate MAC (message authentication code)
+   - AAD (additional authenticated data) is used - instead of calculating MAC on ciphertext,
+     it's calculated over a concatenation of `nonce` and `ciphertext`
+   - Validate that AAD (nonce) is 32 bytes
+7. Base64-encode (with padding) params using `concat(version, nonce, ciphertext, mac)`
+
+Encrypted payloads MUST be included in an event's payload, hashed, and signed as defined in NIP 01, using schnorr
+signature scheme over secp256k1.
+
+### Decryption
+
+Before decryption, the event's pubkey and signature MUST be validated as defined in NIP 01. The public key MUST be
+a valid non-zero secp256k1 curve point, and the signature must be valid secp256k1 schnorr signature. For exact
+validation rules, refer to BIP-340.
+
+1. Check if first payload's character is `#`
+   - `#` is an optional future-proof flag that means non-base64 encoding is used
+   - The `#` is not present in base64 alphabet, but, instead of throwing `base64 is invalid`,
+     implementations MUST indicate that the encryption version is not yet supported
+2. Decode base64
+   - Base64 is decoded into `version, nonce, ciphertext, mac`
+   - If the version is unknown, implementations must indicate that the encryption version is not supported
+   - Validate length of base64 message to prevent DoS on base64 decoder: it can be in range from 132 to 87472 chars
+   - Validate length of decoded message to verify output of the decoder: it can be in range from 99 to 65603 bytes
+3. Calculate conversation key
+   - See step 1 of (encryption)[#Encryption]
+4. Calculate message keys
+   - See step 3 of (encryption)[#Encryption]
+5. Calculate MAC (message authentication code) with AAD and compare
+   - Stop and throw an error if MAC doesn't match the decoded one from step 2
+   - Use constant-time comparison algorithm
+6. Decrypt ciphertext
+   - Use ChaCha20 with key and nonce from step 3
+7. Remove padding
+   - Read the first two BE bytes of plaintext that correspond to plaintext length
+   - Verify that the length of sliced plaintext matches the value of the two BE bytes
+   - Verify that calculated padding from step 3 of the (encryption)[#Encryption] process matches the actual padding
+
+### Details

 - Cryptographic methods
-  - `secure_random_bytes(length)` fetches randomness from CSPRNG
-  - `hkdf(IKM, salt, info, L)` represents HKDF [(RFC 5869)](https://datatracker.ietf.org/doc/html/rfc5869) with SHA256 hash function,
-    comprised of methods `hkdf_extract(IKM, salt)` and `hkdf_expand(OKM, info, L)`
-  - `chacha20(key, nonce, data)` is ChaCha20 [(RFC 8439)](https://datatracker.ietf.org/doc/html/rfc8439), with starting counter set to 0
-  - `hmac_sha256(key, message)` is HMAC [(RFC 2104)](https://datatracker.ietf.org/doc/html/rfc2104)
-  - `secp256k1_ecdh(priv_a, pub_b)` is multiplication of point B by scalar a (`a ⋅ B`), defined in [BIP340](https://github.com/bitcoin/bips/blob/e918b50731397872ad2922a1b08a5a4cd1d6d546/bip-0340.mediawiki). The operation produces shared point, and we encode the shared point's 32-byte x coordinate, using method `bytes(P)` from BIP340. Private and public keys must be validated as per BIP340: pubkey must be a valid, on-curve point, and private key must be a scalar in range `[1, secp256k1_order - 1]`
+  - `secure_random_bytes(length)` fetches randomness from CSPRNG.
+  - `hkdf(IKM, salt, info, L)` represents HKDF [(RFC 5869)](https://datatracker.ietf.org/doc/html/rfc5869)
+    with SHA256 hash function comprised of methods `hkdf_extract(IKM, salt)` and `hkdf_expand(OKM, info, L)`.
+  - `chacha20(key, nonce, data)` is ChaCha20 [(RFC 8439)](https://datatracker.ietf.org/doc/html/rfc8439) with
+    starting counter set to 0.
+  - `hmac_sha256(key, message)` is HMAC [(RFC 2104)](https://datatracker.ietf.org/doc/html/rfc2104).
+  - `secp256k1_ecdh(priv_a, pub_b)` is multiplication of point B by scalar a (`a ⋅ B`), defined in
+    [BIP340](https://github.com/bitcoin/bips/blob/e918b50731397872ad2922a1b08a5a4cd1d6d546/bip-0340.mediawiki).
+    The operation produces a shared point, and we encode the shared point's 32-byte x coordinate, using method
+    `bytes(P)` from BIP340. Private and public keys must be validated as per BIP340: pubkey must be a valid,
+    on-curve point, and private key must be a scalar in range `[1, secp256k1_order - 1]`.
 - Operators
-  - `x[i:j]`, where `x` is a byte array and `i, j <= 0`,
-    returns a `(j - i)`-byte array with a copy of the `i`-th byte (inclusive) to the `j`-th byte (exclusive) of `x`
+  - `x[i:j]`, where `x` is a byte array and `i, j <= 0` returns a `(j - i)`-byte array with a copy of the
+    `i`-th byte (inclusive) to the `j`-th byte (exclusive) of `x`.
 - Constants `c`:
  - `min_plaintext_size` is 1. 1b msg is padded to 32b.
  - `max_plaintext_size` is 65535 (64kb - 1). It is padded to 65536.
@ -82,7 +154,10 @@ The algorithm choices are justified in a following way:
  - `zeros(length)` creates byte array of length `length >= 0`, filled with zeros
  - `floor(number)` and `log2(number)` are well-known mathematical methods

-User-defined functions:
+### Implementation pseudocode
+
+The following is a collection of python-like pseudocode functions which implement the above primitives,
+intended to guide impelmenters. A collection of implementations in different languages is available at https://github.com/paulmillr/nip44.

 ```py
 # Calculates length of the padded byte array.
@ -177,73 +252,13 @@ def decrypt(payload, conversation_key):
 #   'hello world' == decrypt(payload, conversation_key)
 ```

-#### Encryption
+### Audit

-1. Calculate conversation key
-   - Execute ECDH (scalar multiplication) of public key B by private key A.
-     Output `shared_x` must be unhashed, 32-byte encoded x coordinate of the shared point.
-   - Use HKDF-extract with sha256, `IKM=shared_x` and `salt=utf8_encode('nip44-v2')`
-   - HKDF output will be `conversation_key` between two users
-   - It is always the same, when key roles are swapped: `conv(a, B) == conv(b, A)`
-2. Generate random 32-byte nonce
-   - Always use [CSPRNG](https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator)
-   - Don't generate nonce from message content
-   - Don't re-use the same nonce between messages: doing so would make them decryptable,
-     but won't leak long-term key
-3. Calculate message keys
-   - The keys are generated from `conversation_key` and `nonce`. Validate that both are 32 bytes
-   - Use HKDF-expand, with sha256, `OKM=conversation_key`, `info=nonce` and `L=76`
-   - Slice 76-byte HKDF output into: `chacha_key` (bytes 0..32), `chacha_nonce` (bytes 32..44), `hmac_key` (bytes 44..76)
-4. Add padding
-   - Content must be encoded from UTF-8 into byte array
-   - Validate plaintext length. Minimum is 1 byte, maximum is 65535 bytes
-   - Padding format is: `[plaintext_length: u16][plaintext][zero_bytes]`
-   - Padding algorithm is related to powers-of-two, with min padded msg size of 32
-   - Plaintext length is encoded in big-endian as first 2 bytes of the padded blob
-5. Encrypt padded content
-   - Use ChaCha20, with key and nonce from step 3
-6. Calculate MAC (message authentication code) with AAD
-   - AAD is used: instead of calculating MAC on ciphertext,
-     it's calculated over a concatenation of `nonce` and `ciphertext`
-   - Validate that AAD (nonce) is 32 bytes
-7. Base64-encode (with padding) params: `concat(version, nonce, ciphertext, mac)`
-
-After encryption, it's necessary to sign it. Use NIP-01 to serialize the event, with result base64 assigned to event's `content`. Then, use NIP-01 to sign the event using schnorr signature scheme over secp256k1.
-
-#### Decryption
-
-Before decryption, it's necessary to validate the message's pubkey and signature. The public key must be a valid non-zero secp256k1 curve point, and signature must be valid secp256k1 schnorr signature. For exact validation rules, refer to BIP-340.
-
-1. Check if first payload's character is `#`
-   - `#` is an optional future-proof flag that means non-base64 encoding is used
-   - The `#` is not present in base64 alphabet, but, instead of throwing `base64 is invalid`,
-     an app must say the encryption version is not yet supported
-2. Decode base64
-   - Base64 is decoded into `version, nonce, ciphertext, mac`
-   - If the version is unknown, the app, an app must say the encryption version is not yet supported
-   - Validate length of base64 message to prevent DoS on base64 decoder: it can be in range from 132 to 87472 chars
-   - Validate length of decoded message to verify output of the decoder: it can be in range from 99 to 65603 bytes
-3. Calculate conversation key
-   - See step 1 of Encryption
-4. Calculate message keys
-   - See step 3 of Encryption
-5. Calculate MAC (message authentication code) with AAD and compare
-   - Stop and throw an error if MAC doesn't match the decoded one from step 2
-   - Use constant-time comparison algorithm
-6. Decrypt ciphertext
-   - Use ChaCha20 with key and nonce from step 3
-7. Remove padding
-   - Read the first two BE bytes of plaintext that correspond to plaintext length
-   - Verify that the length of sliced plaintext matches the value of the two BE bytes
-   - Verify that calculated padding from encryption's step 3 matches the actual padding
-
-## Audit
-
-The v2 of the standard has been subject to an audit by [Cure53](https://cure53.de) in December 2023.
+The v2 of the standard was audited by [Cure53](https://cure53.de) in December 2023.
 Check out [audit-2023.12.pdf](https://github.com/paulmillr/nip44/blob/ce63c2eaf345e9f7f93b48f829e6bdeb7e7d7964/audit-2023.12.pdf)
 and [auditor's website](https://cure53.de/audit-report_nip44-implementations.pdf).

-## Tests and code
+### Tests and code

 A collection of implementations in different languages is available at https://github.com/paulmillr/nip44.

@ -251,7 +266,7 @@ We publish extensive test vectors. Instead of having it in the document directly

    269ed0f69e4c192512cc779e78c555090cebc7c785b609e338a62afc3ce25040  nip44.vectors.json

-Example of test vector from the file:
+Example of a test vector from the file:

 ```json
 {