GH-47435: [Python][Parquet] Add direct key encryption/decryption API#49667
GH-47435: [Python][Parquet] Add direct key encryption/decryption API#49667smaheshwar-pltr wants to merge 5 commits intoapache:mainfrom
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
|
cc @ggershinsky, this is a requirement for PyIceberg encryption support, see apache/iceberg-python#3221 |
|
sgtm, #47435 (comment) |
|
|
1 similar comment
|
|
| This bypasses the KMS-based :class:`CryptoFactory` API and directly | ||
| constructs decryption properties from a plaintext key. This is useful | ||
| when the caller manages key wrapping externally (e.g. via an | ||
| application-level envelope encryption scheme). | ||
|
|
||
| For most use cases, prefer the higher-level :class:`CryptoFactory` | ||
| with :class:`DecryptionConfiguration`, which handles envelope | ||
| encryption and key rotation automatically. |
There was a problem hiding this comment.
I don't think it's true that the CryptoFactory handles key rotation automatically, that part of the comment should probably be removed. It does allow key rotation if you use external key material though.
adamreeve
left a comment
There was a problem hiding this comment.
Thanks for implementing this @smaheshwar-pltr! I've left a few suggestions
| This bypasses the KMS-based :class:`CryptoFactory` API and directly | ||
| constructs decryption properties from a plaintext key. This is useful | ||
| when the caller manages key wrapping externally (e.g. via an | ||
| application-level envelope encryption scheme). | ||
|
|
||
| For most use cases, prefer the higher-level :class:`CryptoFactory` | ||
| with :class:`DecryptionConfiguration`, which handles envelope | ||
| encryption and key rotation automatically. |
There was a problem hiding this comment.
I don't think it's true that the CryptoFactory handles key rotation automatically, that part of the comment should probably be removed. It does allow key rotation if you use external key material though.
| The decryption key for the file footer (and all columns if | ||
| uniform encryption was used). Must be 16, 24, or 32 bytes |
There was a problem hiding this comment.
It looks like this only supports uniform encryption at the moment, so this comment could be a little confusing. Maybe there should be a note about this added here, eg. "Note that currently only uniform encryption is supported with this method"
|
|
||
| For most use cases, prefer the higher-level :class:`CryptoFactory` | ||
| with :class:`EncryptionConfiguration`, which handles envelope | ||
| encryption, key rotation, and unique-per-file data keys |
There was a problem hiding this comment.
| encryption, key rotation, and unique-per-file data keys | |
| encryption and unique-per-file data keys |
| footer_key : bytes | ||
| The encryption key for the file footer (and all columns unless | ||
| per-column keys are specified). Must be 16, 24, or 32 bytes | ||
| for AES-128, AES-192, or AES-256 respectively. |
There was a problem hiding this comment.
Similar to above, we should add a note somewhere about per-column keys not being supported yet.
|
@adamreeve, thank you for the great review here 🙌 . I've reworked the docs now - they are consistent with my (inexperienced) understanding. Would appreciate your eyes on this again, please let me know what you think! |
Rationale for this change
See #47435.
What changes are included in this PR?
Adds direct encryption / decryption Python API
Are these changes tested?
Yes, see PR.
Are there any user-facing changes?
Yes, new Python bindings.