-
Notifications
You must be signed in to change notification settings - Fork 275
Support metadata compaction #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am interested in taking this if no one has started working on it. |
Based on offline discussion with @Fokko, I will first focus on implementing the
The MergeAppend will become the default append method since BTW, it seems |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
Hey, was wondering if there are no blockers if i can try to implement rewrite manifests?? |
sure thing @amitgilad3 |
Hi, recently I'm trying to investigate support rewrite manifest in iceberg-rust. And the design of iceberg-rust is following iceberg-python, basically, but for now, rewrite manifest is not supported in iceberg-python so I have to refer to the implementation of iceberg-java. In iceberg-java, the rewrite manifest is based on SnapshotProducer and I find that the design of SnapshotProducer between iceberg-java and python is a little different. In iceberg-python, SnapshotProducer is a more "fine grained" abstract, e.g. it provides the summary implementation,
I'm interested in which design iceberg-python will choice and as a refer for iceberg-rust. |
Hi @ZENOTME thanks for bringing this up. In pyiceberg, The I think we can implement metadata compaction by overriding the behaviors of iceberg-python/pyiceberg/table/update/snapshot.py Lines 197 to 199 in b86d7d5
|
Looks like @amitgilad3 has already started a PR for Rewrite manifests in #1661 |
Thanks @kevinjqliu! It's a good reference. |
feel free to help review the PR :) i haven't gotten to it yet |
Feature Request / Improvement
Add support for compaction. This rewrites the existing manifests into a single one, reducing the number of calls to the object store. This should follow the Java configuration keys:
commit.manifest-merge.enabled
: Controls whether to automatically merge manifests on writes.commit.manifest.min-count-to-merge
: Minimum number of manifests to accumulate before merging.commit.manifest.target-size-bytes
: Target size when merging manifest files.The text was updated successfully, but these errors were encountered: