Add keep_orig_idx_per_feature parameter to block_bucketize_sparse_features kernel #4027

emlin · 2025-04-25T23:54:46Z

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1112

Context
Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables.

Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of the Flexible Collision-Free Embedding Table, one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention.

For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively.

Spec:

keep_orig_idx_per_feature is an optional parameter with per feature settings.
If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false.
If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control.

Note:
Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue.

Differential Revision: D73606958

facebook-github-bot · 2025-04-25T23:54:56Z

This pull request was exported from Phabricator. Differential Revision: D73606958

netlify · 2025-04-25T23:55:10Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`ef29826`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/681d3ca6244ab90008999e82
😎 Deploy Preview	https://deploy-preview-4027--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

…tures kernel (pytorch#4027) Summary: X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-04-26T02:15:01Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: Pull Request resolved: pytorch#4027 X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-04-26T02:20:07Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: Pull Request resolved: pytorch#4027 X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-04-30T23:06:08Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-04-30T23:46:14Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-05-03T01:48:44Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: Pull Request resolved: pytorch#4027 X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-05-03T01:56:35Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: Pull Request resolved: pytorch#4027 X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-05-07T05:07:57Z

This pull request was exported from Phabricator. Differential Revision: D73606958

…tures kernel (pytorch#4027) Summary: X-link: facebookresearch/FBGEMM#1112 **Context** Enhance block_bucketize_sparse_features and block_bucketize_sparse_features_inference kernels to support mixed-format embedding tables. Previously, the keep_orig_idx parameter was a boolean flag applied uniformly across all features, determining whether to retain the original index. With the introduction of [the Flexible Collision-Free Embedding Table](https://github.com/pytorch/torchrec/blob/main/rfc/RFC-0002-Flexible-Collision-Free-Embedding-Table.md), one embedding collection may include both collision-free and collision tables. This update allows the kernel to handle mixed formats by supporting feature-wise control over index retention. For collision-free tables, a large table size of 2^50 is set, maintaining parameters as id-value pairs and preserving the original global id. This change facilitates the use of mixed-style embedding tables effectively. Spec: - keep_orig_idx_per_feature is an optional parameter with per feature settings. - If the keep_orig_idx_per_feature is not None, the value will override global flag keep_orig_idx, no matter it's true for false. - If keep_orig_idx_per_feature is None, fallback to keep_orig_idx control. Note: Adding additional parameter keep_orig_idx_per_feature, instead of change keep_orig_idx directly, is to avoid backward compatibility issue. Differential Revision: D73606958

facebook-github-bot · 2025-05-08T23:22:19Z

This pull request was exported from Phabricator. Differential Revision: D73606958

facebook-github-bot added the cla signed label Apr 25, 2025

facebook-github-bot added the fb-exported label Apr 25, 2025

emlin force-pushed the export-D73606958 branch from 2a3e87e to 5614a0f Compare April 26, 2025 02:11

emlin force-pushed the export-D73606958 branch from 5614a0f to b6e5cc7 Compare April 26, 2025 02:12

emlin force-pushed the export-D73606958 branch from b6e5cc7 to 45224b5 Compare April 26, 2025 02:15

emlin force-pushed the export-D73606958 branch 2 times, most recently from 6c9d912 to 710535a Compare April 30, 2025 23:06

emlin force-pushed the export-D73606958 branch from 710535a to 9a50ccd Compare April 30, 2025 23:46

emlin force-pushed the export-D73606958 branch from 9a50ccd to 5f49d48 Compare May 3, 2025 01:45

emlin force-pushed the export-D73606958 branch from 5f49d48 to 2c59672 Compare May 3, 2025 01:46

emlin force-pushed the export-D73606958 branch from 2c59672 to ec5a78b Compare May 3, 2025 01:48

emlin force-pushed the export-D73606958 branch from ec5a78b to 0d9251c Compare May 3, 2025 01:56

emlin force-pushed the export-D73606958 branch from 0d9251c to 7a9051d Compare May 7, 2025 05:07

emlin force-pushed the export-D73606958 branch from 7a9051d to ef29826 Compare May 8, 2025 23:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add keep_orig_idx_per_feature parameter to block_bucketize_sparse_features kernel #4027

Add keep_orig_idx_per_feature parameter to block_bucketize_sparse_features kernel #4027

emlin commented Apr 25, 2025

facebook-github-bot commented Apr 25, 2025

netlify bot commented Apr 25, 2025 •

edited

Loading

facebook-github-bot commented Apr 26, 2025

facebook-github-bot commented Apr 26, 2025

facebook-github-bot commented Apr 30, 2025

facebook-github-bot commented Apr 30, 2025

facebook-github-bot commented May 3, 2025

facebook-github-bot commented May 3, 2025

facebook-github-bot commented May 7, 2025

facebook-github-bot commented May 8, 2025

Add keep_orig_idx_per_feature parameter to block_bucketize_sparse_features kernel #4027

Are you sure you want to change the base?

Add keep_orig_idx_per_feature parameter to block_bucketize_sparse_features kernel #4027

Conversation

emlin commented Apr 25, 2025

facebook-github-bot commented Apr 25, 2025

netlify bot commented Apr 25, 2025 • edited Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

facebook-github-bot commented Apr 26, 2025

facebook-github-bot commented Apr 26, 2025

facebook-github-bot commented Apr 30, 2025

facebook-github-bot commented Apr 30, 2025

facebook-github-bot commented May 3, 2025

facebook-github-bot commented May 3, 2025

facebook-github-bot commented May 7, 2025

facebook-github-bot commented May 8, 2025

netlify bot commented Apr 25, 2025 •

edited

Loading