【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API #970

NKNaN · 2024-09-29T11:06:51Z

No description provided.

paddle-bot · 2024-09-29T11:06:56Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

zxcd · 2024-10-08T09:17:36Z

rfcs/APIs/20240929_api_design_for_embeddingbag.md

+- 测试不同设备；
+- 测试动态图静态图；
+- 测试不同的参数组合；
+


建议增加单测存放位置。

zxcd · 2024-10-08T09:22:07Z

rfcs/APIs/20240929_api_design_for_embeddingbag.md

+    Args:
+        x(Tensor): A 1D or 2D tensor with type int32/int64, which contains the id information. If ``x`` is 1D tensor, it will be treated as the concatenation of multiple bags, and will be segmented by ``offsets`` into each bag. If ``x`` is 2D tensor, the shape should be [bag_number, sequence_length]. The value of the input id should satisfy :math: `0 <= id < params.shape[0]`.
+
+        weight(Tensor): A tensor with shape of [num_embedding, embedding_dim] in which num_embedding indicates the size of the dictionary of embeddings and embedding_dim indicates the size of each embedding vector.


weight支持什么类型？

与 embedding 保持一致，支持 int8, float16, bfloat16, complex64, complex128, float32, float64。

zxcd · 2024-10-08T09:35:00Z

rfcs/APIs/20240929_api_design_for_embeddingbag.md

+
+        include_last_offset(bool, optional): If True, the size of ``offsets`` will be [B+1], where B is the number of bags, and the last element will specify the ending position of the last bag. Default: False.
+
+        weight_attr(ParamAttr|None, optional): To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in :ref:`api_paddle_ParamAttr` . In addition, user-defined or pre-trained word vectors can be loaded with the :attr:`param_attr` parameter. The local word vector needs to be transformed into numpy format, and the shape of local word vector should be consistent with :attr:`num_embeddings` . Then :ref:`api_paddle_nn_initializer_Assign` is used to load custom or pre-trained word vectors. See code example for details.


这个参数是用来做什么的，与weight有什么区别？

这个放错位置了，应该是 EembeddingBag 类的参数，用来初始化 weight 。F.embedding_bag 没有这个参数，已删除。

zxcd · 2024-10-11T06:32:57Z

rfcs/APIs/20240929_api_design_for_embeddingbag.md

    """
    Args:
        x(Tensor): A 1D or 2D tensor with type int32/int64, which contains the id information. If ``x`` is 1D tensor, it will be treated as the concatenation of multiple bags, and will be segmented by ``offsets`` into each bag. If ``x`` is 2D tensor, the shape should be [bag_number, sequence_length]. The value of the input id should satisfy :math: `0 <= id < params.shape[0]`.

-        weight(Tensor): A tensor with shape of [num_embedding, embedding_dim] in which num_embedding indicates the size of the dictionary of embeddings and embedding_dim indicates the size of each embedding vector.
+        weight(Tensor): A tensor with shape of [num_embedding, embedding_dim] in which num_embedding indicates the size of the dictionary of embeddings and embedding_dim indicates the size of each embedding vector. Supported dtypes are int8, float16, bfloat16, complex64, complex128, float32, float64.


如果罗列数据类型的话最好可以将类型按低-高位进行排序

add rfc

773e9f5

paddle-bot bot added the contributor label Sep 29, 2024

luotao1 assigned luotao1 and zxcd Oct 8, 2024

zxcd reviewed Oct 8, 2024

View reviewed changes

update

8c6fd70

luotao1 changed the title ~~【Hackathon 7 No.24】为 Paddle 新增 EmbeddingBag API~~ 【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API Oct 11, 2024

luotao1 mentioned this pull request Oct 11, 2024

【Hackathon 7th】开源贡献个人挑战赛 PaddlePaddle/Paddle#68244

Open

zxcd reviewed Oct 11, 2024

View reviewed changes

update

7550945

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API #970

【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API #970

NKNaN commented Sep 29, 2024

paddle-bot bot commented Sep 29, 2024

zxcd Oct 8, 2024

NKNaN Oct 10, 2024

zxcd Oct 8, 2024

NKNaN Oct 10, 2024

zxcd Oct 8, 2024

NKNaN Oct 10, 2024 •

edited

Loading

zxcd Oct 11, 2024

NKNaN Oct 23, 2024


		include_last_offset(bool, optional): If True, the size of ``offsets`` will be [B+1], where B is the number of bags, and the last element will specify the ending position of the last bag. Default: False.

		weight_attr(ParamAttr\|None, optional): To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in :ref:`api_paddle_ParamAttr` . In addition, user-defined or pre-trained word vectors can be loaded with the :attr:`param_attr` parameter. The local word vector needs to be transformed into numpy format, and the shape of local word vector should be consistent with :attr:`num_embeddings` . Then :ref:`api_paddle_nn_initializer_Assign` is used to load custom or pre-trained word vectors. See code example for details.

【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API #970

Are you sure you want to change the base?

【Hackathon 7th No.24】为 Paddle 新增 EmbeddingBag API #970

Conversation

NKNaN commented Sep 29, 2024

paddle-bot bot commented Sep 29, 2024

zxcd Oct 8, 2024

Choose a reason for hiding this comment

NKNaN Oct 10, 2024

Choose a reason for hiding this comment

zxcd Oct 8, 2024

Choose a reason for hiding this comment

NKNaN Oct 10, 2024

Choose a reason for hiding this comment

zxcd Oct 8, 2024

Choose a reason for hiding this comment

NKNaN Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

zxcd Oct 11, 2024

Choose a reason for hiding this comment

NKNaN Oct 23, 2024

Choose a reason for hiding this comment

NKNaN Oct 10, 2024 •

edited

Loading