Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do position 2 and 3 mean in the decoder_output? #4

Open
lsy641 opened this issue Nov 4, 2022 · 3 comments
Open

What do position 2 and 3 mean in the decoder_output? #4

lsy641 opened this issue Nov 4, 2022 · 3 comments

Comments

@lsy641
Copy link

lsy641 commented Nov 4, 2022

lm_logits_start[:, 2, 1465].view(-1, 1) - lm_logits_start[:, 2, 2841].view(-1, 1)
lm_logits_start[:, 3, self.discrete_value_ids[0]].view(-1, 1)
What do index 2 and 3 map with?

@Slash0BZ
Copy link
Contributor

Slash0BZ commented Nov 4, 2022

That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.

@lsy641
Copy link
Author

lsy641 commented Nov 4, 2022

That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.

Thank you. I have one more question. Why is the "answer: positive <extrat_id_2>" always the target of input_start, I mean why it cannot be "answer: positive <extrat_id_3>" or "answer: positive <extrat_id_4>"? And in the pretrain duration data, <extract_id_n> really serves as supervision?

@Slash0BZ
Copy link
Contributor

Not sure what you mean here, where do you see the target always being <extra_id_2>? These IDs are actually used during the pre-training stage, so there is a semantic associated with different extra ids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants