question about case selection in observations #10

Cooperx521 · 2024-06-21T12:02:00Z

Hello : )

Thank you for the brilliant work!

I have a question regarding case selection. I noticed that Figure 2 uses a multi-document question answering task, but I am curious about the generalizability of this observation. For other cases, do we also observe different attention patterns corresponding to different depths, from shallow to deep layers?

Thank you very much for your insights.

Zefan-Cai · 2024-06-24T05:09:51Z

Hi, thank you so much for your support!

This observation is generally generalizable in most of the cases where inputs contain components (i.e., system prompts, documents, examples for ICL, instructions). In such cases, localized attention aggregation happens within each component.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about case selection in observations #10

question about case selection in observations #10

Cooperx521 commented Jun 21, 2024

Zefan-Cai commented Jun 24, 2024

question about case selection in observations #10

question about case selection in observations #10

Comments

Cooperx521 commented Jun 21, 2024

Zefan-Cai commented Jun 24, 2024