Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about dataset #2

Open
wangyu1997 opened this issue Dec 15, 2021 · 2 comments
Open

Question about dataset #2

wangyu1997 opened this issue Dec 15, 2021 · 2 comments

Comments

@wangyu1997
Copy link

wangyu1997 commented Dec 15, 2021

Hello,can you please share your notebookCDG dataset? You say that the dataset is public and can be used for further study, but i can only see the processed data where all the code has been converted into id.

@Tamal-Mondal
Copy link

Tamal-Mondal commented Dec 15, 2021

In addition, it's mentioned that for preprocessing, the process is followed as LeClair and McMillan(2019). Can you please add the preprocessing code in the repository? that would greatly help to reproduce the actual results and would help us to understand the processed data.

@xuyeliu
Copy link
Owner

xuyeliu commented Dec 31, 2021

You can find our dataset in the Readme.md. https://www.dropbox.com/s/vpsst1el7f0jqo6/data_notebookcdg.pkl?dl=0. If you want to see the detailed dataset you can check our dataset link in the Huggingface. Hope to answer your questions

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants