Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing sets of files, one per student #1613

Closed
bestchai opened this issue Oct 17, 2024 · 2 comments
Closed

Comparing sets of files, one per student #1613

bestchai opened this issue Oct 17, 2024 · 2 comments
Labels
question Further information is requested

Comments

@bestchai
Copy link

Which component(s) is your question about?
Dolos CLI

What is your question?
Each student submits a set of files for their solution. Is it possible to compare sets of files with the CLI?

For example, student A submits a1.java and a2.java. And student B submits b1.java and b2.java.

I don't want to compare a1.java against a2.java or b1.java against b2.java. But I do want to compare every file from student A against every file from student B.

The docs talk about info.csv including a 'label' field that could be used to group files. But, does dolos actually skip computing comparisons for files in the same group? My concern is scalability.

@bestchai bestchai added the question Further information is requested label Oct 17, 2024
@rien
Copy link
Member

rien commented Oct 21, 2024

To respond to your question: labels are just used for visualization and filtering purposes. Submissions with the same label will still be compared with each other, but it will visually help to distinguish submissions with different labels.

Analyzing multiple files per submission is indeed a feature that we want to support with Dolos. We have an issue with an approach how to tackle it: #1121. We currently do not have anyone working on this in the near future.

As a workaround you could concatenate all the files of one student together in one big file. That is how we tackle projects where students submit multiple files.

If scalability would still be a problem, you can make the analysis less fine-grained by tweaking the k and w parameters (see https://dolos.ugent.be/docs/running.html#modifying-plagiarism-detection-parameters)

Closing because this is a duplicate, but feel free to continue the discussion here.

@rien rien closed this as not planned Won't fix, can't repro, duplicate, stale Oct 21, 2024
@bestchai
Copy link
Author

Thank you for the detailed reply, I appreciate it. Indeed #1121 is the right issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants