Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reports: adding multiple proteins to comma separated file #440

Open
tivdnbos opened this issue Jan 31, 2021 · 3 comments
Open

Reports: adding multiple proteins to comma separated file #440

tivdnbos opened this issue Jan 31, 2021 · 3 comments
Assignees

Comments

@tivdnbos
Copy link
Contributor

When multiple proteins are added to a report, a comma is used. This occurs even when the user prefers csv format instead of tab. Therefore, no major problems here: the user can use the default tab separated format instead of comma separated, but I thought you should be aware of this.
This problem might also occur for other features, but I haven't tested this.

@hbarsnes hbarsnes transferred this issue from compomics/peptide-shaker-2.0-issue-tracker Jan 31, 2021
@hbarsnes hbarsnes changed the title [Bug] Reports: adding multiple proteins to comma separated file Reports: adding multiple proteins to comma separated file Jan 31, 2021
@hbarsnes hbarsnes self-assigned this Jan 31, 2021
@hbarsnes
Copy link
Member

Good point! Any suggestions for what we ought to use instead of comma? As we still want the text to be easy to read.

@tivdnbos
Copy link
Contributor Author

tivdnbos commented Feb 15, 2021

Sorry for my late reply, Harald.

I think the most convenient way is to not provide csv files, but only the default tsv files so no parsing errors can occur. Perhaps, a semicolon might be of use if we want to keep the csv format? I don't think semicolons occur often in protein names.

Best,
Tim

@hbarsnes
Copy link
Member

Hi Tim,

Now that you mentioned it, aren't all our default text exports tab based, i.e. tsv files and not csv files? Which means that this issue only occurs if a user chooses to use comma separated columns and selecting export content such as protein groups where comma is also used to separate the column content? So the easy fix is simply to not create those types of reports I guess? ;)

I'm leaning against not removing csv as an output format though, as there might be simpler user-defined exports that do not include comma-separated column content.

And from a general point of view, there is really no symbol that can be considered as "safe" when it comes to protein names and accession numbers, as I do not think that we can guarantee that there will be no protein names or accession numbers containing comma or semi colon, for example.

Best regards,
Harald

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants