Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select lines tool which allows selecting lines from a table given their indices. #6476

Open
hechth opened this issue Oct 21, 2024 · 6 comments

Comments

@hechth
Copy link
Contributor

hechth commented Oct 21, 2024

It would be nice to have a tool which can just select lines defined by the user from a table.

Question is whether to try and be smart about the header or not ... whether to add a bool if it has a header and maybe another whether to copy the header? And then start indexing lines either at the first line (with header) or second line (without header).

Also, what would be the best way to specify the lines? Just a comma separated list of values? Might be simple but efficient. Then we just need simple validation logic that all values are valid and that they are all smaller than the length of the file basically.

@wm75
Copy link
Contributor

wm75 commented Oct 21, 2024

I'm curious what your use case would be. In which scenario would you know exactly which line numbers you want to keep without knowing enough about the contents to use a tool like Filter1 or Grep1?
Also you can always add a line count to your input with toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/nl/9.3+galaxy1, then use that with the tools above and remove the column afterwards.

@hechth
Copy link
Contributor Author

hechth commented Oct 21, 2024

@wm75 It is possible but super cumbersome.

Imagine I have a file with 50 lines and I want 1,2,5,6,10,13,17,18,25, 43 - you can do this with the tools from above but no "normal" user will manage to do this.

@wm75
Copy link
Contributor

wm75 commented Oct 21, 2024

Yes, I can clearly see how that would be very inconvenient atm, but my question was about your use case. When do you actually want to extract just those lines from a dataset?

@hechth
Copy link
Contributor Author

hechth commented Oct 22, 2024

We have a tool that creates a table of homologue series which can be used to re calibrate mass spectrometry data. The table can have few hundred entries, after filtering we can get down to 100 candidates. Then there is some tool to automatically choose 10 but people might also want to choose those manually. The table with the 10 chosen entries is then passed to the next tool.

People might want to look at the table and then choose the entries they want manually. I'd be happy if this could be done reproducibly, without downloading, editing and uploading again or without using the built in editor.

I also think that this is quite a generic use case overall ... Choosing just some entries of a table.

@wm75
Copy link
Contributor

wm75 commented Oct 22, 2024

Table compute is a tool that can do that for you (among many other things):

image

but wouldn't users have some objective criteria for their selection, which then could/should be expressed as a filter?

@hechth
Copy link
Contributor Author

hechth commented Oct 25, 2024

Okay it seems that this tool can do exactly what I was looking for, I'll check it out.

Sometimes the criteria are not objective. In this specific use case, there is a function which gives the user a suggestion, but people might still want to use their own judgement. The automatic suggestion is based on scoring of various parameters with empirical weighting, which is quite subjective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants