Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reshape2: cast/melt require en_US.UTF-8, other assorted issues. #6467

Open
hexylena opened this issue Oct 18, 2024 · 1 comment
Open

reshape2: cast/melt require en_US.UTF-8, other assorted issues. #6467

hexylena opened this issue Oct 18, 2024 · 1 comment

Comments

@hexylena
Copy link
Member

locale

en_US isn't my locale, isn't installed at a system level, and isn't installed as part of the conda env.

cc @bgruening looks like you were the last to touch these.

Variable not found

## Setup R error handling to go to stderr                                                         
options(show.error.messages=F, error=function(){cat(geterrmessage(),file=stderr());q("no",1,F)})  
loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")                                                
                                                                                                  
## Import library                                                                                 
library("reshape2")                                                                               
                                                                                                  
input <- read.csv('$input', sep='\t', header=TRUE)                                                
cinput <- dcast(input, ... ~ variable)                                                            
write.table(cinput, "output.tabular", sep="\t", quote=FALSE, row.names=FALSE)                     

I'm not sure where variable should've come from. I suppose it should've been annotated in my dataset but then that assumption needs to be made explicit and communicated to the user.

Dependency Issues

I also had to manually update R in that conda env to even get the tools to work, I don't understand how that happened though. The original resolution of r-reshape2 at version 1.4.2 included an r-base that depended on libgfortran.so.3 which isn't installed alongside that version of r? deeply unhelpful. After manually conda install 'r-base>3.6' can the tool work at all, due to the lack of libgfortran3 (only 5 was installed). If we can pin r-base>3.6 it could be helpful maybe for folks that want to install this.

Help Section

The help section of this tool would also greatly benefit from two quick tables showing the before/after of the operation. here are some sample tables for whoever wants to update this tool.

station ozone      wind     temp
1 23.61538 11.622581 65.54839
2 29.44444 10.266667 79.10000

and

station variable value
1 ozone 23.61538 
1 wind 11.622581
1 temp 65.54839
2 ozone 29.44444
2 wind 10.266667
2 temp 79.10000
@hexylena
Copy link
Member Author

Additionally I have a dataset with missing values. dcast has a terrible worst case here, it'll use a different aggregation function, which is rather unhelpful.

  amr    variable value
  <chr>  <chr>    <dbl>
1 tet(M) WF1      100  
2 tet(M) WF3       96  
3 tet(M) WF3       97.4

will produce

> d %>% filter(amr=="tet(M)") %>% dcast(... ~ variable)
Aggregation function missing: defaulting to length
     amr WF1 WF3
1 tet(M)   1   2

it would be helpful to be able to set fun.aggregate to things like first or mean or sum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant