"NA" groups are omitted from plotly plots #888
Labels
bug
Something isn't working
Low Priority
Not the biggest concern at the moment. Also includes "nice to haves"
https://nemoanalytics.org/dataset_curator.html?dataset_id=6b3f05b3-0885-4765-8d6b-c449753823b6
Observed while fixing #875
If you add a data series with groups like "NA" or other designated Pandas missing values, Pandas will convert it into a "nan" missing value after the AnnData object is read in. I believe this functionality may have changed from Pandas 1.x to 2.x but I cannot confirm. This has lead to various issues, which I have resolved but one particular one is that the NA groups are omitted from the plot. I believe this is actually related to the Plotly package, as I have not seen this in tSNE static plots. This has been observed using "dev_state" or "cell_type" as the x-axis with no color.
In addition, passing "nan" to one of the Plotly Express functions throws an error
KeyError: (nan, '', '', '', '')
where the KeyError is originating from the plotting args. In this particular example (link above), I used the "dev_state" as the x-axis and the color param for a scatter plot, and "dev_state" (or "cell_type") has a designated "NA" group. When I switch to a series with no "NA", such as "stage_ord" for the x-axis and leave color as "dev_state", the issue still persists, which tells me this is related to the color mapping and name. This is not an issue with violin plots, where I had to write a custom function a few years back.I think the solution is to find any "nan" values in the adata.obs object, and fill in the value to be "NA". It's not a fool-proof solution, since the used missing value from the dataset may be something else, but it at least makes this a string, and less prone to weirdness with downstream things.
The text was updated successfully, but these errors were encountered: