Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making this work for different experimental set ups #63

Merged
merged 16 commits into from
Oct 24, 2024

Conversation

cansavvy
Copy link
Collaborator

@cansavvy cansavvy commented Oct 7, 2024

Description

Started by making fake data (added that script here)

Now running through the steps and adjusting how folks can specify the different treatment groups.

This needs more work. @kweav and I will pair program on this Wednesday

@cansavvy
Copy link
Collaborator Author

cansavvy commented Oct 7, 2024

Thus far normalization step is up to speed but I need to walk through the steps in crispr calculation and genetic interaction testing to make sure the changes here are carried all the way through.

@kweav
Copy link
Collaborator

kweav commented Oct 21, 2024

The NAs are just partial parts of columns rather than the whole columns. I also found that they're a subset of the type of pgRNAs. Looking into the building of crispr_df and the left_join steps

                          colOI percent_na
1                        pg_ids  0.0000000
2                   target_type  0.0000000
3                           rep  0.0000000
4           double_crispr_score  0.0000000
5         single_crispr_score_1  0.8750432 <-- the NAs for these are from "gene_gene" "ctrl_ctrl"
6         single_crispr_score_2  0.9638633 <-- the NAs for these are from "gene_gene" "ctrl_ctrl"
7           pgRNA_target_double  0.0000000
8   mean_single_target_crispr_1  0.8750432 <-- the NAs for these are from "gene_gene" "ctrl_ctrl"
9   mean_single_target_crispr_2  0.9638633 <-- the NAs for these are from "gene_gene" "ctrl_ctrl"
10                   pgRNA1_seq  0.0000000
11                   pgRNA2_seq  0.0000000
12 mean_double_control_crispr_1 99.1956840 <-- the NAs for these are from "gene_gene"
13 mean_double_control_crispr_2 99.1956840 <-- the NAs for these are from "gene_gene"

@@ -76,14 +57,7 @@ calc_crispr <- function(.data = NULL,
# fact that single-targeting pgRNAs generate only two double-strand breaks
# (1 per allele), whereas the double-targeting pgRNAs generate four DSBs.
# To do this, we set the median (adjusted) LFC for unexpressed genes of each group to zero.
crispr_score = lfc_adj - median,
# TODO: I think this n_genes_expressed variable is never used so can we eliminate?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't end up using this so its deleted.

@@ -107,32 +81,37 @@ calc_crispr <- function(.data = NULL,
target_type == "gene_ctrl" ~ gRNA1_seq,
target_type == "ctrl_gene" ~ gRNA2_seq
),
control_gRNA_seq = case_when(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these steps had to be rearranged. @kweav and I went through this together to figure it out. Needed the double control crisprs to be aligned with the single gene targets they are related to.

mean_observed_crispr = mean(double_crispr_score, na.rm = TRUE)
) %>%
group_modify(~ broom::tidy(lm(mean_observed_crispr ~ mean_expected_crispr, data = .x)))
mean_expected_double_crispr = mean(expected_crispr_double, na.rm = TRUE)
Copy link
Collaborator Author

@cansavvy cansavvy Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the stats needed adjusting because the single and double crispr scores are calculated differently in their own linear models. @kweav and I pair programmed on this together, referencing the original code.


if (rm_ids_wo_annot){
lfc_df <- lfc_df %>%
filter(!pg_ids %in% missing_ids$missing_ids)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing_ids weren't fully being dropped because it was a vector inside a data.frame so that is fixed here.

@cansavvy
Copy link
Collaborator Author

Okay. Vignettes are updated and tests are passing. Going to merge this and we can continue to polish in a new PR.

@cansavvy cansavvy merged commit 4fcd828 into main Oct 24, 2024
4 of 7 checks passed
@cansavvy cansavvy deleted the cansavvy/design_matrix branch October 24, 2024 14:15
@cansavvy cansavvy restored the cansavvy/design_matrix branch October 29, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants