2. Note that we provide extensive code samples throughout this document. In these samples the tilde character (~) is used as a short cut to the user’s home directory on Unix-like systems. On Microsoft Windows, the forward slash characters (/) separating the file paths will need to be substituted with back slashes (\).
6. BioGRID is a database of experimentally determined molecular interactions [20]. The web interface to BioGRID allows users to download the entire database in Tab 2.0 format, and also the interactions associated with a specific gene. An alternative source is PathwayCommons [23], which integrates protein-protein, gene-regulatory, kinase-substrate, and other molecular relationships. More specialized data sources include PhosphoSitePlus [24] (kinase-substrate relationships) and HINT (high-confidence protein-protein interactions) [25].
7. Detailed instructions on how to use cellHTS2 can be found in an R vignette titled “End-to-end analysis of cell-based screens.” Once the cellHTS2 package is installed, the command ‘browseVignettes("cellHTS2")’ can be entered into the R console to reveal links to this and other relevant vignettes.
8. In our experience, luminescence values from multiwell siRNA screens tend to be positively skewed and show a log-normal distribution. It is thus preferable to log transform values prior to normalization. Setting the “log” argument of the “normalizePlates” function to “TRUE” and the “scale” argument to “multiplicative” instructs cellHTS2 to first log transform the luminescence values and then subtract the plate median values from each value on a plate.
9. Z-normalization in the classical sense refers to adjusting a set of normally distributed values such that they have a mean value of zero and a standard deviation equal to one. For idealized normally distributed Z-scores, 95% of the values are expected to fall between Z = −2 and Z = +2 and 99.1% of the values are expected to fall between Z = −3 and Z = +3. Log-transformed and plate-centered luminescence values from siRNA screens often have negatively skewed distributions that are not well described by statistics such as the mean and standard deviation. As an alternative to standard Z-score normalization we use robust Z-normalization where the median value is subtracted from all log-transformed plate-centered values and these values are then divided by the median absolute deviation (MAD) of the distribution. This results in approximately 95% of the values falling between Z = −2 and Z = +2. Thus, siRNAs that produce a Z-score of <−2 (or more stringently, <−3) are interpreted as causing a decrease in viability.
11. Defining functional mutations in cancer driver genes can be difficult. In some cases (e.g., amplification of a gene such as ERBB2) the functional relevance of an alteration is well established. In many cases however, especially those involving missense mutations, the functional relevance of an alteration is uncertain. In [5] we developed a simple pipeline to classify mutations and copy number changes as either of likely functional relevance or of uncertain relevance [5]. For tumor suppressor genes we classify homozygous deletions, mutations predicted to cause a truncation (frame shift, nonsense, or splice site alteration) or missense mutations found to occur recurrently in tumors as functionally relevant. For oncogenes, we classify amplification events or recurrent missense mutations as functionally relevant. Mutations other than these are classified as of uncertain relevance and cell lines harboring these mutations are excluded from our association tests.
12. By default the “annotate_dependencies.py” script assumes that the interactions provided in the input file are undirected (i.e., the interaction (a, b) is the same as the interaction (b, a)). Using the argument “-d” changes this default behavior such that a directed network is utilized. This may be more appropriate for directed networks—e.g., for RB1 associated dependencies it may make sense to highlight associations between RB1 and genes that it regulates, but not associations involving genes that regulate RB1.
Davoli T et al (2013) Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. Cell 155(4):948–962
Lawrence MS et al (2014) Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505(7484):495–501
Yaffe MB (2013) The scientific drunk and the lamppost: massive sequencing efforts in cancer discovery and treatment. Sci Signal 6(269):e13
Brough R et al (2011) Functional viability profiles of breast cancer. Cancer Discov 1(3):260–273
Campbell J et al (2016) Large-scale profiling of kinase dependencies in cancer cell lines. Cell Rep 14(10):2490–2501
Cheung HW et al (2011) Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer. Proc Natl Acad Sci U S A 108(30):12372–12377
Cowley GS et al (2014) Parallel genome-scale loss of function screens in 216 cancer cell lines for the identification of context-specific genetic dependencies. Sci Data 1:140035
Hart T et al (2015) High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163(6):1515–1526
Kim HS et al (2013) Systematic identification of molecular subtype-selective vulnerabilities in non-small-cell lung cancer. Cell 155(3):552–566
Marcotte R et al (2016) Functional genomic landscape of human breast cancer drivers, vulnerabilities, and resistance. Cell 164(1–2):293–309
Moser R et al (2014) Functional kinomics identifies candidate therapeutic targets in head and neck cancer. Clin Cancer Res 20(16):4274–4288
Luo J, Solimini NL, Elledge SJ (2009) Principles of cancer therapy: oncogene and non-oncogene addiction. Cell 136(5):823–837
Lord CJ, Tutt AN, Ashworth A (2015) Synthetic lethality and cancer therapy: lessons learned from the development of PARP inhibitors. Annu Rev Med 66:455–470
Helming KC et al (2014) ARID1B is a specific vulnerability in ARID1A-mutant cancers. Nat Med 20(3):251–254
Hsu TY et al (2015) The spliceosome is a therapeutic vulnerability in MYC-driven cancer. Nature 525(7569):384–388
Kelley R, Ideker T (2005) Systematic interpretation of genetic interactions using protein networks. Nat Biotechnol 23(5):561–566
Lord CJ et al (2008) A high-throughput RNA interference screen for DNA repair determinants of PARP inhibitor sensitivity. DNA Repair (Amst) 7(12):2010–2019
Boutros M, Bras LP, Huber W (2006) Analysis of cell-based RNAi screens. Genome Biol 7(7):R66
Zhang JH, Chung TD, Oldenburg KR (1999) A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J Biomol Screen 4(2):67–73
Chatr-Aryamontri A et al (2015) The BioGRID interaction database: 2015 update. Nucleic Acids Res 43(Database issue):D470–D478
Jackson AL, Linsley PS (2010) Recognizing and avoiding siRNA off-target effects for target identification and therapeutic application. Nat Rev Drug Discov 9(1):57–67
Forbes SA et al (2015) COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res 43(Database issue):D805–D811
Cerami EG et al (2011) Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res 39(Database issue):D685–D690
Hornbeck PV et al (2015) PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 43(Database issue):D512–D520
Das J, Yu H (2012) HINT: high-quality protein interactomes and their applications in understanding human disease. BMC Syst Biol 6:92