Chris Miller

Chris Miller

Bioinformatics Grad student at Baylor College of Medicine. My online home is at
RT @GraveleyLab: Please don't put the impact factor of the journal your papers are in on your CV!
RT @kerencarss: This is an important article. WGS has better coverage uniformity than WES, and less bias in variant calling.
You’re not allowed bioinformatics anymore
Answer: A: Puzzle on size of Bam file -
Like Istvan said - don't worry about the size, count the number of lines in your bam file to see the effect. - Chris Miller
Answer: A: sciClone package -> mosaic copy number correction of VAFs? -
With this version of sciClone, we made the explicit decision to not correct VAFs for copy number, and to instead exclude anything that is in a non-CN-neutral region. Correction is a problem that's more tricky than it sounds, when taking heterogeneity into account. Consider the following example: I have a founding clone population at a VAF 50%, and two subclones at 36% and 18%. Now, I find a mutation with a VAF of 12% in a region of copy number 3. Is that two copies of the mutation, present in the 18% subclone, or a single copy of the mutation, present in the 36% subclone? There are some tricks you can do with phasing the variants to help disambiguate some of these cases, and we're working on them for the next version, but in general, it's a difficult problem. That said, if you are highly confident that your copy number calls are accurate, you can use each segment of CN alteration to generate pseudo-VAFs. So in your example, with a highly confident CN region of 1.5, you could add it... - Chris Miller
Answer: A: sciClone error with analysis -
Make sure the results/ folder exists before running the script. I'll check the code in the morning and push a change if need be. - Chris Miller
Comment: C: Identifying the tumor clones and subclones with VAF from VarScan2 -
Sorry for the slow response, I've been out of town. Short answer: That input is optional and may not be relevant to all cancers, so you may be fine without it. Long answer: I believe we take the frequencies of germline SNPs in the tumor genome and look for large regions where there are no het sites, caused by CN-neutral LOH (aka UPD). You can segment them into discrete regions using CBS, as implemented in the DNAcopy package for R. Plotting on a per-chromosome basis should also make them clearly evident, if they exist. - Chris Miller
“Remember kids, the only difference between screwing around and science is writing it down.” --Adam Savage
RT @PlantEvolution: Several #MPMI2014 speakers talking of “our bioinformatician” w/o mentioning names - bioinformaticians aren’t pets! #bioinformatics
New Challenges of Next-Gen Sequencing
As of 6:09 am today, I am over 1 billion seconds old. Here's to another two billion at least!
Oligotyping analysis of the human oral microbiome and solid commentary by Carl Zimmer
Just saw "In the late 1900s" used to describe when DNAseq was starting. Seems awfully historical for something that's just really taking off
Whole genome and exome sequencing of monozygotic twins discordant for Crohn's disease
"It is a good morning exercise for a research scientist to discard a pet hypothesis every day before breakfast" - Konrad Lorenz
RT @random_walker: Google Scholar should rename the "Recommended based on My Citations" page to "Papers that should cite my work but don't."
Want to live in a walkable area in STL? These are your options (and some are a stretch). Via
bc it makes too much sense: "Why not share the cost of paying for roads in proportion to the usage made of roadways?"
Answer: A: Identifying the tumor clones and subclones with VAF from VarScan2 -
We built the sciClone package for exactly this purpose: It takes inputs of somatic mutations, with readcounts and VAFs, and uses that information to infer subclonal populations in heterogeneous tumors. It also gives you some nice visualization options. - Chris Miller
Comment: C: Will Picard MarkDuplicates also un-mark duplicates? -
Well played, sir.  In retrospect, doing the experiment myself would have been faster than that 10 minutes of searching and typing this post.  Thanks! - Chris Miller
Will Picard MarkDuplicates also un-mark duplicates? -
If I take a bam that's already aligned and has dups marked, remove a bunch of reads, then re-run Picard's mark-duplicates, will it correctly change the flags of reads that are no longer duplicates (but may have been before ditching the reads)? This question is proving surprisingly hard to google for. - Chris Miller
