Version 0.9.1 Highlights: Useful enhancements and changes to plotting and segmentation, and a new script for single-exon CNV testing. Plus, bug fixes and usability improvements to avoid unexpected errors. (#250, #255, #262, etc.) Dependencies ------------ - Compatible with the most recent pandas version 0.21.0 (#273, #274; thanks @chapmanb) - R dependencies were reduced to simplify installation Scripts ------- - Renamed "cnn_*.py" to "cnv_*.py" - New script "cnv_ztest.py" to detect single-bin (e.g. single exon) deep deletions and high-level amplifications. - In "cnv_updater.py", rename "Background" (i.e. off-target) bins to "Antitarget", addition to adding a "depth" column if it's missing. Commands -------- `autobin`: - Raise the maximum target/antitarget bin sizes to 50kb/1Mb. `fix`: - Allow specifying sample_id via ``--sample-id``/``-id``, in case the input coverage filenames do not have the expected form "sample_id.targetcoverage.cnn" and "sample_id.antitargetcoverage.cnn". (#269; thanks @chapmanb) `segment`: - Process each chromosome arm separately (with 'cbs' and 'haar', but not 'flasso'). Centromere locations are guessed from the largest gap between sequencing-accessible regions, and are not necessarily the true locations, although they do match fairly well on the human genome. - Logging of dropped bins is streamlined somewhat. - New method `-m none` to only calculate arm-level segment means (for testing and experimentation). `scatter`: - Highlight non-neutral segments from .call.cns. If segments have the columns 'cn' and potentially also 'cn1' and 'cn2' (as added by the `call` command), use those fields to display copy number alterations, LOH and allelic imbalance with colorized segments (orange by default), and use gray for neutral segments. If a VCF is also given, the same is done for SNVs in the lower panel. Otherwise, all segments are colorized as before. (#18, #157) - New option `--by-bins` to display x-axis positions by sequential bin number on each chromosome, rather than genomic coordinates. This makes the plots much more useful with targeted amplicon sequencing data, or very small gene panels. (#63) - Trend line (`--trend`) now accounts for bin weights, which generally results in a better fit. - Improved interaction of -c and -g options: - Only apply the window margin (-w) if -g is used alone, or -c specifies a small chromosomal region with no genes. - Allow an empty gene list (-g '' or -g ',') to prevent highlighting and labeling of any genes / small non-genic "Selection" in the -c region. - If any gene in -g is not fully within the region specified by -c, name that gene and its coordinates in the error message. - If the -c region has size <=0, show a specific error message. - Handle NaN log2 values when calculating y-axis limits. `heatmap`: - Incorporate the `--by-bins` argument to match `scatter`. (#63) - Warn if selected region contains no data for a sample. This helps troubleshoot if a chromosome name was mis-specified on the command line. (#268) `export seg`: - Change column headers to match DNAcopy output. The column headers generally don't matter in the SEG format, but the DNAcopy dataframe is considered the canonical form. Python API ---------- - cnvlib.do_segment -- new keyword argument min_weight to drop bins with 'weight' below the specified value. If not used, then only bins with weight 0 will be dropped. This feature is not recommended for normal usage and is not available on the command line. - cnvlib.do_scatter -- Remove deprecated keyword argument 'background_marker' in favor of 'antitarget_marker', corresponding to `scatter` options deprecated in v0.9.0. - cnvlib.cnary.CopyNumArray: Add method 'smoothed', which calculates the trendline displayed by the `scatter` command. - skgenome.tabio: Add read support for samtools 'dict' format, which resembles the plain-text SAM header and can contain chromosome names and sizes. - skgenome.gary.GenomicArray: Add magic methods __bool__ (Py3) and __nonzero__ (Py2) to ensure an empty GenomicArray, i.e. 0 rows, is treated as false-ish on both Python 2.7 and 3.x.