Version 0.8.4 This minor release focuses on improving usability and fixing some bugs. Documentation is updated (thanks @kyleabeauchamp for #186). Dependencies ------------ - Raise minimum pandas version from 0.18.1 to 0.19.0 - Raise minimum matplotlib version to 1.3.1 Commands -------- `fix`, `metrics`: - Set PRNG seed to ensure reproducible results. The pipeline is now fully repeatable with identical results if run in serial, i.e. without `-p`. `fix`, `reference`: - Ensure bias smoothing window size is at least 5. This reduces the occurrence of 0-log2, 0-spread bins on a 32-bin dataset, but doesn't eliminate it. (#181) `fix`: - Don't complain about mismatched sample IDs if antitargets are blank. This allows reusing a blank "MT" file in a shell loop for WGS and amplicon data. `reference`: - Make antitargets (antitarget.bed or \*.antitargetcoverage.cnn) an optional argument. Previously this argument was required, so processing WGS or amplicon data, which has no off-target regions or reads, required the user to create and provide a blank BED file or appropriately named, empty .cnn files. (#183) `segment`: - Don't log "Dropped 0 low-coverage bins". Only log when it actually drops bins. `diagram`, `heatmap`: - Add option `--no-shift-xy`. Shifting X and Y according reference and sample sex was done in diagram, but not heatmap. Now it's optional in both. `heatmap`: - Add a legend of log2 ratio colors to the plot. (#36) - Add options `-x`/`--sample-sex` and `-y`/`--male-reference`. (#172) `gender`/`sex`: - Rename 'gender' command to 'sex', with shim for backward compatibility. (#182) - In other commands, the `-g`/`--gender`` argument is renamed to `-x`/`--sample-sex`, also with a compatibility shim. Argument values `x` and `y` are accepted in addition to `f`/`female` and `m`/`male`, respectively. `import-picard`: - Deprecate searching a directory tree for files. It was a vestige of early lab work, and makes a shaky assumption about Picard CalculateHsMetrics ``--PER_TARGET_COVERAGE`` output filenames. API --- - The ``do_*`` function implementations moved to their named modules. The ``do_*`` functions can still be called or imported from the `cnvlib` and `cnvlib.commands` modules. - All parsing and serialization of "chr:start-end" genomic region labels is consolidated under a new module, `cnvlib.genome.rangelabel`. These functions are used in in tabio.textcoord, GenomicArray.labels(), and elsewhere to ensure consistent behavior. Internal -------- - `cnvlib.genome`: Handle nested bins correctly in the `merge`, `flatten`, and `intersect` modules, functions and GenomicArray methods. Verified with thorough unit tests. - VCF: If the paired normal sample's genotypes are all 0/0 or missing, fall back to `--zygosity-freq` (inference from b-allele frequency) rather than marking all variants as somatic. Then infer and drop additional somatic SNVs based on genotype after parsing, and only if that wouldn't drop all records. This allows CNVkit to safely distinguish somatic vs. germline in VCFs from Mutect2, though Mutect2 is still not recommended. (#184)