Skip to content
Version 0.7.11

New dependency on pyfaidx, a Python library for handling samtools-style
FASTA indexes (.fai).

export vcf:

- Add CNVkit version and current date (i.e. local calendar date that the
  "cnvkit.py export vcf" command was run) to the VCF header.

export theta:

- Given a VCF of SNVs called jointly in paired tumor and normal samples,
  extract SNP allele counts to THetA2's custom input format
  ("snp_formatted.txt"). The two additional files CNVkit generates this way can
  be used with THetA2's "--TUMOR_SNP" and "--NORMAL_SNP" options to improve
  estimates of tumor purity and clonality.
- Use CNVkit's segment weights and probe counts to estimate normal-sample read
  counts for each segment if no copy number reference profile (.cnn) or paired
  normal sample (.cnr) is given.
  The command's second argument is now optional and deprecated in favor of the
  "-r"/"--reference" option, which does the same thing.

import-theta:

- Save integer copy number in the "cn" column of the output file(s) (CNVkit's
  .cns format).

call, export nexus-ogt:

- When reading structural variants from a VCF file, interpret the END tag as the
  variant end position, not the length, per the VCF 4.2 specification.
  This bug could cause the b-allele frequencies calculated in `call` and
  `export nexus-ogt` to be erroneously repeated across many consecutive bins.

scatter:

- When loading CNVkit files (in any command), identify and drop rows with "NaN"
  log2 values. (CNVkit never emits these, but they could happen if a user
  generates .cnr files from Illumina CGH array data files using a custom
  script.) The other rows (spread, gc, rmask) can be NaN without a problem, but
  plotting with `scatter` would crash when adjusting the y-axis based on NaN
  log2 values. (#95)
- Detect & warn if input .cnr/.cns/.vcf is not sorted by genomic coordinates.
  This could happen if the input VCF or manually constructed .cnr/.cns file (not
  generated by CNVkit) was not sorted by genomic coordinates. Then the error
  message was cryptic, because some bins/segments/SNVs are selected successfully
  but plotting would crash when laying out the x-axis coordinates.

Internals & packaging:

- Use the pyfaidx library to extract sequences from a genome FASTA file (used in
  the `reference` command), replacing some custom code in cnvlib. (#73; thanks
  @mdshw5)
- Documentation updates.