Version 0.7.11 New dependency on pyfaidx, a Python library for handling samtools-style FASTA indexes (.fai). export vcf: - Add CNVkit version and current date (i.e. local calendar date that the "cnvkit.py export vcf" command was run) to the VCF header. export theta: - Given a VCF of SNVs called jointly in paired tumor and normal samples, extract SNP allele counts to THetA2's custom input format ("snp_formatted.txt"). The two additional files CNVkit generates this way can be used with THetA2's "--TUMOR_SNP" and "--NORMAL_SNP" options to improve estimates of tumor purity and clonality. - Use CNVkit's segment weights and probe counts to estimate normal-sample read counts for each segment if no copy number reference profile (.cnn) or paired normal sample (.cnr) is given. The command's second argument is now optional and deprecated in favor of the "-r"/"--reference" option, which does the same thing. import-theta: - Save integer copy number in the "cn" column of the output file(s) (CNVkit's .cns format). call, export nexus-ogt: - When reading structural variants from a VCF file, interpret the END tag as the variant end position, not the length, per the VCF 4.2 specification. This bug could cause the b-allele frequencies calculated in `call` and `export nexus-ogt` to be erroneously repeated across many consecutive bins. scatter: - When loading CNVkit files (in any command), identify and drop rows with "NaN" log2 values. (CNVkit never emits these, but they could happen if a user generates .cnr files from Illumina CGH array data files using a custom script.) The other rows (spread, gc, rmask) can be NaN without a problem, but plotting with `scatter` would crash when adjusting the y-axis based on NaN log2 values. (#95) - Detect & warn if input .cnr/.cns/.vcf is not sorted by genomic coordinates. This could happen if the input VCF or manually constructed .cnr/.cns file (not generated by CNVkit) was not sorted by genomic coordinates. Then the error message was cryptic, because some bins/segments/SNVs are selected successfully but plotting would crash when laying out the x-axis coordinates. Internals & packaging: - Use the pyfaidx library to extract sequences from a genome FASTA file (used in the `reference` command), replacing some custom code in cnvlib. (#73; thanks @mdshw5) - Documentation updates.