Skip to content
htslib release 1.12:

Features and Updates
--------------------

* Added experimental CRAM 3.1 and 4.0 support. (#929)

  These should not be used for long term data storage as the
  specification still needs to be ratified by GA4GH and may be
  subject to changes in format. (This is highly likely for 4.0). 
  However it may be tested using:

  test/test_view -t ref.fa -C -o version=3.1 in.bam -p out31.cram

  For smaller but slower files, try varying the compression profile
  with an additional "-o small".  Profile choices are fast, normal,
  small and archive, and can be applied to all CRAM versions.

* Added a general filtering syntax for alignment records in
  SAM/BAM/CRAM readers. (#1181, #1203)

  An example to find chromosome spanning read-pairs with high mapping
  quality:  'mqual >= 30 && mrname != rname'

  To find significant sized deletions: 'cigar =~ "[0-9]{2}D"' or
  'rlen - qlen > 10'.

  To report duplicates that aren't part of a "proper pair":
    'flag.dup && !flag.proper_pair'

  More details are in the samtools.1 man page under "FILTER
  EXPRESSIONS".

* The knet networking code has been removed.  It only supported the
  http and ftp protocols, and a better and safer alternative using
  libcurl has been available since release 1.3.  If you need access
  to ftp:// and http:// URLs, HTSlib should be built with libcurl
  support. (#1200)

* The old htslib/knetfile.h interfaces have been marked as
  deprecated.  Any code still using them should be updated to use
  hFILE instead. (#1200)

* Added an introspection API for checking some of the capabilities
  provided by HTSlib. (#1170) Thanks also to John Marshall for
  contributions. (#1222)

    - `hfile_list_schemes`: returns the number of schemes found

    - `hfile_list_plugins`: returns the number of plugins found

    - `hfile_has_plugin`: checks if a specific plugin is available

    - `hts_features`: returns a bit mask with all available features

    - `hts_test_feature`: test if a feature is available

    - `hts_feature_string`: return a string summary of enabled
      features

* Made performance improvements to `probaln_glocal` method, which
  speeds up mpileup BAQ calculations. (#1188)

    - Caching of reused loop variables and removal of loop invariants

    - Code reordering to remove instruction latency.

    - Other refactoring and tidyups.

* Added a public method for constructing a BAM record from the
  component pieces. Thanks to Anders Kaplan. (#1159, #1164)

* Added two public methods, `sam_parse_cigar` and `bam_parse_cigar`,
  as part of a small CIGAR API (#1169, #1182). Thanks to
  Daniel Cameron for input. (#1147)

* HTSlib, and the included htsfile program, will now recognise
  the old RAZF compressed file format.  Note that while the
  format is detected, HTSlib is unable to read it.  It is
  recommended that RAZF files are uncompressed with `gunzip`
  before using them with HTSlib.  Thanks to John Marshall
  (#1244); and Matthew J. Oldach who reported problems with
  uncompressing some RAZF files (samtools/samtools#1387).

* The S3 plugin now has options to force the address style.  It
  will recognise the addressing_style and host_bucket entries in
  the respective aws .credentials and s3cmd .s3cfg files.  There is
  also a new HTS_S3_ADDRESS_STYLE environment variable.  Details
  are in the htslib-s3-plugin.7 man file (#1249).

Build changes
-------------

These are compiler, configuration and makefile based changes.

* Added new Makefile targets for the applications that embed HTSlib
  and want to run its test suite or clean its generated artefacts.
  (#1230, #1238)

* The CRAM codecs are now obtained via the htscodecs submodule, hence
  when cloning it is now best to use "git clone --recursive".  In an
  existing clone, you may use "git submodule update --init" to obtain
  the htscodecs submodule checkout.

* Updated CI test configuration to recurse HTSlib submodules. (#1359)

* Added Cirrus-CI integration as a replacement for Travis, which was
  phased out.  (#1175; #1212)

* Updated the Windows image used by Appveyor to 'Visual Studio 2019'.
  (#1172; fixed #1166)

* Fixed a buglet in configure.ac, exposed by the release 2.70 of
  autoconf. Thanks to John Marshall. (#1198)

* Fixed plugin linking on macOS, to prevent symbol conflict when
  linking with a static HTSlib. Thanks to John Marshall. (#1184)

* Fixed a clang++9 error in `cram_io.h`. Thanks to Pjotr Prins.
  (#1190)

* Introduced $(ALL_CPPFLAGS) to allow for more flexibility in setting
  the compiler flags. Thanks to John Marshall. (#1187)

* Added 'fall through' comments to prevent warnings issued by Clang
  on intentional fall through case statements, when building with 
  `-Wextra flag`. Thanks to John Marshall. (#1163)

* Non-configure builds now define _XOPEN_SOURCE=600 to allow them
  to work when the `gcc -std=c99` option is used.  Thanks to
  John Marshall. (#1246)

Bug fixes
---------

* Fixed VCF `#CHROM` header parsing to only separate columns at tab
  characters. Thanks to Sam Morris for reporting the issue. (#1237;
  fixed samtools/bcftools#1408)

* Fixed a crash reported in `bcf_sr_sort_set`, which expects REF to
  be present. (#1204; fixed samtools/bcftools#1361)

* Fixed a bcf synced reader bug when filtering with a region
  list, and the first record for a chromosome had the same
  position as the last record for the previous chromosome.
  (#1254; fixed samtools/bcftools#1441)

* Fixed a bug in the overlapping logic of mpileup, dealing with
  iterating over CIGAR segments. Thanks to `@wulj2` for the
  analysis. (#1202; fixed #1196)

* Fixed a tabix bug that prevented setting the correct number of
  lines to be skipped in a region file. Thanks to Jim Robinson for
  reporting it. (#1189;  fixed #1186)

* Made `bam_itr_next` an alias for `sam_itr_next`, to prevent it from
  crashing when working with htsFile pointers. Thanks to
  Torbjörn Klatt for reporting it. (#1180; fixed #1179)

* Fixed once per outgoing multi-threaded block `bgzf_idx_flush`
  assertion, to accommodate situations when a single record could
  span multiple blocks. Thanks to `@lacek`. (#1168; fixed
  samtools/samtools#1328)

* Fixed assumption of pthread_t being a non-structure, as permitted
  by POSIX. Thanks also to John Marshall and Anders Kaplan. (#1167,
  #1153, #1153)

* Fixed the minimum offset of a BAI index bin, to account for
  unmapped reads. Thanks to John Marshall for spotting the issue.
  (#1158; fixed #1142)

* Fixed the CRLF handling in `sam_parse_worker` method. Thanks to
  Anders Kaplan. (#1149; fixed #1148)

* Included unistd.h and errno.h directly in HTSlib files, as opposed
  to including them indirectly, via third party code. Thanks to
  Andrew Patterson (#1143) and John Marshall (#1145).