Skip to content
代码片段 群组 项目

What's Changed

Release

Hotfix

  • [hotfix] add soft link to support required files (#5661) by Tong Li
  • [hotfix] Fixed fused layernorm bug without apex (#5609) by Edenzzzz
  • [hotfix] Fix examples no pad token & auto parallel codegen bug; (#5606) by Edenzzzz
  • [hotfix] fix typo s/get_defualt_parser /get_default_parser (#5548) by digger yu
  • [hotfix] quick fixes to make legacy tutorials runnable (#5559) by Edenzzzz
  • [hotfix] set return_outputs=False in examples and polish code (#5404) by Wenhao Chen
  • [hotfix] fix typo s/keywrods/keywords etc. (#5429) by digger yu

News

Lazyinit

Shardformer

  • [shardformer] refactor pipeline grad ckpt config (#5646) by Hongxin Liu
  • [shardformer] fix chatglm implementation (#5644) by Hongxin Liu
  • [shardformer] remove useless code (#5645) by flybird11111
  • [shardformer] update transformers (#5583) by Wang Binluo
  • [shardformer] fix pipeline grad ckpt (#5620) by Hongxin Liu
  • [shardformer] refactor embedding resize (#5603) by flybird11111
  • [shardformer] Sequence Parallelism Optimization (#5533) by Zhongkai Zhao
  • [shardformer] fix pipeline forward error if custom layer distribution is used (#5189) by Insu Jang
  • [shardformer] update colo attention to support custom mask (#5510) by Hongxin Liu
  • [shardformer]Fix lm parallel. (#5480) by flybird11111
  • [shardformer] fix gathering output when using tensor parallelism (#5431) by flybird11111

Fix

  • [Fix]: implement thread-safety singleton to avoid deadlock for very large-scale training scenarios (#5625) by Season
  • [fix] fix typo s/muiti-node /multi-node etc. (#5448) by digger yu
  • [Fix] Grok-1 use tokenizer from the same pretrained path (#5532) by Yuanheng Zhao
  • [fix] fix grok-1 example typo (#5506) by Yuanheng Zhao

Coloattention

  • [coloattention]modify coloattention (#5627) by flybird11111

Example

Exampe

Feature

  • [Feature] Support LLaMA-3 CPT and ST (#5619) by Tong Li

Zero

  • [zero] support multiple (partial) backward passes (#5596) by Hongxin Liu

Doc

Devops

Shardformer, pipeline

  • [shardformer, pipeline] add gradient_checkpointing_ratio and heterogenous shard policy for llama (#5508) by Wenhao Chen

Colossalchat

  • [ColossalChat] Update RLHF V2 (#5286) by YeAnbang

Format

  • [format] applied code formatting on changed files in pull request 5510 (#5517) by github-actions[bot]

Full Changelog: https://github.com/hpcaitech/ColossalAI/compare/v0.3.7...v0.3.6