Commit Graph

81 Commits

Author SHA1 Message Date
duzx16 f831824845 Add test for modeling_chatglm 2023-04-19 18:05:28 +08:00
duzx16 35ca52301f Fix input embeds 2023-04-18 20:46:39 +08:00
duzx16 0829959f96 Update slack link 2023-04-17 16:05:26 +08:00
duzx16 4de8efebc8 Change mask positions to batch 2023-04-14 15:54:43 +08:00
duzx16 3a99d7951d Always add gmask in token ids 2023-04-14 15:54:32 +08:00
duzx16 53f019758b Fix bug 2023-04-14 15:13:54 +08:00
duzx16 eb55ff050e Add empty_init option 2023-04-13 20:33:07 +08:00
duzx16 969290547e Update README 2023-04-13 15:43:34 +08:00
duzx16 aa51e62ddc Fix eos token in tokenizer 2023-04-11 13:24:10 +08:00
duzx16 cde457b39f Fix attention score on mps 2023-04-09 16:10:23 +08:00
duzx16 acd41f7731 Update dependency 2023-04-08 21:45:46 +08:00
duzx16 6650ae3a53 Merge branch 'main' of https://huggingface.co/THUDM/chatglm-6b 2023-04-08 12:04:14 +08:00
duzx16 7e69b85627 Fix tokenizer config saving 2023-04-08 12:04:08 +08:00
Zhengxiao Du 61eee50c9f Fix LogitsProcessor using slim checkpoint (#29)
- Fix LogitsProcessor using slim checkpoint (7f8f01fee41efeaac4f926bdb96aea42f1c6076b)


Co-authored-by: bcol <bcol@users.noreply.huggingface.co>
2023-04-08 02:54:27 +00:00
duzx16 9324de70a9 Use gmask in first place 2023-04-06 23:25:10 +08:00
Zhengxiao Du d467effe91 Update slim checkpoint (#28)
- Update slim checkpoint (674b8f6771ae11bec06f050cd38755ae5abe5eaa)
2023-04-06 14:47:57 +00:00
duzx16 06a22a39fa Merge branch 'slim' of https://huggingface.co/THUDM/chatglm-6b into slim 2023-04-06 22:45:49 +08:00
duzx16 36b7f2d0ad Add gmask token id 2023-04-06 22:43:35 +08:00
Zhengxiao Du 6461061e82 Update slim checkpoint 2023-04-06 14:25:30 +00:00
duzx16 63ce1bac4a Update code for slim 2023-04-06 22:19:30 +08:00
duzx16 72985e820c Drop icetk dependency 2023-04-06 19:18:52 +08:00
Zhengxiao Du 551a50efec fix typo in use_gmask (#21)
- fix typo in use_gmask (d6504255afdd555d12137fc3af04646f099b5785)


Co-authored-by: Fan Zhang <fzhang@users.noreply.huggingface.co>
2023-04-05 11:11:33 +00:00
duzx16 23ad39b571 Fix decode method for torch tensor 2023-04-05 18:26:09 +08:00
Zhengxiao Du fdb7a601d8 Support single integer or empty list as input to decode (#7)
- Update tokenization_chatglm.py (dffe870a7ef1558ebbc6f3dfdf46491cdb2b3e31)


Co-authored-by: Yichao 'Peak' Ji <peakji@users.noreply.huggingface.co>
2023-04-04 09:48:07 +00:00
duzx16 f82b180d8d Fix position ids expand 2023-04-03 14:14:20 +08:00
duzx16 fb23542cfe Fix generate 2023-04-02 14:51:17 +08:00
duzx16 08bc85104d Fix attention mask for prefix prompt 2023-04-02 02:25:03 +08:00
duzx16 4b7ffbf070 No padding for chat function 2023-04-02 02:03:05 +08:00
duzx16 373fd6b9d4 Fix attention_mask and position_ids 2023-04-02 01:58:45 +08:00
duzx16 e22cddf212 Fix encode method 2023-04-02 01:04:40 +08:00
duzx16 e1494f222d Fix batch input 2023-04-01 22:38:53 +08:00
duzx16 cc96a2271a Implement batch generation 2023-04-01 19:41:28 +08:00
duzx16 11c270c26c Fix position id for training 2023-03-31 20:18:10 +08:00
Zhengxiao Du 9c7416d834 fix GLM6BBlock name typo (#20)
- fix GLM6BBlock name typo (2a180534c5f2b4860598a9caef798b55a77cfb72)


Co-authored-by: rich brain <richbrain@users.noreply.huggingface.co>
2023-03-31 07:34:06 +00:00
duzx16 2e1be30ac4 Add support for loading quantized model 2023-03-31 10:48:38 +08:00
duzx16 c949d03152 Use dynamic dtype for prompts 2023-03-31 01:13:32 +08:00
duzx16 0cfae21ef8 Fix backward for quantization 2023-03-30 21:49:06 +08:00
duzx16 aea6cefcf5 Implement gradient checkpointing 2023-03-30 19:42:01 +08:00
duzx16 0564795e6e Fix bugs 2023-03-30 17:35:58 +08:00
duzx16 2200e2bc52 Add pad_token_id in config.json
Fix position_ids in ChatGLMModel
Add batch position_ids
2023-03-29 21:52:46 +08:00
duzx16 db2249979c Change padding side 2023-03-29 21:25:40 +08:00
duzx16 5c64357295 Set ignore_index for CrossEntropyLoss 2023-03-29 21:19:38 +08:00
duzx16 8127ab6abf Support batch training 2023-03-29 21:15:30 +08:00
duzx16 fbda1206cb Merge branch 'main' into dev_pt
# Conflicts:
#	modeling_chatglm.py
2023-03-29 20:37:39 +08:00
duzx16 812f43f9ff Add p-tuning v2 2023-03-29 20:22:57 +08:00
duzx16 096f3de6b4 Fix context length in get_position_ids 2023-03-28 17:37:46 +08:00
duzx16 4a9b711e61 Close CPU fusion on Mac 2023-03-23 22:43:06 +08:00
duzx16 d2bbc82a2c Fix Chinese punctuation 2023-03-22 14:37:21 +08:00
duzx16 2449bdc9d8 Add English 2023-03-21 23:27:46 +08:00
songxxzp 1b54948bb2 Fix typo in tokenization_chatglm.py 2023-03-19 22:52:12 +08:00