Commit Graph

19 Commits

Author SHA1 Message Date
duzx16 3a99d7951d Always add gmask in token ids 2023-04-14 15:54:32 +08:00
duzx16 53f019758b Fix bug 2023-04-14 15:13:54 +08:00
duzx16 aa51e62ddc Fix eos token in tokenizer 2023-04-11 13:24:10 +08:00
duzx16 7e69b85627 Fix tokenizer config saving 2023-04-08 12:04:08 +08:00
duzx16 36b7f2d0ad Add gmask token id 2023-04-06 22:43:35 +08:00
duzx16 63ce1bac4a Update code for slim 2023-04-06 22:19:30 +08:00
duzx16 72985e820c Drop icetk dependency 2023-04-06 19:18:52 +08:00
duzx16 23ad39b571 Fix decode method for torch tensor 2023-04-05 18:26:09 +08:00
Zhengxiao Du fdb7a601d8 Support single integer or empty list as input to decode (#7)
- Update tokenization_chatglm.py (dffe870a7ef1558ebbc6f3dfdf46491cdb2b3e31)


Co-authored-by: Yichao 'Peak' Ji <peakji@users.noreply.huggingface.co>
2023-04-04 09:48:07 +00:00
duzx16 fb23542cfe Fix generate 2023-04-02 14:51:17 +08:00
duzx16 373fd6b9d4 Fix attention_mask and position_ids 2023-04-02 01:58:45 +08:00
duzx16 e22cddf212 Fix encode method 2023-04-02 01:04:40 +08:00
duzx16 e1494f222d Fix batch input 2023-04-01 22:38:53 +08:00
duzx16 cc96a2271a Implement batch generation 2023-04-01 19:41:28 +08:00
duzx16 db2249979c Change padding side 2023-03-29 21:25:40 +08:00
songxxzp 1b54948bb2 Fix typo in tokenization_chatglm.py 2023-03-19 22:52:12 +08:00
duzx16 8492687842 Remove image tokens when decoding 2023-03-16 00:24:42 +08:00
duzx16 c4575e73d0 Update tokenizer 2023-03-14 01:32:34 +08:00
Sengxian d11c6aaed8 Add chatglm-6b 2023-03-14 00:21:01 +08:00