duzx16
|
eb55ff050e
|
Add empty_init option
|
2023-04-13 20:33:07 +08:00 |
duzx16
|
cde457b39f
|
Fix attention score on mps
|
2023-04-09 16:10:23 +08:00 |
Zhengxiao Du
|
61eee50c9f
|
Fix LogitsProcessor using slim checkpoint (#29)
- Fix LogitsProcessor using slim checkpoint (7f8f01fee41efeaac4f926bdb96aea42f1c6076b)
Co-authored-by: bcol <bcol@users.noreply.huggingface.co>
|
2023-04-08 02:54:27 +00:00 |
duzx16
|
9324de70a9
|
Use gmask in first place
|
2023-04-06 23:25:10 +08:00 |
duzx16
|
63ce1bac4a
|
Update code for slim
|
2023-04-06 22:19:30 +08:00 |
Zhengxiao Du
|
551a50efec
|
fix typo in use_gmask (#21)
- fix typo in use_gmask (d6504255afdd555d12137fc3af04646f099b5785)
Co-authored-by: Fan Zhang <fzhang@users.noreply.huggingface.co>
|
2023-04-05 11:11:33 +00:00 |
duzx16
|
f82b180d8d
|
Fix position ids expand
|
2023-04-03 14:14:20 +08:00 |
duzx16
|
fb23542cfe
|
Fix generate
|
2023-04-02 14:51:17 +08:00 |
duzx16
|
08bc85104d
|
Fix attention mask for prefix prompt
|
2023-04-02 02:25:03 +08:00 |
duzx16
|
4b7ffbf070
|
No padding for chat function
|
2023-04-02 02:03:05 +08:00 |
duzx16
|
cc96a2271a
|
Implement batch generation
|
2023-04-01 19:41:28 +08:00 |
duzx16
|
11c270c26c
|
Fix position id for training
|
2023-03-31 20:18:10 +08:00 |
Zhengxiao Du
|
9c7416d834
|
fix GLM6BBlock name typo (#20)
- fix GLM6BBlock name typo (2a180534c5f2b4860598a9caef798b55a77cfb72)
Co-authored-by: rich brain <richbrain@users.noreply.huggingface.co>
|
2023-03-31 07:34:06 +00:00 |
duzx16
|
2e1be30ac4
|
Add support for loading quantized model
|
2023-03-31 10:48:38 +08:00 |
duzx16
|
c949d03152
|
Use dynamic dtype for prompts
|
2023-03-31 01:13:32 +08:00 |
duzx16
|
0cfae21ef8
|
Fix backward for quantization
|
2023-03-30 21:49:06 +08:00 |
duzx16
|
aea6cefcf5
|
Implement gradient checkpointing
|
2023-03-30 19:42:01 +08:00 |
duzx16
|
0564795e6e
|
Fix bugs
|
2023-03-30 17:35:58 +08:00 |
duzx16
|
2200e2bc52
|
Add pad_token_id in config.json
Fix position_ids in ChatGLMModel
Add batch position_ids
|
2023-03-29 21:52:46 +08:00 |
duzx16
|
5c64357295
|
Set ignore_index for CrossEntropyLoss
|
2023-03-29 21:19:38 +08:00 |
duzx16
|
8127ab6abf
|
Support batch training
|
2023-03-29 21:15:30 +08:00 |
duzx16
|
fbda1206cb
|
Merge branch 'main' into dev_pt
# Conflicts:
# modeling_chatglm.py
|
2023-03-29 20:37:39 +08:00 |
duzx16
|
812f43f9ff
|
Add p-tuning v2
|
2023-03-29 20:22:57 +08:00 |
duzx16
|
096f3de6b4
|
Fix context length in get_position_ids
|
2023-03-28 17:37:46 +08:00 |
duzx16
|
4a9b711e61
|
Close CPU fusion on Mac
|
2023-03-23 22:43:06 +08:00 |
duzx16
|
d2bbc82a2c
|
Fix Chinese punctuation
|
2023-03-22 14:37:21 +08:00 |
duzx16
|
2460dc2430
|
Remove hardcode bos_token_id
|
2023-03-19 14:56:15 +08:00 |
duzx16
|
42095d42ff
|
Add support for streaming output
|
2023-03-19 14:31:26 +08:00 |
duzx16
|
220f772e9a
|
Fix overflow in FP16
|
2023-03-16 09:26:05 +08:00 |
duzx16
|
f9f74fda55
|
Set is_parallelizable to False
|
2023-03-16 00:30:43 +08:00 |
duzx16
|
c3dece3f01
|
Add logit processor for NaN or Inf scores
|
2023-03-15 18:14:34 +08:00 |
duzx16
|
9d1509a1ad
|
Fix default history argument
|
2023-03-14 18:38:49 +08:00 |
duzx16
|
d4832e8142
|
Add support for float32
|
2023-03-14 14:49:14 +08:00 |
duzx16
|
cd8041ea53
|
Fix past_key_values
|
2023-03-14 02:08:43 +08:00 |
Sengxian
|
d11c6aaed8
|
Add chatglm-6b
|
2023-03-14 00:21:01 +08:00 |