Reduce threads to 5 to speed up generation and allow more parallel lamas. - annna - Annna the nice friendly bot.
HTML git clone git://bitreich.org/annna/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/annna/
DIR Log
DIR Files
DIR Refs
DIR Tags
DIR README
---
DIR commit ec75159ce63799ac70abd66ac590bc0d80c7dcab
DIR parent 27908724ac2dfc9736de111f4a6a1ac89e4c949f
HTML Author: Annna Robert-Houdin <annna@bitreich.org>
Date: Sat, 4 Jan 2025 19:49:16 +0100
Reduce threads to 5 to speed up generation and allow more parallel lamas.
Diffstat:
M gpt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
DIR diff --git a/gpt b/gpt
@@ -19,7 +19,7 @@ fi
prompt="$1"
printf "%s\n" "${prompt}" \
- | $ggmlbin -m $ggmlmodel -n $ggmlntokens \
+ | $ggmlbin -m $ggmlmodel -n $ggmlntokens -t 5 \
--simple-io --no-display-prompt --grammar 'root ::= ([^\x00-\x1F])*' \
-p "${systemprompt}" -cnv 2>/dev/null \
| sed -E '/^$/d;s/^>[[:blank:]]+//;q'