feat: topk/topp sampling #105

chenghuaWang · 2024-07-30T11:56:34Z

greedy search, topk sampling and topp sampling for language generation. see ref: https://huggingface.co/blog/how-to-generate

Note: The tensor provided to the top-p generator should sum to 1, indicating that a softmax operation should be applied first.

LlmTextGenerator gen(LLmTextGeneratorType::kTopkSampling, /*k*/ 50, /*temperature*/0.3, /*p*/0.92);
auto result = model(...);
auto out_token = gen.generate(result[0]);
auto out_string = tokenizer.detokenize({out_token});

chenghuaWang · 2024-07-31T03:41:58Z

To avoid copying the entire vector, if you want to get all tokens by once, pls using call_back function. Here is an example

Chat:

for (int i = 0; i < in_strs.size(); ++i) {
        auto in_str = in_strs[i];
        auto input_tensor = tokenizer.tokenize(in_str, i);
        std::cout << "[Q] " << in_str << std::endl;
        std::cout << "[A] " << std::flush;

        LlmTextGeneratorOpts opt{
            .max_new_tokens = 100,
            .do_sample = true,
            .temperature = 0.3f,
            .top_k = 50,
            .top_p = 0.f,
        };
        model.generate(input_tensor, opt, [&](unsigned int out_token) -> bool {
            auto out_string = tokenizer.detokenize({out_token});
            auto [isOk, print_string] = processOutput(out_string);
            if (isOk) {
                std::cout << print_string << std::flush;
            } else {
                return false;
            }
            return true;
        });
        printf("\n");
    }

Get all Tokens:

for (int i = 0; i < in_strs.size(); ++i) {
        auto in_str = in_strs[i];
        auto input_tensor = tokenizer.tokenize(in_str, i);

        LlmTextGeneratorOpts opt{
            .max_new_tokens = 100,
            .do_sample = true,
            .temperature = 0.3f,
            .top_k = 50,
            .top_p = 0.f,
        };
        std::vector<unsigned int> tokens
        model.generate(input_tensor, opt, [&](unsigned int out_token) -> bool {
            tokens.emplace_back(out_token);
            return true;
        });
        auto out_string = tokenizer.detokenize(out_token);
    }

chenghuaWang and others added 4 commits July 30, 2024 19:48

feat: topk/topp sampling

9d18794

fix: typo

cd163a2

Merge branch 'UbiquitousLearning:main' into main

ff22f9c

feat: hf like model generate.

1a3fef7

yirongjie approved these changes Jul 31, 2024

View reviewed changes

yirongjie merged commit c5c33de into UbiquitousLearning:main Jul 31, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: topk/topp sampling #105

feat: topk/topp sampling #105

chenghuaWang commented Jul 30, 2024

chenghuaWang commented Jul 31, 2024

feat: topk/topp sampling #105

feat: topk/topp sampling #105

Conversation

chenghuaWang commented Jul 30, 2024

chenghuaWang commented Jul 31, 2024