Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual warmup per model instance / specify warmup config dynamically using c api #7884

Open
asaff1 opened this issue Dec 16, 2024 · 0 comments

Comments

@asaff1
Copy link

asaff1 commented Dec 16, 2024

Is your feature request related to a problem? Please describe.
I'm having a problem with warming up the model.
Currently, the config.pbtxt provides a warmup section, but I'd like to warmup the model for many different batch sizes, which will make the configpb.txt very large. I use the C API, so I'd like to do it either manually in code - by sending InferRequests after model load, or somehow dynamically providing the model config to LoadModel function (currently, not possible).

Doing it manually in code be like:

        auto request = tds::InferRequest::Create(tds::InferOptions(model_name_));
        for (auto& inp : model_inputs_) {
            request->AddInput(inp, begin, end, tds::DataType::FP32, input_shape, tds::MemoryType::CPU, 0);
        }

        auto result = server_->Infer(*request);

The problem here, I cannot specify the device (the system has multiple GPUs) or more precisely, the instance of the model to send the InferRequest to. Is that possible using the c++ api? If yes, it will solve my issue.

The other way, could be using the ModelWarmup config, it will do it per model instance which is good. But I cannot specify it in any other way than the config.pbtxt. I'd like to specify it on LoadModel, or in other dynamic way with c code - is that possible with c++ api, without editing the config.pbtxt file?

Describe the solution you'd like
A way to warmup the model, for all instances (that runs on different devices). Either by the warmup feature, or manually specifying the model instance of the InferRequest.

@asaff1 asaff1 changed the title Manual warmup / specify warmup using c api Manual warmup per model instance / specify warmup config dynamically using c api Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant