Skip to content

Commit

Permalink
Make padding-aware scheduling disableable (#771)
Browse files Browse the repository at this point in the history
  • Loading branch information
kzawora-intel authored Feb 3, 2025
2 parents e29b5f5 + 4deb8cf commit 85eb147
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion vllm/engine/arg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -465,7 +465,9 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
parser.add_argument(
'--use-padding-aware-scheduling',
default=EngineArgs.use_padding_aware_scheduling,
action='store_true',
action=StoreBoolean,
nargs="?",
const="True",
help=('Use padding-aware scheduling. If True, the scheduler '
'will consider padded tokens in prefill. '
'By default this is set to False on non-HPU devices. '))
Expand Down

0 comments on commit 85eb147

Please sign in to comment.