[core] Performance improvement for runtime env serialization #48749

dentiny · 2024-11-14T22:31:55Z

Addresses issue #48591

The problem is:

If we specify anything in runtime_env in remote decorator, the parsing and serialization happens for each function invocation
- Parsing calls parse_runtime_env, which involves a dictionary to class transformation
- Serialization calls get_runtime_env_info, which serialize a class into json format

Discussed with @rynewang , the proposed solution here is cache pre-calculated serialized runtime env info, so the parsing and serialization only happens once at initialization.

Benchmarked with the test @jjyao mentioned on the ticket, I confirm we could reach similar performance between no env var vs with env var.

Alternatives considered:

Use functools.cache for get_runtime_env_info, which is a stateless function
- The caveat is, we have to figure out an acceptable way to decide whether serialization options and runtime env info is the same, simply traversing all fields is not a good plan

Signed-off-by: dentiny <[email protected]>

rynewang · 2024-11-14T23:24:44Z

python/ray/remote_function.py

+        # runtime env will be merged and re-serialized.
+        #
+        # Caveat: for `func.option().remote()`, we have to recalculate serialized
+        # runtime env info upon every call. But it's acceptable since pre-calculation


to be more clear,

To support dynamic runtime envs in `func.option(runtime_env={...}).remote()`, we recalculate the serialized runtime env info in the `option` call. If there are multiple calls to a same option, one can save the calculation by `opt_f = func.option(runtime_env={...}); [opt_f.remote() for i in range(many)]`.

I'm not sure I follow the "If there are multiple calls to a same option" part, since we don't do any caching for option calls.

Adopted other comments.

python/ray/remote_function.py

rynewang · 2024-11-14T23:27:38Z

python/ray/remote_function.py

-        # only update runtime_env when ".options()" specifies new runtime_env
+        # Only update runtime_env and re-calculate serialized runtime env info when
+        # ".options()" specifies new runtime_env.
+        serialized_runtime_env_info = self._serialized_base_runtime_env_info
        if "runtime_env" in task_options:
            updated_options["runtime_env"] = parse_runtime_env(


I think we should no longer populate updated_options["runtime_env"] ?

We basically have two options:

Keep updated_options["runtime_env"] updated, as we did in the past

Remove runtime_env from updated_options

I choose the first method, becase updated_options corresponds to task_options and default_options, so people have expectation there's runtime_env in these options.

hmm, ok since we need to support bind as well, where this optimization is not used

Signed-off-by: dentiny <[email protected]>

performance improvement for runtime env

3e312e0

Signed-off-by: dentiny <[email protected]>

dentiny requested review from jjyao and rynewang November 14, 2024 22:31

dentiny added the go add ONLY when ready to merge, run all tests label Nov 14, 2024

rynewang reviewed Nov 14, 2024

View reviewed changes

dentiny added 3 commits November 15, 2024 00:08

update comment

c5dd0e7

Signed-off-by: dentiny <[email protected]>

avoid default value

2a2ee84

Signed-off-by: dentiny <[email protected]>

fix is_job_runtime_env

a224041

Signed-off-by: dentiny <[email protected]>

dentiny requested a review from rynewang November 15, 2024 00:15

Merge branch 'master' into hjiang/improve-runtime-env-remote

1f2121c

jcotant1 added the core Issues that should be addressed in Ray Core label Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Performance improvement for runtime env serialization #48749

[core] Performance improvement for runtime env serialization #48749

dentiny commented Nov 14, 2024 •

edited

Loading

rynewang Nov 14, 2024

dentiny Nov 15, 2024

dentiny Nov 15, 2024

rynewang Nov 14, 2024

dentiny Nov 15, 2024

rynewang Nov 15, 2024

[core] Performance improvement for runtime env serialization #48749

Are you sure you want to change the base?

[core] Performance improvement for runtime env serialization #48749

Conversation

dentiny commented Nov 14, 2024 • edited Loading

rynewang Nov 14, 2024

Choose a reason for hiding this comment

dentiny Nov 15, 2024

Choose a reason for hiding this comment

dentiny Nov 15, 2024

Choose a reason for hiding this comment

rynewang Nov 14, 2024

Choose a reason for hiding this comment

dentiny Nov 15, 2024

Choose a reason for hiding this comment

rynewang Nov 15, 2024

Choose a reason for hiding this comment

dentiny commented Nov 14, 2024 •

edited

Loading