Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50843][ML][PYTHON][CONNECT] Support return a new model from existing one #49709

Closed
wants to merge 1 commit into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Support return a new model from existing one, e.g.
1, RandomForestClassificationModel.trees -> get underlying decision tree models;
2, DistributedLDAModel.toLocal -> get a LocalLDAModel

Why are the changes needed?

feature parity

Does this PR introduce any user-facing change?

yes

How was this patch tested?

added tests

Was this patch authored or co-authored using generative AI tooling?

no

@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Jan 28, 2025

@wbo4958 @grundprinzip

@zhengruifeng zhengruifeng changed the title [SPARK-50843][ML][PYTHON][CONNECT] Support return a new model from existing one [WIP][SPARK-50843][ML][PYTHON][CONNECT] Support return a new model from existing one Jan 28, 2025
@zhengruifeng zhengruifeng marked this pull request as draft January 28, 2025 10:13
@zhengruifeng zhengruifeng changed the title [WIP][SPARK-50843][ML][PYTHON][CONNECT] Support return a new model from existing one [SPARK-50843][ML][PYTHON][CONNECT] Support return a new model from existing one Jan 28, 2025
@@ -2290,10 +2291,15 @@ def featureImportances(self) -> Vector:
"""
return self._call_java("featureImportances")

@property
@cached_property
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache the result to avoid re-generating and re-caching the submodels (the trees) in the server side.

Copy link
Contributor

@wbo4958 wbo4958 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhengruifeng zhengruifeng marked this pull request as ready for review February 2, 2025 01:12
zhengruifeng added a commit that referenced this pull request Feb 2, 2025
…isting one

### What changes were proposed in this pull request?
Support return a new model from existing one, e.g.
1, `RandomForestClassificationModel.trees` -> get underlying decision tree models;
2, `DistributedLDAModel.toLocal` -> get a `LocalLDAModel`

### Why are the changes needed?
feature parity

### Does this PR introduce _any_ user-facing change?
yes

### How was this patch tested?
added tests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #49709 from zhengruifeng/ml_connect_model.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit ba18f50)
Signed-off-by: Ruifeng Zheng <[email protected]>
@zhengruifeng
Copy link
Contributor Author

merged to master/4.0

@zhengruifeng zhengruifeng deleted the ml_connect_model branch February 2, 2025 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants