-
Notifications
You must be signed in to change notification settings - Fork 44.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): Enable json parsing with typing & conversion #8578
base: dev
Are you sure you want to change the base?
Conversation
PR Reviewer Guide 🔍(Review updated until commit c9df13e)Here are some key observations to aid the review process:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a functional or aesthetic change? In all of the instances where this target_type
feature is used, I'd expect the type to be predictable because we validate it before storing it. Are there any edge cases this addresses? Can you give an example?
@Pwuts the trigger was we accidentally storing stats column of AgentGraphExecution as a list instead of dict. This code made sure to not break in such a scenario. Example: https://github.com/Significant-Gravitas/AutoGPT/pull/8578/files#diff-fe699074de9055342cbf5c2aef8eba0ae08ef41429c2f012789c20eb9dec532cR82 Once an invalid format being stored, our code will break and we have no way to handle it. This also helps the type hinting on the IDE |
As far as I can see, this code has no recovery mechanism for that situation, does it? It raises an error, which is still breaking. |
Nope it concerts between types, that's the whole point of this PR |
What does that mean? |
My point is: I've looked through the code of Trying to coerce a type when retrieving stuff from DB is risky business to me. A dict (that we want to store) accidentally being wrapped in a list is just one of many possible ways to screw up storing that information. I think checking the parsed value against a target type is all we should be doing when it comes to info pulled from the DB. |
You can make the expected structure as deep and detailed as you want and it will shape it, since the caller only expect dict[str, Any] then that's what we get. The stats value on invalid format doesn't matter, what matter is that we never break the whole process because of it.
Checking the type is an obvious requirement. The action that we need if it doesn't match is the question, so we could either:
I chose the second option. |
Persistent review updated to latest commit c9df13e |
My point is that this generally won't help. If we are planning on using the value, e.g. So I think it really should just raise an error. Simple conversions can work fine, but I don't really see a case for converting to/from or between complex types like |
We definitely can't do this. This is the whole point of the PR, avoid raising error on malformatted loosely typed json. Json value we are having are mostly not a the most crucial information (stats, output data, properties). It doesn't make sense to error out loading a graph, executing a graph , or even opening a monitor page just because the db was accidentally corrupted the stats/outputData value. If a best effort conversion is a no go. I'm open with other alternatives, but it has to be non-terminating, not like the current state. |
As I stated before, using the value is not the main intention. We need the process to be non breaking that's all. And to make it non breaking we need to fallback to empty value on each type or do conversion like what this change is doing.
We can't. We should not break monitor page or literally any get request because an execution duration stats or one output page missing or malformed. |
✅ Deploy Preview for auto-gpt-docs canceled.
|
I'll hold merging this PR based on @Pwuts input here. |
Alright. I still don't think we should be trying to convert between types that don't have a trivial way to convert. E.g. A few other ideas:
|
@Pwuts Updated, please re-review |
@@ -133,6 +133,8 @@ def _convert(value: Any, target_type: Type): | |||
return {convert(v, args[0]) for v in value} | |||
else: | |||
return value | |||
elif raise_on_mismatch: | |||
raise ConversionError(f"Failed to convert {value} to {target_type}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ConversionError(f"Failed to convert {value} to {target_type}") | |
raise TypeError(f"Value {value} is not of expected type {target_type}") |
Other than the one comment I just posted I'm happy with this solution :) |
Background
We are using loosely typed JSON in several places, sometimes some sort of typing is required, and the resulting has to be either (tuple, list, dict, int, float, etc), and the raw string format is not adhering to that.
Changes 🏗️
This PR introduces typing on json.loads.
Testing 🔍
Note
Only for the new autogpt platform, currently in autogpt_platform/
Configuration Changes 📝
Note
Only for the new autogpt platform, currently in autogpt_platform/
If you're making configuration or infrastructure changes, please remember to check you've updated the related infrastructure code in the autogpt_platform/infra folder.
Examples of such changes might include: