feat(backend): Enable json parsing with typing & conversion #8578

majdyz · 2024-11-07T03:07:22Z

Background

We are using loosely typed JSON in several places, sometimes some sort of typing is required, and the resulting has to be either (tuple, list, dict, int, float, etc), and the raw string format is not adhering to that.

Changes 🏗️

This PR introduces typing on json.loads.

Testing 🔍

Note

Only for the new autogpt platform, currently in autogpt_platform/

Create from scratch and execute an agent with at least 3 blocks
Import an agent from file upload, and confirm it executes correctly
Upload agent to marketplace
Import an agent from marketplace and confirm it executes correctly
Edit an agent from monitor, and confirm it executes correctly

Configuration Changes 📝

Note

Only for the new autogpt platform, currently in autogpt_platform/

If you're making configuration or infrastructure changes, please remember to check you've updated the related infrastructure code in the autogpt_platform/infra folder.

Examples of such changes might include:

Changing ports
Adding new services that need to communicate with each other
Secrets or environment variable changes
New or infrastructure changes such as databases

codiumai-pr-agent-pro · 2024-11-07T03:07:40Z

PR Reviewer Guide 🔍

(Review updated until commit `c9df13e`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Error Handling The new error handling in convert() function catches all exceptions generically and wraps them in ConversionError. This could mask important specific errors. Consider catching more specific exceptions. Default Value Risk Using empty string as default value for json.loads could mask real errors if stats is None. Consider handling None case explicitly. Type Safety The loads() function returns Any when no target_type is provided. This could lead to type safety issues in consuming code. Consider adding runtime type checking or warnings.

Pwuts

Is this a functional or aesthetic change? In all of the instances where this target_type feature is used, I'd expect the type to be predictable because we validate it before storing it. Are there any edge cases this addresses? Can you give an example?

majdyz · 2024-11-08T01:20:20Z

@Pwuts the trigger was we accidentally storing stats column of AgentGraphExecution as a list instead of dict. This code made sure to not break in such a scenario. Example: https://github.com/Significant-Gravitas/AutoGPT/pull/8578/files#diff-fe699074de9055342cbf5c2aef8eba0ae08ef41429c2f012789c20eb9dec532cR82

Once an invalid format being stored, our code will break and we have no way to handle it.

This also helps the type hinting on the IDE

Pwuts · 2024-11-08T14:37:51Z

accidentally storing stats column of AgentGraphExecution as a list instead of dict. This code made sure to not break in such a scenario.

As far as I can see, this code has no recovery mechanism for that situation, does it? It raises an error, which is still breaking.

majdyz · 2024-11-08T14:41:12Z

Nope it concerts between types, that's the whole point of this PR

Pwuts · 2024-11-08T21:37:05Z

it concerts between types

What does that mean?

Pwuts · 2024-11-08T21:47:19Z

My point is: I've looked through the code of convert() and afaics convert([{"some_stat": 123}], dict[str, Any]) will output {0: {"some_stat": 123}}. This doesn't fix the problem of stats being accidentally being stored wrapped in a tuple or list, which was the trigger for making this PR (right?). The resulting object may be a dict, but still doesn't have the right structure.

Trying to coerce a type when retrieving stuff from DB is risky business to me. A dict (that we want to store) accidentally being wrapped in a list is just one of many possible ways to screw up storing that information. I think checking the parsed value against a target type is all we should be doing when it comes to info pulled from the DB.

majdyz · 2024-11-08T22:25:04Z

You can make the expected structure as deep and detailed as you want and it will shape it, since the caller only expect dict[str, Any] then that's what we get.

The stats value on invalid format doesn't matter, what matter is that we never break the whole process because of it.
To fix the initial problem we only have two options:

Make the list to dict, then move on
Not changing anything, but still move on

I think checking the parsed value against a target type is all we should be doing when it comes to info pulled from the DB.

Checking the type is an obvious requirement. The action that we need if it doesn't match is the question, so we could either:

Raise an error (that we can't do)
Shape it to match the structure (and maybe log a warning) and move on.

I chose the second option.

codiumai-pr-agent-pro · 2024-11-08T22:29:30Z

Persistent review updated to latest commit c9df13e

Pwuts · 2024-11-09T03:21:24Z

Shape it to match the structure (and maybe log a warning) and move on.

My point is that this generally won't help. If we are planning on using the value, e.g. stats["duration"], and stats comes in as a list[dict] instead of a dict, then trying to convert it (using an arbitrary mechanism) isn't going to make it usable because they are fundamentally different data structures. It will let the system move on until the value is actually used, and then it will still fail, maybe in an unexpected way. It could even cause further pollution of the system by propagating the malformed data.

So I think it really should just raise an error.

Simple conversions can work fine, but I don't really see a case for converting to/from or between complex types like list and dict. Within graphs we have to be somewhat flexible to increase interoperability of blocks, but I wouldn't want to extend this to our backend code. That honestly seems dangerous.

majdyz · 2024-11-10T00:26:29Z

So I think it really should just raise an error.

We definitely can't do this. This is the whole point of the PR, avoid raising error on malformatted loosely typed json. Json value we are having are mostly not a the most crucial information (stats, output data, properties). It doesn't make sense to error out loading a graph, executing a graph , or even opening a monitor page just because the db was accidentally corrupted the stats/outputData value.

If a best effort conversion is a no go. I'm open with other alternatives, but it has to be non-terminating, not like the current state.

majdyz · 2024-11-11T05:14:47Z

As I stated before, using the value is not the main intention. We need the process to be non breaking that's all. And to make it non breaking we need to fallback to empty value on each type or do conversion like what this change is doing.

So I think it really should just raise an error.

We can't. We should not break monitor page or literally any get request because an execution duration stats or one output page missing or malformed.

netlify · 2024-11-11T11:11:34Z

✅ Deploy Preview for auto-gpt-docs canceled.

Name	Link
🔨 Latest commit	`8821b04`
🔍 Latest deploy log	https://app.netlify.com/sites/auto-gpt-docs/deploys/6736a946777a0400099db72f

majdyz · 2024-11-12T03:46:38Z

I'll hold merging this PR based on @Pwuts input here.

Pwuts · 2024-11-12T11:37:13Z

If a best effort conversion is a no go. I'm open with other alternatives, but it has to be non-terminating, not like the current state.

Alright. I still don't think we should be trying to convert between types that don't have a trivial way to convert. E.g. float <-> str is doable because there is a lossless way to do that conversion. list <-> dict is non-trivial, and I estimate the chance of it working with the rest of the code to be minimal.

A few other ideas:

Raise an error if deserialized value is not of target_type, then try/except where this isn't allowed to cascade into failure of the containing routine
Return None if deserialized value is not of target_type and can't be (trivially) converted
Add a default: Any = None parameter, return default if deserialized value is not of target_type

majdyz · 2024-11-13T08:17:07Z

@Pwuts Updated, please re-review

Pwuts · 2024-11-14T23:36:48Z

autogpt_platform/backend/backend/util/type.py

@@ -133,6 +133,8 @@ def _convert(value: Any, target_type: Type):
                return {convert(v, args[0]) for v in value}
            else:
                return value
+    elif raise_on_mismatch:
+        raise ConversionError(f"Failed to convert {value} to {target_type}")


Suggested change

raise ConversionError(f"Failed to convert {value} to {target_type}")

raise TypeError(f"Value {value} is not of expected type {target_type}")

Pwuts · 2024-11-14T23:39:38Z

Other than the one comment I just posted I'm happy with this solution :)

feat(backend): Enable json parsing with typing & conversion

3fca82a

majdyz requested a review from ntindle November 7, 2024 03:07

majdyz requested a review from a team as a code owner November 7, 2024 03:07

majdyz requested review from Pwuts and removed request for a team November 7, 2024 03:07

github-actions bot added platform/backend AutoGPT Platform - Back end size/m labels Nov 7, 2024

codiumai-pr-agent-pro bot added the Review effort [1-5]: 3 label Nov 7, 2024

majdyz mentioned this pull request Nov 7, 2024

refactor(backend): Introduced Graph Input & Output Schema, Merge GraphMeta & Graph, Remove subgraph functionality #8526

Merged

Merge branch 'dev' into zamilmajdy/better-json-parsing

941d9df

majdyz requested review from Pwuts and removed request for Pwuts November 7, 2024 04:37

Pwuts reviewed Nov 8, 2024

View reviewed changes

Merge branch 'dev' into zamilmajdy/better-json-parsing

d57e24e

majdyz requested a review from Pwuts November 8, 2024 07:52

Merge branch 'dev' into zamilmajdy/better-json-parsing

c9df13e

majdyz closed this Nov 8, 2024

majdyz reopened this Nov 8, 2024

Merge branch 'dev' into zamilmajdy/better-json-parsing

bf6dbd4

Merge branch 'dev' into zamilmajdy/better-json-parsing

d6edbe4

Merge branch 'dev' into zamilmajdy/better-json-parsing

3f74f4f

aarushik93 previously approved these changes Nov 12, 2024

View reviewed changes

Merge branch 'dev' into zamilmajdy/better-json-parsing

07981fc

Merge branch 'dev' into zamilmajdy/better-json-parsing

7720de4

majdyz added 2 commits November 13, 2024 15:02

Merge branch 'dev' into zamilmajdy/better-json-parsing

8375181

Address comments

4f7e7d0

majdyz dismissed aarushik93’s stale review via 4f7e7d0 November 13, 2024 08:16

Merge branch 'dev' into zamilmajdy/better-json-parsing

32ea591

majdyz requested a review from aarushik93 November 14, 2024 09:42

Merge branch 'dev' into zamilmajdy/better-json-parsing

408eb2f

Pwuts reviewed Nov 14, 2024

View reviewed changes

ntindle approved these changes Nov 15, 2024

View reviewed changes

Merge branch 'dev' into zamilmajdy/better-json-parsing

8821b04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): Enable json parsing with typing & conversion #8578

feat(backend): Enable json parsing with typing & conversion #8578

majdyz commented Nov 7, 2024

codiumai-pr-agent-pro bot commented Nov 7, 2024 •

edited

Loading

Pwuts left a comment

majdyz commented Nov 8, 2024 •

edited

Loading

Pwuts commented Nov 8, 2024

majdyz commented Nov 8, 2024

Pwuts commented Nov 8, 2024

Pwuts commented Nov 8, 2024 •

edited

Loading

majdyz commented Nov 8, 2024 •

edited

Loading

codiumai-pr-agent-pro bot commented Nov 8, 2024

Pwuts commented Nov 9, 2024

majdyz commented Nov 10, 2024 •

edited

Loading

majdyz commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading

majdyz commented Nov 12, 2024

Pwuts commented Nov 12, 2024

majdyz commented Nov 13, 2024

Pwuts Nov 14, 2024

Pwuts commented Nov 14, 2024

	raise ConversionError(f"Failed to convert {value} to {target_type}")
	raise TypeError(f"Value {value} is not of expected type {target_type}")

feat(backend): Enable json parsing with typing & conversion #8578

Are you sure you want to change the base?

feat(backend): Enable json parsing with typing & conversion #8578

Conversation

majdyz commented Nov 7, 2024

Background

Changes 🏗️

Testing 🔍

Configuration Changes 📝

codiumai-pr-agent-pro bot commented Nov 7, 2024 • edited Loading

PR Reviewer Guide 🔍

(Review updated until commit c9df13e)

Pwuts left a comment

Choose a reason for hiding this comment

majdyz commented Nov 8, 2024 • edited Loading

Pwuts commented Nov 8, 2024

majdyz commented Nov 8, 2024

Pwuts commented Nov 8, 2024

Pwuts commented Nov 8, 2024 • edited Loading

majdyz commented Nov 8, 2024 • edited Loading

codiumai-pr-agent-pro bot commented Nov 8, 2024

Pwuts commented Nov 9, 2024

majdyz commented Nov 10, 2024 • edited Loading

majdyz commented Nov 11, 2024 • edited Loading

netlify bot commented Nov 11, 2024 • edited Loading

✅ Deploy Preview for auto-gpt-docs canceled.

majdyz commented Nov 12, 2024

Pwuts commented Nov 12, 2024

majdyz commented Nov 13, 2024

Pwuts Nov 14, 2024

Choose a reason for hiding this comment

Pwuts commented Nov 14, 2024

codiumai-pr-agent-pro bot commented Nov 7, 2024 •

edited

Loading

(Review updated until commit `c9df13e`)

majdyz commented Nov 8, 2024 •

edited

Loading

Pwuts commented Nov 8, 2024 •

edited

Loading

majdyz commented Nov 8, 2024 •

edited

Loading

majdyz commented Nov 10, 2024 •

edited

Loading

majdyz commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading