Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Children in spawned gltf scenes are incorrectly ordered #17720

Open
rparrett opened this issue Feb 7, 2025 · 3 comments
Open

Children in spawned gltf scenes are incorrectly ordered #17720

rparrett opened this issue Feb 7, 2025 · 3 comments
Labels
A-ECS Entities, components, systems, and events A-glTF Related to the glTF 3D scene/model format A-Scenes Serialized ECS data stored on the disk C-Bug An unexpected or incorrect behavior D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Needs-Design This issue requires design work to think about how it would best be accomplished
Milestone

Comments

@rparrett
Copy link
Contributor

rparrett commented Feb 7, 2025

Bevy version

main, bisected to #17687

Relevant system information

SystemInfo { os: "macOS 15.3 Sequoia", kernel: "24.3.0", cpu: "Apple M4 Max", core_count: "16", memory: "64.0 GiB" }
AdapterInfo { name: "Apple M4 Max", vendor: 0, device: 0, device_type: IntegratedGpu, driver: "", driver_info: "", backend: Metal }

Also failing on linux/mac in the example runner.

What you did

cargo run --example gltf_skinned_mesh

What went wrong

thread 'Compute Task Pool (1)' panicked at examples/animation/gltf_skinned_mesh.rs:61:68:
called `Result::unwrap()` on an `Err` value: QueryDoesNotMatch(12v1 with components Transform, GlobalTransform, Visibility, InheritedVisibility, ViewVisibility, VisibilityClass, ChildOf, PreviousGlobalTransform, Mesh3d, SkinnedMesh, Name, MeshMaterial3d<StandardMaterial>, Aabb)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Encountered a panic in system `gltf_skinned_mesh::joint_animation`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
@rparrett rparrett added C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled labels Feb 7, 2025
@rparrett
Copy link
Contributor Author

rparrett commented Feb 9, 2025

It seems that the order of the children within the spawned hierarchy has changed.

// Bevy 0.15.2
8v1
 9v1
  GltfNode0
   Mesh
   GltfNode1
    GltfNode2

// Bevy main
8v1
 9v1
  GltfNode0
   GltfNode1
    GltfNode2
   Mesh

So the following from gltf_skinned_mesh is no longer true:

// First joint is the second child of the mesh node.

So the following patch fixes the example:

         // First joint is the second child of the mesh node.
-        let first_joint_entity = mesh_node_parent[1];
+        let first_joint_entity = mesh_node_parent[0];
         // Get `Children` in the first joint.
         let first_joint_children = parents.get(first_joint_entity).unwrap();

However, if I'm understanding the situation right, this may be pointing to some sort of bug in scene spawning or entity cloning that we want to fix.

@rparrett
Copy link
Contributor Author

rparrett commented Feb 9, 2025

cc @cart

@mockersf mockersf added this to the 0.16 milestone Feb 9, 2025
@cart
Copy link
Member

cart commented Feb 11, 2025

Nice catch!

This became visible in #17687 because that fixed the "duplicate entities on spawn" bug introduced in #17398. But the reordering was actually introduced in #17398.

The problem is that we're spawning a flat (arbitrarily ordered) list of entities in the SceneSpawner (ex: iter archetypes, spawn each entity in the archetype). We ignore the Children component, meaning we are spawning entities, some of which have ChildOf, and letting the arbitrary spawn order determine the final order of the Children. Definitely not shippable behavior.

The original RelationshipTarget is the only thing holding the order information. We cannot discard it.

Some options:

  1. Spawn hierarchically, which would ensure the order is preserved (ex: spawn starting from a list of roots, and then trigger spawns for all relationshiptarget components with LINKED_SPAWN). This would largely fix the problem. But for any non-LINKED_SPAWN relationships, their order would still be lost (and therefore this isn't a compelling option). And spawning in archetype order also has better memory access patterns in theory (although its hard to say if we're actually getting benefits from that cache friendliness here).
  2. We could do a final "fixup pass" where we replace all relationship target entities with a mapped version of their original collection. This has the downside of doing all of the "setup code" with the wrong order, then doing additional work to fix it up retroactively. This overhead puts it in the "last-resort-hack" category for me.
  3. We could spawn the mapped version of the collection, and then somehow suppress the Relationship (ex: ChildOf) insert hook logic to avoid inserting the out-of-order duplicates. This could be accomplished efficiently with a flag on World or something, which we set when spawning / cloning and reset when finished.
  4. We could "reverse" the process: spawn a mapped version of the RelationshipTarget collection, fully ignore the Relationship component (to avoid inserting duplicates), and then drive the inserts of the Relationship component via the RelationshipTarget collection. This has the downside of forcing archetype moves unnecessarily (spawning without ChildOf and then inserting later is an archetype move). The approach on main does not (theoretically) suffer from unnecessary archetype moves, as we can spawn the empty Children (preallocated with the space necessary for each child) and the children are each spawned directly with ChildOf. In practice main still has archetype moves because we're spawning each component one-by-one, but that will change once we switch to using the EntityCloner to spawn.

I was leaning toward (3), as it seemed like our "least bad" option. It preserves order across all relationship types, doesn't introduce overhead (just one more branch ... and in fact it is actually faster because we're skipping the insert code), and it allows us to spawn/clone in arbitrary orders (including cache-efficient orders like iterating archetypes or tables). We can trust that the contents of the RelationshipTarget collections are valid, as they were generated via the normal hooks flows.

Sadly it has a fatal flaw: cloning can trigger hooks/observers, which can trigger arbitrary spawns. In those cases, with a global flag the triggered spawns wouldn't properly maintain their relationships. This means we need this configuration to be properly scoped to the call (ex: add a new set of spawn / insert methods that pass the relevant information to the hook so it can skip only where relevant).

So thats what I'm leaning toward now. If anyone has other ideas let me know.

@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events A-Scenes Serialized ECS data stored on the disk A-glTF Related to the glTF 3D scene/model format D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Needs-Design This issue requires design work to think about how it would best be accomplished and removed S-Needs-Triage This issue needs to be labelled labels Feb 11, 2025
@rparrett rparrett changed the title gltf_skinned_mesh example panics Children in spawned gltf scenes are incorrectly ordered Feb 11, 2025
github-merge-queue bot pushed a commit that referenced this issue Feb 11, 2025
# Objective

Fix gltf validation errors in `Fox.glb`.

Inspired by #8099, but that issue doesn't appear to describe a real bug
to fix, as far as I can tell.

## Solution

Use the latest version of the Fox from
[glTF-Sample-Assets](https://github.com/KhronosGroup/glTF-Sample-Assets/blob/main/Models/Fox/glTF-Binary/Fox.glb).

## Testing

Dropped both versions in https://github.khronos.org/glTF-Validator/

`cargo run --example animated_mesh` seems to still look fine.

Before:

```
The asset contains errors.
"numErrors": 126,
"numWarnings": 4184,
```

After:

```
The asset is valid.
"numErrors": 0,
"numWarnings": 0,
```

## Discussion

The 3d testbed was panicking with
```
thread 'main' panicked at examples/testbed/3d.rs:288:60:
called `Result::unwrap()` on an `Err` value: QueryDoesNotMatch(35v1 with components Transform, GlobalTransform, Visibility, InheritedVisibility, ViewVisibility, ChildOf, Children, Name)
```
Which is bizarre. I think this might be related to #17720, or maybe the
structure of the gltf changed.

I fixed it by using updating the testbed to use a more robust method of
finding the correct entity as is done in `animated_mesh`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events A-glTF Related to the glTF 3D scene/model format A-Scenes Serialized ECS data stored on the disk C-Bug An unexpected or incorrect behavior D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Needs-Design This issue requires design work to think about how it would best be accomplished
Projects
Status: No status
Development

No branches or pull requests

4 participants