Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepared statement optimisation #17497

Open
systay opened this issue Jan 9, 2025 · 0 comments
Open

Prepared statement optimisation #17497

systay opened this issue Jan 9, 2025 · 0 comments

Comments

@systay
Copy link
Collaborator

systay commented Jan 9, 2025

When running queries inside prepared statements, the planner doesn't see values that would be available to it during normal, not-prepared, query execution. Example:

select 1 from user, user_extra where user.id = 1 and user_extra.user_id = 1

produces the plan:

{
  "QueryType": "SELECT",
  "Original": "select 1 from user, user_extra where user.id = 1 and user_extra.user_id = 1",
  "Instructions": {
    "OperatorType": "Route",
    "Variant": "EqualUnique",
    "Keyspace": {
      "Name": "user",
      "Sharded": true
    },
    "FieldQuery": "select 1 from `user`, user_extra where 1 != 1",
    "Query": "select 1 from `user`, user_extra where `user`.id = 1 and user_extra.user_id = 1",
    "Table": "`user`, user_extra",
    "Values": [
      "1"
    ],
    "Vindex": "user_index"
  },
  "TablesUsed": [
    "user.user",
    "user.user_extra"
  ]
}

Unfortunately, when running in prepared statement, the planner sees ? instead of literal values, and won't know that the two sharding keys are pointing to the same value:

select 1 from user, user_extra where user.id = ? and user_extra.user_id = ?

produces:

{
  "QueryType": "SELECT",
  "Original": "select 1 from user, user_extra where user.id = ? and user_extra.user_id = ?",
  "Instructions": {
    "OperatorType": "Join",
    "Variant": "Join",
    "JoinColumnIndexes": "L:0",
    "TableName": "`user`_user_extra",
    "Inputs": [
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select 1 from `user` where 1 != 1",
        "Query": "select 1 from `user` where `user`.id = :v1",
        "Table": "`user`",
        "Values": [
          ":v1"
        ],
        "Vindex": "user_index"
      },
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select 1 from user_extra where 1 != 1",
        "Query": "select 1 from user_extra where user_extra.user_id = :v2",
        "Table": "user_extra",
        "Values": [
          ":v2"
        ],
        "Vindex": "user_index"
      }
    ]
  },
  "TablesUsed": [
    "user.user",
    "user.user_extra"
  ]
}

To address this issue, the planner could defer planning until the first execution of the prepared statement. During the initial execution, actual values for the bind variables (?) would be available, allowing the planner to determine if the sharding keys point to the same value. Using this information, the planner could create a more optimal plan.

Once the plan is generated, it would be cached. To ensure subsequent executions remain valid, the cache would require a mechanism to validate that the bind variables in future executions match the expected pattern. If the validation fails (e.g., bind variables do not follow the required relationship), the planner could either fall back to a generic plan or replan dynamically. This ensures correctness while optimizing performance for typical cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant