-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Schema not found the second time a spark table is accessed #10749
Comments
Thanks for the issue! Can you show a fully reproducible example that I can copy-paste into a file or a Python REPL? The
bits seem critical to reproducing the problem for example. Toy data is completely fine. |
Sure ! Please let me know if this one is good enough: import ibis
from pyspark.sql import SparkSession
session = SparkSession.builder.getOrCreate()
con = ibis.pyspark.connect(session)
# Could not use following as 'force' is not supported in 'create_table' and does not seem to work in 'create_database'
# con.create_database(
# "my_database",
# force=True,
# )
# con.create_table(
# "my_table",
# schema=ibis.schema([("id", "string")]),
# database="my_database",
# force=True,
# )
con.sql(f"CREATE DATABASE IF NOT EXISTS my_database")
con.sql("""
CREATE TABLE IF NOT EXISTS my_database.my_table (
id string
)
""")
new_ids = ibis.memtable([{
'id': 'my_id',
}])
con.insert(
"my_table",
new_ids,
database="my_database",
)
new_ids_2 = ibis.memtable([{
'id': 'my_id_2',
}])
con.insert(
"my_table",
new_ids_2,
database="my_database",
) |
Hey @christophediprima ! The |
There does seem to be a bug here, which is that |
Found the issue, I'll put up a PR |
Thanks for your replies. I used .sql as a workaround so I can provide a working full example to reproduce the initial bug I was reporting. Did you find the solution for my initial bug report or |
|
(Although that's with a local spark session, not using Iceberg) |
What happened?
I am trying to insert into two different iceberg tables using spark connect. But as soon as I access a table using
get_schema
,table
orinsert
I won't be able to do it a second time.My code looks like this:
First insert will work but not the second one. If first insert is commented, second one will work. Same thing will happen regardless of the
get_schema
,table
orinsert
method.What version of ibis are you using?
ibis-framework[pyspark]==10.0.0.dev490
What backend(s) are you using, if any?
PySpark
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: