Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading Empty Dataframe crashes Snowflake Pandas IO Manager #24172

Open
reinthal opened this issue Sep 3, 2024 · 3 comments · May be fixed by #24188
Open

Loading Empty Dataframe crashes Snowflake Pandas IO Manager #24172

reinthal opened this issue Sep 3, 2024 · 3 comments · May be fixed by #24188
Labels
integration: snowflake Related to dagster-snowflake type: bug Something isn't working

Comments

@reinthal
Copy link

reinthal commented Sep 3, 2024

Dagster version

1.8.0

What's the issue?

When SnowflakePandasIOManager is passed an empty dataframe with no schema it crashes at line 117

snowflake.connector.errors.ProgrammingError: 001003 (42000): SQL compilation error:
syntax error line 1 at position 65 unexpected ')'.
  File "/app/.venv/lib/python3.10/site-packages/dagster/_core/execution/plan/utils.py", line 54, in op_execution_error_boundary
    yield
  File "/app/.venv/lib/python3.10/site-packages/dagster/_utils/__init__.py", line 474, in iterate_with_context
    next_output = next(iterator)
  File "/app/.venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 751, in _gen_fn
    gen_output = output_manager.handle_output(output_context, output.value)
  File "/app/.venv/lib/python3.10/site-packages/dagster/_core/storage/db_io_manager.py", line 147, in handle_output
    handler_metadata = self._handlers_by_type[obj_type].handle_output(
  File "/app/.venv/lib/python3.10/site-packages/dagster_snowflake_pandas/snowflake_pandas_type_handler.py", line 117, in handle_output
    write_pandas(
  File "/app/.venv/lib/python3.10/site-packages/snowflake/connector/pandas_tools.py", line 402, in write_pandas
    cursor.execute(create_table_sql, _is_internal=True)
  File "/app/.venv/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 938, in execute
    Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
  File "/app/.venv/lib/python3.10/site-packages/snowflake/connector/errors.py", line 290, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/app/.venv/lib/python3.10/site-packages/snowflake/connector/errors.py", line 345, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/app/.venv/lib/python3.10/site-packages/snowflake/connector/errors.py", line 221, in default_errorhandler
    raise error_class(

when it tries to load the empty frame to snowflake.

What did you expect to happen?

When no data is present and no schema present snowflake_pandas_type_handler.handle_output should skip write_pandas.

How to reproduce?

Following integration test reproduces the error

@pytest.mark.skipif(not IS_BUILDKITE, reason="Requires access to the BUILDKITE snowflake DB")
@pytest.mark.parametrize(
    "io_manager", [(old_snowflake_io_manager), (pythonic_snowflake_io_manager)]
)
@pytest.mark.integration
def test_io_manager_with_snowflake_pandas_empty_data(io_manager):
    with temporary_snowflake_table(
        schema_name=SCHEMA,
        db_name=DATABASE,
    ) as table_name:
        # Create a job with the temporary table name as an output, so that it will write to that table
        # and not interfere with other runs of this test

        @op(out={table_name: Out(io_manager_key="snowflake", metadata={"schema": SCHEMA})})
        def emit_pandas_df(_):
            return pandas.DataFrame([])

        @op
        def read_pandas_df(df: pandas.DataFrame):
            assert set(df.columns) == {}
            assert len(df.index) == 0

        @job(
            resource_defs={"snowflake": io_manager},
        )
        def io_manager_test_job():
            read_pandas_df(emit_pandas_df())

        res = io_manager_test_job.execute_in_process()
        assert res.success

Deployment type

None

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

@reinthal reinthal added the type: bug Something isn't working label Sep 3, 2024
@reinthal
Copy link
Author

reinthal commented Sep 3, 2024

Possibly related issue #23571

@reinthal
Copy link
Author

reinthal commented Sep 3, 2024

See PR #24188

@reinthal
Copy link
Author

reinthal commented Sep 3, 2024

Thanks to @Laxi-luke and @AntonMaxen for working on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration: snowflake Related to dagster-snowflake type: bug Something isn't working
Projects
None yet
2 participants