Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-828775: Get Scala session and Dataframe from Java #34

Open
sonalgoyal opened this issue May 27, 2023 · 6 comments
Open

SNOW-828775: Get Scala session and Dataframe from Java #34

sonalgoyal opened this issue May 27, 2023 · 6 comments

Comments

@sonalgoyal
Copy link

Our code relies on java(our own code) and scala libraries(ml, graph) and it would be very helpful to be able to convert the Java Dataframe and Session to Scala so that we can use both interoperably. I see the com.snowflake.snowpark_java.Dataframe already has getScalaDataframe() but it is package scoped. So is com.snowflake.snowpark_java.getScalaSession.

Is it possible to expose these methods publicly?

@github-actions github-actions bot changed the title Get Scala session and Dataframe from Java SNOW-828775: Get Scala session and Dataframe from Java May 27, 2023
@sfc-gh-jfreeberg
Copy link
Collaborator

Hi @sonalgoyal , could you include a code snippet or psudeo-code to help describe your scenario? I'm not sure I follow. Since Java is the common denominator in your project, you should be able to use the Java Snowpark library throughout, no?

@sonalgoyal
Copy link
Author

sonalgoyal commented May 31, 2023

@sfc-gh-jfreeberg We use Java predominantly in our stack and transform the data. So we have a Java Dataframe. Now we have a graph library provided to us by the Snowflake team which is in scala and uses Scala Dataframes as input. We do not have a way to invoke the Scala library from Java, as the DFs can not be invoked directly. Hence we are writing the Java DF to a temp table, and then reading it in Scala to make Scala Dataframes.

If we could get a handle to the underlying scala dataframe from java, we could pass that to the graph library, and convert the resulting scala df back to java df and use it in our flow.

@sfc-gh-mrui
Copy link
Contributor

@sonalgoyal Thanks to explain the use case for us. If these API are package scoped. I am assuming you can workaround it easily by introducing a utility class in the same name package and create a public function to return Scala DataFrame/Session for Java DataFrame/Session.

@sonalgoyal
Copy link
Author

@sfc-gh-mrui thanks for your suggestion. this workaround is not optimal as we do not want to write code in a namespace we do not own, and if the snowpark code changes, our codebase gets impacted. hence the request for a public api.

@sfc-gh-jfreeberg
Copy link
Collaborator

@sonalgoyal Which graph library is this? Do said it was provided by Snowflake?

@sonalgoyal
Copy link
Author

sonalgoyal commented Jun 17, 2023

We got it from Stuart Ozer and Robert Fehrmann from the Snowflake team. @sfc-gh-jfreeberg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants