January 2022 – DataLyseis

Snowpark is a nice addition to the suite of features now available in snowflake. With snowpark we can now execute programs in snowflake without extracting the data to another environment ( think spark clusters, local desktop etc ) , instead we can quickly execute the program in snowflake and get the results. so the obvious question is how does this work internally.

Snowpark internally runs on docker containers ready to be accessed in virtual warehouses so it hides this complexity from us . We are essentially running the code using the snowpark library that enables this to happen. Just like Spark , this takes the advantage of Lazy evaluation , where the entire set of operation is only executed when an action is taken on the object. Just like Spark there is a set of containers that do these operations in the cloud without moving the data to your local machine . We are essentially moving the code to the cloud as opposed to moving the data to the code . This is such a powerful feature especially when it comes to dealing with a lot of data , we dont want to be moving data in and out of the cloud for any kind of transformation

susbsequent posts will cover how to use snowpark

Month: January 2022

a quick primer on snowpark

Object in Scala