Ultimate 2023 Guide to Snowflake Interview Mastery: 100+ Essential Questions & Answers for Both Novices and Experienced Professionals(Part:-3)
Welcome again to our series on Snowflake Interview Questions & Answers! For a comprehensive understanding, it’s advisable to go through Part 1 and Part 2 before diving into this segment. This will ensure you’re well-prepared for your upcoming Snowflake interviews.

1. What is the stream?
A stream is an object in a snowflake used for change data capturing.
Types of streams
Standard Streams: Capture all changes(INSERT, DELETE, UPDATE)
Append only Streams: Capture only insert related records
Insert only Streams: Can be created only on external tables
2. What are the additional fields/columns that will get added in streams?
Three additional fields will get added in streams namely
METADATA$ACTION
METADATA$ISUPDATE
METADATA$ROW_ID
3. How will the stream behave in case of an update on the source table?
The stream will put 2 entries in your stream
It will delete old records and insert a new record
4. What is the alternative for streams in Snowflake?
Enable change tracking on tables and use a changes clause as an alternative for streams
5. What are the limitations of Tasks ?
We can be able to call or run only one SQL statement in a Task. Earlier, we happened to specify which warehouse to use for that manually, where now recently Snowflake has introduced server-less tasks (Snowflake Managed tasks)
6. Does Snowflake enforce constraints?
We can create constraints but they won’t be enforced except NOT NULL constraint
7. What are the different caches in Snowflake?
There are three types of caches that exist in Snowflake namely
Result Cache
Metadata Cache
Warehouse Cache
8. If you rerun the same query next time where will the query get data from?
The result cache is persisted for 24 hours unless the base object is changed. If the same query is rerun again, the results will be retrieved from the result cache.
9. How many days the Result cache is valid?
If the query is rerun after 23 hours that will be extended by another 24 hours
If the underlying base table data changes, then the result cache gets invalidated
10. What is the maximum retention period?
90 days is the maximum retention period available in Enterprise and above editions.
11. How to load historical data/old data/existing data in the cloud to Snowflake using Snowpipe?
ALTER PIPE REFRESH command will load the last 7 days of historical data. Use copy command to load data older than 7 days (i.e say the last 6 months data)
12. Will Snowpipe load duplicate files (same filename) into the table?
No, it will maintain metadata information for 14 days
13. How snow pipe copy command is different from the regular copy command?
Regular COPY Command:
==>The default value for ON_ERROR is ABORT_STATEMENT
==>It doesn’t support FORCE=TRUE
==>It stores the metadata for 64 days
Snowpipe COPY Command:
==>The default value for ON_ERROR is SKIP_FILE
==>It does not support FORCE=TRUE
==>It stores the metadata for 14 days
14. What is cloning and what will happen when we clone an object?
Can we perform DML operations on cloned tables? How will it impact original tables?
Cloning is nothing but taking a snapshot of any object without actually copying its data. Both original and cloned tables behave as independent tables and we can perform DML on any of these tables
15. For what purpose did you use Cloning? To create Lower environments (DEV, STAGE/QA) from production, we can use cloning.
16. What will happen if we clone a database? Internal stages, snow pipes referring to internal stages won’t be cloned. All the database objects will get cloned.
17. What is Data Sharing? Secure data sharing is one of the features in Snowflake which enables sharing selected objects (tables, secure views, secure UDFs) in a database from one account to another snowflake account.
18. What are the different types of Data shares? Data shares in Snowflake are of two types INBOUND SHARE: This share will automatically appear in CONSUMER Account and we can create the shared database. OUTBOUND SHARE: This share has to be created on the Provider account to share the data with the consumer.
19. Can we perform DML operations on objects in the Data shared database? No, shared databases are read-only. We cannot perform DML operations on the shared databases.
20. What is clustering in Snowflake? Clustering is a method used to maximize performance to avoid the unnecessary scanning of micro-partitions.
21. How can we improve the performance of queries? Increase the warehouse size to improve the performance of the query. If the table is very huge create a clustering key on the field that is used in joins, filter conditions
22. How do we check the status of the snowpipe? The status of the snowpipe can be viewed by running the following command SELECT SYSTEM$PIPE_STATUS(‘
’)
23. How to set up auto-refresh for snowpipe? With Auto_ingest = TRUE in the pipe and SQS/SNS notification from AWS, we can refresh the pipe automatically.
24. What major challenges that you faced in creating stored procedures in Snowflake? Data type issues are the challenges faced while creating stored procedures since some Javascript data types are not compatible with SQL data types like timestampntz to handle that snowflake has a special data type => sfdate
25. How do we do error handling in stored procedures? Using the Try catch block, we can handle errors in stored procedures.
26. Can we return a table in the snowflake stored procedure? No, the stored procedure returns a single row and single column, If we have to return a table then we can use table-valued function
27. How data will be stored in Snowflake? Data stored in snowflake in compressed columnar format in micro-partitions each of size 50–500MB
28. How do we establish the connection between s3 to Snowflake? By creating storage integration and an external stage, we can integrate Snowflake with AWS S3.
29. How to set up auto-refresh for external tables? Set AUTO_ refresh =TRUE on the external tables and set up SQS notification on AWS S3 to set up auto-refresh for external tables.
30. How do we load semi-structured data into Snowflake? We have a special data type called variant which can hold semi-structured data and we can load nested JSON into relational tables using the lateral flatten function
31. What is the output of Select try_to_number(‘100’), try_to_number(‘15_A’)? 100, null
32. How the sequence in snowflake behaves in the following Select SEQ1.nextval, SEQ1.nextval; The output of the given statement is 1,2 Whereas in RDBMS it will be 1,1
33. Is TRUNCATE a DDL or DML command in Snowflake? INSERT, UPDATE, DELETE, MERGE and TRUNCATE are DML commands in Snowflake. Truncate is a DDL command in Snowflake
34. Write a query to get the complete name of an employee from the first name, middle name and last name?
Select coalesce(first_name,middle_name,last_name) from employee;
35. What is the output of the following query SELECT CONCAT(123,NULL); NULL
36. What is the default value for MAX_CONCURRENCY_LEVEL ? The maximum number of concurrent or parallel statements a warehouse can execute is called MAX_CONCURRENCY_LEVEL. The default value for MAX_CONCURRENCY_LEVEL is 8 37. What are the different types of connectors supported by snowflakes? There are different types of connectors that snowflakes support. • JDBC Driver • ODBC Driver • Node.js Driver • Python Connector • .NET Driver • Spark Connector (spark 3.1,3.2,3.3) • Kafka Connector
38. What is the alternative way of capturing DMLs on any object other than streams? Enabling change tracking can help in capturing DML’s on any object.
39. What are the copy options which are not supported in Snowpipe? The following options are not supported for snowpipe • On_error= abort_statement • Force=TRUE/FALSE • Purge=TRUE/FALSE • Validation_mode= return_errors/return_all_errors/return_n_errors • Size_limit=Num
40. How can you resume a staled pipe? If any snowpipe is paused for more than 14 days, it will become stale. A staled snowpipe can be resumed using the following command SELECT SYSTEM$PIPE_FORCE_RESUME(‘pipe_name’,’staleness_check_override’)
41. How can you delete the files explicitly from a stage? Files for internal/external stages can be explicitly deleted as follows remove @%table_stage –to remove from a table stage remove @~ –to remove from a user stage remove @int_stage/ext_stage –to remove from an internal stage/external stage
42. How to disable the result cache in Snowflake? The result cache can be disabled in Snowflake as follows ALTER SESSION SET USE_CACHED_RESULT=FALSE;
43. What is remote disk spillage? How do you resolve this? If the available memory of a virtual warehouse is not sufficient to hold the data, then it starts spilling data to remote storage. This is called remote disk spillage. We can avoid remote disk spillage by resizing the warehouse to a larger one.
44. What is Snowpark?
Snowpark is a new developer experience in Snowflake that allows developers, data engineers, and data scientists to write code in their preferred programming languages and execute that code securely and efficiently within Snowflake’s platform.
45. Which programming languages are supported by Snowpark?
Snowpark currently supports Java and Scala, with plans to support additional languages in the future.
Thank you for reading our guide on Snowflake interview questions. If you go through all three posts(Part-1, Part-2, Part-3) on Snowflake interviews, you’ll be well-prepared to ace any Snowflake interview. Please give it a clap and share it with friends who are getting ready for Snowflake developer or data engineer interviews.
This article was first published on sqlskool.com. Keep an eye out for more updates, and best of luck!
