Intro­duc­tion

Snow­flake offers an ele­gant data ware­house (DWH) solu­tion deve­lo­ped spe­ci­fi­cally for the cloud. Cus­to­mers do not need to install or main­tain soft­ware or hard­ware, because Snow­flake DWH ope­ra­tes as Soft­ware-as-a-Ser­vice (SaaS) on the main cloud plat­forms (Ama­zon Web Ser­vices, Google Cloud Sto­rage and Micro­soft Azure). Snowflake’s archi­tec­ture is marked by data sto­rage being inde­pen­dent from com­pute resour­ces. This con­cept enables effi­ci­ent usage and sca­ling of both in real-time. Thus, Snow­flake can meet dyna­mi­cally chan­ging requi­re­ments of modern busi­nesses. In this blog article, we will take first steps tog­e­ther towards migra­ting a DWH into Snowflake.

Step 1: Administration

The first task is to arrange the Snow­flake account access, users and roles accor­ding to your busi­ness struc­ture and use case. The access con­trol in Snow­flake uses two models:

  • Dis­cre­tio­nary Access Con­trol (DAC): each object (data­base, schema, table) has an owner; an owner grants access to objects.
  • Role-Base-Access-Con­trol (RBAC): access pri­vi­le­ges are assi­gned to roles; roles are assi­gned to users.

Snow­flake has the fol­lo­wing default sys­tem-defi­ned roles: ACCOUNTADMIN, SECURITYADMIN, SYSADMIN, USERADMIN, and PUBLIC. In addi­tion, Snow­flake allows cus­tom-defi­ned roles.

Easy Takeoff in the Cloud with Snowflake Bild1
Figure 1 Role Hier­ar­chy in Snowflake
Two ACCOUNT­S­AD­MINs are bet­ter than one

When a user crea­tes a Snow­flake account and login for the first time, the user is assi­gned by default the ACCOUNTADMIN role, the most powerful role in a Snow­flake account. Snow­flake recom­mends crea­ting ano­ther user who also has the ACCOUNTADMIN role. This is useful in various cases. For ins­tance, if one account admi­nis­tra­tor for­gets the pass­word, ano­ther account admi­nis­tra­tor can reset the pass­word imme­dia­tely.  Due to secu­rity issues, it may take Snow­flake sup­port up to two busi­ness days to reset the pass­word for a user with the ACCOUNTADMIN role. The ACCOUNTADMIN role is desi­gned to manage objects on the account level and should not be used to create low-level data­base objects (data­base, schema, table, etc.). Snow­flake recom­mends crea­ting a role hier­ar­chy accor­ding to your busi­ness struc­ture ins­tead of doing ever­y­thing with the ACCOUNTADMIN role. The ACCOUNTADMIN role owns the SECURITY and SYSADMIN roles.

SECURITYADMIN

The account admi­nis­tra­tor crea­tes one or seve­ral users with the SECURITYADMIN role. The SECURITYADMIN role has the pri­vi­le­ges to create and modify users/roles. To improve sepa­ra­tion of duties, Snow­flake recently intro­du­ced the USERADMIN role. Alt­hough the SYSADMIN role owns the USERADMIN role, now it is pos­si­ble to sepa­rate role/user manage­ment (SECURITYADMIN) and role/user crea­tion (USERADMIN). In the cur­rent blog, we will skip the USERADMIN role for sim­pli­city, but it can be easily imple­men­ted if requi­red by a use case.

SYSADMIN

The secu­rity admi­nis­tra­tor crea­tes one or seve­ral users with the SYSADMIN role and cus­tom roles. The SYSADMIN role has the pri­vi­le­ges to create vir­tual warehou­ses, data­ba­ses, and any data­base objects (sche­mas, tables, etc.). The cus­tom roles in turn are defi­ned accor­ding to your busi­ness struc­ture and use case. Snow­flake does not put any limi­ta­ti­ons on the maxi­mum num­ber of cus­tom roles.

Sam­ple user/role Use Case

In the illus­tra­tion below, we demons­trate a sam­ple user/role hier­ar­chy tog­e­ther with the SQL com­mands to be used in a Snow­flake work­s­heet to create the users and roles.

First, using the ACCOUNTADMIN role, we create ano­ther account admi­nis­tra­tor with the name COMPANY_CHEF. Snow­flake recom­mends that the account admi­nis­tra­tors use the multi-fac­tor authen­ti­ca­tion und a default role dif­fe­rent from the ACCOUNTADMIN. The default role is the role auto­ma­ti­cally assi­gned to a user after each new login into the Snow­flake account. Note that assig­ning a default role to a user does not auto­ma­ti­cally grant this role. Then, we create a user with the name SECURITY_OFFICER and the SECURITYADMIN role, that in turn creates

  • the DWH_AMINISTRATOR user with the SYSADMIN role and
  • the roles and users for dif­fe­rent busi­ness units (DB_TEAM, DATASCIENCE_TEAM, BI_TEAM).

The new cus­tom roles are also gran­ted to the SYSADMIN role, i.e. DWH_ADMINISTRATOR.

Easy Takeoff in the Cloud with Snowflake Bild2
Figure 2 Sam­ple Users
Con­trol Your Usage

In order to pre­vent usage expen­ses excee­ding the limit, Snow­flake offers resource moni­tors. A resource moni­tor helps con­trol cre­dit usage for run­ning vir­tual warehou­ses in your account. When a cer­tain cre­dit limit is excee­ded, a noti­fi­ca­tion or other action can be trig­ge­red. The moni­tor can be auto­ma­ti­cally reset on a peri­odic basis. In the exam­ple shown here, we define a resource moni­tor that starts imme­dia­tely after crea­tion with the limit of 200 cre­dits. The moni­tor is reset monthly. When 90 per­cent of cre­dits are spent, the account admi­nis­tra­tor will receive a noti­fi­ca­tion. Note that only the ACCOUNTADMIN role can create a resource moni­tor. We will use the moni­tor ESTIMATED_LIMIT_PER_MONTH in the next step when desig­ning a vir­tual warehouse.

Easy Takeoff in the Cloud with Snowflake Bild3
Figure 3 Resource Monitor

Step 2: Vir­tual Ware­house und Data­base Organization

Now, it is time to start arran­ging objects nee­ded to work with data.

Vir­tual Warehouse

Vir­tual ware­house (VWH) is a com­pute clus­ter to per­form SQL ope­ra­ti­ons. Snow­flake offers various sizes of vir­tual warehou­ses. Snow­flake recom­mends begin­ning with a small clus­ter and then exten­ding it while moni­to­ring query per­for­mance. In addi­tion, it is good prac­tice to assign sepa­rate VWH for dif­fe­rent busi­ness units. In this blog, we will stick to a sin­gle VWH exam­ple.  We create a VWH with the extra-small size (1 cre­dit per hour). If the workload for our clus­ter is too much (which is defi­ned by the sca­ling policy, in our case ECONOMY), Snow­flake auto­ma­ti­cally spins up one addi­tio­nal clus­ter (MAX_CLUSTER_COUNT=2). When the workload decrea­ses, Snow­flake auto­ma­ti­cally shuts down the second cluster.

Easy Takeoff in the Cloud with Snowflake Bild4
Figure 4 Vir­tual Ware­house Creation
Database/Schema

The last step is to create a data­base and schema for deve­lo­p­ment acti­vi­ties. By default, all data­base objects in Snow­flake have the PERMANENT type. For per­ma­nent objects, Snow­flake has a fail-safe region that saves backup infor­ma­tion for 7 days. If your data gets cor­rupted, you can cont­act Snow­flake sup­port to reco­ver your data from the fail-safe region for the last 7 days. We create a tran­si­ent data­base and schema which means that we do not have fail-safe costs rela­ted to these objects. This is a good option for deve­lo­p­ment, but not for pro­duc­tion. After we gran­ted pri­vi­le­ges to the users, they can start ope­ra­ting in their accounts.

Easy Takeoff in the Cloud with Snowflake Bild5
Figure 5 Data­base and Schema Creation

Next Steps

During the next steps, the users can load data and estab­lish con­nec­tions to other tools and ser­vices alre­ady used by your business:

  • AWS S3
  • Micro­soft Azure Blob
  • Google Cloud Storage
  • Infor­ma­tica
  • Kafka
  • Dat­ab­ricks
  • MicroStra­tegy
  • and many others.

We covered the first steps requi­red for migra­ting a DWH into Snow­flake. Snow­flake offers many other fea­tures that we did not men­tion here. For more infor­ma­tion, see our com­pa­n­ion blogs on Snowflake.