Introduction to the SnowEx Database

What we’re gonna attempt to cover

  • Introduction

  • Database Structure/ Contents

  • Forming Useful Queries

  • Examples

  • Exporting Data

  • Group Excerise

  • QGIS Setup

Why a database?

”Dude, I am into pits not bits. What gives?!”

  • Standardizing diverse datasets

  • Cross referencing data

  • Enables GIS functionality

  • Ready for use in your code

  • Provenance!

  • Ready for use in a GIS software like ArcGIS or QGIS!

TL;DR Do less wrangling, do more crunching.

What is it exactly?

  • PostgreSQL database

  • PostGIS extension

  • Supports vector and raster data

  • And a host of GIS operations

What’s in it?

  • Snow pits - Density, hardness profiles, grain types + sizes

  • Manual snow depths - TONS of depths, Can you say spirals?

  • Snow Micropenetrometer profiles - (Subsampled to every 100th)

  • Snow depth + SWE rasters from ASO inc

  • GPR

  • Pit site notes

  • Camera Derived snow depths

  • Snow off DEM from USGS 3DEP at GM

  • And almost all the associated metadata

All this and more is easily indexed, cross referencable, and put into GIS ready formats!

How do I get at this magical box of data?

# Import the connection function from the snowexsql library
from snowexsql.db import get_db

# This is what you will use for all of hackweek to access the db
db_name = 'snow:hackweek@db.snowexdata.org/snowex'

# Using the function get_db, we receive 2 ways to interact with the database
engine, session = get_db(db_name)

1. Using the Engine Object

The engine object returned from the get_db function is not used much in the snowexsql library. It does allow you to use typical SQL strings to interact with the database.

Note: Users who have used python + SQL before will likely be more familiar with this approach. Additionally those who don’t know python but know SQL will also be more comfortable here.

# Form a typical SQL query and use python to populate the table name
qry = "SELECT DISTINCT site_id FROM sites"

# Then we execute the sql command and collect the results
results = engine.execute(qry)

# Create a nice readable string to print the site names using python 
out = ', '.join((row['site_id'] for row in results))

# Print it with a line return for readability
print(out + '\n')
8C18, 1C1, Open 6, 2S4, 2S7, 6N36, Skyway Open, 6S22, 2S6, 9S40, 9N30, 1N1, 9N47, 1N3, 9N29, 8N45, 9N39, 8C35, 1C5, 6N31, 3S33, 5N24, 3N53, 1C7, 7N40, SNB 2, 8N55, County Line Open, 6N18, 5C20, 5N10, 8N52, 2N21, 1N5, 6S32, GML, 1S1, 2C3, 8S41, JPL 1, Forest 14, 6S53, C1, 8C11, Open, 2C13, Open Flat, Joe Wright, 8N54, 7S23, 6N16, 8C31, Open 2, 5S42, Mesa West Open, 8N34, Upper, 6C37, 5S49, 9S39, 1S17, 2S35, 1C8, BA Flux Tower, 8N9, FL2A, 5S31, 8N38, 6S26, Caples Lake, Open 4, 2S46, 8S28, 8C36, 5N15, 2C33, 6N46, 3S14, 6S15, 8N35, Skyway Tree, TLSFL2A, 2N13, 3N26, 1S8, Saddle, Banner Open, 3S47, 3S52, 4N2, 2S9, 9S51, Trench 13, 6C24, Panorama Dome, 5C27, Senator Beck, 2S25, Swamp Angel, FL1B, 9N42, 1N6, JPL 2, 2S11, 2N8, 9N59, 1N7, 8C25, 3S5, 8N58, 9C28, 2S10, 2S45, 5C21, 5S24, 7S50, 2N49, 8C22, Forest 13, 2N14, 9C17, 5N19, 2C9, 5N50, 2N4, Mores Creek Summit, LDP Tree, 1C14, 2C2, CUES, SNB 1, 8S18, Michigan River, 7C15, Irwin Barn, 2S20, 1S12, 6S44, 2S48, 9C19, 9N43, 9N56, 9N44, 8S30, 8C26, 7N57, 3S38, 9C16, 5N11, 6S34, 4N27, Forest 12, 5S21, 2C12, 2N12, 9C23, 1S2, 3N22, 2N48, 2S3, 5S29, 8N25, 2C4, Forest North, 2C6, Forest, 2S37, 2S16, HQ Met Station, 1S13, Alta Collins, 4C30, County Line Tree, Bogus Upper, 2S36, 6S19, 8N37, Forest Flat, Atwater, LDP Open, 9N28, 1N23, Gothic, 6N17, 8N51, 1N20, Forest South, 5N41, 8C32, 5S43, 8N33, 5N32, 6C10, Tower 4, Banner Snotel, 2S27, Mesa West Trees, 6C34, Aspen, 8C29

2. Using the Session Object

The session object allows a user to interact with the database in a pure python form. This approach is called Object Relational Mapping (ORM).

# Import the table classes from our data module which is where our ORM classes are defined 
from  snowexsql.data import SiteData

# Form the query to receive all the site_id from the sites table
qry = session.query(SiteData.site_id).distinct()

# Execute the query and collect the results
results = qry.all()

# Print it with a line return for readability
print(', '.join([row[0] for row in list(results)]))

# Close your session to avoid hanging transactions
session.close()
8C18, 1C1, Open 6, 2S4, 2S7, 6N36, Skyway Open, 6S22, 2S6, 9S40, 9N30, 1N1, 9N47, 1N3, 9N29, 8N45, 9N39, 8C35, 1C5, 6N31, 3S33, 5N24, 3N53, 1C7, 7N40, SNB 2, 8N55, County Line Open, 6N18, 5C20, 5N10, 8N52, 2N21, 1N5, 6S32, GML, 1S1, 2C3, 8S41, JPL 1, Forest 14, 6S53, C1, 8C11, Open, 2C13, Open Flat, Joe Wright, 8N54, 7S23, 6N16, 8C31, Open 2, 5S42, Mesa West Open, 8N34, Upper, 6C37, 5S49, 9S39, 1S17, 2S35, 1C8, BA Flux Tower, 8N9, FL2A, 5S31, 8N38, 6S26, Caples Lake, Open 4, 2S46, 8S28, 8C36, 5N15, 2C33, 6N46, 3S14, 6S15, 8N35, Skyway Tree, TLSFL2A, 2N13, 3N26, 1S8, Saddle, Banner Open, 3S47, 3S52, 4N2, 2S9, 9S51, Trench 13, 6C24, Panorama Dome, 5C27, Senator Beck, 2S25, Swamp Angel, FL1B, 9N42, 1N6, JPL 2, 2S11, 2N8, 9N59, 1N7, 8C25, 3S5, 8N58, 9C28, 2S10, 2S45, 5C21, 5S24, 7S50, 2N49, 8C22, Forest 13, 2N14, 9C17, 5N19, 2C9, 5N50, 2N4, Mores Creek Summit, LDP Tree, 1C14, 2C2, CUES, SNB 1, 8S18, Michigan River, 7C15, Irwin Barn, 2S20, 1S12, 6S44, 2S48, 9C19, 9N43, 9N56, 9N44, 8S30, 8C26, 7N57, 3S38, 9C16, 5N11, 6S34, 4N27, Forest 12, 5S21, 2C12, 2N12, 9C23, 1S2, 3N22, 2N48, 2S3, 5S29, 8N25, 2C4, Forest North, 2C6, Forest, 2S37, 2S16, HQ Met Station, 1S13, Alta Collins, 4C30, County Line Tree, Bogus Upper, 2S36, 6S19, 8N37, Forest Flat, Atwater, LDP Open, 9N28, 1N23, Gothic, 6N17, 8N51, 1N20, Forest South, 5N41, 8C32, 5S43, 8N33, 5N32, 6C10, Tower 4, Banner Snotel, 2S27, Mesa West Trees, 6C34, Aspen, 8C29

Crash Course in Object Relational Mapping (ORM)

Question: How is a database used in pure python?!…Are you down with the O.O.P? The answer is as a Class where each column is mapped to that class as an attribute e.g. obj.attribute AND… in the correct type for python!

Consider the following table:

id

site_id

ground_roughness

0

GML

rough

1

2S27

smooth

2

3S52

smooth

In our python repo we have a made up class SiteData defined to map to this table.

     
    from snowexsql.data import SiteData
    

id

site_id

ground_roughness

SiteData.id

SiteData.site_id

SiteData.ground_roughness

SiteData.id

SiteData.site_id

SiteData.ground_roughness

SiteData.id

SiteData.site_id

SiteData.ground_roughness

If we queried the whole table above using the session object we would get back 3 Sitedata objects in a list. 1 for each row.

[<snowexsql.data.SiteData object at 0x7fcf0f9bea90>, <snowexsql.data.SiteData object at 0x7fcf0f9bec70>, <snowexsql.data.SiteData object at 0x7fcf0f9bed00>]

This at first doesn’t seem useful until you start to use the objects.


print(my_queried_data[0].ground_roughness)
rough

Question

  • How would you access from our list the site_id of the 2nd row?

Recap

You just:

  • Accessed a geodatabase using python

  • Saw two methods for interacting with the db using the snowexsql library

  • Pulled all the unique pit site id numbers from the db

  • Had a high level intro to ORM