PyCDS

ORM for the CRMP database.

This ORM includes table mappings for all table-like objects: tables, views, and materialized views.

Note: Schema name:

SQLAlchemy does not make it easy to set the schema name of a declarative base to a value determined at run-time. It must be specified when the metadata for the declarative base is created, and once set it cannot be changed. This happens declaratively, not procedurally, so the value of the name must be determined externally to the code.

The most convenient way to specify the schema name externally is via an environment variable, which in this case is named PYCDS_SCHEMA_NAME. If PYCDS_SCHEMA_NAME is not specified, the default value of crmp is used, making this backward compatible with all existing code. The function get_schema_name() returns this value and is used throughout as the source of the schema name in all contexts (database function definitions, tables, views, materialized views).

Clients of this package must take care to specify PYCDS_SCHEMA_NAME correctly when performing any database operations with it. Otherwise the operations will fail with errors of the form “could not find object X in schema Y”.

pycds.variable_tags(Table)

Apply the database function variable_tags to Table. This is necessarily the table class Variable, but it may not be a direct ref to that class in code, so it is passed as a parameter. Usage is typically select(…, variable_tags(Variable), …).

class pycds.Base(**kwargs)

The base class of the class hierarchy.

When called, it accepts no arguments and returns a new featureless instance that has no instance attributes and cannot be given any.

class pycds.Network(**kwargs)

This class maps to the table which represents various networks of data for the Climate Related Monitoring Program. There is one network row for each data provider, typically a BC Ministry, crown corporation or private company.

class pycds.Contact(**kwargs)

This class maps to the table which represents contact people and representatives for the networks of the Climate Related Monitoring Program.

class pycds.Variable(**kwargs)

This class maps to the table which records the details of the physical quantities which are recorded by the weather stations.

class pycds.Station(**kwargs)

This class maps to the table which represents a single weather station. One weather station can potentially have multiple physical locations (though, few do in practice) and periods of operation

class pycds.History(**kwargs)

This class maps to the table which represents a history record for a weather station. Since a station can potentially (and do) move small distances (e.g. from one end of the airport runway to another) or change the frequency of its observations, this table records the details of those changes.

WARNING: The GeoAlchemy2 Geometry column (attribute the_geom) forces all reads on that column to be wrapped with Postgis function ST_AsEWKB. This may or may not be desirable for all use cases, specifically views. See the GeoAlchemy2 documentation for details.

class pycds.Obs(**kwargs)

This class maps to the table which records the details of weather observations. Each row is one single data point for one single quantity.

class pycds.MetaSensor(**kwargs)
class pycds.TimeBound(**kwargs)

This class maps to a table which records the start and end times for an observation on a variable that spans a changeable time period, rather than a variable that is at a point in time or which spans a fixed, known time period and is represented by a single standardized point in that time period). Variable time periods are typically for climatologies and cumulative precipitations.

class pycds.NativeFlag(**kwargs)

This class maps to the table which records all ‘flags’ for observations which have been flagged by the data provider (i.e. the network) for some reason. This table records the details of the flags. Actual flagging is recorded in the class/table ObsRawNativeFlags.

class pycds.PCICFlag(**kwargs)

This class maps to the table which records all ‘flags’ for observations which have been flagged by PCIC for some reason. This table records the details of the flags. Actual flagging is recorded in the class/table ObsRawPCICFlags.

class pycds.DerivedValue(**kwargs)
class pycds.CollapsedVariables(**kwargs)

This class defines a materialized view that supports the view CrmpNetworkGeoserver, to improve that view’s performance. It provides aggregated information about all variables associated to a given history – which is to say all the variables for which there are observations associated to that history. It exists to speed up critical queries in the backends.

Most columns are self-explanatory; see comments in column definition below. Further explanation for specific columns:

  • unique_variable_tags: The set (rendered as a SQLAlchemy native array) of all results of the variable_tags function applied to each variable associated to the history. It enables us to distinguish stations (histories) that report climatological variables from those which do not, which is critical for some apps. It is a flattened and unique version of the array of all variable tags arrays returned by the function.
  • vars: A string formed by concatenating an “identifier” derived from each value of cell_method to each variable associated to the history (separator: ‘, ‘). It is a legacy value, formerly used both to distinguish stations with climatological variables from those without, and for filtering stations based on the “identifier”. The first usage has been replaced by column unique_variable_tags; the second is regrettably still in use. The identifier is formed by a regex replacement (see CTE aggregated_vars below) that yields a string that is no longer a programming language identifier and is in any case idiosyncratic. For more information, see issue https://github.com/pacificclimate/pycds/issues/180.
class pycds.VarsPerHistory(**kwargs)

This materialized view speeds up the PDP and station data portal by linking variables to stations, rather than needing to query the very large obs_raw table to find what what variables are associated with a station over what timespan.

Compared to the previous version of this view, this version adds the earliest and latest timestamps data is recorded for a variable.

Definition of supporting view: SELECT DISTINCT obs_raw.history_id, obs_raw.vars_id,

min(obs_raw.time) AS start_time, max(obs_raw.time) AS end_time

FROM obs_raw GROUP BY history_id, vars_id;

class pycds.ClimoObsCount(**kwargs)

This class defines a materialized view required for web app performance. It is used to approximate the number of climatologies which will be returned by station selection criteria.

class pycds.StationObservationStats(**kwargs)

This class defines a materialized view required for web app performance. It precomputes some basic statistics about observations by station and history.

class pycds.ObsCountPerMonthHistory(**kwargs)

This class defines a materialized view that is required for web app performance. It is used for approximating the number of observations which will be returned by station selection criteria.

class pycds.CollapsedVariables(**kwargs)

This class defines a materialized view that supports the view CrmpNetworkGeoserver, to improve that view’s performance. It provides aggregated information about all variables associated to a given history – which is to say all the variables for which there are observations associated to that history. It exists to speed up critical queries in the backends.

Most columns are self-explanatory; see comments in column definition below. Further explanation for specific columns:

  • unique_variable_tags: The set (rendered as a SQLAlchemy native array) of all results of the variable_tags function applied to each variable associated to the history. It enables us to distinguish stations (histories) that report climatological variables from those which do not, which is critical for some apps. It is a flattened and unique version of the array of all variable tags arrays returned by the function.
  • vars: A string formed by concatenating an “identifier” derived from each value of cell_method to each variable associated to the history (separator: ‘, ‘). It is a legacy value, formerly used both to distinguish stations with climatological variables from those without, and for filtering stations based on the “identifier”. The first usage has been replaced by column unique_variable_tags; the second is regrettably still in use. The identifier is formed by a regex replacement (see CTE aggregated_vars below) that yields a string that is no longer a programming language identifier and is in any case idiosyncratic. For more information, see issue https://github.com/pacificclimate/pycds/issues/180.
class pycds.DailyMaxTemperature(**kwargs)
class pycds.DailyMinTemperature(**kwargs)
class pycds.MonthlyAverageOfDailyMaxTemperature(**kwargs)
class pycds.MonthlyAverageOfDailyMinTemperature(**kwargs)
class pycds.MonthlyTotalPrecipitation(**kwargs)
class pycds.CrmpNetworkGeoserver(**kwargs)

This view is used by the PDP Geoserver backend for generating station map layers.

Note: In this class (which maps a table to the view), a primary key must be declared (SQLAlchemy requirement). A view cannot have a PK, but this table declaration does not affect the view.

class pycds.HistoryStationNetwork(**kwargs)

This view, as its name suggests, is a convenience view that joins History, Station, and Network tables.

Note: In this class (which maps a table to the view), a primary key must be declared (SQLAlchemy requirement). A view cannot have a PK, but this table declaration does not affect the view.

class pycds.ObsCountPerDayHistory(**kwargs)

This view provides counts of observations grouped by day (date) and history_id

Note: In this class (which maps a table to the view), a primary key must be declared (SQLAlchemy requirement). A view cannot have a PK, but this table declaration does not affect the view.

class pycds.ObsWithFlags(**kwargs)

This view joins Obs with History and Variable.

Note: In this class (which maps a table to the view), a primary key must be declared (SQLAlchemy requirement). A view cannot have a PK, but this table declaration does not affect the view.

pycds.util.compact_join(*parts, separator=' ')

Compact (eliminate falsy values from) arguments and join them with a separator. Typically used to construct a SQL statement with optional parts.

pycds.util.compile_query(statement, bind=None)

print a query, with values filled in for debugging purposes only for security, you should always separate queries from their values please also note that this function is quite slow

pycds.util.ddl_escape(s)

Encode a string so that SQLAlchemy DDL will process it as intended. See https://docs.sqlalchemy.org/en/14/core/ddl.html#sqlalchemy.schema.DDL

pycds.util.snake_case(ident)

Return a snake-case version of a camel-case identifier, e.g., “MyBigDeal” -> “my_big_deal”. Courtesy of http://stackoverflow.com/a/12867228

pycds.util.variable_tags(Table)

Apply the database function variable_tags to Table. This is necessarily the table class Variable, but it may not be a direct ref to that class in code, so it is passed as a parameter. Usage is typically select(…, variable_tags(Variable), …).