Data Model View

The SWIM modeling storage system is implemented as a document-based NoSQL database managed with MongoDB. This section describes the semi-structured organization of data in our storage system.

Terms

Term Definition
Database A physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.
Collection A grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists within a single database.
Document A record in a MongoDB collection and the basic unit of data in MongoDB. Documents are analogous to JSON objects but exist in the database in a more type-rich format known as BSON.
JSON JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in many programming languages.
String A sequence of characters
Array Is a data structure consisting of a collection of elements (values or variables).
Object A JSON data structure.
Any When a field is not constrained to a specific data type.


Sources:
https://docs.mongodb.com/manual/reference/glossary/
https://www.json.org/json-en.html

Collection Catalog

The SWIM modeling database is under constant expansion and refinement as the development process progresses. The platform is currently composed of the following collections:

  • acronym-catalog: Label dictionary to convert model specific acronyms into suitable labels for users.
  • canned-scenarios: Subsets of model outputs to answer specific questions that focus on targeted model ouputs.
  • model-catalog: Metadata of a registered modeling services in SWIM.
  • transformation-catalog: Metadata of registered transformation services in SWIM.
  • output-catalog: List of output metadata linked to s specific model.
  • options: Adds model parameters to the options menu on the SWIM UI.
  • parameter-catalog: List of model paramater metadata and default values linked to specific model.
  • private-scenarios: Model scenario executions by registered users that are not available for public view.
  • public-scenarios: Model scenario executions openly available to public.
  • set-catalog: List of set attributes for models developed with the General Algebraic Modeling System (GAMS).
  • theme-catalog: List of generic scenario themes where one scenario theme can modify one or multiple model inputs.
  • summary-catalog: List of summary objects available per model. These are loaded in the results summary section of a modeling workflow on the SWIM UI.

acronym-catalog

Sometimes model source code contains acronyms or abbreviated words. This naming scheme may be familiar to the developers of the model, but can be hard for other people to infer the meaning behind each abbreviation. This collection holds multiple acronym dictionaries linked to the different models registered in SWIM. That way the interface can replace them with more meaningful labels and facilitate user understanding.

Primary Representation

Acronym Catalog Figure

Collection Fields

Field Type Description
_id string Dictionary unique identifier.
modelID string Identifier of the model linked to the dictionary.
lang string Target language (English or Spanish).
dictionary Key-Value Array Acronym and label pairs.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "lang": "",
  "dictionary": {
    "": "",
    "": ""
  }
}

model-catalog

Stores metadata of registered scientific models in SWIM. This information is used to extract model provenance and required information invoke execution through underlying web services.

Primary Representation

Model Catalog Figure

Collection Fields

Field Type Description
_id string MongoDB auto-generated unique identifier.
info Object array
info.name string
info.description string
info.lang string
dateCreated datetime Date of the first model release.
dateModified datetime Date and time of last modification to the model.
softwareAgent string Software application capable of executing and implementing the model.
license string The license information for the exposed model.
version string Provides the version of the base model.
sponsor string Organization, person or institution that provides funding.
creators Object array Model authors/creators.
creators.name string Creator full name.
creators.department string A division of a organization such as a government, university, business, or shop, dealing with a specific subject.
creators.organization string Creator organization name.
creators.email string Electronic mail address for communication.
creators.city string Named geographical locality.
creators.state string Region associated with the address of an object.
creators.country string The country name associated with the address of an object.
hostServer Object container Server machine where the base model and corresponding software agent is deployed.
hostServer.serverName string DNS name of the server.
hostServer.serverAdress string External IP Address of the server.
hostServer.serverAdmin string Person is charge of administating the server.
hostServer.adminEmail string Administrator email.
hostServer.serverOwner string Organization or person that owns the physical server.
serviceInfo Object container Information regarding the exposed webservice of the model.
serviceInfo.serviceURL string Request URI.
serviceInfo.serviceMethod string http method to invoke a request (EG: POST, GET).
serviceInfo.status string
serviceInfo.consumes string A list of MIME types the service can consume. (E.G: application/json).
serviceInfo.produces string MIME types produced as a response by the service.
service.isPublic boolean If the service is available on the web for public use. (true or false).
serviceInfo.externalDocs Object array Additional external documentation.

JSON Skeleton Representation

{
    "_id" : "",
  "info": [
    {
      "name": "",
      "description": "",
      "lang": ""
    }
  ],
    "dateCreated" : "",
    "dateModified" : "",
    "softwareAgent" : "",
    "license" : "",
    "version" : "",
    "sponsor" : "",
    "creators" : [
    {
        "name" : "",
        "department" : "",
        "email" : "",
        "organization" : "",
        "city" : "",
        "state" : "",
        "country" : ""
    }
    ],
    "hostServer" : {
        "serverName" : "",
        "serverAdress" : "",
        "serverAdmin" : "",
        "adminEmail" : "",
        "serverOwner" : ""
    },
    "serviceInfo" : {
        "serviceURL" : "",
        "serviceMethod" : "",
    "status": "",
        "consumes" : "",
        "produces" : "",
        "isPublic" : "",
        "externalDocs" : [""]
    }
}

transformation-catalog

Stores metadata of registered data transformation services in SWIM.

Collection Fields

Field Type Description
_id string MongoDB auto-generated unique identifier.
info Object array
info.name string
info.description string
info.lang string
dateCreated datetime Date of the first model release.
dateModified datetime Date and time of last modification to the model.
language string Programming language used for artifact.
license string The license information for the exposed model.
version string Provides the version of the base model.
creators Object array Model authors/creators.
creators.name string Creator full name.
creators.department string A division of a organization such as a government, university, business, or shop, dealing with a specific subject.
creators.organization string Creator organization name.
creators.email string Electronic mail address for communication.
creators.city string Named geographical locality.
creators.state string Region associated with the address of an object.
creators.country string The country name associated with the address of an object.
serviceInfo Object container Information regarding the exposed webservice of the model.
serviceInfo.serviceURL string Request URI.
serviceInfo.serviceMethod string http method to invoke a request (EG: POST, GET).
serviceInfo.status string
serviceInfo.consumes string A list of MIME types the service can consume. (E.G: application/json).
serviceInfo.produces string MIME types produced as a response by the service.
service.isPublic boolean If the service is available on the web for public use. (true or false).

canned-scenarios

A canned scenario is a subset of a model scenario specification. The listing of inputs and outputs is filtered in relation to a question of interest; hence only parameters and variables that help answer the targeted question will be presented to the user. This approach aims to help introduction of underlying models without overwhelming users at a first glance.

Primary Representation

Canned Scenario Figure

Collection Fields

Field Type Description
_id string Unique identifier.
name string Question or subject.
description string Detailed statement about the canned scenario.
modelID string
themeCatalogFilter String Array Listing of theme categories to show on the canned scenario.
parameterFilter String Array Listing of model inputs to show on the canned scenario.
outputFilter String Array Listing of model output to show on the canned scenario.

JSON Skeleton Representation

{
  "_id": "",
  "name": "",
  "description": "",
  "modelID": "",
  "themeCatalogFilter": [
    ""
  ],
  "parameterFilter": [
    ""
  ],
  "outputFilter": [
    ""
  ]
}

User Scenarios (public-scenarios, private-scenarios)

Stores a complete model scenario specification. The specification is composed of scenario metadata, model settings, sets, inputs and outputs. This collection schema is used for the execution of canned scenarios, public scenarios and private scenarios.

Primary Representation

User Scenario Figure

Collection Fields

Field Type Description
_id string MongoDB auto-generated unique identifier.
className string Morphia mapping java class on webservice.
name string User defined name of a given projection input.
description string Short description to describe projection scenario and inputs.
startedAtTime datetime Date and time when projection was submitted.
endedAtTime datetime Date and time when model execution finished.
status string Execution state of submitted projection (queued, running, complete, error).
isPublic string Flag to determine if the projection is available for public view. (true or false)
modelSettings Object array General settings of models used for the current scenario.
modelSettings.modelID string Reference to model-catalog document.
modelSets Object array List of Set attributes for GAMS models. Follows the set catalog schema.
modelInputs Object array List of model parameters as model input. Follows the parameter catalog schema.
modelOutputs Object array List of model outputs to be extracted from model results. Follows the output catalog schema.

JSON Skeleton Representation

{
    "_id" : "",
    "className" : "",
    "name" : "",
    "description" : "",
    "userid" : "",
    "startedAtTime" : "",
    "endedAtTime" : "",
    "status" : "",
    "isPublic" : "",
    "modelSettings" : [
        {
            "modelID" : "",
        }
    ],
    "modelInputs" : [{}],
    "modelOutputs" : [{}]
}

output-catalog

The output catalog stores metadata information of each variable that will be extracted from the results of a scenario execution. Once executed, the values taken by the variable are appended to the model scenarios document along with the metadata from this collection.

Primary Representation

Output Catalog Figure

Collection Fields

Field Type Description
_id string Unique identifier.
modelID string Model identifier referenced for this output.
varName string Unique variable name on the target model.
varBenchmarks Object Array Listing of benchmark metadata and values to provide additional context to the output.
varBenchmarks.benchmarkLabel string A descriptive name for the presented benchmark
varBenchmarks.benchmarkLang string Target language of the benchmark (English or Spanish).
varBenchmarks.benchmarkDescription string Detailed statement about the benchmark.
varBenchmarks.benchmarkSource string Provenance of the benchmark value.
varBenchmarks.benchmarkValue number A quantity or number of the current benchmark.
varinfo Object Array Output metadata in different languages.
varinfo.varLabel string Descriptive term of the output variable.
varinfo.varCategory string Classification of the output variable.
varinfo.lang string Output metadata language (english or spanish).
varinfo.varDescription string Detailed statement about the output variable.
varinfo.varUnit string Unit of measure used for the output values.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "varName": "",
  "varBenchMarks": [
    {
      "benchmarkLabel": "",
      "benchmarkLang": "",
      "benchmarkDescription": "",
      "benchmarkSource": "",
      "benchmarkValue": 1
    }
  ],
  "varinfo": [
    {
      "varLabel": "",
      "varCategory": "",
      "lang": "",
      "varDescription": "",
      "varUnit": ""
    }
  ]
}

options

Interface options for model. E.g. time series range.

Collection Fields

Field Type Description
_id string Unique identier.
name string Option name.
type string Type of option.
timestep string Type of tymestep.
parameter string Link to model parameter unique name.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "name": "",
  "type": "",
  "timestep": "",
  "parameter": ""
}

parameter-catalog

Holds the list of scientific model inputs and its metadata.

Primary Representation

Input Catalog Figure

Collection Fields

Field Type Description
_id string Unique identifier.
modelID string Model identifier referenced for this input.
dataType string Attribute of the parameter value: E.G. integer, double, boolean or string.
paramName string Unique name used on the model source.
definitionType string If the parameter values are defined by users, scenarios or statically in the model.
maxValue decimal Numerical upper bound that the parameter value can reach.
minValue decimal Numerical lower bound that the parameetr value can reach.
paramDefaultSource string Provenance of the default dataset values.
structType string Organized type of structure where the parameter values are set: E.G scalar, table or matrix.
paramInfo Object Array Informative metadata about the paramater in different languages.
paramInfo.paramCategory string Classification of the parameter.
paramInfo.paramUnit string Unit of measurement used for the parameter values.
paramInfo.paramLabel string Descriptive name of the current parameter.
paramInfo.lang string Target language for the parameter information.
paramInfo.paramDescription string Detailed statement about the current parameter.
paramDefaultValue Any The values assigned to the parameter before user customization.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "dataType": "",
  "paramName": "",
  "definitionType": "",
  "maxValue": 2,
  "minValue": 1,
  "paramDefaultSource": "",
  "structType": "",
  "paraminfo": [
    {
      "paramCategory": "",
      "paramUnit": "",
      "paramLabel": "",
      "lang": "",
      "paramDescription": ""
    }
  ],
  "paramDefaultValue": 1
}

set-catalog

Sets are fundamental building blocks of GAMS models that consist of a group of elements or members. This collection stores only sets that need to be changed dynamically before execution of a GAMS model. The rest of the set declarations remain embedded on the model source code.

Primary Representation

Set Catalog Figure

Collection Fields

Field Type Description
_id string Set unique indentifier
modelID string Model identifer reference for this set.
setName string Unique set name used on the model code.
setValue string String value applied to the set
setLabel string Descriptive name of the set.
setDescription string Detailed statement about the set.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "setName": "",
  "setValue": "",
  "setLabel": "",
  "setDescription": ""
}

theme-catalog

Stores a list of scenario themes. A theme is a pre-defined scenario option that modifies the values of one or more model inputs such as paramater and/or set values.

Primary Representation

Theme Catalog Figure

Collection Fields

Field Type Description
_id string Unique identifier.
modelID string Reference to model identifier.
sourceLink string URL to the provenance of the theme values.
order integer Order of appearance in the user interface from left to right and top to bottom.
info Object array
info.lang string
info.title string Representative title of the theme.
info.description string Detailed statement about the theme.
info.category string Classification of the theme.
info.sourceLabel string Descriptive name of the provenance of the theme values.
info.appendix string Additional information about the theme.
info.imgSource string Path to representative image file.
info.imgCitation string Citation to provide attribution on used image.
parameters Object Array Listing of parameters modified by the theme.
parameters.paramName string Unique paramater name affected by the theme.
parameters.showUser boolean If the theme data is visible or not to the user.
parameters.paramValue Any Paramater values corresponding to the current theme.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "sourceLink": "",
  "order": 1,
  "info": [
    {
      "lang": "",
      "title": "",
      "description": "",
      "category": "",
      "sourceLabel": "",
      "appendix": "",
      "imgSource" : "",
      "imgCitation" : ""
    }
  ],
  "parameters": [
    {
      "paramName": "",
      "paramShow": false,
      "paramValue"": {}
    }
  ]
}

summary-catalog

Primary Representation

Summary Catalog Figure

Collection Fields

Field Type Description
_id string Unique identifier.
modelID string Reference to model identifier.
info Object Array Human readable information.
info.caption string Sentence with the name or title of the summary object.
info.subcaption string Sentence with subtitle or addional information of the summary object.
units string Unit of measure.
lang string Target human language.
data Object Array Attributes and pointers to the target data.
data.targets String Array List of unique dataset names (e.g. ouput variables).
data.operation string Function that will be applied to the target datasets.
data.upperLimit number Maximum value that the processed data can take.
data.lowerLimit number Minimum value that the processes data can take.
benchmarks Object Array Contains metadata that provides context to the summary object.
benchmarks.acronym string An abbreviation of the benchmark name.
benchmarks.label string Short describe term of the benchmark.
benchmarks.value number Value of the benchmark resource.
benchmarks.lang string Target language for the parameter information.
widget Object container Holds metadta for the visualization widget used on for this summary object.
widget.type string Specifies the widget that will be used to represent the summary object.
widget.rangeLabel String Array Specifies labels for the value ranges of the summary object.
visible boolean Used as the default visibility value to show or hide the summary object.

JSON Skeleton Representation

{
  "_id" : "",
  "modelID" : "",
  "info" : [
    {
      "caption" : "",
      "subcaption" : "",
      "units" : "",
      "lang" : "",
    }
  ],
  "data": 
    {
      "targets": [""],
      "operation": "",
      "upperLimit": "",
      "lowerLimit": 0
    },
  "benchmarks": [
    {
      "acronym": "",
      "label": "",
      "value": 1,
      "lang" : "",
    }
  ],
  "widget" : {
    "type": "",
    "rangeLabel": [""]
  },
  "visible": true
}