Data Model View

The SWIM modeling storage system is implemented as a document-based NoSQL database managed with MongoDB. This section describes the semi-structured organization of data in our storage system.

Terms

Term	Definition
Database	A physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.
Collection	A grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists within a single database.
Document	A record in a MongoDB collection and the basic unit of data in MongoDB. Documents are analogous to JSON objects but exist in the database in a more type-rich format known as BSON.
JSON	JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in many programming languages.
String	A sequence of characters
Array	Is a data structure consisting of a collection of elements (values or variables).
Object	A JSON data structure.
Any	When a field is not constrained to a specific data type.

Sources:
https://docs.mongodb.com/manual/reference/glossary/
https://www.json.org/json-en.html

Collection Catalog

The SWIM modeling database is under constant expansion and refinement as the development process progresses. The platform is currently composed of the following collections:

acronym-catalog: Label dictionary to convert model specific acronyms into suitable labels for users.
canned-scenarios: Subsets of model outputs to answer specific questions that focus on targeted model ouputs.
model-catalog: Metadata of a registered modeling services in SWIM.
transformation-catalog: Metadata of registered transformation services in SWIM.
output-catalog: List of output metadata linked to s specific model.
options: Adds model parameters to the options menu on the SWIM UI.
parameter-catalog: List of model paramater metadata and default values linked to specific model.
private-scenarios: Model scenario executions by registered users that are not available for public view.
public-scenarios: Model scenario executions openly available to public.
set-catalog: List of set attributes for models developed with the General Algebraic Modeling System (GAMS).
theme-catalog: List of generic scenario themes where one scenario theme can modify one or multiple model inputs.
summary-catalog: List of summary objects available per model. These are loaded in the results summary section of a modeling workflow on the SWIM UI.

acronym-catalog

Sometimes model source code contains acronyms or abbreviated words. This naming scheme may be familiar to the developers of the model, but can be hard for other people to infer the meaning behind each abbreviation. This collection holds multiple acronym dictionaries linked to the different models registered in SWIM. That way the interface can replace them with more meaningful labels and facilitate user understanding.

Primary Representation

Acronym Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Dictionary unique identifier.
modelID	string	Identifier of the model linked to the dictionary.
lang	string	Target language (English or Spanish).
dictionary	Key-Value Array	Acronym and label pairs.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "lang": "",
  "dictionary": {
    "": "",
    "": ""
  }
}

model-catalog

Stores metadata of registered scientific models in SWIM. This information is used to extract model provenance and required information invoke execution through underlying web services.

Primary Representation

Model Catalog Figure

Collection Fields

Field	Type	Description
_id	string	MongoDB auto-generated unique identifier.
info	Object array
info.name	string
info.description	string
info.lang	string
dateCreated	datetime	Date of the first model release.
dateModified	datetime	Date and time of last modification to the model.
softwareAgent	string	Software application capable of executing and implementing the model.
license	string	The license information for the exposed model.
version	string	Provides the version of the base model.
sponsor	string	Organization, person or institution that provides funding.
creators	Object array	Model authors/creators.
creators.name	string	Creator full name.
creators.department	string	A division of a organization such as a government, university, business, or shop, dealing with a specific subject.
creators.organization	string	Creator organization name.
creators.email	string	Electronic mail address for communication.
creators.city	string	Named geographical locality.
creators.state	string	Region associated with the address of an object.
creators.country	string	The country name associated with the address of an object.
hostServer	Object container	Server machine where the base model and corresponding software agent is deployed.
hostServer.serverName	string	DNS name of the server.
hostServer.serverAdress	string	External IP Address of the server.
hostServer.serverAdmin	string	Person is charge of administating the server.
hostServer.adminEmail	string	Administrator email.
hostServer.serverOwner	string	Organization or person that owns the physical server.
serviceInfo	Object container	Information regarding the exposed webservice of the model.
serviceInfo.serviceURL	string	Request URI.
serviceInfo.serviceMethod	string	http method to invoke a request (EG: POST, GET).
serviceInfo.status	string
serviceInfo.consumes	string	A list of MIME types the service can consume. (E.G: application/json).
serviceInfo.produces	string	MIME types produced as a response by the service.
service.isPublic	boolean	If the service is available on the web for public use. (true or false).
serviceInfo.externalDocs	Object array	Additional external documentation.

JSON Skeleton Representation

{
    "_id" : "",
  "info": [
    {
      "name": "",
      "description": "",
      "lang": ""
    }
  ],
    "dateCreated" : "",
    "dateModified" : "",
    "softwareAgent" : "",
    "license" : "",
    "version" : "",
    "sponsor" : "",
    "creators" : [
    {
        "name" : "",
        "department" : "",
        "email" : "",
        "organization" : "",
        "city" : "",
        "state" : "",
        "country" : ""
    }
    ],
    "hostServer" : {
        "serverName" : "",
        "serverAdress" : "",
        "serverAdmin" : "",
        "adminEmail" : "",
        "serverOwner" : ""
    },
    "serviceInfo" : {
        "serviceURL" : "",
        "serviceMethod" : "",
    "status": "",
        "consumes" : "",
        "produces" : "",
        "isPublic" : "",
        "externalDocs" : [""]
    }
}

transformation-catalog

Stores metadata of registered data transformation services in SWIM.

Collection Fields

Field	Type	Description
_id	string	MongoDB auto-generated unique identifier.
info	Object array
info.name	string
info.description	string
info.lang	string
dateCreated	datetime	Date of the first model release.
dateModified	datetime	Date and time of last modification to the model.
language	string	Programming language used for artifact.
license	string	The license information for the exposed model.
version	string	Provides the version of the base model.
creators	Object array	Model authors/creators.
creators.name	string	Creator full name.
creators.department	string	A division of a organization such as a government, university, business, or shop, dealing with a specific subject.
creators.organization	string	Creator organization name.
creators.email	string	Electronic mail address for communication.
creators.city	string	Named geographical locality.
creators.state	string	Region associated with the address of an object.
creators.country	string	The country name associated with the address of an object.
serviceInfo	Object container	Information regarding the exposed webservice of the model.
serviceInfo.serviceURL	string	Request URI.
serviceInfo.serviceMethod	string	http method to invoke a request (EG: POST, GET).
serviceInfo.status	string
serviceInfo.consumes	string	A list of MIME types the service can consume. (E.G: application/json).
serviceInfo.produces	string	MIME types produced as a response by the service.
service.isPublic	boolean	If the service is available on the web for public use. (true or false).

canned-scenarios

A canned scenario is a subset of a model scenario specification. The listing of inputs and outputs is filtered in relation to a question of interest; hence only parameters and variables that help answer the targeted question will be presented to the user. This approach aims to help introduction of underlying models without overwhelming users at a first glance.

Primary Representation

Canned Scenario Figure

Collection Fields

Field	Type	Description
_id	string	Unique identifier.
name	string	Question or subject.
description	string	Detailed statement about the canned scenario.
modelID	string
themeCatalogFilter	String Array	Listing of theme categories to show on the canned scenario.
parameterFilter	String Array	Listing of model inputs to show on the canned scenario.
outputFilter	String Array	Listing of model output to show on the canned scenario.

JSON Skeleton Representation

{
  "_id": "",
  "name": "",
  "description": "",
  "modelID": "",
  "themeCatalogFilter": [
    ""
  ],
  "parameterFilter": [
    ""
  ],
  "outputFilter": [
    ""
  ]
}

User Scenarios (public-scenarios, private-scenarios)

Stores a complete model scenario specification. The specification is composed of scenario metadata, model settings, sets, inputs and outputs. This collection schema is used for the execution of canned scenarios, public scenarios and private scenarios.

Primary Representation

User Scenario Figure

Collection Fields

Field	Type	Description
_id	string	MongoDB auto-generated unique identifier.
className	string	Morphia mapping java class on webservice.
name	string	User defined name of a given projection input.
description	string	Short description to describe projection scenario and inputs.
startedAtTime	datetime	Date and time when projection was submitted.
endedAtTime	datetime	Date and time when model execution finished.
status	string	Execution state of submitted projection (queued, running, complete, error).
isPublic	string	Flag to determine if the projection is available for public view. (true or false)
modelSettings	Object array	General settings of models used for the current scenario.
modelSettings.modelID	string	Reference to model-catalog document.
modelSets	Object array	List of Set attributes for GAMS models. Follows the set catalog schema.
modelInputs	Object array	List of model parameters as model input. Follows the parameter catalog schema.
modelOutputs	Object array	List of model outputs to be extracted from model results. Follows the output catalog schema.

JSON Skeleton Representation

{
    "_id" : "",
    "className" : "",
    "name" : "",
    "description" : "",
    "userid" : "",
    "startedAtTime" : "",
    "endedAtTime" : "",
    "status" : "",
    "isPublic" : "",
    "modelSettings" : [
        {
            "modelID" : "",
        }
    ],
    "modelInputs" : [{}],
    "modelOutputs" : [{}]
}

output-catalog

The output catalog stores metadata information of each variable that will be extracted from the results of a scenario execution. Once executed, the values taken by the variable are appended to the model scenarios document along with the metadata from this collection.

Primary Representation

Output Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Unique identifier.
modelID	string	Model identifier referenced for this output.
varName	string	Unique variable name on the target model.
varBenchmarks	Object Array	Listing of benchmark metadata and values to provide additional context to the output.
varBenchmarks.benchmarkLabel	string	A descriptive name for the presented benchmark
varBenchmarks.benchmarkLang	string	Target language of the benchmark (English or Spanish).
varBenchmarks.benchmarkDescription	string	Detailed statement about the benchmark.
varBenchmarks.benchmarkSource	string	Provenance of the benchmark value.
varBenchmarks.benchmarkValue	number	A quantity or number of the current benchmark.
varinfo	Object Array	Output metadata in different languages.
varinfo.varLabel	string	Descriptive term of the output variable.
varinfo.varCategory	string	Classification of the output variable.
varinfo.lang	string	Output metadata language (english or spanish).
varinfo.varDescription	string	Detailed statement about the output variable.
varinfo.varUnit	string	Unit of measure used for the output values.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "varName": "",
  "varBenchMarks": [
    {
      "benchmarkLabel": "",
      "benchmarkLang": "",
      "benchmarkDescription": "",
      "benchmarkSource": "",
      "benchmarkValue": 1
    }
  ],
  "varinfo": [
    {
      "varLabel": "",
      "varCategory": "",
      "lang": "",
      "varDescription": "",
      "varUnit": ""
    }
  ]
}

options

Interface options for model. E.g. time series range.

Collection Fields

Field	Type	Description
_id	string	Unique identier.
name	string	Option name.
type	string	Type of option.
timestep	string	Type of tymestep.
parameter	string	Link to model parameter unique name.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "name": "",
  "type": "",
  "timestep": "",
  "parameter": ""
}

parameter-catalog

Holds the list of scientific model inputs and its metadata.

Primary Representation

Input Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Unique identifier.
modelID	string	Model identifier referenced for this input.
dataType	string	Attribute of the parameter value: E.G. integer, double, boolean or string.
paramName	string	Unique name used on the model source.
definitionType	string	If the parameter values are defined by users, scenarios or statically in the model.
maxValue	decimal	Numerical upper bound that the parameter value can reach.
minValue	decimal	Numerical lower bound that the parameetr value can reach.
paramDefaultSource	string	Provenance of the default dataset values.
structType	string	Organized type of structure where the parameter values are set: E.G scalar, table or matrix.
paramInfo	Object Array	Informative metadata about the paramater in different languages.
paramInfo.paramCategory	string	Classification of the parameter.
paramInfo.paramUnit	string	Unit of measurement used for the parameter values.
paramInfo.paramLabel	string	Descriptive name of the current parameter.
paramInfo.lang	string	Target language for the parameter information.
paramInfo.paramDescription	string	Detailed statement about the current parameter.
paramDefaultValue	Any	The values assigned to the parameter before user customization.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "dataType": "",
  "paramName": "",
  "definitionType": "",
  "maxValue": 2,
  "minValue": 1,
  "paramDefaultSource": "",
  "structType": "",
  "paraminfo": [
    {
      "paramCategory": "",
      "paramUnit": "",
      "paramLabel": "",
      "lang": "",
      "paramDescription": ""
    }
  ],
  "paramDefaultValue": 1
}

set-catalog

Sets are fundamental building blocks of GAMS models that consist of a group of elements or members. This collection stores only sets that need to be changed dynamically before execution of a GAMS model. The rest of the set declarations remain embedded on the model source code.

Primary Representation

Set Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Set unique indentifier
modelID	string	Model identifer reference for this set.
setName	string	Unique set name used on the model code.
setValue	string	String value applied to the set
setLabel	string	Descriptive name of the set.
setDescription	string	Detailed statement about the set.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "setName": "",
  "setValue": "",
  "setLabel": "",
  "setDescription": ""
}

theme-catalog

Stores a list of scenario themes. A theme is a pre-defined scenario option that modifies the values of one or more model inputs such as paramater and/or set values.

Primary Representation

Theme Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Unique identifier.
modelID	string	Reference to model identifier.
sourceLink	string	URL to the provenance of the theme values.
order	integer	Order of appearance in the user interface from left to right and top to bottom.
info	Object array
info.lang	string
info.title	string	Representative title of the theme.
info.description	string	Detailed statement about the theme.
info.category	string	Classification of the theme.
info.sourceLabel	string	Descriptive name of the provenance of the theme values.
info.appendix	string	Additional information about the theme.
info.imgSource	string	Path to representative image file.
info.imgCitation	string	Citation to provide attribution on used image.
parameters	Object Array	Listing of parameters modified by the theme.
parameters.paramName	string	Unique paramater name affected by the theme.
parameters.showUser	boolean	If the theme data is visible or not to the user.
parameters.paramValue	Any	Paramater values corresponding to the current theme.

JSON Skeleton Representation

{
  "_id": "",
  "modelID": "",
  "sourceLink": "",
  "order": 1,
  "info": [
    {
      "lang": "",
      "title": "",
      "description": "",
      "category": "",
      "sourceLabel": "",
      "appendix": "",
      "imgSource" : "",
      "imgCitation" : ""
    }
  ],
  "parameters": [
    {
      "paramName": "",
      "paramShow": false,
      "paramValue"": {}
    }
  ]
}

summary-catalog

Primary Representation

Summary Catalog Figure

Collection Fields

Field	Type	Description
_id	string	Unique identifier.
modelID	string	Reference to model identifier.
info	Object Array	Human readable information.
info.caption	string	Sentence with the name or title of the summary object.
info.subcaption	string	Sentence with subtitle or addional information of the summary object.
units	string	Unit of measure.
lang	string	Target human language.
data	Object Array	Attributes and pointers to the target data.
data.targets	String Array	List of unique dataset names (e.g. ouput variables).
data.operation	string	Function that will be applied to the target datasets.
data.upperLimit	number	Maximum value that the processed data can take.
data.lowerLimit	number	Minimum value that the processes data can take.
benchmarks	Object Array	Contains metadata that provides context to the summary object.
benchmarks.acronym	string	An abbreviation of the benchmark name.
benchmarks.label	string	Short describe term of the benchmark.
benchmarks.value	number	Value of the benchmark resource.
benchmarks.lang	string	Target language for the parameter information.
widget	Object container	Holds metadta for the visualization widget used on for this summary object.
widget.type	string	Specifies the widget that will be used to represent the summary object.
widget.rangeLabel	String Array	Specifies labels for the value ranges of the summary object.
visible	boolean	Used as the default visibility value to show or hide the summary object.

JSON Skeleton Representation

{
  "_id" : "",
  "modelID" : "",
  "info" : [
    {
      "caption" : "",
      "subcaption" : "",
      "units" : "",
      "lang" : "",
    }
  ],
  "data": 
    {
      "targets": [""],
      "operation": "",
      "upperLimit": "",
      "lowerLimit": 0
    },
  "benchmarks": [
    {
      "acronym": "",
      "label": "",
      "value": 1,
      "lang" : "",
    }
  ],
  "widget" : {
    "type": "",
    "rangeLabel": [""]
  },
  "visible": true
}