Data Model View
The SWIM modeling storage system is implemented as a document-based NoSQL database managed with MongoDB. This section describes the semi-structured organization of data in our storage system.
Terms
Term | Definition |
---|---|
Database | A physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases. |
Collection | A grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists within a single database. |
Document | A record in a MongoDB collection and the basic unit of data in MongoDB. Documents are analogous to JSON objects but exist in the database in a more type-rich format known as BSON. |
JSON | JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in many programming languages. |
String | A sequence of characters |
Array | Is a data structure consisting of a collection of elements (values or variables). |
Object | A JSON data structure. |
Any | When a field is not constrained to a specific data type. |
Sources:
https://docs.mongodb.com/manual/reference/glossary/
https://www.json.org/json-en.html
Collection Catalog
The SWIM modeling database is under constant expansion and refinement as the development process progresses. The platform is currently composed of the following collections:
- acronym-catalog: Label dictionary to convert model specific acronyms into suitable labels for users.
- canned-scenarios: Subsets of model outputs to answer specific questions that focus on targeted model ouputs.
- model-catalog: Metadata of a registered modeling services in SWIM.
- transformation-catalog: Metadata of registered transformation services in SWIM.
- output-catalog: List of output metadata linked to s specific model.
- options: Adds model parameters to the options menu on the SWIM UI.
- parameter-catalog: List of model paramater metadata and default values linked to specific model.
- private-scenarios: Model scenario executions by registered users that are not available for public view.
- public-scenarios: Model scenario executions openly available to public.
- set-catalog: List of set attributes for models developed with the General Algebraic Modeling System (GAMS).
- theme-catalog: List of generic scenario themes where one scenario theme can modify one or multiple model inputs.
- summary-catalog: List of summary objects available per model. These are loaded in the results summary section of a modeling workflow on the SWIM UI.
acronym-catalog
Sometimes model source code contains acronyms or abbreviated words. This naming scheme may be familiar to the developers of the model, but can be hard for other people to infer the meaning behind each abbreviation. This collection holds multiple acronym dictionaries linked to the different models registered in SWIM. That way the interface can replace them with more meaningful labels and facilitate user understanding.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Dictionary unique identifier. |
modelID | string | Identifier of the model linked to the dictionary. |
lang | string | Target language (English or Spanish). |
dictionary | Key-Value Array | Acronym and label pairs. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"lang": "",
"dictionary": {
"": "",
"": ""
}
}
model-catalog
Stores metadata of registered scientific models in SWIM. This information is used to extract model provenance and required information invoke execution through underlying web services.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | MongoDB auto-generated unique identifier. |
info | Object array | |
info.name | string | |
info.description | string | |
info.lang | string | |
dateCreated | datetime | Date of the first model release. |
dateModified | datetime | Date and time of last modification to the model. |
softwareAgent | string | Software application capable of executing and implementing the model. |
license | string | The license information for the exposed model. |
version | string | Provides the version of the base model. |
sponsor | string | Organization, person or institution that provides funding. |
creators | Object array | Model authors/creators. |
creators.name | string | Creator full name. |
creators.department | string | A division of a organization such as a government, university, business, or shop, dealing with a specific subject. |
creators.organization | string | Creator organization name. |
creators.email | string | Electronic mail address for communication. |
creators.city | string | Named geographical locality. |
creators.state | string | Region associated with the address of an object. |
creators.country | string | The country name associated with the address of an object. |
hostServer | Object container | Server machine where the base model and corresponding software agent is deployed. |
hostServer.serverName | string | DNS name of the server. |
hostServer.serverAdress | string | External IP Address of the server. |
hostServer.serverAdmin | string | Person is charge of administating the server. |
hostServer.adminEmail | string | Administrator email. |
hostServer.serverOwner | string | Organization or person that owns the physical server. |
serviceInfo | Object container | Information regarding the exposed webservice of the model. |
serviceInfo.serviceURL | string | Request URI. |
serviceInfo.serviceMethod | string | http method to invoke a request (EG: POST, GET). |
serviceInfo.status | string | |
serviceInfo.consumes | string | A list of MIME types the service can consume. (E.G: application/json). |
serviceInfo.produces | string | MIME types produced as a response by the service. |
service.isPublic | boolean | If the service is available on the web for public use. (true or false). |
serviceInfo.externalDocs | Object array | Additional external documentation. |
JSON Skeleton Representation
{
"_id" : "",
"info": [
{
"name": "",
"description": "",
"lang": ""
}
],
"dateCreated" : "",
"dateModified" : "",
"softwareAgent" : "",
"license" : "",
"version" : "",
"sponsor" : "",
"creators" : [
{
"name" : "",
"department" : "",
"email" : "",
"organization" : "",
"city" : "",
"state" : "",
"country" : ""
}
],
"hostServer" : {
"serverName" : "",
"serverAdress" : "",
"serverAdmin" : "",
"adminEmail" : "",
"serverOwner" : ""
},
"serviceInfo" : {
"serviceURL" : "",
"serviceMethod" : "",
"status": "",
"consumes" : "",
"produces" : "",
"isPublic" : "",
"externalDocs" : [""]
}
}
transformation-catalog
Stores metadata of registered data transformation services in SWIM.
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | MongoDB auto-generated unique identifier. |
info | Object array | |
info.name | string | |
info.description | string | |
info.lang | string | |
dateCreated | datetime | Date of the first model release. |
dateModified | datetime | Date and time of last modification to the model. |
language | string | Programming language used for artifact. |
license | string | The license information for the exposed model. |
version | string | Provides the version of the base model. |
creators | Object array | Model authors/creators. |
creators.name | string | Creator full name. |
creators.department | string | A division of a organization such as a government, university, business, or shop, dealing with a specific subject. |
creators.organization | string | Creator organization name. |
creators.email | string | Electronic mail address for communication. |
creators.city | string | Named geographical locality. |
creators.state | string | Region associated with the address of an object. |
creators.country | string | The country name associated with the address of an object. |
serviceInfo | Object container | Information regarding the exposed webservice of the model. |
serviceInfo.serviceURL | string | Request URI. |
serviceInfo.serviceMethod | string | http method to invoke a request (EG: POST, GET). |
serviceInfo.status | string | |
serviceInfo.consumes | string | A list of MIME types the service can consume. (E.G: application/json). |
serviceInfo.produces | string | MIME types produced as a response by the service. |
service.isPublic | boolean | If the service is available on the web for public use. (true or false). |
canned-scenarios
A canned scenario is a subset of a model scenario specification. The listing of inputs and outputs is filtered in relation to a question of interest; hence only parameters and variables that help answer the targeted question will be presented to the user. This approach aims to help introduction of underlying models without overwhelming users at a first glance.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identifier. |
name | string | Question or subject. |
description | string | Detailed statement about the canned scenario. |
modelID | string | |
themeCatalogFilter | String Array | Listing of theme categories to show on the canned scenario. |
parameterFilter | String Array | Listing of model inputs to show on the canned scenario. |
outputFilter | String Array | Listing of model output to show on the canned scenario. |
JSON Skeleton Representation
{
"_id": "",
"name": "",
"description": "",
"modelID": "",
"themeCatalogFilter": [
""
],
"parameterFilter": [
""
],
"outputFilter": [
""
]
}
User Scenarios (public-scenarios, private-scenarios)
Stores a complete model scenario specification. The specification is composed of scenario metadata, model settings, sets, inputs and outputs. This collection schema is used for the execution of canned scenarios, public scenarios and private scenarios.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | MongoDB auto-generated unique identifier. |
className | string | Morphia mapping java class on webservice. |
name | string | User defined name of a given projection input. |
description | string | Short description to describe projection scenario and inputs. |
startedAtTime | datetime | Date and time when projection was submitted. |
endedAtTime | datetime | Date and time when model execution finished. |
status | string | Execution state of submitted projection (queued, running, complete, error). |
isPublic | string | Flag to determine if the projection is available for public view. (true or false) |
modelSettings | Object array | General settings of models used for the current scenario. |
modelSettings.modelID | string | Reference to model-catalog document. |
modelSets | Object array | List of Set attributes for GAMS models. Follows the set catalog schema. |
modelInputs | Object array | List of model parameters as model input. Follows the parameter catalog schema. |
modelOutputs | Object array | List of model outputs to be extracted from model results. Follows the output catalog schema. |
JSON Skeleton Representation
{
"_id" : "",
"className" : "",
"name" : "",
"description" : "",
"userid" : "",
"startedAtTime" : "",
"endedAtTime" : "",
"status" : "",
"isPublic" : "",
"modelSettings" : [
{
"modelID" : "",
}
],
"modelInputs" : [{}],
"modelOutputs" : [{}]
}
output-catalog
The output catalog stores metadata information of each variable that will be extracted from the results of a scenario execution. Once executed, the values taken by the variable are appended to the model scenarios document along with the metadata from this collection.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identifier. |
modelID | string | Model identifier referenced for this output. |
varName | string | Unique variable name on the target model. |
varBenchmarks | Object Array | Listing of benchmark metadata and values to provide additional context to the output. |
varBenchmarks.benchmarkLabel | string | A descriptive name for the presented benchmark |
varBenchmarks.benchmarkLang | string | Target language of the benchmark (English or Spanish). |
varBenchmarks.benchmarkDescription | string | Detailed statement about the benchmark. |
varBenchmarks.benchmarkSource | string | Provenance of the benchmark value. |
varBenchmarks.benchmarkValue | number | A quantity or number of the current benchmark. |
varinfo | Object Array | Output metadata in different languages. |
varinfo.varLabel | string | Descriptive term of the output variable. |
varinfo.varCategory | string | Classification of the output variable. |
varinfo.lang | string | Output metadata language (english or spanish). |
varinfo.varDescription | string | Detailed statement about the output variable. |
varinfo.varUnit | string | Unit of measure used for the output values. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"varName": "",
"varBenchMarks": [
{
"benchmarkLabel": "",
"benchmarkLang": "",
"benchmarkDescription": "",
"benchmarkSource": "",
"benchmarkValue": 1
}
],
"varinfo": [
{
"varLabel": "",
"varCategory": "",
"lang": "",
"varDescription": "",
"varUnit": ""
}
]
}
options
Interface options for model. E.g. time series range.
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identier. |
name | string | Option name. |
type | string | Type of option. |
timestep | string | Type of tymestep. |
parameter | string | Link to model parameter unique name. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"name": "",
"type": "",
"timestep": "",
"parameter": ""
}
parameter-catalog
Holds the list of scientific model inputs and its metadata.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identifier. |
modelID | string | Model identifier referenced for this input. |
dataType | string | Attribute of the parameter value: E.G. integer, double, boolean or string. |
paramName | string | Unique name used on the model source. |
definitionType | string | If the parameter values are defined by users, scenarios or statically in the model. |
maxValue | decimal | Numerical upper bound that the parameter value can reach. |
minValue | decimal | Numerical lower bound that the parameetr value can reach. |
paramDefaultSource | string | Provenance of the default dataset values. |
structType | string | Organized type of structure where the parameter values are set: E.G scalar, table or matrix. |
paramInfo | Object Array | Informative metadata about the paramater in different languages. |
paramInfo.paramCategory | string | Classification of the parameter. |
paramInfo.paramUnit | string | Unit of measurement used for the parameter values. |
paramInfo.paramLabel | string | Descriptive name of the current parameter. |
paramInfo.lang | string | Target language for the parameter information. |
paramInfo.paramDescription | string | Detailed statement about the current parameter. |
paramDefaultValue | Any | The values assigned to the parameter before user customization. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"dataType": "",
"paramName": "",
"definitionType": "",
"maxValue": 2,
"minValue": 1,
"paramDefaultSource": "",
"structType": "",
"paraminfo": [
{
"paramCategory": "",
"paramUnit": "",
"paramLabel": "",
"lang": "",
"paramDescription": ""
}
],
"paramDefaultValue": 1
}
set-catalog
Sets are fundamental building blocks of GAMS models that consist of a group of elements or members. This collection stores only sets that need to be changed dynamically before execution of a GAMS model. The rest of the set declarations remain embedded on the model source code.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Set unique indentifier |
modelID | string | Model identifer reference for this set. |
setName | string | Unique set name used on the model code. |
setValue | string | String value applied to the set |
setLabel | string | Descriptive name of the set. |
setDescription | string | Detailed statement about the set. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"setName": "",
"setValue": "",
"setLabel": "",
"setDescription": ""
}
theme-catalog
Stores a list of scenario themes. A theme is a pre-defined scenario option that modifies the values of one or more model inputs such as paramater and/or set values.
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identifier. |
modelID | string | Reference to model identifier. |
sourceLink | string | URL to the provenance of the theme values. |
order | integer | Order of appearance in the user interface from left to right and top to bottom. |
info | Object array | |
info.lang | string | |
info.title | string | Representative title of the theme. |
info.description | string | Detailed statement about the theme. |
info.category | string | Classification of the theme. |
info.sourceLabel | string | Descriptive name of the provenance of the theme values. |
info.appendix | string | Additional information about the theme. |
info.imgSource | string | Path to representative image file. |
info.imgCitation | string | Citation to provide attribution on used image. |
parameters | Object Array | Listing of parameters modified by the theme. |
parameters.paramName | string | Unique paramater name affected by the theme. |
parameters.showUser | boolean | If the theme data is visible or not to the user. |
parameters.paramValue | Any | Paramater values corresponding to the current theme. |
JSON Skeleton Representation
{
"_id": "",
"modelID": "",
"sourceLink": "",
"order": 1,
"info": [
{
"lang": "",
"title": "",
"description": "",
"category": "",
"sourceLabel": "",
"appendix": "",
"imgSource" : "",
"imgCitation" : ""
}
],
"parameters": [
{
"paramName": "",
"paramShow": false,
"paramValue"": {}
}
]
}
summary-catalog
Primary Representation
Collection Fields
Field | Type | Description |
---|---|---|
_id | string | Unique identifier. |
modelID | string | Reference to model identifier. |
info | Object Array | Human readable information. |
info.caption | string | Sentence with the name or title of the summary object. |
info.subcaption | string | Sentence with subtitle or addional information of the summary object. |
units | string | Unit of measure. |
lang | string | Target human language. |
data | Object Array | Attributes and pointers to the target data. |
data.targets | String Array | List of unique dataset names (e.g. ouput variables). |
data.operation | string | Function that will be applied to the target datasets. |
data.upperLimit | number | Maximum value that the processed data can take. |
data.lowerLimit | number | Minimum value that the processes data can take. |
benchmarks | Object Array | Contains metadata that provides context to the summary object. |
benchmarks.acronym | string | An abbreviation of the benchmark name. |
benchmarks.label | string | Short describe term of the benchmark. |
benchmarks.value | number | Value of the benchmark resource. |
benchmarks.lang | string | Target language for the parameter information. |
widget | Object container | Holds metadta for the visualization widget used on for this summary object. |
widget.type | string | Specifies the widget that will be used to represent the summary object. |
widget.rangeLabel | String Array | Specifies labels for the value ranges of the summary object. |
visible | boolean | Used as the default visibility value to show or hide the summary object. |
JSON Skeleton Representation
{
"_id" : "",
"modelID" : "",
"info" : [
{
"caption" : "",
"subcaption" : "",
"units" : "",
"lang" : "",
}
],
"data":
{
"targets": [""],
"operation": "",
"upperLimit": "",
"lowerLimit": 0
},
"benchmarks": [
{
"acronym": "",
"label": "",
"value": 1,
"lang" : "",
}
],
"widget" : {
"type": "",
"rangeLabel": [""]
},
"visible": true
}