The DataExchange library (https://documentation.aimms.com/dataexchange/index.html) is intended to offer support in AIMMS for data formats, such as JSON, XML and CSV, that are commonly used in web services. The basic mechanism to read and write data using the DataExchange library is to provide a mapping that maps data in any of these formats onto identifiers in an AIMMS model. Although creating such a mapping is not very difficult, it may amount to a lot of work.
In this post I want to explore if and how we can provide some kind of standard format, that is generated by the DataExchange library itself, and can be used directly in most use cases for web services. More specifically, I'm looking for feedback on what follows below.
Let's start with a list of possible web services that we could consider:
- Calling a web service from within an AIMMS model that runs a Python/R script with data from within the model.
- Calling an (AIMMS-provided) web service that runs an AIMMS model on our cloud platform, with data provided from the calling program/service
- Calling a web service to store data from a model, or retrieve data needed for a model (such a web service could, for instance, replace the current database interface in AIMMS and prevent all kind of problems with, for instance, latency that our current database interface suffers from regularly, as the web service could be placed close to the database server)
Let's focus on the last use case, and explore how most of our customers use relational databases (as we do ourselves as well in AIMMS SC Navigator). Most customers store values from one or more identifiers with the same domain in a single table. In order to be able to store data for multiple scenarios, typically one additional column is added to the primary key of all tables to store the scenario key. Typically, models read data from such tables for one or more scenarios, and add the complete data for a new scenario, or replace the complete data for an already existing scenario. Changing individual rows or rows, has quite some commonality with CDM and are less suitable for the regular database interface we have in my opinion.
In SC Navigator, we have an additional layer of abstraction, the dataset, which collects data from multiple tables holding functionally similar data into a single dataset instance. The model data is then filled by mixing and matching data from instances of multiple datasets, each of which could have been created by any of the SC Navigator apps, or, with a 'database' web service, even by external applications/services.
Through model annotations we can indicate in the model the data of which identifiers should be stored in a single table, and to which dataset such a table would belong. From such a collection of annotations we could automatically generate a database schema (like in tools such as Mendix), and use DataExchange functionality to write data to these generated tables either directly, or via a webservice. The big benefit is that the entire data exchange with a model is standardized, and we can make sure that the data exchange with the database will take place in a performant manner, by proper normalization of all tables and the use of the appropriate indexes on the database tables. We could even let CDM operate on such generated tables, where one would lose the history of changes that CDM currently provides because only the latest value will be stored, but where individual changes committed by one client could still be passed to other clients as CDM currently does.
To transfer the data as JSON, we could generate a mapping from those annotations that generates JSON as in the following example:
{
"dataSets": {
"DataSet1": {
"instance" : "data of 07-09-2020"
"tables": {
"Table1": {
"rows":
{ "i":"i1", "j":1, "pn": 10.0, "ps": "a value"},
{ "i":"i1", "j":2, "pn": 20.1, "ps": "another value"}
]
},
"Table2": {
"rows":
{ "i":"i1", "j":1, "k":3, "qn": 10.0, "qs": "a value"},
{ "i":"i1", "j":2, "k":4, "qn": 20.1, "qs": "another value"}
]
}
}
},
"DataSet2": {
"instance" : "data of 07-09-2020"
"tables": {
"Table3": {
"rows": 9
{ "i":"i1", "rn": 10.0, "rs": "a value"},
{ "i":"i2", "rn": 20.1, "rs": "another value"}
]
},
"Table4": {
"rows":
{ "i":"i1", "k":3, "sn": 10.0, "ss": "a value"},
{ "i":"i1", "k":4, "sn": 20.1, "ss": "another value"}
]
}
}
}
}
}
Such a "standard" JSON format to exchange data is quite suitable for exchanging data with an AIMMS model-based webservice and can be easily generated from other applications/services, but also with Python webservices, as the JSON content of the individual tables can be directly read into or generated from Python Pandas. And it matches directly the format of the generated database table I proposed above (with potentially some metadata added in additional fields).
So, to summarize the benefits of my proposal:
- it standardizes all data exchange with the model, database and external web services. This fits our guidance principle, where AIMMS will provide as much guidance as possible to create good applications and follow best practices
- for modelers the work to create mappings to exchange data with various services will dramatically reduce
- it would alleviate modelers from the task of manually creating a performant database schema and the associated mapping in the model
- it will make sure that, because of the standardization, we can much more streamline the support AIMMS have to give on data exchange with databases.
Please let me know your reactions.