Evaluation of Contemporary Graph Databases for Efficient Persistence of Large Scale Models

2020-02-27 59浏览

  • 1.Journal of Object Technology Published by AITO — Association Internationale pour les Technologies Objets, c JOT 2014 Online athttp://www.jot.fm.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models Konstantinos Barmpisa Dimitrios S. Kolovosa a. Department of Computer Science, University of York, Heslington, York, YO10 5DD, UKhttp://www.cs.york.ac.uk/Abstract Scalability in Model-Driven Engineering (MDE) is often a bottleneck for industrial applications. Industrial scale models need to be persisted in a way that allows for their seamless and efficient manipulation, often by multiple stakeholders simultaneously. This paper compares the conventional and commonly used persistence mechanisms in MDE with novel approaches such as the use of graph-based NoSQL databases; Prototype integrations of Neo4J and OrientDB with EMF are used to compare with relational database, XMI and document-based NoSQL database persistence mechanisms. It also compares and benchmarks two approaches for querying models persisted in graph databases to measure and compare their relative performance in terms of memory usage and execution time. Keywords scalability, persistence, model-driven engineering 1 Introduction The popularity and adoption of MDE in industry has increased substantially in the past decade as it provides several benefits compared to traditional software engineering practices, such as improved productivity and reuse [MFM+ 09], which allow for systems to be built faster and cheaper. However, certain limitations of supporting tools such as poor scalability which prevent wider use of MDE in industry [KPP08, MDBS09] will need to be overcome. Scalability issues arise when large models (of the order of millions of model elements) are used in MDE processes. When referring to scalability issues in MDE they can be split into the followingcategories:1. Modelpersistence:storage of large models; ability to access and update such models with low memory footprint and fast execution time. 2. Model querying andtransformation:ability to perform intensive and complex queries and transformations on large models with fast execution time. Konstantinos Barmpis, Dimitrios S. Kolovos. Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models. In Journal of Object Technology, vol. 13, no. 3, 2014, pages 3:1–26.doi:10.5381/jot.2014.13.3.a3
  • 2.2 · Konstantinos Barmpis, Dimitrios S. Kolovos 3. Collaborativework:multiple developers being able to query, modify and version control large-scale shared models in a non-invasive manner. Previous works have suggested using relational and document NoSQL databases to improve performance and memory efficiency when working with large-scale models. This paper contributes to the study of scalable techniques for large-scale model persistence and querying by reporting on the results obtained by exploring two graph-based NoSQL databases (OrientDB and Neo4J), and by providing a direct comparison with previously proposed persistence mechanisms. This paper is an extended version of [BK12] with further analysis of the databases presented in Section 5 and results in Section 7. This work is used as the foundation for [BK13], which integrates scalable persistence with reliable versioning. The remainder of the paper is organized as follows. Section 2 introduces MDE and NoSQL databases. Section 3 discusses other projects aiming at providing scalable model persistence. Section 4 introduces the Grabats query used for evaluating the technologies. Section 5 presents the design and implementation of two further prototypes for scalable model persistence based on the OrientDB and Neo4J graph-based NoSQL databases. In section 6 we discuss two approaches for navigation and querying of models stored in such databases. In section 7 the produced prototypes are compared with existing solutions in terms of performance. Finally, section 8 discusses the application of these results and identifies interesting directions for further work. 2 Background This section discusses the core concepts related to models, Model Driven Engineering and NoSQL databases that will be used in the remainder of the paper. 2.1 Model-Driven Engineering Model Driven Engineering is an approach to software development that elevates models to first class artefacts of the software engineering process. In MDE, models are living entities used to describe a system and (partly) automate its implementation through automated transformation to lower-level products. In order for models to be amenable to automated processing, they must be defined in terms of rigorously specified modeling languages (metamodels). The Eclipse Modeling Framework1 (EMF) is one of the most widely-used frameworks that facilitate the definition and instantiation of metamodels, and a pragmatic implementation of the OMG Essential Meta Object Facility (EMOF) standard. In EMF, metamodels are defined using the Ecore metamodeling language, a high level overview of which is illustrated in Figure 1. In Ecore, domain concepts are represented using EClasses. EClasses are organized in EPackages and each EClass can contain EReferences to other EClasses in the metamodel and EAttributes, which are used to define the primitive features of instances of the EClass. Ecore also provides mechanisms for defining primitive types, enumerations, inheritance between EClasses and operation signatures (but not implementations). EMF metamodels can be instantiated both reflectively, and through code generated through a 2-stage transformation. The code generation process involves a model-to-model transformation where the Ecore metamodel is transformed into an intermediate platform-specific model (GenModel ) that enables engineers to define low-level details of the metamodel 1http://www.eclipse.org/emfJournal of Object Technology, vol. 13, no. 3, 2014
  • 3.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models · 3 implementation, and a model-to-text transformation from the intermediate GenModel to Java. Under both the reflective and the generative approach, at runtime EMF models comprise of one or more Resources containing nested model elements (EObjects) that conform to the EClasses of the Ecore metamodel. Figure 1 – Simplified Diagram of the Ecore Metamodeling Language By default, models in EMF are stored in a standard XML-based representation called XML Metadata Interchange (XMI) that is an OMG-standardized format designed to enhance tool-interoperability. As XMI is an XML-based format, models stored in single XMI files cannot be partially loaded and as such, loading an XMIbased model requires reading this entire document using a SAX parser, and converting it into an in-memory object graph that conforms to the respective Ecore metamodel. As such, XMI scales poorly for large models both in terms of time needed for upfront parsing and resources needed to maintain the entire object graph in memory (the performance issues of XMI are further illustrated in Section 7). To address these limitations of XMI, persisting models in relational databases has been proposed. Examples of such approaches include the Connected Data Objects (CDO)2 project and Teneo-Hibernate3 . In this class of approaches, an Ecore metamodel is used to derive a relational schema as well as an object-oriented API that hides the underlying database and enables developers to interact with models that conform to the Ecore metamodel at a high level of abstraction. Such approaches eliminate the initial overhead of loading the entire model in memory by providing support for partial and on-demand loading of subsets of model elements. However, due to the nature of relational databases, such approaches, while better than XMI, are still largely inefficient, as demonstrated in Section 4. Due to the highly interconnected nature of most 2http://www.eclipse.org/CDO/3http://wiki.eclipse.org/Teneo/HibernateJournal of Object Technology, vol. 13, no. 3, 2014
  • 4.4 · Konstantinos Barmpis, Dimitrios S. Kolovos models, complex queries require multiple expensive table joins to be executed and hence do not scale well for large models. Even though Teneo-Hibernate attempts to minimize the number of tables generated (all subclasses of an EClass are in the same table as the EClass itself, resulting in a fraction of the tables otherwise required if a separate table was created for each EClass, the fact that the database consists of sparsely populated data results in increased insertion and query time as demonstrated in the sequel. To overcome the limitations of relational databases for scalable model persistence, recent work [PCM11] has proposed using a NoSQL database instead. In the following paragraphs we provide a discussion on NoSQL databases and their application for scalable model persistence. 2.2 NoSQL Databases The NoSQL (Not Only SQL) movement is a contemporary approach to data persistence using novel, typically non-relational, storage approaches. NoSQL databases provide flexibility and performance as they are not limited by the traditional relational approach to data storage [Sto10]. Each type of NoSQL database is tailored for storing a particular type of data and the technology does not force the data to be limited by the relational model but attempts to make the database (as much as is feasible) compatible with the data it wishes to store [Ore10]. The NoSQL movement itself has become popular due to large widely known and successful companies creating database storage implementations for their services, all of which do not follow the relational model. Such companies include for example Amazon (Dynamo database [DHJ+ 07]), Google (Bigtable database [CDG+ 08]) and Facebook (Cassandra database [LM10]). There are four widely accepted types of NoSQL databases, which use distinct approaches in tackling data persistence, three of which are described by [PPS11] and a fourth, more contemporary one, that is of increasingpopularity:1. Key-value stores consist of keys and their corresponding values, which allows for data to be stored in a schema-less way. This allows for search of millions of values in a fraction of the time needed by relational databases. Inspired by databases such as Amazon’s Dynamo, such stores are tailored for handling terabytes of distributed key-value data. 2. Tabular stores (or Bigtable stores - named after the Google database) consist of tables which can have a different schema for each row. It can be seen as each row having one large extensible column containing the data. Such stores aim at extending the classical relational database idea by allowing for sparsely populated tables to be handled elegantly – as opposed to needing a large amount of null fields in a relational database, which scales very poorly when the number of columns becomes increasingly large. Widely used examples of such stores are Bigtable [CDG+ 08] and Hbase [Hba12]. 3. Document databases consist of a set of documents (possibly nested), each of which contains fields of data in a standard format like XML or Json. They allow for data to be structured in a schema-less way as such collections. Popular examples are MongoDB [Mon12] and OrientDB [Ori12]. 4. Graph databases consist of a set of graph nodes linked together by edges (hence providing index-free adjacency of nodes). Each node contains fields of data and querying the store commonly uses efficient mathematical graph-traversal algorithms to achieve performance; As such, these databases are optimized for Journal of Object Technology, vol. 13, no. 3, 2014
  • 5.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models · 5 traversal of highly interconnected data. Examples of such stores are Neo4J [Neo12] and the graph layer (OGraphDatabase) of OrientDB [Ori12]. NoSQL databases have a loosely defined set of characteristics / properties [Cat11]: • They scale horizontally by having the ability to dynamically adapt to the addition of new servers. • Data replication and distribution over multiple servers is used, for coping with failure and achieving eventual consistency. • Eventual consistency; a weaker form of concurrency than ACID (Atomicity, Consistency, Isolation, Durability) transactions, which does not lock a piece of data when it is being accessed for a write operation but uses data replication over multiple servers to cope with conflicts. Each database will implement this in a different way and will allow the administrator to alter configurations making it either closer to ACID or increasing the availability of the store. • Simple interfaces for searching the data and calling procedures. • Use of distributed indexes to store key data values for efficient searching. • Ability to add new fields can be added to records dynamically in a lightweight fashion. The CAP theorem defines this approach and states that a (NoSQL) database can choose to strengthen only two of the threeprinciples:consistency availability and partition tolerance, and has to (necessarily) sacrifice the third. Popular NoSQL stores chose to sacrifice consistency; BASE (Basically Available, Soft-state, Eventually consistent) defines this approach. NoSQL stores are seen to have the following limitations [Lea10]: 1. Lack of a standard querying language (such as SQL) results in the database administrator or the database creator having to manually create a form of querying. 2. Lack of ACID transactions results in skepticism from industry, where sensitive data may be stored. 3. Being a novel technology causes lack of trust by large businesses which can fall back on reliable SQL databases which offer widely used support, management and other tools. 2.2.1 Graph Databases As this paper presents an approach using graph-based NoSQL databases, we will present them in more detail here. Figure 2 describes the basic terminology used for property graphs, like the ones used in graph databases. As such, graph stores describe their constructs in this way, with nodes containing properties and relationships between them. Below we go into more depth about two specific graph databases, Neo4J and (the graph layer of) OrientDB. Neo4J Neo4J is a popular, commercial Graph Database released under the GNU Public License (GPL) and the Affero GNU Public License (AGPL) licenses. Neo4J is implemented in the Java programming language and provides a programmatic way to insert and query embedded graph databases. Its core constructs are Nodes (which contains an arbitrary number of properties, which can be dynamically added and removed at will) and Relationships, whereby Node represents a mathematical graph node and Relationship an edge between two nodes. 4http://www.infoq.com/articles/graph-nosql-neo4jJournal of Object Technology, vol. 13, no. 3, 2014
  • 6.6 · Konstantinos Barmpis, Dimitrios S. Kolovos name=wheel PART_OF 1 2 number=4 name=car color=red node relationship property Figure 2 – Diagram of property graph terminology4 OrientDB OrientDB is a novel document-store database released under the Apache 2 License. It is also implemented using the Java programming language and provides a programmatic way to insert and query from a document database in Java. OrientDB also has a graph layer which allows for documents to have edges between them (edges are backed by documents themselves), emulating the property of index-free adjacency of documents and hence effectively being a graph database. Its core constructs are ODocuments (which can contain an arbitrary number of properties that can be dynamically added and removed). 3 Related Work While the approach of persisting large models in NoSQL is a novel concept, it has been already done, mainly by use of document-based databases. Below we briefly present a popular model repository, in which a NoSQL store (MongoDB document store) can be used to persist models. We also present Morsa, the first published work of using a NoSQL store to tackle scalable model persistence (also MongoDB) and its prototype tool. 3.1 The Connected Data Objects Repository (CDO) CDO allows users to store and access models in repositories supported by a range of back-end stores. CDO’s API is an extension of EMF’s and allows for a seamless use of a remote store for accessing and manipulating models. CDO supports multiple different back-ends such as relational databases and non-relational stores such as the MongoDB NoSQL store. Object-Relational Mapping CDO handles EObjects as CDOObjects that extend the EObject class by adding CDO-specific metadata. To store an EMF model on CDO, there are three main paths to pursue5 : The first is to migrate a Resource (for example an XMIResource) to a CDOResource (by copying all its contents to a new CDOResource). The second is to use a GenModel to create CDOObjectImpl objects by migrating the .gen-model file using the CDO Model Migrator. The third is to use DynamicCDOObjectImpl that result from new dynamic model elements added to a CDO session’s package registry. The model files used by CDO can be annotated 5http://wiki.eclipse.org/PreparingEMF Models for CDO Journal of Object Technology, vol. 13, no. 3, 2014
  • 7.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models · 7 (such as with JPA – EAnnotations) in order to provide a tool for customization in the storage of the model; such annotations are not directly needed by CDO and are only useful if the back-end store supports them (such as Teneo/Hibernate, for example). Furthermore, it is worth noting that CDO’s architecture, which is directly based on such mapping, limits how much it can benefit from using other technologies as it fundamentally represents model and metamodel data by means of tables. Client CDO Object CDO Resource CDO Session ResourceSet CDO View extends User Application CDO Protocol IChannel CDO Transaction CDO Audit Server IConnector Figure 3 – CDO Client high-level architecture, adapted from the CDO Wiki6 Client From the client-side, the regular EMF API can be directly used after a connection (session) has been established, but for using advanced CDO-specific functionality (such as CDOView that allows queries directly to the CDO store, or CDOTransaction that allows for savepoints and rollbacks), additional dependencies to CDO have to be included. Furthermore, a native CDO User Interface (UI) is provided for accessing, manipulating and querying models stored in the repository. This client architecture is seen in Figure 3. Server On the server-side, this repository allows any form of storage to be easily plugged in (such as a MongoDB NoSQL database). It uses a proprietary modelbased version control system (Audit Views) and supports collaborative development. More information about model repositories and a comparison thereof with file-based repositories of models can be found in [BK13]. 3.2Morsa:NoSQL Model Persistence Prototype Morsa [PCM11] is a prototype that attempts to address the issue of scalable model persistence using a document store NoSQL database (MongoDB) to store EMF models as collections of documents. Morsa stores one model element per document, with its attributes stored as key-value pairs, alongside its other metadata (such as a reference to its EClass). Metamodel elements are stored in a similar fashion to model elements and are also represented as entries in an index document that maps each model or metamodel URI (the unique identifier of a model or metamodel element in the store) into an array of references to the documents that represent its root objects. A high level overview of the architecture of Morsa is displayed in Figure 4 by [PCM11]. Morsa uses a load-on-demand mechanism which relies on an object cache that holds loaded model objects. This cache is managed by a configurable cache replace6http://wiki.eclipse.org/CDOJournal of Object Technology, vol. 13, no. 3, 2014
  • 8.8 · Konstantinos Barmpis, Dimitrios S. Kolovos Figure 4 – Persistence back-end structure excerpt for Morsa ment policy that chooses which objects must be unloaded from the client memory (should the cache be deemed full by the active configuration). While this attempts to use an effective storage technique and succeeds in improving upon the current paradigms, due to using a Document Store database the EReferences (which are serialized as document references) are stored inefficiently, which hampers insertion and query speed, as models tend to be densely interconnected with numerous references between them. Nevertheless, the discussions on the various caching techniques and cache replacement policies, as well as the different loading strategies are very effective in conveying the large number of configurations possible in a single back-end persistence example, and how optimizing the storage of different sizes and types of models can be extremely complex. Hence any solution aiming at tackling this challenge needs to be aware of these issues and experiment on the optimal way to handle them in its specific context. 4 The Grabats 2009 Case Study and Query To obtain meaningful evaluation results, we have evaluated all solutions using largescale models extracted by reverse engineering existing Java code. For this purpose, we have used the updated version of the JDTAST metamodel used in the SharenGo Java Legacy Reverse-Engineering MoDisco use case7 , presented in the Grabats 2009 contest [Gra12] described below, as well as the five models also provided in the contest. A subset of the Java JDTAST metamodel is presented in Figure 5. In this figure, there are TypeDeclarations that are used to define Java classes and interfaces, MethodDeclarations that are used to define Java methods (in classes or interfaces, for example) and Modifiers that are used to define Java modifiers (like static or synchronized) for Java classes or Java methods. The Grabats 2009 contest comprised several tasks, including the case study used in this paper for benchmarking different model querying and pattern detection tech7http://www.eclipse.org/gmt/MoDisco/useCases/JavaLegacyRE/Journal of Object Technology, vol. 13, no. 3, 2014
  • 9.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models · 9 Figure 5 – Small subset of the Java JDTAST metamodel nologies. More specifically, task 1 of this case study is performed, using all of the case studies’ models, set0 – set4 (which represent progressively larger models, from one with 70447 model elements (set0) to one with 4961779 model elements (set4)), all of which conform to the JDTAST metamodel. These models are injected into the persistence technologies used in the benchmark (insertion benchmark) and then queried using the Grabats 2009 task 1 query (query benchmark) [SJ09]. This query requests all instances of TypeDeclaration elements which declare at least one MethodDeclaration that has static and public modifiers and the declared type being its returning type (i.e. singleton candidates). In the following sections we use the JDTAST metamodel as a running example to demonstrate our approach for persisting large-scale models in the Neo4J and OrientDB graph databases. 5 Persisting and Querying Large-Scale Models using Graph Databases As discussed above, NoSQL databases have been shown to be a promising alternative that overcomes some of the limitations of relational databases for persistence of largescale models, briefly summarized in Section 2.1. Extending the study of the suitability of NoSQL databases for persistence of large-scale models, in this work prototype model stores based on Neo4J [Neo12] and OrientDB [Ori12] have been created, and their efficiency has been compared against the default EMF XMI text store and a relational database (using CDO with its default H28 store as well as a MySQL9 store, to integrate with EMF). This section discusses the design and implementation of the 8http://www.h2database.com/html/main.html9http://www.mysql.com/Journal of Object Technology, vol. 13, no. 3, 2014
  • 10.10 · Konstantinos Barmpis, Dimitrios S. Kolovos two model stores; to our knowledge, this is the first time graph databases are used to persist large models. Due to the highly-interconnected nature of typical models, key-value stores as well as tabular NoSQL data-stores were not considered, as they target a different class of problems (as explained above in Section 2.2). Hence the decision was made to experiment with a document based (and hybrid-graph) database (OrientDB) and a pure graph database (Neo4J). After initial trials, the document layer of OrientDB lagged behind the graph layer (as can be expected with the nature of the data being stored) so the focus shifted on comparing two graph databases. The rationale behind choosing these technologies was that Neo4J is a particularly popular, stable and widespread graph database while OrientDB not only provides a document layer and a graph layer, but also has a flexible license, which Neo4J does not, as detailed in their respective subsections 5.1 and 5.2 below. The Neo4J and OrientDB stores attempt to solve the aforementioned scalability issues using graph databases to store large models. As such stores have index-free adjacency of nodes, we anticipate that retrieving subgraphs or querying a model will scale well. The main differences between the two prototypes lie in the fact that OrientDB’s core storage is in documents (and uses a graph layer to handle the data as a graph) while Neo4J’s core storage is as a graph. In the following sections we present our approaches for persisting and querying models that conform to Ecore metamodels using Neo4J and OrientDB databases, and we then evaluate the performance of our two prototypes against XMI and CDO. 5.1 Neo4J In our prototype, a Neo4J-based model store consists of thefollowing:• Nodes representing model elements in the model stored. These nodes contain as properties all of the attributes of that element (as defined by the EClass it is an instance of) that are set. • Relationships from model element nodes to other model element nodes. These represent the EReferences of the model element to other model elements. • Nodes representing EClasses of the metamodel(s) the models stored are instances of. These nodes only have an id property denoting the unique identifier (URI) of the metamodel they belong to, followed by their name,ie:org.amma.dsl.jdt.core/IJavaElement is the id of the EClass IJavaElement in the org.amma.dsl.jdt.core Ecore metamodel. These lightweight nodes are used to speed up querying by providing references to model elements that are instances of this EClass (ofType reference) as EClasses that inherit from it (ofKind reference) – as such types of queries are very common in model management programs (e.g. model transformations, code generators etc.). This is the only metamodel information stored in the database, as explained below. • An index containing the ids of the EClasses and their appropriate location in the database. This allows typical queries (such as the Grabats query described above) to use an indexed EClass as a starting point, in order to find all model elements of a specific type, and then navigate the graph to return the required results. Journal of Object Technology, vol. 13, no. 3, 2014
  • 11.Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models · 11 The above data contains all of the information required to load a model and evaluate any EMF query, provided that the metamodel(s) of the model is also available (e.g. registered to the EMF metamodel registry10 ) during the evaluation of the query, as detailed metamodel information is not saved in the database (only the model information is stored in full). This is due to the fact that metamodels are typically small and sufficiently fast to navigate using the default EMF API. Thus any action that may require use of such metamodel data, like querying whether this element can have a certain property (as it may be unset, and therefore not stored in the database) or whether a reference is a containment one for example (as the database only stores the reference’s name) will need to access the metamodel (e.g. through the EMF registry), retrieve the EClass in question and extract this information from there. Note that the database supports querying of a model, insertion of a new model (from a file-based EMF model) as well as updating a model (adding or removing elements or properties). Figure 6 shows how a model conforming to the Java metamodel described above is stored in Neo4J. EClasses only store their full name (including their EPackage) and have ofType and ofKind relationships to their instances (we note that in EMF such relationships are not references but results of applying the .eClass() operation on the model element). Model elements store all their properties and relationships to other model elements (as well as to their EClass and superclass(es)). Note that even though all Neo4J relationships are bi-directional, they have a starting and an end node so can be treated in the same way as references internally. Opposite references are similarly created (but with the starting and end node reversed), linked to one another internally, and treated as usual. ECore TypeDeclaration ‘name’ : String Metamodel Neo4J MethodDeclaration ‘localTypeDeclaration’ : Boolean bodyDeclarations ‘constructor’ : Boolean ‘Id’= ‘org.amma.dsl.jdt.dom /TypeDeclaration’ ‘Id’= ‘org.amma.dsl.jdt.dom /MethodDeclaration’ ‘memberTypeDeclaration’ : Boolean ofType ofType AptPlugin : TypeDeclaration ofType getPlugin : MethodDeclaration ‘localTypeDeclaration’ = ‘true’ Model ofType ‘memberTypeDeclaration’ = ‘false’ bodyDeclarations ‘constructor’ = ‘false’ ‘name’ = ‘AptPlugin’ ‘name’ = ‘getPlugin’ ‘localTypeDeclaration’ = ‘true’ bodyDeclarations ‘constructor’ = ‘false’ ‘memberTypeDeclaration’= ‘false’ Figure 6 – Example high-level mapping from Ecore to Neo4J 5.1.1 Transactions and I/O The default mechanism for querying the database uses ACID transactions in a similar manner to SQL databases, and any number of operations can be carried out per transaction (such as creation of a node, creation of an edge, creation of a property of a node). The database uses the relevant operating system’s Memory Mapping for its I/O (MMIO) in order to increase performance. Hence if a transaction has too many operations in it (more than the allocated MMIO (or even the maximum Java heap) can handle) then the performance of the transaction will suffer. On the other hand if transactions perform too few operations then there will be a lot more transactions needed for performing a task and hence the overall run time of this task will be longer. Hence an equilibrium needs to be found around the size of transactions and dependent on the total memory allocated to the process. A further balance needs to be found 10http://www.eclipse.org/epsilon/doc/articles/epackage-registry-view/Journal of Object Technology, vol. 13, no. 3, 2014
  • 12.12 · Konstantinos Barmpis, Dimitrios S. Kolovos between the Java heap and the MMIO which will add up to be the total memory used by the process. Even further the memory available for inserts is hindered by the XMI resource being loaded (which uses a considerable amount of memory) and for large models needs to be taken into consideration. Empirical tests have shown thatusing:'>using: