Nosql databases thesis

Discovering NoSQL Schemas
  1. Comparison of relational, NoSQL and NewSQL databases
  2. Introduction
  3. Referential Integrity in Cloud NoSQL Databases
  4. NoSQL Databases: A Software Engineering Perspective | SpringerLink

As a second contribution we apply the first framework on seven, and the second on two open-source TSDBs.

Comparison of relational, NoSQL and NewSQL databases

This also serves as a test scenario to evaluate the validity of the frameworks for meaningful comparisons. The remainder of this paper is organized as follows: Section 0 explains some of the essential background of distributed systems, NoSQL databases and time series databases. Section 0 is devoted to related work in the field of time series databases and the identification of the lack in literature. In section 0 we present two frameworks to compare time series databases: the feature-oriented and the quality-oriented comparison framework.

Section 0 summarizes the preceding results and gives an outlook into further work. In the following we revise and discover the essentials of distributed systems, especially the CAP-Theorem and architectural styles.


Furthermore, we define time series data in detail and discover characteristics and features of time series databases. A distributed system is characterized as a network of computers which communicate to each other in order to deliver results. In sum they can represent a whole information system which appears as a single system to agents outside this system. In the case of distributed systems architectural styles involve important design and evaluation questions leading to various implications on characteristics of distributed systems.

The two general architecture styles are master-slave systems and peer-to-peer systems. Furthermore, information systems consist of various layers, e. They also can be designed by a component-based system architecture. Furthermore, component-based architectures which run as application servers are often seen as a distributed computing environment [57]. They differ fundamentally from traditional, relational databases in means that they are not restricted to strong ACID-attributes and are optimized to handle large datasets in a distributed manner.

This leads to performance-related advantages like lower latencies and higher throughputs as well as tremendous possibilities to scale, and thus means to maintain stable systems. Popular classifications of NoSQL databases are key-value stores, wide-column stores and document stores [51]. A time series has special characteristics and can be applied to various fields. Apparently, a time series is a sequence of events recorded and stored in time order. Different specialization terms can occur for different fields, e.

Common fields of application are physics, medicine, economics, IT-infrastructure, finance and other scientific experiment results. Time series data refers to the composition of metrics and tags. A metric is an arrangement of numerical data in a successive time order, consisting of a title and several time-value pairs. Usually time series data is enriched by tags, thus metadata with additional information, e. CPU ID. The ultimate goal is to store the series in a table and to plot it via a graph for monitoring.

Referential Integrity in Cloud NoSQL Databases

Other intentions go into the field of time series analysis which defines approaches to investigate past data to gather meaningful statistics or the time series forecasting as a field of predicting prospective developments based on meaningful analytics [7]. Below we show how time series data is constructed:.

TSDBs have a few unique characteristics which necessarily have to be considered designing and implementing such systems. They are implied from the properties of time series data [22]:. Query functionalities of TSDBs are very specific. Filters help to select particular time series for given metric names, tagtitle-value pairs or time ranges. Aggregations include specific functions to aggregate data points, e.

Down-sampling describes the change of the displayed time resolution and leads to faster plotting based on reduced data volume. Some TSDBs also offer automated queries, e. An important factor is the cardinality of metrics. Storing time series data with a number of tags of different categories could lead to meaningful selections and aggregations. For example, storing time series data of various server CPU workloads and additionally storing tags about their cluster regions, belonging to server rack, manufacturer, etc.

  1. my trip to paris essay.
  2. Algorithms for Large Networks in the NoSQL Database ArangoDB.
  3. Algorithms for Large Networks in the NoSQL Database ArangoDB — Informatik 5 (Information Systems);
  4. Research Topics for Master Theses or Projects?
  5. science and technology essay in 200 words;
  6. english essays for secondary students;

Other design questions include how to deal with automated data purging [8],[22]. Traditional time series databases are database management systems capable and optimized to manage time series data, and are implemented in relational databases. This is done by creating a data schema including ID, metric and dateTime columns. Depending on the data model this approach also can be of high performance.

Because of the ACID-characteristics relational databases are not able to scale easily. Problems often occur by reaching storage limitations although data points are small, but usually high in amount. Furthermore, time series-specific queries like groupings of data and other statistics-related requests can become very resource-intensive and difficult to handle for relational databases [22].

NoSQL-based time series databases have been developed to substitute traditional TSDBs because increased requirements are no longer fulfilled by the characteristics offered by traditional systems [22], [51]:. This is called horizontal scalability and is done by NoSQL systems in a simple and inexpensive manner. Even with commodity hardware high reliability can be guaranteed.

In former systems expensive hardware was necessary to guarantee reliant operation. Furthermore, scalability supports replication mechanisms and thus increases reliability. Such systems can handle large and changing user populations. Apparently, those features and abilities are differently addressed by various TSDBs. I like to say, there's always a schema —even in so-called schemaless databases. It's just that your application code has to enforce the schema when you use NoSQL. But the logical entities that your application uses must be described.

An entity-relationship diagram is a good way of showing that. It's less common in a NoSQL database to have lots of entities. Also, some NoSQL databases don't have any support for inter-entity relationships; they're all islands. Learn more. First 10 Free. Asked 2 years, 7 months ago. Active 2 years, 2 months ago. Viewed times.

Conference paper. This is a preview of subscription content, log in to check access.

NoSQL Databases: A Software Engineering Perspective | SpringerLink

Orend, K. Lavitt, N.

  1. computer science cover letter.
  2. Essay Database.
  3. top creative writing grad schools;

Chang, F. Floratou, A. VLDB Endow. Lith, A. Sadalage, J. Schram, A. Strauch, C. Kuznetsov, S. Hecht, R. Cattell, R.

Stonebraker, M. NoSQL databases.