Relational Databases will persist due to ACID, ERDs, concurrency control, transaction management, and SQL capabilities.  It doesn’t help that major software can easily integrate with these databases.  But, the reason why so many new ways keep popping up is due to impedance resource costs on computational systems, when data is pulled and pushed from in-memory to databases.  This resource cost can compound fast with big amounts of data.  Industry wants and needs to use parallel computing with clusters to store, retrieve, and manipulate big amounts of data.  Data could also be aggregated into units of similarities, and data consistency can be thrown out the window, in real-life applications since they can actually be divided into multiple phases (MUSE, 2015a).

Think of a bank transaction, not all transactions you do at the same time get processed at the same time, and they may show up on your mobile device (mobile database), they may not be committed until a few hours or days later.  The bank will in my case withdraw my mortgage payment from my checking on the first, but apply it on the second of every month into the loan.  But, for 24 hours my payment is pending.

Thanks to the aforementioned ideas have created a movement to support “Not Only SQL” databases, best known as NoSQL, which was derived from a twitter hashtag #NoSQL.  NoSQL contains Aggregate databases like key-value, document, and column friendly, as well as aggregate ignorant databases like the graph (Sadalage & Fowler, 2012). These can be schemaless databases, where data can be stored without any predefined schema.  NoSQL is best for application-specific databases, not to substitute all relational databases (MUSE, 2015b).

 Originally meant for open-sourced, distributed, nonrelational databases like Voldemort, Dynomite, CouchDB, MongoDB, Cassandra, it expanded in its definition and what applications/platforms it can take on.  CQL is from Cassandra and was written to act like SQL in most cases, but also act differently when needed (Sadalage & Fowler, 2012), hence the No in NoSQL.

Suitable Applications

According to Cassandra Planet (n.d.), NoSQL is best for large data sets (big data, complex data, and data mining):

  • Graph: where data relationships are graphical and interconnected like a web (ex: Neo4j & Titan)
  • Key-Value: data is stored and index by a key (ex: Cassandra, DynamoDB, Azure Table Storage, Riak, & BerkeleyDB)
  • Column Store: stores tables as columns rather than rows (ex: Hbase, BigTable, & HyperTable)
  • Document: can store more complex data, with each document having a key (ex: MongoDB & CouchDB).

System Platform

In Relational databases, there is a resource cost, but in as industry wants to deal with big amounts of data, we can gravitate towards NoSQL.  To process all that data we may need to use parallel computing with clusters to store, retrieve, and manipulate big amounts of data.


Adv Topics: The Internet of Things and Web 4.0

The IoT is the explosion of device/sensor data, which is growing the amount of structured data exponentially with tremendous opportunities (Jaffe, 2014; Power, 2015). Both Atzori (2010) and Patel (2013) classified the Web 4.0 as the symbiotic web, where data interactions occur between humans and smart devices, the internet of things (IoT). These smart devices can be wired to the internet or connected via wireless sensors through enhanced communication protocols (Atzori, 2010). Thus, these smart devices would have read and write concurrency with humans, where the largest potential of web 4.0 has these smart devices analyze data online and begin to migrate the online world into the reality (Patel, 2013). Besides interacting with the internet and the real world, the internet of things smart devices would be able to interact with each other (Atzori, 2010). Sakr (2014) stated that this web ecosystem is built off of four key items:

  • Data devices where data is gathered from multiple sources that generate the data
  • Data collectors are devices or people that collect data
  • Data aggregation from the IoT, people, Radio Frequency Identification tags, etc.
  • Data users and data buyers are people that derive value out of the data

Some of the potential benefits of IoT are: assisted living, e-health, enhanced learning, government, retail, financial, automation, industrial manufacturing, logistics, business/process management, and intelligent transport (Sakr, 2014; Atzori, 2010). Atzori (2010) suggests that there are three different definitions or vision on the use of IoT, which is based on the device’s orientation:

  • Things oriented which are designed for status and traceability of objects via RFID or similar technology
  • Internet-oriented which are designed for light internet protocol where the device is addressable and reachable via the internet
  • Semantic-oriented where devices aid in creating reasoning over the data that is generated by these devices by exploiting models

Some of IoT can fall on one, two, or all three definitions or visions for IoT use.

Performance Bottlenecks for IoT

In 2016, IoT has two main issues, if it is left on its own and it is not tied to anything else (Jaffe, 2014; Newman, 2016):

  • The devices cannot deal with the massive amounts of data generated and collected
  • The devices cannot learn from the data it generates and receives

Thus, artificial intelligence (AI) should be able to store and mine all the data that is gathered from a wide range of sensors to give it meaning and value (Canton, 2016; Jaffe, 2014). AI would bring out the potential of IoT through quickly and naturally collect, analyzing, organizing, and feeding valuable data to key stakeholders, transforming the field into the Internet of Learning-Things (IoLT) from the standard IoT (Jaffe, 2014; Newman, 2016). However, this would mean a change in the infrastructure of the web to handle IoLT or IoT. Thus, Atzori (2010) listed some of the potential performance bottlenecks for IoT on a network level:

  • The vast number of internet oriented devices that will be taking up the last few IPv4 addresses, thus there is a need to move to IPv6 to support all the devices that will come online soon. This is just one version of the indexing problem.
  • Things oriented and internet oriented devices could spend a time in sleep mode, which is not typical for current devices using the existing IP networks.
  • IoT devices when connecting to the internet produce smaller packets of data at a higher frequency than current devices.
  • Each of the devices would have to use a common interface and standard protocols as other devices, which can quickly flood the network and increase the complexity of middleware software layer design.
  • IoT is vastly various objects, where each device with their function and has its way of communicating. There is a need to create a level of abstraction to homogenate data transfer and access of data through a standard process.

Proposed solutions would be to use NoSQL (Not only Structured Query Language) databases to help with collection, storage, and analysis of IoT data that is heterogeneous, lacking a common interface with standard protocols and can deal with data of various sizes. This can solve one aspect of the indexing problem of IoT. NoSQL databases are databases that are used to store data in non-relational databases i.e. graphical, document store, column-oriented, key-value, and object-oriented databases (Sadalage & Fowler, 2012; Services, 2015).

  • Document stores use a key/value pair that could store data in JSON, BSON, or XML
  • Graphical databases are use networks diagrams to show the relationship between items in a graphical format
  • Column-oriented databases are perfect for sparse datasets, where data is grouped together in columns rather than rows

Retail is currently using thing oriented RFID for inventory tracking and in-store foot traffic if installed on shopping carts to be used for understanding customer wants (Mitchell, n.d.). Thus, Mitchell (n.d.) suggested that the use of video cameras and mobile device Wi-Fi traffic could help identify if the customer wanted an item or a group of items by seeking hotspots of dwelling time, so that store managers can optimize the store layouts to increase flow and higher revenue. However, these retailers must be considering the added data sources and have the supporting infrastructure to avoid performance bottlenecks to get to reap the rewards of utilizing IoT to generate data-driven decisions.


  • Atzori, L., Antonio Iera, A., & Morabito, G. (2010). The Internet of things: A survey. Computer Networks, 54(2). 787–2,805