Adv DB: CAP and ACID

Transactions

A transaction is a set of operations/transformations to be carried out on a database or relational dataset from one state to another.  Once completed and validated to be a successful transaction, the ending result is saved into the database (Panda et al, 2011).  Both ACID and CAP (discussed in further detail) are known as Integrity Properties for these transactions (Mapanga & Kadebu, 2013).

 Mobile Databases

Mobile devices have become prevalent and vital for many transactions when the end-user is unable to access a wired connection.  Since the end-user is unable to find a wired connection to conduct their transaction their device will retrieve and save information on transaction either on a wireless connection or disconnected mode (Panda et al, 2011).  A problem with a mobile user accessing and creating a transaction with databases, is the bandwidth speeds in a wireless network are not constant, which if there is enough bandwidth connection to the end-user’s data is rapid, and vice versa.  There are a few transaction models that can efficiently be used for mobile database transactions: Report and Co-transactional model; Kangaroo transaction model; Two-Tiered transaction model; Multi-database transaction model; Pro-motion transaction model; and Toggle Transaction model.  This is in no means an exhaustive list of transaction models to be used for mobile databases.

According to Panda et al (2011), in a Report and Co-transactional Model, transactions are completed from the bottom-up in a nested format, such that a transaction is split up between its children and parent transaction.  The child transaction once successfully completed then feeds that information up to the chain until it reaches the parent.  However, not until the parent transaction is completed is everything committed.  Thus, a transaction can occur on the mobile device but not be fully implemented until it reaches the parent database. The Kangaroo transaction model, a mobile transaction manager collects and accepts transactions from the end-user, and forwards (hops) the transaction request to the database server.  Transaction made in this model is done by proxy in the mobile device, and when the mobile devices move from one location to the next, a new transaction manager is assigned to produce a new proxy transaction. The two-tiered transaction model is inspired by the data replication schemes, where there is a master copy of the data but for multiple replicas.  The replicas are considered to be on the mobile device but can make changes to the master copy if the connection to the wireless network is strong enough.  If the connection is not strong enough, then the changes will be made to the replicas and thus, it will show as committed on these replicas, and it will still be made visible to other transactions.

The multi-database transaction model uses asynchronous schemes, to allow a mobile user to unplug from it and still coordinate the transaction.  To use this scheme, five queues are set up: input, allocate, active, suspend and output. Nothing gets committed until all five queues have been completed. Pro-motion transactions come from nested transaction models, where some transactions are completed through fixed hosts and others are done in mobile hosts. When a mobile user is not connected to the fixed host, it will spark a command such that the transaction now needs to be completed in the mobile host.  Though carrying out this sparked command is resource-intensive.  Finally, the Toggle transaction model relies on software on a pre-determined network and can operate on several database systems, and changes made to the master database (global) can be presented different mobile systems and thus concurrency is fixed for all transactions for all databases (Panda et al, 2011).

At a cursory glance, these models seem similar but they vary strongly on how they implement the ACID properties in their transaction (see table 1) in the next section.

ACID Properties and their flaws

Jim Gray in 1970 introduced the idea of ACID transactions, which provide four guarantees: Atomicity (all or nothing transactions), Consistency (correct data transactions), Isolation (each transaction is independent of others), and Durability (transactions that survive failures) (Mapanga & Kedebu, 2013, Khachana, 2011).  ACID is used to assure reliability in the database system, due to a transaction, which changes the state of the data in the database.

This approach is perfect for small relational centralized/distributed databases, but with the demand to make mobile transactions, big data, and NoSQL, ACID may be a bit constricting.  The web has independent services connected together relationally, but really hard to maintain (Khachana, 2011).  An example of this is booking a flight for a CTU Doctoral Symposium.  One purchases a flight, but then also may need another service that is related to the flight, like ground transportation to and from the hotel, the flight database is completely different and separate from the ground transportation system, yet sites like Kayak.com provide the service of connecting these databases and providing a friendly user interface for their customers.  Kayak.com has its own mobile app as well. So taking this example further we can see how ACID, perfect for centralized databases, may not be the best for web-based services.  Another case to consider is, mobile database transactions, due to their connectivity issues and recovery plans, the models aforementioned cover some of the ACID properties (Panda et al, 2011).  This is the flaw for mobile databases, through the lens of ACID.

Model Atomicity Consistency Isolation Durability
Report & Co-transaction model Yes Yes Yes Yes
Kangaroo transaction model Maybe No No No
Two-tiered transaction model No No No No
Multi-database Transaction model No No No No
Pro-motion Model Yes Yes Yes Yes
Toggle transaction model Yes Yes Yes Yes

Table 1: A subset of the information found in Panda et al (2011) dealing with mobile database system transaction models and how they use or not use the ACID properties.

 

CAP Properties and their trade-offs

CAP stands for Consistency (just like in ACID, correct all data transactions and all users see the same data), Availability (users always have access to the data), and Partition Tolerance (splitting the database over many servers do not have a single point of failure to exist), which was developed in 2000 by Eric Brewer (Mapanga & Kadebu, 2013; Abadi, 2012).  These three properties are needed for distributed database management systems and is seen as a less strict alternative to the ACID properties by Jim Gary. Unfortunately, you can only create a distributed database system using two of the three systems so a CA, CP, or AP systems.  CP systems have a reputation of not being made available all the time, which is contrary to the fact.  Availability in a CP system is given up (or out-prioritized) when Partition Tolerance is needed. Availability in a CA system can be lost if there is a partition in the data that needs to occur (Mapanga & Kadebu, 2013). Though you can only create a system that is the best in two, that doesn’t mean you cannot add the third property in there, the restriction only talks applies to priority. In a CA system, ACID can be guaranteed alongside Availability (Abadi, 2012)

Partitions can vary per distributed database management systems due to WAN, hardware, a network configured parameters, level of redundancies, etc. (Abadi, 2012).  Partitions are rare compared to other failure events, but they must be considered.

But, the question remains for all database administrators:  Which of the three CAP properties should be prioritized above all others? Particularly if there is a distributed database management system with partitions considerations.  Abadi (2012) answers this question, for mission-critical data/applications, availability during partitions should not be sacrificed, thus consistency must fall for a while.

Amazon’s Dynamo & Riak, Facebook’s Cassandra, Yahoo’s PNUTS, and LinkedIn’s Voldemort are all examples of distributed database systems, which can be accessed on a mobile device (Abadi, 2012).  However, according to Abadi (2012), latency (similar to Accessibility) is critical to all these systems, so much so that a 100ms delay can significantly reduce an end-user’s future retention and future repeat transactions. Thus, not only for mission-critical systems but for e-commerce, is availability during partitions key.

Unfortunately, this tradeoff between Consistency and Availability arises due to data replication and depends on how it’s done.  According to Abadi (2012), there are three ways to do data replications: data updates sent to all the replicas at the same time (high consistency enforced); data updates sent to an agreed-upon location first through synchronous and asynchronous schemes (high availability enforced dependent on the scheme); and data updates sent to an arbitrary location first through synchronous and asynchronous schemes (high availability enforced dependent on the scheme).

According to Abadi (2012), PNUTS sends data updates sent to an agreed-upon location first through asynchronous schemes, which improves Availability at the cost of Consistency. Whereas, Dynamo, Cassandra, and Riak send data updates sent to an agreed-upon location first through a combination of synchronous and asynchronous schemes.  These three systems, propagate data synchronously, so a small subset of servers and the rest are done asynchronously, which can cause inconsistencies.  All of this is done in order to reduce delays to the end-user.

Going back to the Kayak.com example from the previous section, consistency in the web environment should be relaxed (Khachana et al, 2011).  Further expanding on Kayak.com, if 7 users wanted to access the services at the same time they can ask which of these properties should be relaxed or not.  One can order a flight, hotel, and car, and enforce that none is booked until all services are committed. Another person may be content with whichever car for ground transportation as long as they get the flight times and price they want. This can cause inconsistencies, information being lost, or misleading information needed for proper decision analysis, but systems must be adaptable (Khachana et al, 2011).  They must take into account the wireless signal, their mode of transferring their data, committing their data, and load-balance of incoming requests (who has priority to get a contested plane seat when there is only one left at that price).  At the end of the day, when it comes to CAP, Availability is king.  It will drive business away or attract it, thus C or P must give, in order to cater to the customer.  If I were designing this system, I would run an AP system, but conduct the partitioning when the load/demand on the database system will be small (off-peak hours), so to give the illusion of a CA system (because Consistency degradation will only be seen by fewer people).  Off-peak hours don’t exist for global companies or mobile web services, or websites, but there are times throughout the year where transaction to the database system is smaller than normal days. So, making around those days is key.  For a mobile transaction system, I would select a pro-motion transaction system that helps comply with ACID properties.  Make the updates locally on the mobile device when services are not up, and set up a queue of other transactions in order, waiting to be committed once wireless service has been restored or a stronger signal is sought.

References

  • Abadi, D. J. (2012). Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. IEEE Computer Society, (2), 37-42.
  • Khachana, R. T., James, A., & Iqbal, R. (2011). Relaxation of ACID properties in AuTrA, The adaptive user-defined transaction relaxing approach. Future Generation Computer Systems, 27(1), 58-66.
  • Mapanga, I., & Kadebu, P. (2013). Database Management Systems: A NoSQL Analysis. International Journal of Modern Communication Technologies & Research (IJMCTR), 1, 12-18.
  • Panda, P. K., Swain, S., & Pattnaik, P. K. (2011). Review of some transaction models used in mobile databases. International Journal of Instrumentation, Control & Automation (IJICA), 1(1), 99-104.

Decluttering & Recycling

Last year I mentioned that I am a minimalist, though I do not subscribe to the 100 item challenge.  However, there is value in disposing of items that are no longer providing any value in your life.  Rather than trashing them, why not recycle them for cash.  Here are a few places that accept gently used and sometimes roughly used items, in an effort to create a more sustainable economy and the planet.  For really old devices, they extract the precious metals to be used in new devices.

Note: Shop around all these sites and programs to get the most money for your product. Also, one site or store may not take it, but another might so keep shopping around. Also, if you are getting store credit make sure it’s at a store you will actually use.

Note: This is not a comprehensive list.  Comment down below if you know of any other places or apps that have worked for you really well.  Some apps work best in the city versus the suburbs.

  1. Amazon.com Trade-In: They will give you Amazon gift card, for Kindle e-readers, tablets, streaming media players, BlueTooth speakers, Amazon Echos, Textbooks, Phones, and video games.
  2. Best Buy: Will buy your iPhones, iPads, Gaming Systems, Laptops, Samsung mobile devices, Microsoft Surface devices, video games, and smartwatches for BestBuy gift cards.
  3. Game Stop (one of my favorites): Will take your video games, Gaming systems, most obscure phones, tablets, iPods, etc. and will give you cash back.
  4. Staples: Smartphones, tablets, and laptops can be sold here for store credit.
  5. Target: Phones, tablets, gaming systems, smartwatches, voice speakers for a target gift card.
  6. Walmart: Phones, tablets, gaming systems, and voice speakers can be cashed in for Walmart gift cards.
  7. Letgo app: A great way to sell almost anything.  Just make sure you meet up in a public place to make the exchange, like a mall or in front of a police station. Your safety is more important than any piece you were willing to part with in the first place.
  8. Facebook.com Marketplace: Another great way to sell almost anything. The same warning is attached here as in Letgo.
  9. Decluttr.com: They pay you back via check, PayPal, or direct deposit.
  10. Gazelle: They will reward you with PayPal, check or Amazon gift cards.
  11. Raise: This is for those gift cards you know you won’t use.  You can sell them for up to 85% of its value, via PayPal, direct deposit, or check.
  12. SecondSpin: This is for those CDs, DVDs, and Blu-rays, and you can earn money via store credit, check, or PayPal.
  13. Patagonia: For outdoor gear and it is mostly for store credit.
  14. thredUp: This is for your clothes. Once they are sold via the app you can receive cash or credit.
  15. Plato’s Closet: Shoes, Clothes, and bags can be turned in for cash.  Though they take mostly current trendy items.
  16. Half Price Books: Books, textbooks, audiobooks, music, CDs, LPs, Movies, E-readers, phones, tablets, video games, and gaming systems for cash.
  17. Powells.com: For your books and you can get paid via PayPal or credit in your account.

My advice, I try to sell first to a retailer, because they are going to always be there, it’s their job, it’s safer, you can do it at your own schedule, and you will get what they promise you.  No hassle of no-shows, fear of meeting a stranger, getting further bargained down when you are there and they conveniently forget to bring the full amount, or them arriving way late.

Another piece of advice is to hold on to at least one old phone (usually the latest one), for two reasons: (1) if your current phone breaks, you can use this as an interim phone, (2) international travel, if the phone is unlocked.

Subsequent advice is to make sure you turn off and clear out all our old data from electronic devices.  The last thing you want to do is have your data compromised when doing something positive for the earth.

Also, Look for Consignment shops, local book stores, and ask around. You never know who you may be able to sell stuff to.  At a consignment shop, you deposit your items there, and if they sell, you get a part of the earnings. When all else fails, what you cannot sell, recycle it by donating it to goodwill, habitat for humanity, etc.

Business Intelligence: Targets, Probabilities, & Modeling

  • Target Measures are used to improve marketing efforts through tracking measures like ROI, NVP, Revenue, lead generation, lag generations, growth rates, etc. (Liu, Laguna, Wright, & He, 2014). The goal is that after a marketing effort is conducted, there should be a change in Target Measures. Positive changes in these measures should be repeated.  Hoptroff and Kufyba (2001) stated that these measures could also be defect rates, default rates, survey ranking results, response rates, churn rate, the value of lost to the business, transaction amounts, products purchased, etc.
  • Probability Mining is data mining using Logit Regression, neural networks, linear regression, etc. Using this helps determine the probability of an event, in our case meeting or failing to meet our Target Measures based on information on past events. (Hoptroff & Kufyba, 2001)
  • Econometrics Modeling is a form of understanding the economy through a blend of economic theory with statistical analysis. Essentially, a way of modeling how certain independent variables act or influence the dependent variable using both economic and statistical theory tools to build the model.  Econometrics Modeling looks into the market power a business holds, game theory models, information theory models, etc.  It is rationalized that economic theory nor statistical theory can provide enough knowledge to solve/describe a certain variable/state, thus the blending of both are assumed to be better at solving/describing a certain variable/state (Reiss & Wolak, 2007)

In the end, an econometric models can contains elements of probability mining, but a probability miner doesn’t have to be is not an econometric model.  Each of these models and miners can track and report on target measures.

Econometrics Modeling is a way to understand price and the pricing model, which is central to generating profits through understanding both economic and statistical/probability principles to achieve a targeted measure.   Companies should use big data and a probability miner/econometric modeling to help them understand the meaning behind the data and extract actionable decisions one could make to either meet or exceed a current target measure, compare and contrast against their current competition, understand their current customers.

Two slightly different Applications

  1. Probability mining has been used to see a customer’s affinity and responses towards a new product through profiling current and/or new customers (Hoptroff & Kufyba, 2001). Companies and marketing firms work on these models to assign a probability value of attracting new customers to a new or existing product or service. The results can give indications as to whether or not the company could met the Target Measures.
  2. We have Marketing Strategies Plan A, B, and C, and we want to use econometric modeling to understand how cost effective each marketing strategy plan would be with respect to the same product/product mix at different price points. This would be a cause and effect modeling (Hoptroff, 1992). Thus, the model should help predict which strategy would produce the most revenue, which is one of our main target measures.

An example of using Probability Mining is Amazon’s Online shopping experience. As the consumer adds items to the shopping cart, Amazon in real-time begins to apply probabilistic mining to find out what other items this consumer would purchase (Pophal, 2014) based on what has happened before through the creation of profiles and say “Others who purchased X also bought Y, Z, and A.”  This quote, almost implies that these items are a set and will enhance your overall experience, buy some more.  For instance, buyers of a $600 Orion Telescope also bought this $45 Hydrogen-alpha filter (use to point the telescope towards the sun to see planets move in front of it).

The Federal Reserve Bank and its board members have been using econometric modeling in the past 30 years for forecasting economic conditions and quantitative policy analysis (Brayton. Levin, Tryon., & Williams, 1997).  The model began in 1966 with help of the academic community, Division of Research and Statistics with available technology, which became operational in 1970.  It had approximate 60 behavioral equations, with long-run neoclassical growth model, factor demands, and life-cycle model of consumption.  Brayton et al. in 1997 go on to say that this model was used for primarily the analysis of stabilization of monetary and fiscal policies, as well as other governmental policies effects onto the economy.

Resources: