Big Data Analytics: Pizza Industry

Pizza, pizza! A competitive analysis was completed on Dominos, Pizza Hut, and Papa Johns.  Competitive analysis is gathering external data that is available freely, i.e. social media like Twitter tweets and Facebook posts.  That is what He, Zha, and Li (2013) studied, approximately 307 total tweets (266 from Dominos, 24 from Papa John, 17 from Pizza Hut) and 135 wall post (63 from Dominos, 37 from Papa Johns, 35 from Pizza Hut), for the month October 2011(He et al, 2013).  It should be noted that these are the big three pizza chain controlling 23% of the total market share (7.6% from Dominos, 4.23% from Papa Johns, 11.65% from Pizza Hut)(He et al., 2013) (He et al., 2013). Posts and tweets contain text data, videos, and pictures.  All the data collected was text-based data and collected manually, and SPSS Clementine tool was used to discover themes in their text (He et al., 2013).

He et al. (2013), found that Domino’s Pizza was using social media to engage their customers the most.  Domino’s Pizza did the most to reply to as many tweets and posts.  The types of posts in all three companies varied from the promotion to marketing to polling (i.e. “What is your favorite topping?”), facts about pizza, Halloween-themed posts, baseball themed posts, etc. (He et al., 2013).  Results from the text mining of all three companies: Ordering and delivery was key (customers shared the experience and feelings about their experience), Pizza Quality (taste & quality), Feedback on customers’ purchase decisions, Casual socialization posts (i.e. Happy Halloween, Happy Friday), and Marketing tweets (posts on current deals, promotions and advertisement) (He et al, 2013).  Besides text mining, there was also content analysis on each of their sites (367 pictures & 67 videos from Dominos, 196 pictures & 40 videos from Papa Johns, and 106 pictures and 42 videos from Pizza Hut), which showed that the big three were trying to drive customer engagement (He et al., 2013).

He et al. (2013) lists the theory that with higher positive customer engagement, customers can become brand advocates, which increases their brand loyalty and push referrals to their friends, and approximately 1/3 people followed a friend’s referral if done through social media.  Thus, evaluating the structure and unstructured data provided to an organization about their own product and theirs of their competitors, they could use it to help increase their customer services, driving improvements in their own products, and driving more customers to their products (He et al., 2013).  Key lessons from this study, which would help any organization gain an advantage in the market are to (1) Constantly monitor your social media and those of your competitors, (2) Establish a benchmark of how many posts, likes, shares, etc. between you and your competitors, (3) Mine the conversational data for content and context, and (4) analyze the impact of your social media footprint to your own business (when prices rise or fall what is the response, etc.) (He et al, 2013).

Resources:

  • He, W., Zha, S., & Li, L. (2013). Social media competitive analysis and text mining: A case study in the pizza industry. International Journal of Information Management, 33(3), 464-472.

 

What is Big Data Analytics?

 

What makes big data different from conventional data that you use every day?
The differentiation exists where big data and conventional deals with data storage and data analysis. Big data is complex, challenging, and significant (Ward & Barker, 2013). Ward and Barker (2013) traced back the definition of Volume, Velocity, and Variety from Gartner. They then compare its definition to Oracle’s, which is data to mean the value derived from merging relational database with unstructured data that can vary in size, structure, format, etc. Finally, the authors state that Intel big data definition is a company generating about 300 TB weekly, and typically it can come from transactions, documents, emails, sensor data, social media, etc. They use all of this information to state that the true definition should lie with the size of the data, a complexity of the data, and the technologies used to analyze the data. This is how you can differentiate it from conventional data.

Davenport, Barth, and Bean (2012), stated that IT companies define big data as “more insightful data analysis”, but if used properly companies can gain a competitive edge. Companies that use big data: are aware of data flows (customer-facing data, continuous process data, network relationships, which is dynamic and always changing in a continuous flow), rely on data scientists (upgraded data management skill, programing, math, stats, business acumen, and effective communication) and move away from IT functions (concerned with automation) into ops or prod functions (since its goals is to present information to the business first). Data in a continuous flow needs to have business processes set up for obtaining/gathering/capturing, storing, extracting, filtering, manipulating, structuring, monitoring, analyzing and interpreting, to help facilitate data-driven decisions.

Finally, Lazer, Kennedy, King, and Vespignani (2014), talked about big data hubris, where the assumption that big data can do it all and is a great substitute for conventional data analysis. They state that errors in measurement, validity, reliability and dependencies in the data cannot be ignored. Big data analysis can overfit its analysis to a small number of cases. Greater value to any big dataset is to marry it with other near-real-time data from different sources, but continuous evaluation and improvement should always be incorporated. Sources of errors in analysis can arise from measurement (is it stable and comparable across cases and over time, are there systematic errors), algorithm dynamics, search algorithms, and changes in the data-generating process. The authors finally state that transparency and replicability of data analysis (especially secondary or aggregate data, since there are fewer privacy concerns in that), could help improve the results of big data analysis. Without transparency and replicability, how will other scientist learn and build on the knowledge (thus destroying the accumulation of knowledge)?

There is a difference between big data and conventional data. But, no matter how big, fast, and different the data sets are, one cannot deny that because of big data, conventional data gathering, analysis, and techniques are not influenced either. Improvements have been made, to allow doctoral students to conduct surveys at a much faster rate, gather more unstructured data through interview processes, and transcription software used for audio files in big data can also be used in smaller conventional data. Though vastly different, and can come with their errors as we improve one, we inadvertently improve the other.

Public Sites that provide free access to big data sets:

References:

  • Davenport, T. H., Barth, P., & Bean, R. (2012). How big data is different. MIT Sloan Management Review, 54(1), 43.
  • Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: Traps in big data analysis. Science, 343(14 March).
  • Ward, J. S., & Barker, A. (2013). Undefined by data: a survey of big data definitions. arXiv preprint arXiv:1309.5821.

Zeno’s Paradox

Some infinities are bigger than others.

A paradox to motion:

Zeno described a paradox of motion, which helps describes the one type of many infinities. Zeno’s paradox is described below (Stanford Encyclopedia of Philosophy, 2010):

“Imagine Achilles chasing a tortoise, and suppose that Achilles is running at 1 m/s, that the tortoise is crawling at 0.1 m/s and that the tortoise starts out 0.9 m ahead of Achilles. On the face of it Achilles should catch the tortoise after 1s, at a distance of 1m from where he starts (and so 0.1m from where the Tortoise starts). We could break Achilles’ motion up … as follows: before Achilles can catch the tortoise he must reach the point where the tortoise started. But in the time he takes to do this the tortoise crawls a little further forward. So next Achilles must reach this new point. But in the time it takes Achilles to achieve this the tortoise crawls forward a tiny bit further. And so on to infinity: every time that Achilles reaches the place where the tortoise was, the tortoise has had enough time to get a little bit further, and so Achilles has another run to make, and so Achilles has in infinite number of finite catch-ups to do before he can catch the tortoise, and so, Zeno concludes, he never catches the tortoise.”

This paradox was used to illustrate that not all infinities are the same, and one infinity can indeed be bigger than another.  An interpretation of this paradox was written poetically in a eulogy for the book of The Fault in Our Stars (Green, 2012):

“There is an infinite between 0 and 1. There’s .1 and .12 and .112 and an infinite collection of others. Of course there is a bigger infinite set of numbers between 0 and 2, or between 0 and a million. Some infinities are bigger than other infinities. … There are days, many days of them, when I resent the size of my unbounded set. I want more numbers than I’m likely to get, and God, I want more numbers for Augustus Waters than he got. But, Gus, my love, I cannot tell you how thankful I am for our little infinity. I wouldn’t trade it for the world. You have me a forever within the numbered days, and I’m grateful.” (pg. 259-260)

So to my readers out there, I want to thank you in advance for the little infinity(ies) I will get to share with each of you through this blog, and for that I am grateful.

Resources:

  • Green, J. (2012). The fault in our stars.  New York, New York: Penguin Group (USA) Inc.
  • Stanford Encyclopedia of Philosophy (2010). Zeno’s Paradoxes. Retrieved from http://plato.stanford.edu/entries/paradox-zeno/#AchTor