Qualitative Analysis: Coding Project Report of a Virtual Interview Question

The virtual interview question: Explain what being a doctoral student means for you? How has your life changed since starting your doctoral journey?

Description of your coding process

The steps I followed in this coding process were to read the responses once, at least one week before this individual project assignment was due.  This allowed me to think of generic themes, and codes at a super high level throughout the week.  Then after the week was over, I quickly went to wordle.net to create a word cloud on the top 50 most used words in this virtual interview and found out the results below.

wordle

Figure 1: Screenshot for wordle.net results which were used to help develop sub-codes and codes, words that bigger appear more often in the virtual interview than those words that are smaller.

The most telling themes from Figure 1 are: Time, Family, Life, Work, Student, Learning, Frist, Opportunity, Research, People, etc.  This helped create some codes and some of the sub-codes like prioritization, for family, etc.  Figure 1 has also helped me to confirm my ideas for codes that I have been thinking already in my head for the past week, thus I felt ready to begin coding.  After, deciding on the initial set of codes, I did some manual coding, while asking the questions: “What is the person saying? And how they are saying it? And could there be a double meaning in the sentences?”  The last question helped me identify if each sentence in this virtual interview had multiple codes within it.  I used QDA Miner Lite as my software of choice for coding, it is an open-source product and there are plenty of end-user tutorials made by different researchers from many fields on how to effectively use this software effectively on YouTube.  After the initial manual coding, I revisited the initial coding book.  Some of the subcodes that fell under betterment, were moved into the future code as it better fit that theme than just pure betterment. This reanalysis of coding went on for all codes.  As I re-read the responses for the third time, some new subcodes got added as well.  The reason for re-reading this virtual interview a third time was to make sure not many other codes could be created or were missing.

Topical Coding Scheme (Code Book)

The codebook that was derived is as follows:

  • Family
    • For Family
    • Started by Family
    • First in the Family
  • Perseverance
    • Exhausted
    • Pushing through
    • Life Challenges
    • Drive/Motivation
    • Goals
  • Betterment
    • Upgrade skills
    • Personal Growth
    • Maturity
    • Understanding
    • Priority Reanalysis
  • Future
    • More rewarding
    • Better Life
    • Foresight
  • Proving something
    • To others

 

Diagram of findings

Below are images developed through the analytical/automated part QDA Miner Lite:

fig2

Figure 2: Distribution of codes in percentages throughout the virtual interview.

Figure 3: Distribution of codes in frequency throughout the virtual interview.

fig4

Figure 4: Distribution of codes in frequency throughout the virtual interview in terms of a word cloud where more frequent codes appear bigger than less frequent codes.

Brief narrative summary of finding referring to your graphic diagram

Given figures 2-4, one could say that the biggest theme for going into the doctoral program is the prospect of a better life and hoping to change the world, as they more frequently showed up in the interview.  One student states that their degree would open many doors, “Pursuing and obtaining this level of degree would help to open doors that I may not be able to walk through otherwise.” While another student says that hopefully, their research will change the future lives of many “The research that I am going to do will hopefully allow people to truly pursue after their dreams in this ever-changing age, and let the imagination of what is possible within the business world be the limit.” Other students are a bit more practical with their responses stating things like “…move up in my organization and make contributions to the existing knowledge” and finally “More opportunities open for you as well as more responsibility for being credible and usefulness as a cog in the system”

Another concept that kept repeating here is that this is done for family, and because of family work, and school, the life of a doctoral student in this class has to be reprioritized (hence the code priority reanalysis).  This is primarily seen as all forms of graphical output show that these are the two most significant things that drive towards the degree.  One student went to one extreme, “Excluding family and school members, I am void of the three ‘Ps’ (NO – people, pets, or plants). I quit my full-time job and will be having the TV signal turned off after the Super Bowl to force additional focus.”  Another student said that time was the most important thing they had and that it has changed significantly, “The most tangible thing that has changed in my life since I became a doctoral student has been my schedule.  Since this term began I have put myself on a strict schedule designating specific time for studies, my wife, and time for myself.”  Finally, another student says balance is key for them: “Having to balance family time, work, school, and other social responsibilities, has been another adjusted change while on this educational journey. The support of my family has been very instrumental in helping me to succeed and the journey has been a great experience thus far.”  There are 7 instances in which these two codes overlap/included within each other, which apparently happen 80% of the time.

Thus, from this virtual interview, I am able to conclude that family is mentioned with priority reanalysis in order to meet the goal of the doctoral degree and that time management a component of priority reanalysis is key.  There are students that take this reanalysis to the extreme as aforementioned, but if they feel that is the only way they could accomplish this degree in a timely manner, then who am I to judge.  After all, it is the job of the researcher, when coding to be non-biased.  However, the family could drive people to complete the degree, it is the prospects of a better life and changing the world for the better is what was mentioned most.

Appendix A

An output file from qualitative software can be generated by using QDA Miner Lite.

 

Data tools: Analysis of big data involving text mining

Definitions

Big data – any set of data that has high velocity, volume, and variety, also known as the 3Vs (Davenport & Dyche, 2013; Fox & Do 2013, Podesta, Pritzker, Moniz, Holdren, & Zients, 2014).

Text mining – a process that involves discovering implicit knowledge from unstructured textual data (Gera & Goel, 2015; Hashimi & Hafez, 2015; Nassirtoussi Aghabozorgi, Wah, & Ngo, 2015).

Case study: Basole, Seuss, and Rouse (2013). IT innovation adoption by enterprises: Knowledge discovery through text analytics.

The goal of this study was to use text mining techniques on 472 quality peer reviewed articles that spanned 30 years of knowledge (1977-2008).  The selection criteria for the articles were based on articles focused on the adoption of IT innovation; focused on the enterprise, organization, or firm; rigorous research methods; and publishable leading journals.  The reason to go through all this analysis is to prove the usefulness of text analytics for literature reviews.  In 2016, most literature reviews contain recent literature from the last five years, and in certain fields, it may not just be useful to focus on the last five years.  Extending the literature search beyond this 5-year period, requires a ton of attention and manual labor, which makes the already literature an even more time-consuming endeavor than before. So, the author’s question is to see if it is possible to use text mining to conduct a more thorough review of the body of knowledge that expands beyond just the typical five years on any subject matter.  They argue that the time it takes to conduct this tedious task could benefit from automation.  However, this should be thought of as a first pass through the literature review. Thinking of this regarding a first pass allows for the generation of new research questions and a generation of ideas, which drives more future analysis.In the end, the study was able to conclude that cost and complexity were two of the most frequent determinants of IT innovation adoption from the perspective of an IT department.  Other determinants for IT departments were the complexity, capability, and relative advantage of the innovation.  However, when going up one level of extraction to the enterprise/organizational level, the perceived benefits and usefulness were the main determinants of IT innovation.  Ease of use of the technology was a big deal for the organization.  When comparing, IT innovation with costs there was a negative correlation between the two, while IT innovation has a positive correlation to organization size and top management support.

How was big data analytics learned, taught, and used in the case study?

The research approach for this study was: (1) Document Identification and extraction, (2) document classification and coding, (3) document analysis and knowledge discovery (key terms, co-occurrence), and (4) research gap identification.

Analysis of the data consisted of classifying the data into four time periods (bins): 1988-1979; 1980-1989; 1990-1999; and 2000-2008 and use of a classification scheme based on existing taxonomies (case study, content analysis, field experiment, field study, frameworks and conceptual model, interview, laboratory experiment, literature analysis, mathematical model, qualitative research, secondary data, speculation/commentary, and survey).  Data was also classified by their functional discipline (Information systems and computer science, decision science, management and organization sciences, economics, and innovation) and finally by IT innovation (software, hardware, networking infrastructure, and the tool’s IT term catalog). This study used a tool called Northernlight (http://georgiatech.northernlight.com/).

The hopes of this study are to use the bag-of-words technique and word proximity to other words (or their equivalents) to help extract meaning from a large set of text-based documents.  Bag-of-words technique is known for counting and identifying key terms and phrases, which help uncover themes.  The simplest way of thinking of the bag-of-words technique is word frequencies in a document.

However, understanding the meaning behind the themes means studying the context in which the words are located in, and relating them amongst other themes, also called co-occurrence of terms.  The best way of doing this meaning extraction is to measure the strength/distance between the themes.  Finally, the researcher in this study can set minimums, maximums that can enhance the meaning extraction algorithm to garner insights into IT innovation, while reducing the overall noise in the final results. The researchers set the following rules for co-occurrences between themes:

  • There are approximately 40 words per sentence
  • There are approximately 150 words per paragraph

How could this implementation of big data have been improved upon?

Goldbloom (2016) stated that using big data techniques (machine learning) is best on big data that requires classifying and it breaks down when the task is too small and specialized, therefore prime for only human analysis.  This study only looked at 427 articles, is this considered big enough for analysis, or should the analysis go back through multiple years beyond just the 30 years (Basole et al., 2013).  What is considered big data in 2013 (the time of this study), may not be big data in 2023 (Fox & Do, 2013).

Mei & Zhai (2005), observed how terms and term frequencies evolved over time and graphed it by year, rather than binning the data into four different groups as in Basole et al. (2013).  This case study could have shown how cost and complexity in IT innovation changed over time.  Graphing the results similar to Mei & Zhai (2005) and Yoon and Song (2014) would also allow for an analysis of IT innovation themes and if each of these themes is in an Introduction, Growth, Majority, or Decline mode.

 Reference

  • Basole, R. C., Seuss, C. D., & Rouse, W. B. (2013). IT innovation adoption by enterprises: Knowledge discovery through text analytics. Decision Support Systems, 54, 1044-1054. Retrieved from http://www.sciencedirect.com.ctu.idm.oclc.org/science/article/pii/S0167923612002849
  • Davenport, T. H., & Dyche, J. (2013). Big Data in Big Companies. International Institute for Analytics, (May), 1–31.
  • Fox, S., & Do, T. (2013). Getting real about Big Data: applying critical realism to analyse Big Data hype. International Journal of Managing Projects in Business, 6(4), 739–760. http://doi.org/10.1108/IJMPB-08-2012-0049
  • Gera, M., & Goel, S. (2015). Data Mining-Techniques, Methods and Algorithms: A Review on Tools and their Validity. International Journal of Computer Applications, 113(18), 22–29.
  • Goldbloom, A. (2016). The jobs we’ll lose to machines –and the ones we won’t. TED. Retrieved from http://www.ted.com/talks/anthony_goldbloom_the_jobs_we_ll_lose_to_machines_and_the_ones_we_won_t
  • Hashimi, H., & Hafez, A. (2015). Selection criteria for text mining approaches. Computers in Human Behavior, 51, 729–733. http://doi.org/10.1016/j.chb.2014.10.062
  • Mei, Q., & Zhai, C. (2005). Discovering evolutionary theme patterns from text: an exploration of temporal text mining. Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 198–207. http://doi.org/10.1145/1081870.1081895
  • Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y., & Ngo, D. C. L. (2015). Text-mining of news-headlines for FOREX market prediction: a multi-layer dimension reduction algorithm with semantics and sentiment. Expert Systems with Applications42(1), 306-324.
  • Podesta, J., Pritzker, P., Moniz, E. J., Holdren, J., & Zients, J. (2014). Big Data: Seizing Opportunities. Executive Office of the President of USA, 1–79.
  • Yoon, B., & Song, B. (2014). A systematic approach of partner selection for open innovation. Industrial Management & Data Systems, 114(7), 1068.