Qualitative Analysis: Coding Project Report of a Virtual Interview Question

The virtual interview question: Explain what being a doctoral student means for you? How has your life changed since starting your doctoral journey?

Description of your coding process

The steps I followed in this coding process were to read the responses once, at least one week before this individual project assignment was due.  This allowed me to think of generic themes, and codes at a super high level throughout the week.  Then after the week was over, I quickly went to wordle.net to create a word cloud on the top 50 most used words in this virtual interview and found out the results below.

wordle

Figure 1: Screenshot for wordle.net results which were used to help develop sub-codes and codes, words that bigger appear more often in the virtual interview than those words that are smaller.

The most telling themes from Figure 1 are: Time, Family, Life, Work, Student, Learning, Frist, Opportunity, Research, People, etc.  This helped create some codes and some of the sub-codes like prioritization, for family, etc.  Figure 1 has also helped me to confirm my ideas for codes that I have been thinking already in my head for the past week, thus I felt ready to begin coding.  After, deciding on the initial set of codes, I did some manual coding, while asking the questions: “What is the person saying? And how they are saying it? And could there be a double meaning in the sentences?”  The last question helped me identify if each sentence in this virtual interview had multiple codes within it.  I used QDA Miner Lite as my software of choice for coding, it is an open-source product and there are plenty of end-user tutorials made by different researchers from many fields on how to effectively use this software effectively on YouTube.  After the initial manual coding, I revisited the initial coding book.  Some of the subcodes that fell under betterment, were moved into the future code as it better fit that theme than just pure betterment. This reanalysis of coding went on for all codes.  As I re-read the responses for the third time, some new subcodes got added as well.  The reason for re-reading this virtual interview a third time was to make sure not many other codes could be created or were missing.

Topical Coding Scheme (Code Book)

The codebook that was derived is as follows:

  • Family
    • For Family
    • Started by Family
    • First in the Family
  • Perseverance
    • Exhausted
    • Pushing through
    • Life Challenges
    • Drive/Motivation
    • Goals
  • Betterment
    • Upgrade skills
    • Personal Growth
    • Maturity
    • Understanding
    • Priority Reanalysis
  • Future
    • More rewarding
    • Better Life
    • Foresight
  • Proving something
    • To others

 

Diagram of findings

Below are images developed through the analytical/automated part QDA Miner Lite:

fig2

Figure 2: Distribution of codes in percentages throughout the virtual interview.

Figure 3: Distribution of codes in frequency throughout the virtual interview.

fig4

Figure 4: Distribution of codes in frequency throughout the virtual interview in terms of a word cloud where more frequent codes appear bigger than less frequent codes.

Brief narrative summary of finding referring to your graphic diagram

Given figures 2-4, one could say that the biggest theme for going into the doctoral program is the prospect of a better life and hoping to change the world, as they more frequently showed up in the interview.  One student states that their degree would open many doors, “Pursuing and obtaining this level of degree would help to open doors that I may not be able to walk through otherwise.” While another student says that hopefully, their research will change the future lives of many “The research that I am going to do will hopefully allow people to truly pursue after their dreams in this ever-changing age, and let the imagination of what is possible within the business world be the limit.” Other students are a bit more practical with their responses stating things like “…move up in my organization and make contributions to the existing knowledge” and finally “More opportunities open for you as well as more responsibility for being credible and usefulness as a cog in the system”

Another concept that kept repeating here is that this is done for family, and because of family work, and school, the life of a doctoral student in this class has to be reprioritized (hence the code priority reanalysis).  This is primarily seen as all forms of graphical output show that these are the two most significant things that drive towards the degree.  One student went to one extreme, “Excluding family and school members, I am void of the three ‘Ps’ (NO – people, pets, or plants). I quit my full-time job and will be having the TV signal turned off after the Super Bowl to force additional focus.”  Another student said that time was the most important thing they had and that it has changed significantly, “The most tangible thing that has changed in my life since I became a doctoral student has been my schedule.  Since this term began I have put myself on a strict schedule designating specific time for studies, my wife, and time for myself.”  Finally, another student says balance is key for them: “Having to balance family time, work, school, and other social responsibilities, has been another adjusted change while on this educational journey. The support of my family has been very instrumental in helping me to succeed and the journey has been a great experience thus far.”  There are 7 instances in which these two codes overlap/included within each other, which apparently happen 80% of the time.

Thus, from this virtual interview, I am able to conclude that family is mentioned with priority reanalysis in order to meet the goal of the doctoral degree and that time management a component of priority reanalysis is key.  There are students that take this reanalysis to the extreme as aforementioned, but if they feel that is the only way they could accomplish this degree in a timely manner, then who am I to judge.  After all, it is the job of the researcher, when coding to be non-biased.  However, the family could drive people to complete the degree, it is the prospects of a better life and changing the world for the better is what was mentioned most.

Appendix A

An output file from qualitative software can be generated by using QDA Miner Lite.

 

LSAT Conditionals and CS Conditionals

In the past few months, I have been studying for the LSAT exam. Yes, I am contemplating Law School.  Law school will be a topic for another day.  However, I came across a few points that are extremely interesting and could spark discussion in the computer science field.  In the field of computer science, we have a thing called Loops in our coding languages.  One of the most common loops is called an IF-THEN loops, which is one of many conditional phrases. However, the LSAT has made me realized that there is more to the IF-THEN conditional statements in the LSAT, and here is why (Teti et al., 2013):
  1. If X then Y (Simple IF-THEN loop)
  2. If not Y then not X (This is the contra-positive of 1)
  3. X If and only if Y means X and Y
  4. X Unless Y means if not X then Y
where X here is the sufficient variable whereas Y is the necessary variable. The phrase “If” can be substituted for “All,” “Any,” “Every,” and “When” (Teti et al., 2013). Whereas the phrase for “then” can be substituted for the phrase “only,” or “only if.” Remember, that a conditional phrase like the ones above can introduce a relationship between the variables, but it doesn’t establish anything concrete. A sufficient variable (X) is enough to guarantee Y, but Y is not enough on its own to guarantee X.
Subsequently, with any Loop, we have to look at conjunctive “and” or disjunctive “or” statements.
  1. Both X and Y = X + Y
  2. Either X or Y = X or Y
  3. Not both X or Y = X or Y
  4. Neither X or Y = X + Y

We should note that an “or” statement can also allow for the possibility of both (Teti et al., 2013). Additionally, the LSAT adds some nuance to the conditional phrase by adding an “EXCEPT” clause.  For instance (Teti et al. 2013):

  1. Must be true EXCEPT = Could be false
  2. Could be true EXCEPT = Must be false
  3. Could be false EXCEPT = Must be true
  4. Must be false EXCEPT = Could be true
The LSAT views these loops, conjunctive, disjunctive, and conditional phrases a bit more nuance than computer scientists do and maybe we can combine some of this nuance in future coding to get more nuance code and results.
Though some people may state that this whole post is overkill and why do we have to look into such nuance. Each one of the above bullets is necessary and has value. It has been created in the lexicon for a particular reason. We can easily decompose each of these, and then map these out in simpler terms with a programming language. However, to sufficiently capture these nuance characteristics of these conditional phrases, we can create really nasty pieces of convoluted code.
Resources:
  • Teti, T., Teti, J., and Riley, M. (2013). The Blueprint for LSAT Logic Games. Blueprint LSAT Preparation.

A/B Testing

Are you a HiPPO? HiPPO stands for the Highest-Paid Person’s Opinion who designed websites or gives their opinion on how things ought to be (Christian, 2012). This may not be a good thing, because the HiPPOs may not be the best person to get the maximum traffic to and through your website. Proponents for A/B testing state that the advantages of using A/B testing are greater than the time it takes to conduct it in the first place (Christian, 2012; Patel, n.d.). Whereas, Patel (n.d.), further claims that  “A/B tests, done consistently, can improve your bottom line substantially.”  Therefore, A/B testing helps the data scientist to narrow down which element/variable makes an effective difference towards their goal, i.e. click-through rate within a website to generate more revenue (Christian, 2012; Patel, n.d.; Unbounce.com, n.d.).
First, you need to know what to test or which elements/variables you want to test. It is key to  know what you want to test, the current baseline result, what you are testing for, and the goal you want to reach (Patel, n.d.) If you have a click funnel for your audience and they are dropping out at a certain level, you may want to use that area to improve fallout rates.  Once you know what to test, make a list of all the variables you would like to test (Christian, 2012; Patel, n.d.; Unbounce.com, n.d.):
  • location
  • color
  • button type
  • surrounding type
  • text font
  • font size
  • any graphic you use
  • product descriptions
  • sales copy
  • verbiage
  • different offers (50% off, 35% off, free sample, etc.)
  • a whole page
  • a whole landing page

Then you set your control element/variable, and it is essentially what you have now, and you call that your A. Meanwhile, the element/variable you want to test as B, to be run simultaneously with the control (Christian, 2012; Patel, n.d.). The A and B variables are also known as variants, the challenger is the B variable, and the champion variable is the one that outperforms the others (unbounce.com, n.d.)  For instance, 100% of the audience will be split into 50% of your site with variable A and the other 50% of your site with variable B. The split can vary from 50/50 to 60/40 to 70/30, etc. and it depends on how much weight you want to assign to the challenger variable (unbounce.com, n.d).

Another thing to consider is what statistical test you want to apply to the A/B test:
  • If Gaussian is the assumed distribution (i.e. average revenue per paying user), you can use the Unpaired T-test and/or Student T-test (Amazon, 2015; Box et al., 1987; Pereira, 2007).
  • If Binomial is the assumed distribution (i.e.click through rate), you can use Fisher’s exact test and/or Bernard’s test (Amazon, 2015).
  • If Poisson is the assumed distribution (i.e. transactions per paying user), you can use the E-test and/or C-test (Krishnamoorthy & Thomson, 2004).
  • If Multinomial is the assumed distribution (i.e. the number of each product purchased), you can use the Chi-square test.
  • If the assumed distribution is unknown, you can use the Mann-Whitney U test and/or Gibbs Sampling.
Testing can go on for a few days to a few weeks depending on the amount of traffic you get (Patel, n.d.).  Something like Facebook can start an A/B test on Monday and have a reported result by Friday, whereas my current state of the blog may have to take about a month or two. However, too long of a test for the wrong set of traffic or just in general can include confounding variables, which will skew your results.
To test three variations, also known as a multivariate test, according to Patel (n.d.), you need to set up an A/B test, a B/C test, and a C/A test and you want to give it a bit more time to have enough data.  When doing a multivariate test, you want to give them equal weight to pick a champion variable quickly (unbounce.com).
Another great article to view is from Kolowich (n.d), which provides a checklist for successfully conducting an A/B Test.
Resources