Data Science Career Tracks: What's the Right Path for You?

Free Live Webinar on May 31 - Register Now

Part 3: Survey Analysis

Part 3: Survey Analysis

April 22, 2021

We’ve built a survey, distributed it to potential respondents, and are now ready to start analyzing the results.

Although the possibilities for data gathering through surveys are limited only by your imagination, most questionnaires are made up of just a few basic question types. Understanding what they are and what can be done with their respective output is crucial for getting the most value from the raw data.


1. Multiple choice

A multiple choice question in which respondents are limited to making only one selection is used when there is either only one possible choice - what month were you born? - or when you want to understand the most relevant selection from a set of given options - what is your favorite type of ice cream?.

How many new data hires do you envision making this coming year?

 Fewer than last year
 The same as last year
 More than last year
 I do not know

Since respondents are forced to select only one response, be sure to include all possible options. When that is not practical due to length, simply include other - please specify as the final choice. In this example, we added I do not know in case the respondent doesn’t have enough information about hiring from the previous or current year.

The most common way to summarize these select one questions is to share the proportion of responses from each unique option relative to the total number of respondents. The sum of these should always total 100 percent.

How many new data hires do you envision making this coming year?

Response optionRespondents selectingPercent of total
Fewer than last year10352%
The same as last year6030%
More than last year3316%
I do not know42%
Total200100%

 

The majority of HR professionals responding to the survey (52%) anticipate hiring fewer data hires this year than they did the previous year. Only 16% expect to higher more.

Two common feedback variants

Likert scale

The Likert response scale attempts to gauge satisfaction or agreement from low to high. A survey may make a statement and provide response options such as:

Data skills are less important than soft skills in new hires. Do you agree or disagree?

 Strongly disagree
 Somewhat disagree
 Neither agree nor disagree
 Somewhat agree
 Strongly agree

Here are the responses by 200 of our HR professionals in Europe.

Although counts and percentage summaries are most common, you may also see mean scores reported by assigning a numeric value to the ordered response sequence.

Response LabelOrdered Rank ValueRespondents
Strongly disagree110
Somewhat disagree227
Neither agree nor disagree357
Somewhat agree452
Strongly agree554

In our simple example, the mean score for the question would be calculated by either taking the weighted average of the rank ordered result table above or by converting each response from its response label to its appropriate ordered rank value.

In the latter approach we then simply take the average of the converted values to find a mean score for the question of 3.6, a one-number indication that the average respondent leans toward agreeing with the statement. A mean value of exactly 3 in our case would indicate the typical respondent neither agreed nor disagreed.

Although this technique mathematically possible because Likert scales can be converted to rank-ordered numbers (e.g., Strongly disagree = 1, Somewhat disagree = 2, etc…), many statisticians advise against using mean values in these situations.

In spite of such concerns, mean score summaries can help simplify general sentiment, especially when communicating a long list of questions with the same ordinal scale.

Finally, you might want to compare intensity at the edges of the response scale. This is referred to as Top Box when focusing on the top of the scale (e.g., Strongly agree). If you collapse the top two options (e.g., Strongly agree and Somewhat agree) it is called Top 2 Box. Conversely, if you look or compare with the bottom of the scale it would be Bottom Box or Bottom 2 Box.

In our example, Top 2 Box would show that 53 percent of respondents either somewhat or strongly agree with the statement compared with just 34 percent who either somewhat or strongly disagree (Bottom 2 Box).

Net Promoter Score

Net Promoter Score (NPS) is a popular and simple business metric created by Bain & Company. Respondents are asked how likely they would be on a scale from zero (not at all likely) to ten (extremely likely) to recommend a specific product, service, or company to other people.

How likely are you to recommend DataKwery training programs to others?

0: Not at all likely
1
2
3
4
5
6
7
8
9
10: Extremely likely

Let’s say the survey responses to the NPS question came back like this:

Respondent percentages are then grouped as follows.

  • Promoters: Responses of nine and ten = 46 percent
  • Neutral: Responses of seven and eight = 29 percent
  • Detractors: Responses of zero through six = 25 percent

The final NPS score is the percentage of respondents who are promoters (46) minus the percentage who are detractors (25). NPS can range from negative 100 percent to positive 100 percent and is often reported without the percent sign. In our example, we found an NPS score of 21 (46 minus 25). The higher value, the better. Depending on the industry, any score above zero is considered pretty darn good.


2. Multiple select

Sometimes you don’t want to limit your respondents to only one choice. A multiple select question allows people to choose as many options as are applicable.

What data roles do you plan to hire this year?

 Business Analyst
 Data Analyst
 Data Scientist
 Data Engineer
 Data Architect
 Other (please specify)

Unlike select one multiple choice questions, summary percentages generally sum to more than 100 percent as respondents are able to pick all options that reflect their reality.

What data roles do you plan to hire this year?

Response optionRespondents selectingTotal respondentsPercent of total
Data Scientist15420077.0%
Data Engineer14720073.5%
Data Analyst14020070.0%
Business Analyst12220061.0%
Data Architect5720028.5%
Other132006.5%

Here we find that it is more common for companies to anticipate Data Scientist hires (77.0%) when compared with Data Architect hires (28.5%).


3. Rank order

Select one multiple choice questions limit the respondent to one response. Although select all questions enable a respondent to choose as many options as relevant, it doesn’t reveal how much more important one selection is compared with another. That’s where rank order questions come in.

Please rank the importance of the following strategic actions for your organization from most important (1) to least important (4).

  • Enter New Markets
  • Improve Customer Engagement
  • Launch New Products
  • Reduce Costs

Rank order questions ask respondents to explicitly make value judgments against a series of options. There are a few ways to analyze such responses. Assuming everyone responds to each option, you should have a rank for each strategic option. In our example, this ranges from a rank of one to a rank of four. Here are the responses from the first ten respondents.

Please rank the importance of the following strategic actions for your organization from most important (1) to least important (4).

 

respondent_idEnter New MarketsImprove Customer EngagementReduce CostsLaunch New Products
13421
24321
33214
42341
53241
64231
72341
83214
92341
102341

Mean rank

The mean rank calculates the average selected rank for each response. A lower average value indicates a higher rank and a higher average value indicates a lower rank.

Response optionMean rank
Launch New Products1.96
Improve Customer Engagement2.61
Reduce Costs2.71
Enter New Markets2.72

Launching new products has a higher average rank (1.96) when compared with entering new markets (2.73), a glimpse into the strategic priorities for these companies.

Proportion of ranked values

You could also look at how many times a specific rank was assigned to a given option. For instance, 63 percent of respondents assigned launching new products as their top priority. Meanwhile, only 6.5 percent of respondents indicated that entering new markets was the top goal.

Response OptionRank 1Rank 2Rank 3Rank 4Total
Enter New Markets6.5%23.5%61.0%9.0%100%
Improve Customer Engagement10.0%39.5%30.0%20.5%100%
Launch New Products63.0%7.5%0.5%29.0%100%
Reduce Costs20.5%29.5%8.5%41.5%100%

Although both approaches show the same directional finding, the proportion of ranked values measure is likely more impactful in this case as it demonstrates wider numeric differentiation.


4. Text entry

Text entry questions work well when you don’t have a great idea for which response options frame a certain topic or for when you simply want to give respondents a chance to provide more flexible, open-ended input.

What are your biggest hiring challenges for the coming year?

Analysis of text data is not as straight forward as the other question types. However, we can use basic text analytics and Natural Language Processing (NLP) to get a sense on what people are thinking.

Here are some customer reviews from an e-commerce platform to use as an example with responses shown for the first ten respondents who provided a review.

Please leave a review from your purchase

Respondent IDReview
3Some major design flaws
4My favorite buy!
5Flattering shirt
6Not for the very petite
7Cagrcoal shimmer fun
8Shimmer, surprisingly goes with lots
9Flattering
10Such a fun dress!
11Dress looks like it’s made of cheap material
13Perfect!!!

It would be difficult to sift through the 19,676 records manually to tag certain words or emotions. Thankfully we can turn this into a data problem by putting each word in its own row and then analyzing the adjusted data series.

This makes it easy to count which words appear most often or conduct sentiment analysis to get a broader sense of customer satisfaction. For now we’ll look at the most frequently mentioned words, excluding common terms such as the or and.

RankWordNumber of mentions
1love1,867
2dress1,654
3cute1,554
4beautiful1,408
5top1,177
6perfect816
7pretty672
8fit612
9nice528
10flattering506

Overall, customers look pretty happy. A wordcloud is a common way to visualize which terms dominate the conversation, aligning word size and color with the number of mentions.


Recap

We’ve now covered key aspects of survey planning and distribution, as well as analysis for core question types. Despite remaining one of the primary tools for organizations to understand the world, many people underestimate the effort required to run surveys and collect meaningful results.

We hope this practical guide has provided some ideas to help with each step of your next survey project.

This article is the second of three-part series on practical survey considerations with guidance from Gregg Schoenfeld, survey guru and founder of MNoet, a boutique research consultancy.

Related Courses

Subscribe for Updates

search
Or create a Datakewry.com Account

Related Learning Paths

Coursera
University of Michigan

Specialization | 7 Courses 8 Months

Survey Data Collection and Analytics