The old axiom “content is king” has proven untrue; it appears that crown now belongs to data. As its ubiquitous presence informs decisions affecting everything from healthcare and finance to entertainment marketing, the gathering and analysis of data have become invaluable methods for generating insight that sustains our businesses, improves the many devices we rely upon, and enhances the overall quality of our lives.
Enter McIntire’s Center for Business Analytics (CBA). Promoting educational and professional innovation efforts throughout the Commerce School, the CBA offers in-depth research opportunities for students interested in studying data’s many challenges, including how to best measure and interpret it. Through the Deloitte Foundation Data Analytics Fellowship sponsorship of the Deloitte Foundation Analytics Scholars Program, the Center has secured a two-year agreement to support this important student-conducted research.
Corporate partner Deloitte, one of the world’s leading professional services firms, invested in student projects concerned with industry-oriented predictive analytics. Under guidance from Professor David Dobolyi, four recent McIntire graduates from the Class of 2018—Michelle Chen, Cody Kelderhouse, Joshua Peters, and Reilly Sheehy—considered data issues throughout the 2017-2018 academic year as Deloitte Foundation Analytics Scholars.
Peters spent the fall semester analyzing an existing data set developed by the CBA and Ipsos Public Affairs about public opinions on automation, with the goal of drawing new insights through techniques including sentiment analysis conducted using R and Python. The three-student team of Chen, Kelderhouse, and Sheehy analyzed 3D printer reviews on Amazon from descriptive and machine-learning perspectives to determine underlying customer sentiment and other variables as predictive indicators of consumer ratings. The aim of the research, says Kelderhouse, was to deliver actionable, cost-effective insights about how to better understand consumer demand for the machines.
Learning Languages for Machine Learning
Using an extensive data set of 3D printers listed on Amazon, the trio filtered the information into standard definitions arranged by specific types of machines, choosing only those listings with a minimum of 30 posted reviews. The data ultimately yielded more than 6,000 reviews across 33 products, which was then cleaned to reveal the most meaningful content.
In generating research insights for the CBA, Dobolyi notes, the scholars gained “hands-on experience that furthered their analytical skills, particularly in terms of working with the R statistical programming language.”
Seeking an avenue to apply technical skills to real-world projects, Chen was particularly interested in learning the R language. The Deloitte Foundation-funded project gave her ample opportunities to use R to prepare and analyze the review data, which included using text analytics data “to determine the overall polarity of reviews as positive or negative,” she explains.
Part of the group’s initial analysis found that although more people left five-star reviews, there were a higher number of one-star ratings that were marked as helpful by other Amazon users. As expected, five-star ratings abounded with positive emotions, while one-star reviews reflected a negative sentiment.
The team then focused on extremely positive and negative reviews that were considered helpful by a high number of users. Preparing their final data set through a variety of factors, including “downsampling” to evaluate an equal number of one- and five-star ratings before training machine learning models, the group attempted to predict user product scores based on the semantic patterns in the text of the reviews along with other relevant factors such as the number of comments reviews received and helpfulness ratings.
Evaluating machine-learning models such as gradient boosted models, decision trees, and logistic regression using several key performance indicators, the team successfully classified one-star versus five-star reviews, reaching an average predictive accuracy of 73 percent.
How might businesses use their research to their advantage?
The team believes manufacturers should release new products or highlight updates to reengage customers and increase satisfaction with their products. As reviews tend to worsen over time, existing listings are at a disadvantage when compared with new ones.
Another takeaway: As products’ comment count and reviews that are marked as helpful prove to be strong indicators of ratings, 3D printer manufacturers could also leverage a network of Amazon influencers to provide comprehensive reviews.
“There are many things you can do with these predictions,” says Chen. “Manufacturers could examine reviews to obtain a better understanding of customers’ feelings about their products on Amazon.”
Chen adds that manufacturers could also examine the spread of positive or negative polarity and sentiment across competitor products to see how consumers perceive other 3D printers and which aspects, like ease of use or reliability, consumers place the highest value on. “Companies could then make adjustments and improvements to their products, how they describe their product on Amazon, and even how customer service might respond to consumer feedback,” she says.
Calling the Deloitte Foundation Analytics Scholars Program “invaluable,” Chen, Kelderhouse, and Sheehy say that undertaking the research project taught them a great deal in a short amount of time about analyzing real-world data. And although they graduated from the McIntire School at the end of the spring semester, the three say they hope Deloitte will continue to partner with the CBA for years to come so that more Commerce students can benefit from the supportive relationship of collaborating with a top-tier professional services firm.
For the recent graduates’ part, the research effort has impacted their plans as well as their undergraduate learning. Chen will continue studying analytics in the fall, attending Columbia University to earn her master’s degree in data science. Kelderhouse is headed to Austin, Texas, to take a position as a Consultant with Informatica, yet he also intends to pursue a data science master’s degree sometime in the future.
Sheehy is a newly hired Deloitte employee, slated to be a Federal Analytics Consultant in the company’s Financial and Risk Advisory division. For him, the program had clear benefits: “I was really excited to get a head start in learning about the types of projects, tools, and data sets a typical analyst might encounter.”
Ultimately, the team’s final presentation to the Deloitte sponsors and members of the McIntire faculty, including Dobolyi, was well received. “A standout aspect of the presentations involved the consistent focus on business and research objectives,” Dobolyi says. “The fellows succeeded in funneling a broad array of complex statistical analyses into a thoughtful set of key takeaways, which were both easy to understand and compelling, thanks to both the excellent use of visuals and well-practiced presentation skills.”
Deloitte practitioner Max Melnick (Engineering ’09) says he was “extremely impressed” by the quality of the work the fellows delivered.
“The content in the presentation was indicative of the strong technical skills the fellows developed during the year,” Melnick notes. “The presentation was expertly crafted and delivered very professionally. The fellows did a great job communicating the impact of their work.”
Sheehy believes that presenting the team’s research was a motivating force and one that fit directly into his employment plans. “Presenting analytics on behalf of a client rather than for an internal project is a very different proposition. I wanted to better understand the project timelines, expectations, and deliverables typical of client-facing work,” he says. “There is currently a huge amount of growth in Deloitte’s analytics practice, so having the opportunity to research and expand my own analytics knowledge under this brand was incredibly important to me.”