The Science Behind Data Science
By: Kelsey O'Neill
I developed an interest in data analytics after using Google Analytics at my previous job. I knew that data science was something that I was curious to learn more about, so I decided to pursue more of a background in the field by obtaining a Master’s degree in Predictive Analytics from Northwestern University.
I was drawn to work at Glidewell Dental because of my knowledge of what the company wanted to achieve in terms of technology. During my interview and subsequent tour of the lab, I was able to actually see all of the science and innovation in action, leaving me completely fascinated and eager to join in on the production of all of this.
Since being hired, Glidewell has challenged me to use many of the techniques and skills that I learned in my graduate program.
For example, I help design beta tests and determine optimal sample sizes for experimentations in the lab. I also write and analyze surveys with thousands of respondents. By tapping into dental industry data on the web and social media, I’m able to identify upcoming trends. My job also includes interacting with our executives in order to get a feel for the direction the company is heading.
All very exciting things to work on, if you’re a data geek like me!
The work that a data scientist does really is science. Just like a scientist in the lab, I come up with theories and test them with data. In order to do this, data scientists typically have expertise in three areas:
As a data scientist, a lot of times I ask and answer my own questions with data. A thorough understanding of the company I work for is key so that I am in tune with both opportunities and challenges. Having domain knowledge in my industry is critical because my superiors don’t always have visibility into the underlying data, therefore cannot exactly tell me which areas should be investigated further.
That’s my job: knowing the data and knowing the business.
Being exposed to several facets of the dental industry helps me identify pieces of market data that have the most value for our company. This comes with time, exposure to decision makers, and ciphering through a lot of information.
Programming is the “how” behind data science. A lot of my team’s work revolves around what is called “data engineering,” or working to capture and format data to suit your needs. Basically, I have to find out which pieces of information are available from a certain source, figure out how to retrieve it in an automated fashion, and then store it in a way that it can be used to answer many different questions in the future. I don’t always know ahead of time which questions I’ll want to answer with data, so flexibility is key.
A more common use of programming is to design and develop predictive models, which are trained on records of what’s happened in the past. These models help to predict what will happen in the future. It’s all very exciting stuff, but a data scientist has to be careful not to make false assumptions.
This is the true science of data science. When I’m trying to predict something, there’s a lot of statistics that I can employ to make an educated decision or recommendation. These are usually insights that can affect the way our business runs, so to do it right, I like to use scientific methods.
My work can be very subjective, meaning that sometimes there are no clear answers. Understanding and articulating all of the “gotchas” around the analysis or model I’ve built can be difficult, but necessary, so that the people relying on my recommendations can trust the insights I’ve discovered.
Right now, I’m leading a small team that has multidisciplinary skills in data science, analysis, and software engineering. We maintain a “startup vibe,” where everybody wears many hats and collaborates to make the best decisions possible for the project. Our team works to collect, explore, and provide access to industry data as a resource for Glidewell.
Glidewell’s vision is to provide affordable, high-quality dental solutions to all. Collecting data will help us make decisions that will allow us to continue to do that. It can also allow us to possibly find a new material, technique, or technology. Incorporating that data and providing the information to the right people allows us new means to continue pushing towards the future of the dental industry.