What Is a Data Scientist?

The field of data science is a broad and rapidly expanding one with far-reaching implications in areas as diverse as climate research, health and medicine, financial services, and social networking. It’s no surprise, then, that the role of data scientist is equally broad and encompasses a wide range of skills and responsibilities.

Opensource.com explains it this way:

Data science is a branch of computer science dealing with capturing, processing, and analyzing data to gain new insights about the systems being studied. Data scientists deal with vast amounts of information from different sources and in different contexts, so the processing they must do is usually unique to each study, utilizing custom algorithms, artificial intelligence (AI), machine learning, and human interpretation.

To help you better understand this challenging and potentially lucrative career path, in this article, we’ll provide an overview of the role of data scientist along with resources to learn more.

What Does a Data Scientist Do?

Data scientists help organizations solve vexing problems, says Northeastern University. Through a combination of computer science, modeling, statistics, analytics, and math skills, data scientists uncover patterns, visualize trends, and provide insight to help organizations make informed, objective, “data-driven” decisions.

To perform this role, for example, “a data scientist gathers data, parses and normalizes it, and then creates routines for a computer to run on the data in search of a pattern, trend, or just a helpful visualization,” says opensource.com. So, “if you have ever created a pie chart or bar graph from the fields of a spreadsheet, then you've acted as a low-level data scientist by interpreting a dataset and visualizing the data to help others understand it,” they write.

Skills and Tools

Career Karma outlines the necessary skills for a data scientist as follows: 

  • Ability to prepare data
  • Basic statistics
  • Data wrangling/munging
  • Data visualization
  • Programming

According to the Enterprisers Project, “data scientists must have a deep knowledge of statistics and at least one area of machine learning/artificial intelligence. They have to be able to build highly specialized mathematical models and have a thorough understanding of ML algorithms. Preferably they have basic programming skills in R and/or Python and a good understanding of distributed data/computing tools like Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, among others.”

Programming is definitely part of a data scientist’s job. The specific languages you may be expected to use will vary by organization, and the most common ones for data science include: 

In addition to these skills, you will need proficiency, or at least working familiarity, with various data science tools to help you manipulate, manage, and interpret data. As Claire D. Costa writes, “The core idea behind these tools is to unite data analysis, machine learning, statistics and related concepts to make the most out of data.”  

With that in mind, commonly used data science tools include:

Related Job Titles

Note that because of the broad scope of this role, you will find that jobs within the field of data science go by many names, each with its own emphasis, seniority level, and skills. This list from Northeastern University describes duties associated with common job titles:

  • Data scientists: Design data modeling processes to create algorithms and predictive models and perform custom analysis.
  • Data analysts: Manipulate large datasets and use them to identify trends and reach meaningful conclusions to inform strategic business decisions.
  • Data engineers: Clean, aggregate, and organize data from disparate sources and transfer it to data warehouses.
  • Data architects: Design, create, and manage an organization’s data architecture.

Specifically, they differentiate the roles of data scientist and data analyst this way: “Data scientists develop processes for modeling data while data analysts examine data sets to identify trends and draw conclusions. Because of this distinction and the more technical nature of data science, the role of a data scientist is often considered to be more senior than that of a data analyst.” 

Other data science job titles include: research engineer, machine learning engineer, and statistician.

Career Karma says that “the job outlook for data scientists is strong,” as companies will need to hire people to help them analyze their data. Additionally, they state, “according to the U.S. Bureau of Labor Statistics, job opportunities in data science are expected to increase by 16 percent by 2028, which is much faster than average.” 

The information presented here along with the additional resources below should help you learn more and get started in a career in data science.

Other Resources

woman with dark ponytail in front of computer