Back to blog

5 Roles in Data in 2021

5 Roles in Data in 2021

Data Scientists, Analysts, Data Engineers, and Machine Learning Engineers. What do they do?

Photo by Alex Kotliarskyi on Unsplash

According to World Economic Forum, High Scalability, by 2025, the amount of data generated each day is expected to reach 463 exabytes globally. That’s a billion gigabytes! Google, Facebook, Microsoft, and Amazon store at least 1,200 petabytes of information.

To utilize this eruption of data, data professions have become ubiquitous in companies and organizations worldwide.

In this article, I will introduce you to the common roles in a data team.

Data team

A data team typically consists of:

  • Data Analyst
  • Business analyst
  • Data scientist
  • Data Engineer
  • Machine Learning Engineer

For each of these roles, we’ll go through

  • Overall description
  • Roles and responsibilities
  • Skills

If you don’t know already, we recently launched a new discord server! Come join the bitgrit community where we discuss all things data science and AI, including our newly released BGR cryptocurrency token! Join the server here!

Let’s dive in!

Business analyst

Business analysts are commonly known as the intermediaries between management and the IT department in a company.

Their primary responsibility is to analyze the structure of a business, identify problems within it, and then improve the process, service, or product of a business through data analysis and software.

For example, a business analyst could conduct a market analysis and analyze the overall profitability.

Business analysts are critical when a data team lacks domain expertise, as they can bridge the gap and ensure the business is making data-driven decisions.

Roles and Responsibilities (source)

  • Creating a detailed business analysis, outlining problems, opportunities, and solutions for a business
  • Budgeting and Forecasting
  • Planning and monitoring
  • Financial modeling
  • Variance Analysis
  • Pricing
  • Reporting
  • Defining business requirements and reporting them back to stakeholders

Skills

  • SQL
  • Business Intelligence
  • Advanced Excel
  • Data visualization tools using Quicksight, Tableau, Power BI
  • Technical writing and strong communication
  • Stakeholder analysis

Data Analyst

The primary responsibility of a data analyst is to discover how to use data to answer questions and solve problems.

They work with data engineers to access data sources and with stakeholders to create relevant and meaningful reports.

Once they discover the hidden patterns in data, they will utilize reporting tools and storytelling skills to turn numbers into tangible insights.

Data analysts are crucial as they allow businesses to maximize the value of their data assets and use analytics to inform strategic business decisions.

Roles and responsibilities

  • Interpret data and identify trends and patterns with statistical techniques
  • Identify trends and patterns in data.
  • Prepare reports and presentations for management or clients
  • Effectively communicate with stakeholders to understand data and business requirements
  • Data mining from primary and secondary sources
  • Define KPI and metrics

Skills

  • In-depth knowledge of statistical methodologies and data analysis techniques
  • SQL
  • Programming languages such as Python or R
  • Spreadsheet tools — Excel
  • Data visualization software — Tableau, Looker, PowerBI
  • Cloud Technology
  • Strong verbal and written skills

Data scientist

The primary responsibility of a data scientist is to extract value from data using statistical techniques and machine learning.

They are jack of all trades who combine statistics, programming, data modeling, and business acumen to discover solutions to business questions.

“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”

Aside from cleaning and wrangling data, data scientists spend most of their time asking questions, running experiments to answer those questions, working with stakeholders, and communicating their findings with the help of data analysts.

An example of a data scientist’s job is machine learning to increase and optimize customer experience, implement AB testing on new features, ad targeting, etc.

Roles and responsibilities

  • Work with stakeholders to identify opportunities for leveraging company data to drive business solutions.
  • Mine and analyze data from the company database to improve business strategies
  • Develop time series, Forecasting, anomaly detection, and user behavior, models
  • Define KPIs, build automated dashboards, reports, and models
  • Develop custom machine learning models
  • Implement AB testing and QA
  • Coordinate with ML engineers to deploy and monitor ML models

Skills

  • SQL
  • Programming — R or Python
  • Statistical and data mining techniques
  • Distributed Computing tools — MapReduce, Hadoop, Hive, Spark
  • Time series and Forecasting
  • Causal inference
  • AB testing
  • Machine Learning
  • Deep Learning

Interested in Data Science? Subscribe to our newsletter Data Science news and the best resources to learn DS and ML!

Machine Learning Engineer

ML Engineers and Data scientists are pretty similar, with the differentiator being ML engineers are focused on the engineering side of machine learning services.

The primary goal of a ML engineer is to research, design, build, deploy, and test ML systems with various tools and frameworks such as PyTorch or Tensorflow for modeling, and cloud technologies such as AWS and GCP.

However, they don’t do it all alone. ML engineers partner with data scientists and engineers to find the right data, verify data quality, research and implement ML algorithms, define evaluation metrics, run tests to improve models, etc.

Roles

  • Work with data scientists on designing AI workflow and end-to-end pipelines
  • Collaborate with data scientists to create scalable ML solutions for business problems
  • Designing and developing machine learning and deep learning systems
  • Designing ML systems
  • Researching and implementing ML algorithms and tools
  • Undertaking machine learning experiments and test
  • Developing deep learning systems based on business needs

Skills

  • Programming skills
  • ML frameworks — PyTorch and Tensorflow
  • Distributed computing tools
  • Software engineering and system design skills
  • Data modeling and data architecture

Data Engineer

Data analysts and data scientists won’t be able to do their jobs without any data to work. This is why data engineers play the most crucial role in a data team.

Data Engineers are primarily responsible for providing data in a usable form for analytics and ML teams across the organization.

How do they do that, you ask?

They create a data pipeline, which is a set of technologies that form a specific environment where data is obtained, stored, processed, and queried. (source)

Using distributed computing, workflow, orchestration tools, stream processing, and other tools, they provide a reliable and easy-to-use system for ingesting and processing data, helping the data team build data-intensive applications successfully.

Read why Data Engineering is popular now in our recent article.

Roles

  • Create, monitor, and maintain data pipelines
  • Design, build and launch highly efficient and reliable data pipelines
  • Maintain the health of the data ecosystem by configuring monitors, defining alerts on common failure points, and giving feedback on data quality to data owners and business partners
  • Design data lake storage and access patterns to match customer requirements and conform to naming standards
  • Leverage data and business principles to solve large-scale web, mobile, and data infrastructure problems.
  • Partner with leadership, engineers, program managers, and data scientists to understand data needs.

Skills

  • SQL
  • Programming skills — Java, Scala, Python
  • Distributed Computing — Hadoop, Hive, Spark
  • Workflow and orchestration tools — Airflow, Luigi
  • Stream processing — Kafka
  • ETL and ELT tools
  • Databases — SQL and NoSQL
  • Data modeling
  • Cloud platforms — AWS, GCP
  • Data quality and validation
  • Designing and implementing pipelines

Summary

To summarize, here are the data roles and their responsibilities

  1. Business analyst — Analyze the structure of a business, identify problems within it, and then improve the business processes.
  2. Data Analyst —Analyze data to identify trends and patterns, define key metrics, and communicate them effectively with dashboards
  3. Data Scientists — Apply statistical techniques and machine learning to data to answer business questions or build a product.
  4. ML Engineer — Train, monitor, and maintain machine learning services.
  5. Data Engineer — Gather, organize, and maintain data for the company with data pipelines

That’s all for this article, thank you for reading, and I hope you learned something new about what it takes to work in the data profession!

If you liked my writing, the best way to support me is to become a Medium member today for only 5$! You’ll get full access to tons of excellent writing on Medium on all kinds of topics.


Find us on these platforms 👇📱