DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Big Data
  4. The Types of Data Engineers

The Types of Data Engineers

We take a look at the different skills you'll need to have to work on different types of data science projects.

Ederson Corbari user avatar by
Ederson Corbari
·
May. 08, 19 · Analysis
Like (8)
Save
Tweet
Share
11.79K Views

Join the DZone community and get the full member experience.

Join For Free

Overview

We all know that in the last few years the position of data engineer, together with data science, has been in high demand on the market.

However, we can still observe in the market a certain discrepancy in the technical profile of a data engineer. I’m talking about this point specifically for the Latin American region, maybe elsewhere in the world this is more advanced.

Companies have difficulty hiring and mapping profiles properly, especially consulting companies, and often end up generalizing the profile of the vacancy, announcing the vacancy simply as 'Data Engineer.'

They even end up with people who are (relational database) used to solving problems with SQL for projects where you need a programming skill (coding), or sometimes more of a data analysis profile.

It happens! We have different data engineer profiles, basically I thought of three, whichI will detail the following types.

1. Analytical Skill

This engineer has a more analytical skill set, usually people with training in the fields of computer science, mathematics, and physics.

This engineer is responsible for scaling machine learning models and making these models fit for production environments. They have relevant knowledge in the area of data science and know how to code very well.

This engineer knows the scikit-learn, Tensorflow, and Keras; this professional adapts the models/architectures created by data scientists, defines the design patterns, and makes this model run in production for millions of users to use. I think this is currently the most difficult profile to find.

1.1 Skills:

  • Distributed ML Platforms: MLib (Spark), Mahout (Hadoop), AWS SageMaker.
  • Algorithms and data structure: union, linked list, trees, graphs.
  • Parallel Computing for Deep Learning (Tensorflow, GPU Programming).
  • Development in Containers (Docker, Rkt).
  • Programming in Notebooks (Zeppelin, Jupyter).

2. Builder Skill

This engineer has a profile focused on the monitoring of resources, provisioning, pipeline, and volumetry. This professional knows how to raise a cluster from scratch, knows the solutions of APIs/libraries, and has a good knowledge of them.

Generally, this professional has training in information systems and project management, and does the hard work of saving terabytes of data and leaving the collection services in place.

2.1 Skills:

  • Cloud Storage Services (AWS S3, Google FS, BigQuery, Redshift).
  • Streaming Platforms (Kafka, Kinesis, Storm).
  • Management of VMs (EC2, GCC).
  • Container Orchestration (Kubernetes, AWS Fargate, Mesos).
  • Provisioning and Monitoring Tools (Terraform, New Relic, ELK).
  • Maintenance of Distributed Data Stores (ElasticSearch, Mongo Clusters/MySQL/Oracle/PostgreSQL).

3. Developer (Code)

This engineer is a specialist in developing big data, batch, or real-time applications.

A developer has a great knowledge of software architecture, standards, and programming languages. Usually this professional has moved from the software development area to the big data area. They have a background in computer science and information systems.

3.1 Skills:

  • Programming in Java, C++, and/or Go and functional languages (Scala, Clojure, Elixir).
  • Paradigms of distributed programming (channels, actors).
  • SQL and NoSQL interfaces (KQL, ElasticSearch API).
  • Webservices.
  • ETL APIs/platforms (Spark, Airflow, Luigi, Nifi).

Conclusion

These are the thee main profiles of data engineers; of course there are always exceptions to the rules. The data engineering field is relatively new, but that’s a perception I see in the marketplace where I work. In the case of consulting companies, it is common to see a professional with analytical skills performing the task of a builder profile and vice versa.

Thanks!

Data science Big data Engineer

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Taming Cloud Costs With Infracost
  • Asynchronous HTTP Requests With RxJava
  • Why Open Source Is Much More Than Just a Free Tier
  • SAST: How Code Analysis Tools Look for Security Flaws

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: