[Remote] Data Engineer

Remote role Full-time Open position

Note: The job is a remote job and is open to candidates in USA. Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. The Data Engineer role involves owning the end-to-end development lifecycle, collaborating with a high-performing engineering team to design, build, and deploy high-impact features for Veeva’s life sciences customers.

Responsibilities

Architect and build resilient, distributed data processing systems using Python and Spark on AWS
Design and implement end-to-end ETL/ELT workflows that ingest and unify data from diverse sources —ranging from modern table formats like Iceberg and Delta to legacy business files such as Excel and CSV —ensuring a scalable and consistent single source of truth for the organization
Lead the implementation of the Medallion Architecture, managing data maturity through Bronze, Silver, and Gold layers. You will define how data is structured, classified, and stored to maximize business value while ensuring scalability and high availability
Build reusable libraries and frameworks for data quality validation, metadata tracking, and pipeline monitoring
Build CI/CD process, to automate deployment and testing to maintain a high bar for engineering excellence
Enforce data governance standards, including security, privacy, and regulatory compliance
Proactively monitor system health, implement automated observability, and resolve complex bottlenecks in distributed systems to ensure peak resource efficiency and cost-effectiveness
Partner directly with Product Managers and Data Scientists to translate business requirements into innovative solutions
Own the full feature lifecycle—from initial whiteboarding to production deployment and long-term maintenance

Skills

4+ years of professional data engineering experience with a demonstrated ability to architect and deploy production-grade data platforms from scratch
Expert-level proficiency in Python and Apache Spark, with specific experience in JVM tuning, memory management, and optimizing execution plans for large-scale distributed workloads
Deep expertise in modern data architecture, software design patterns, and various data modeling techniques designed for scalability and performance
Proven track record of building on AWS (primary) or GCP, including hands-on experience with managed services like EMR or Databricks
Extensive experience designing and managing complex data lifecycles using orchestration tools such as Airflow, AWS Step Functions, or Prefect
Deep understanding of data cleansing, curation, and transformation strategies, coupled with experience implementing data governance, security, and lifecycle management policies
Strong background in building reusable libraries, frameworks, and internal tools that standardize data ingestion and automate ETL/ELT workflows
Exceptional debugging skills for distributed systems and resolving performance bottlenecks at scale
Proficiency with CI/CD tools and processes (e.g. Codefresh, Jenkins)
Excellent verbal and written communication skills in English, with the ability to translate complex technical architectures into actionable insights for stakeholders and cross-functional teams
Must be located in EST or CST
Applicants must have the unrestricted right to work in the United States. Veeva will not provide sponsorship at this time
Relevant certifications (e.g., AWS, Spark, or similar)
Familiarity with streaming and distributed technologies such as Spark Streaming, EKS, Kinesis, or Apache Kafka
Experience implementing or managing modern cloud data warehouses or lakehouse architectures
Prior experience working in the Life Sciences industry

Benefits

Medical, dental, vision, and basic life insurance
Flexible PTO and company paid holidays
Retirement programs
1% charitable giving program

Company Overview

Veeva delivers the industry cloud for life sciences with software, AI, data, and business consulting. It was founded in 2007, and is headquartered in Pleasanton, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.veeva.com.

Apply To This Job

Apply

[Remote] Data Engineer

Further positions

[Remote] Software Engineer

[Remote] Billing Specialist (Legal)

[Remote] Frontend Engineer, Financial Web Platform

[Remote] Recruiting Coordinator

[Remote] Senior Manager Applications Development

[Remote] Channel Account Manager

[Remote] Senior Consultant, ERP Services Delivery

[Remote] Senior Cloud Services Project Manager- eDiscovery Services

[Remote] Principal Product Marketing Manager

[Remote] Product Sales Specialist

[Remote] IT - Business Analyst II - III (Remote)

Remote Early Childhood Educator Specialist

Experienced Chat Support Representative – Customer Service & Sales Expert for arenaflex

[Remote] Sr. QA Engineer II

Data Entry Specialist (Part-Time, Evening) – Join arenaflex's Dynamic Team

Experienced Customer Service Representative – Aviation Industry Remote Work Opportunity

Remote Data Entry Specialist – High‑Precision Data Management for arenaflex Streaming Platform – $25/hr Flexible Schedule – Work‑From‑Home

Named Account Manager, SLED

Mid-level Developer

HVAC Dispatcher