Back to positions

Data Engineer (Databricks)

Remote role Full-time Open position

Solvd Inc. is a rapidly growing AI-native consulting and technology services firm delivering enterprise transformation across cloud, data, software engineering, and artificial intelligence. We work with industry-leading organizations to design, build, and operationalize technology solutions that drive measurable business outcomes. Following the acquisition of Tooploox, a premier AI and product development company, Solvd now offers true end-to-end delivery—from strategic advisory and solution design to custom AI development and enterprise-scale implementation. Our capability centers combine deep technical expertise, proven delivery methodologies, and sector-specific knowledge to address complex business challenges quickly and effectively. We are looking for a Data Engineer to develop an AI-powered data mapping recommendation platform to speed up the integration and validation of complex datasets. The system will automate data extraction, mapping, and validation processes. What you'll do

  • Build and maintain scalable data pipelines with Databricks, Spark, and PySpark.
  • Manage data governance, security, and credentials using Unity Catalog and Secret Scopes.
  • Develop and deploy ML models with MLflow; work with LLMs and embedding-based vector search.
  • Apply ML/DL techniques (classification, regression, clustering, transformers) and evaluate using industry metrics.
  • Design data models and warehouses leveraging dbt, Delta Lake, and Medallion architecture.
  • Work with healthcare data standards and medical terminology mapping.

What you bring Databricks expertise Hands-on experience with the Databricks platform, including:

  • Unity Catalog: Managing data governance, access control, and auditing across workspaces.
  • Secret Scopes: Secure handling of credentials and sensitive configurations.
  • Apache Spark / PySpark: Writing performant, scalable distributed data pipelines.
  • MLflow: Managing ML lifecycle including experiment tracking, model registry, and deployment.
  • Vector Search: Working with vector databases or search APIs to build embedding-based retrieval systems.
  • LLMs (Large Language Models): Familiarity with using or fine-tuning LLMs in Databricks or similar environments.

Data Engineering skills Experience designing and maintaining robust data pipelines:

  • Data Modeling & Warehousing: Dimensional modeling, star/snowflake schemas, SCD (Slowly Changing Dimensions).
  • Modern Data Stack: Familiarity with dbt, Delta Lake, and the Medallion architecture (Bronze, Silver, Gold layers).

Nice to have Machine Learning knowledge Strong foundation in machine learning is expected, including:

  • Traditional Machine Learning Techniques: Classification, regression, clustering, etc.
  • Model Evaluation & Metrics: Precision, recall, F1-score, ROC-AUC, etc.
  • Deep Learning (DL): Understanding of neural networks and relevant frameworks.
  • Transformers & Attention Mechanisms: Knowledge of modern NLP architectures and their applications.

Preferred domain knowledge

  • Experience with healthcare data standards and medical code systems such as eCQM, VSAC, RxNorm, LOINC, SNOMED, etc.
  • Understanding of medical terminology and how to map or normalize disparate coding systems.

Tech stack Platforms & Tools: Databricks, Unity Catalog, Secret Scopes, MLflow Languages & Frameworks: Python, PySpark, Apache Spark Machine Learning & AI: Traditional ML techniques, Deep Learning, Transformers, Attention Mechanisms, LLMs Search & Retrieval: Vector databases, embedding-based vector search Data Engineering & Modeling: dbt, Delta Lake, Medallion architecture (Bronze/Silver/Gold), Dimensional modeling, Star/Snowflake schemas Domain (Optional): Healthcare data standards (eCQM, VSAC, RxNorm, LOINC, SNOMED) When you join Solvd, you'll…

  • Shape real-world AI-driven projects across key industries, working with clients from startup innovation to enterprise transformation.
  • Be part of a global team with equal opportunities for collaboration across continents and cultures.
  • Thrive in an inclusive environment that prioritizes continuous learning, innovation, and ethical AI standards.

Ready to make an impact? If you're excited to build things that matter, champion responsible AI, and grow with some of the industry’s sharpest minds. Apply today and let’s innovate together. Apply tot his job Apply To this Job

Further positions

DataOps Automation Intern Summer 2026

Remote role Full-time

Senior Data Engineer - Data Ops

Remote role Full-time

DataOps Specialist

Remote role Full-time

Sr. Data Engineer - DataOps

Remote role Full-time

DataOps Engineer

Remote role Full-time

Project Manager, Clinical Programs

Remote role Full-time

Enterprise Engineer II – Mobility Data Ops – REMOTE

Remote role Full-time

Nurse Practitioner - Hybrid/Remote - Beaumont, TX

Remote role Full-time

Manager, Product (ROPS - Revenue Operations)

Remote role Full-time

Sr Data Analyst, Clinical Analytics and Reporting

Remote role Full-time

Payroll Specialist, US Payroll & Tax Compliance

Remote role Full-time

Experienced Customer Success Implementation Manager – Driving Safe and Fair Decisions at arenaflex

Remote role Full-time

Immediate Hiring: Delta Airlines (Remote Jobs)

Remote role Full-time

Experienced Customer Service Representative – Remote Support Specialist

Remote role Full-time

Digital Customer Experience Specialist – Social Media Support & Brand Advocacy at arenaflex

Remote role Full-time

Experienced Part-Time Remote Area Manager – Customer-Focused Operations Leadership at arenaflex

Remote role Full-time

Experienced Remote Customer Engagement Specialist – Building Lasting Relationships and Delivering Exceptional Experiences

Remote role Full-time

LGBTQ Affirming Online Mental Health Therapist Teletherapist; LCSW, LPC

Remote role Full-time

Customer Service Representative-Remote (Bilingual: French / English)

Remote role Full-time

Part-time Chat Specialist – arenaflex – College Station, TX

Remote role Full-time