Data Engineer

Must reside in the US – EASTERN TIME ZONE. Telecommuting permitted. Candidates must be eligible to work in the US without sponsorship. This is a full-time position with benefits.


Who we are:

Enterra provides solutions that leverage sophisticated machine learning, artificial intelligence (ontologies, inference engines and rules) and natural language processing to provide highly actionable insights and recommendations to business users. Today, our solutions impact just about every aspect of the products you buy at your local store – from what is available to how it is priced and even where it is placed on the shelf. Our SaaS solutions are deployed within private clouds – principally on Azure. We help transform market-leading companies into true data-driven digital enterprises.


What you will do:

The successful candidate will join a diverse team to:

  • Build unique high-impact business solutions utilizing advanced technologies for use by world class clients.
  • Create and maintain the underlying data pipeline architecture for the solution offerings from raw client data to final solution output.
  • Create, populate, and maintain data structures for machine learning and other analytics.
  • Use quantitative and statistical methods to derive insights from data.
  • Guide the data technology stack used to build Enterra’s solution offerings.
  • Combine machine learning, artificial intelligence (ontologies, inference engines and rules) and natural language processing under a holistic vision to scale and transform businesses — across multiple function and process.


Responsibilities Include:

  • Work with other Enterra personnel to develop and enhance commercial-grade solution offerings:
    • Create and maintain optimal data pipeline architecture, incorporating data wrangling and Extract-Transform-Load (ETL) flows.
    • Assemble large, complex data sets to meet analytical requirements – analytics tables, feature-engineering etc.
    • Build the infrastructure required for optimal, automated extraction, transformation, and loading of data from a wide variety of data sources using SQL and other ‘big data’ technologies such as Databricks.
    • Build automated analytics tools that utilize the data pipeline to derive actionable insights.
    • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
    • Design and develop data integrations and data quality framework.
    • Develop appropriate testing strategies and reports for the solution as well as data from external sources.
    • Evaluate new technology for use within Enterra.
  • Work with other Enterra and client personnel to administer and operate client-specific instances of Enterra’s solution offerings:
    • Configure data pipelines to accommodate client-specific requirements to onboard new clients.
    • Perform regular operations tasks to ingest new and changing data – implement automation where possible.
    • Implement processes and tools to monitor data quality – investigate and remedy any data-related issues in daily solution operations.



  • Minimum of a bachelor’s degree in Computer Science or related field (STEM subjects preferred)
  • 3+ years hands on experience as a data engineer or similar position
  • 3+ years of commercial experience with Python or Scala Programming Language
  • 3+ years of SQL and experience working with relational databases (Postgres preferred)
  • Knowledge of at least one of the following – Databricks, Spark, Hadoop or Kafka
  • Demonstrable knowledge and experience developing data pipelines to automate data processing workflows
  • Demonstrable experience in data modeling
  • Demonstratable knowledge of data warehousing, business intelligence, and application data integration solutions
  • Demonstrable experience in developing applications and services that run
    on a cloud infrastructure (Azure preferred)
  • Excellent problem-solving and communication skills


The following additional skills would be beneficial:

  • Knowledge of one or more of the following technologies: Data Science, Machine Learning, Natural Language Processing, Business Intelligence, and Data Visualization
  • Knowledge of statistics and experience using statistical or BI packages for analyzing large datasets (Excel, R, Python, Power BI, Tableau etc.)
  • Experience with container management and deployment, e.g., Docker and Kubernetes