Job ID: 1-310592
Department: Information Technology
Location: Corporate Headquarters – Springfield, MO or Remote
Status: Full-Time
Tentative Schedule: This is a full-time opportunity; Monday-Friday, 8 a.m.-5 p.m.
The Data Engineer II is responsible for designing, evaluating, and creating systems to support data science projects across the O’Reilly organization, as well as expanding and optimizing our data and data pipeline architecture. This includes data cleansing, preparation, and ETL. The ideal candidate will identify and work with the appropriate technology and software engineering solutions to facilitate machine learning and analytic pipeline deployment.

Essential Job Functions


Move, structure, encode, and condense data from disparate database systems and formats.        
Identify, design, and implement internal process improvements such as, automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability.              
Create data tools for analytics and data science team members that assist them in building and optimizing solutions to become an innovative industry leader.       
Evaluate performance of machine learning systems and work with data scientists to improve quality.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.           
Develop software solutions with a focus on maintainability and modularity. 

Other Job Functions


Experience performing root cause analysis on data and processes to answer specific business questions and identify opportunities for improvement.                                        
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Advanced working SQL knowledge and experience working with relational databases, query authoring and familiarity with a variety of databases.
Uses best practices to develop statistical and machine learning techniques to address business needs.           
Experience supporting and working with cross-functional teams in a dynamic environment.
Analyze and influence technical, system, and/or user requirements. 

Skills and Qualifications



Bachelor’s degree.
2+ years of practical experience with ETL, data processing, database programming and data analytics.
Strong knowledge of Python; including Pandas, Numpy, SciKit-Learn, and experience with notebooks such as Jupyter.
Demonstrable knowledge of software design and engineering best practices.
Experience working with large-scale distributed data systems.
Excellent written and verbal communication skills.
Desire to work in a dynamic and collaborative environment. 



Experience with machine learning and modeling techniques such as regressions, clustering, classification, random forests, gradient boosting, etc., including optimization of models and statistical evaluation of models.
Experience with statistical analysis of data. 


All full time team members are eligible for a benefits package that is designed to offer convenience and security to our team members and their families. Programs, resources and benefit eligibility varies based on employment status, average hours worked, location and length of service. For detailed benefits info, please click here or type http://bit.ly/ORLYBenefits in your browser.