You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineer

In this role, the chosen candidate will work with a team of scientists tasked with identifying, developing, and deploying data-rich tools and methodologies aimed at improving the manner in which process knowledge is gathered, analyzed, and applied. The technologies we develop are as diverse as the team developing them, and in this Senior Scientist role, the chosen candidate will help to establish an end-to-end cloud-based machine learning pipeline, leveraging data contextualization, to dramatically streamline our digital workflows in our Process Research and Development portfolio.

 

Our Data Rich Experimentation organization is responsible for the invention and application of new tools to support the scientists across process research & development. We aspire to embed data-intense tools into the fabric of the organization’s process development culture. This Senior Scientist role is a scientific position tasked with solving complex process research & development challenges in an interdisciplinary, collaborative environment via invention, development, and application of cutting-edge bioinformatics technologies. The Data Rich Experimentation group is actively driving digital transformation within the Process Research & Development community and aspires to embed data-intense tools into the fabric of the organization’s process development culture.

 

We are seeking Data Engineers who are passionate about connecting data streams to enable emerging digital technologies. As a Data Engineer within Process Research and Development, you’ll have the opportunity to be on the forefront of driving a major transformation. This role will allow the chosen candidate to collaborate with and across process development teams to design, develop, test, implement, and support technical solutions in full-stack development tools and technologies.  They will utilize programming languages like Java, Scala, Python, Spark, and No/SQL to design and implement non/relational databases  also leveraging cloud-based data warehousing tools such as Redshift and Snowflake. Successful candidates will share their passion for staying on top of technology trends, experimenting with and learning new tools, participating in internal & external technology communities, and mentoring other members of our growing digital community. The candidate will be expected to deliver robust cloud-based solutions that drive and accelerate drug development with our partners in Process Research and Development.

 

In addition to a passion for data-rich technologies, the chosen candidate should have excellent interpersonal, communication, and collaboration skills. The chosen candidate should embrace and model our core values of diversity, equity and inclusion, including fostering a supportive culture where all can thrive. The chosen candidate should be able to effectively collaborate in a dynamic, integrated, and multidisciplinary team environment. The chosen candidate should demonstrate a clear ability to perform impactful scientific innovation in a team-oriented manner that builds trusted partnerships across vast stakeholder networks. The chosen candidate should have a clear, demonstrated ability to publish and present research, including an established track record of interaction with the broader academic community.

 

As such, the chosen candidate will join a diverse group of scientific problem solvers who are dedicated to creating the life-changing medicines and vaccines of tomorrow.

 

Education Minimum Requirement: 

A B.S. in Computer Science, or a closely-related field plus 4 years experience.

An M.S. in Computer Science or a closely related field plus 2 years of experience.

A Ph.D. in Computer Science or closely related field.

 

Required Experience and Skills: 

At least 2 years of experience in big data technologies

At least 1 year experience with cloud computing (AWS, Microsoft Azure, Google Cloud)

Background and experience in data science, data engineering, and machine learning/artificial intelligence.

Highly-motivated and technology-centric scientist that is passionate about modernizing process development practices across biologics, vaccines, and small molecule modalities.

Demonstrated scientific ability through publications and presentations in scientific conferences.

Excellent communication skills, demonstrated creativity, and effective interpersonal skills.

Ability to deliver complex solutions under compressed timelines in a dynamic environment.

Ability to work in a team environment with cross-functional interactions.

 

Preferred Experience and Skills:

Experience with a public cloud (AWS, Microsoft Azure, Google Cloud)

Experience with Distributed data/computing tools (MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL)

Background in leveraging a broad range of machine learning and data science tools, including development and application of deep learning methodologies for predictive capabilities.

Expertise in data analysis, including machine learning/artificial intelligence, chemometrics, statistical analysis, and multivariate data analysis.

Utilization of tools within computational biology for processing, analysis, and modeling of data generated by multi-omics approaches.

Evidence of cross-functional collaboration in an academic or industrial setting.

Experience in research efforts focused on vaccines, biological molecules, bioconjugates, or general large molecules.

Motivated to learn new skills, willingness to take on new challenges, and scientific curiosity.