Data Scientist
Position Summary
Caris Life Sciences is seeking a data scientist to expand, test, and validate a suite of molecular biomarkers aimed to improve the standard of care for patients undergoing treatment for cancer. This is a research role within the Caris signature development program and responsibilities will center on statistical or machine-learning derived predictions of phenotypic treatment response built from the genotypic data types available on Caris molecular sequencing platforms. A successful candidate will have the analytical, code-oriented mindset to create reproducible data science pipelines, and the communication skills to discuss the implications of the scientific results with our medical professionals.
Job Responsibilities
- Work with disease experts to determine the cohort selection, model development, and validation steps that make up a project roadmap for a genetic signature.
- Iteratively develop statistical or machine-learned derived features sets built from Caris genetic sequencing data.
- Communicate the impact and interpretation of predicted clinical outcomes for the targeted disease type.
- Structure queries and organize codebases in a streamlined and reproducible manner.
- Compare novel signatures with baselines derived from the molecular health literature.
- Interface with data engineering and bioinformatics teams to understand the intricacies of underlying datasets.
Required Qualifications
- PhD in Data Science, Computational Biology, Bioinformatics, Engineering, or related scientific field.
- 1-5 years experience in Data Science
- Proficiency in Python.
- Proficiency in data visualization.
- Familiarity with Linux ecosystem, Git, and queries from SQL or related database families.
- Experience with common machine-learning Python libraries such as Sklearn, PyTorch, TensorFlow, Keras, etc.
- Ability to communicate quantifiable results through tables, figures, and plots.
- Proficiency in Microsoft Office Suite, specifically Word, Excel, Outlook, and general working knowledge of Internet for business use.
Preferred Qualifications
- Experience with interpretation of clinical health records including Electronic Health Records, insurance claims data, or patient histories
- OR with bioinformatics pipeline development and genetic file types such as VCF, BAM, FASTQ
- Good code documentation practices and experience with workflow management packages.
- Cloud programming experience, in particular under the AWS Sagemaker ecosystem.
Physical Demands
- Will work at a computer most of the time, with some time spent collaborating with subject matter experts and business group leaders either in person or through remote conferencing.
- Visual acuity and analytical skill to distinguish fine detail.
- Must possess ability to sit and/or stand for long periods of time.
Training
- All job specific, safety, and compliance training are assigned based on the job functions associated with this employee.
Other
- Job may require after-hours response to emergency issues.
- This position may require periodic travel and some evenings, weekends, and/or holidays.