AI/ML Job: Computational Scientist I, Genome Aggregation Database

Broad Institute of Harvard and MIT

Computational Scientist I, Genome Aggregation Database at Broad Institute of Harvard and MIT

🇺🇸 United States › Massachusetts › Cambridge
  (Posted Nov 5 2021)

About the company
As part of the methods development team in the Translational Genomics Group, you will have the opportunity to make substantial contributions to high-impact projects with direct implications for clinical practice, as well as to participate in the vibrant research environment at the Broad, with its close links to MIT, Harvard, and the Harvard-affiliated hospitals across Boston. You will have access to data sets of extraordinary scale and to colleagues with deep expertise in genetics, computational biology, software development, and machine learning. The responsibilities of this role align closely with the mission of the Broad to transform medicine and human health through cross-disciplinary collaboration and the development of pioneering technologies to analyze scientific data on an unprecedented scale.

We are an Agile team running production and development in a Scrum framework, and we care deeply about managing our work well, maintaining healthy work-life boundaries, and investing in the professional growth of our team members. You will have access to Broad’s thoughtful and well-resourced leadership development and management training programs, in addition to a generous vacation policy and benefits package. We operated on a hybrid remote/in-office work schedule even before the pandemic, and we expect to continue this model moving forward and are able to accommodate any candidates living within the New England region.

Job description
Since 2016, the Genome Aggregation Database (gnomAD) has been a pioneer in human genomic data aggregation through the regular public release of data for a rapidly growing collection of exomes and genomes sampled from diverse populations across the globe. gnomAD is the default resource used in virtually every clinical variant interpretation pipeline today, and our browser has generated over 39 million page views to date, with tens of thousands of regular monthly users.

We are seeking a creative, self-motivated candidate at the PhD level to play a critical role in designing and developing fast, automated, open-source computational pipelines to produce high-quality public data releases for forthcoming — and exponentially growing — datasets in gnomAD. The role will involve close collaboration with scientists across the Broad to develop novel approaches for quality control and analysis of our highly heterogeneous datasets at exceptional scale, as well as the eventual supervision of associate computational scientists in the group who will be assigned to work alongside the candidate. The candidate will also have the opportunity to interact closely with Hail developers at the Broad to play a role in the feature design of the field’s most cutting-edge toolkit for massively parallel, high-throughput computation of genetic data. As this role involves collaboration with a wide variety of staff across disciplines, including computational scientists, academic trainees, software engineers, biologists, and clinical geneticists, we are specifically looking for a candidate who works well in teams.

Growing a strong team with a diversity of life experiences and backgrounds, who foster a culture of continual learning and who support the growth and success of one another, is key to our success. We are therefore committed to seeking applications from women and from underrepresented groups.

Career development opportunities for this role:

Supervising and mentoring associate computational staff scientist(s) assigned to work on gnomAD releases, including weekly check-ins, quarterly performance reviews, and discussions on career development. Management training and hands-on mentorship in this area will be provided, depending on candidate’s previous experience managing others
Setting concrete objectives and tasks, professional standards, and expectations for associate staff scientist(s); helping them to prioritize tasks, troubleshoot technical issues, locate resources (people and tools), and manage relationships with collaborators
Handling general supervisory/HR administrative tasks for associate staff, including approving vacations and expense reports, writing annual performance reviews, and making recommendations for promotions and salary increases

Characteristics and Qualifications:
The role will require an independent and highly motivated candidate with the ambition to maintain and develop a significant and sophisticated body of code that is used on a regular basis to produce large, public data releases with a highly active and invested user community.
You will have domain expertise in computational methods for analyzing next-generation sequencing data, as well as an interest in the technical aspects of deploying these methods at scale.

We are looking for someone who:
Is able to write clean, efficient, robust, and usable code, with demonstrated proficiency in one of the following: Unix/Linux, Python, Java, C++, Matlab, or R, with a strong preference for Python, Unix/Linux, and R
Has a Ph.D. in mathematics, computer science, engineering, physics, mathematics, statistics, biology, or another related field; or equivalent professional experience
Has demonstrated experience in quantitative (statistical, mathematical, computational) research with large data sets; skill and experience with statistical analysis and/or computational biology is strongly preferred — with special consideration for individuals with prior experience using the Hail Python library
Has fluency with human genetics and next-generation sequencing data; ideally will have prior experience with the quality control of such datasets
Exhibits strong initiative and the ability to take ownership of complex projects and interest in the management and development of a team
Cares passionately about the quality of his/her work and demonstrates zealous attention to detail; is curious and tenacious about investigating anomalies in data
Is familiar with Git and modern team-based software development practices, including peer code review through pull requests
Listens, communicates, and collaborates well with team members, clinicians, software developers, and research scientists; is receptive to feedback and willing to provide constructive feedback to others; demonstrates kindness to others
Demonstrates excellent written and oral presentation skills
Manages time well and is able to respond to shifting priorities in a fast-paced and rapidly changing environment


Company: Broad Institute of Harvard and MIT

Broad Institute of Harvard and MIT
job info / career page
Linkedin profile
Location: Cambridge, Massachusetts, United States

map of company location

Skills wanted for this job:
git java linux matlab python
r unix

Other machine learning jobs that might be interesting

Machine learning job Machine Learning Engineer at Machine Learning Engineer -   (November 2021)
Worldwide, 100% Remote
About you You've worked in ML software engineering for the past 3+ years, and have a DevOps approach to development. You have a keen interest in the emerging MLOps field, and are up-to-date with the current developments. You are passionate about programming, a...
Machine learning job Data Science and Machine Learning Specialist at Hawk-Research Data Science and Machine Learning Specialist - Hawk-Research
Worldwide, 100% Remote - Salary: 18000-25000
Hawk-Research is looking for Data Science specialists with knowledge of Data Mining, Machine Learning, Electronics Engineering to provide assistance with our projects in academic research sphere. We are building a knowledge sharing platform to help people durin...

Machine learning job Engineering Manager, Platform & Text Extraction at OM1 Engineering Manager, Platform & Text Extraction - OM1   (December 2021)
Boston, Massachusetts, United States
OM1 is on a mission to improve health outcomes by unlocking the power of data. We are a healthcare data and technology company focused on real world clinical data and outcomes, accelerating medical research and personalizing healthcare. Our interdisciplinary te...
Machine learning job Senior AI Engineer at Gro Intelligence Senior AI Engineer - Gro Intelligence   (December 2021)
New York, New York, United States
Gro Intelligence is tackling two of the biggest problems facing the world today: food security and climate change. We understand and quantify the complex interplay between food, weather, trade, agriculture, and macroeconomic conditions in a world upended by cli...
Machine learning job Senior Data Scientist/Bioinformatician at Netrias Senior Data Scientist/Bioinformatician - Netrias   (November 2021)
Cambridge, Massachusetts, United States (Remote work possible)
Netrias is a fast-growing Artificial Intelligence (AI) company that specializes in the application of machine learning models for biological applications such as genome engineering and drug discovery. Netrias is committed to creating a diverse environment and ...
Not the machine learning job you are looking for?
Browse all machine learning jobs and we're sure you will find a suitable one!
Browse machine learning jobs with similar combinations of skills
Machine learning jobs with git Machine learning jobs with java Machine learning jobs with linux Machine learning jobs with matlab Machine learning jobs with python Machine learning jobs with r Machine learning jobs with unix