Data Engineer / Machine Learning Engineer at Threadloom
🇺🇸 United States › California › Palo Alto (Posted Jul 1 2019)
About the company
We curate community content and tell contextual stories. Our curation services are used by 1,000+ forums reaching millions of people. We started in 2015 and have 2 locations – Palo Alto and Bellevue. Our team comes from a mix of startup and big tech backgrounds, but we all share a desire to build a better Internet. We are Stanford StartX alumni (2016).
Threadloom is looking for an experienced data engineer with strong machine learning experience.
This is a foundational role. You will be Threadloom's first engineer solely responsible for building and extending our processing pipelines. Working closely with Product, Ops and Eng it will be your job to design and develop the data warehouses used by all of our services and products. This includes ownership of the processing of billions of documents that power Threadloom Search and Newsletter and upcoming consumer products.
The ideal candidate is passionate about building large-scale, high-volume pipelines that manage and store mission-critical data. This person is conversant with current cloud platforms for parallel processing and storage, and can easily translate product and user requirements to data requirements for storing and managing data. They should also have experience with building machine learning models which classify and rank content and predict user preferences.
The ideal candidate also cares about our end users and is a careful steward of their data, so is also comfortable with modern user privacy standards and has experience applying them in real-world situations.
Skills & requirements
3+ years of relevant work experience
Launching consumer products that people love, at scale
Designing and implementing data pipelines and warehouses
Optimizing servers and pipelines to manage operational costs at scale
Building systems to handle user authentication and PII (e.g. Firebase, OAuth, GDPR)
Deploying production cloud services (e.g., Google Cloud, AWS, Azure)
Languages and tools
Python required, Scala/Java desired
Fluency with the latest tools, libraries, and infrastructure for building and maintaining production-level data pipelines and storage, including
distributed data processing frameworks (e.g., Hadoop, Spark, Flink, Apache Beam)
SQL and NoSQL databases (e.g., MySQL, Postgres, Cassandra, Redis)
stream processing frameworks (e.g., Kafka, Storm, Spark)
search engines (Elastic, Solr)
Built & launched ML models in a production environment
Scaling experimental models from proof-of-concept to live products that handle large-scale data
Comfortable building scalable backends, RESTful web services and APIs
Other machine learning jobs that might be interesting
Machine Learning Research Scientist (Intern) - Intel (March 2021)
Remote US/Canada, 100% Remote
Job Category: Intern/Student
Primary Location: Santa Clara, CA US
Virtual US and Canada
Intel develops best in class graphics and GPGPU technology, that is a critical part of our major product lines. We are looking for graduate-level resear...
Senior AI Engineer - LinkedIn (March 2021)
Sunnyvale, California, United States
LinkedIn was built to help professionals achieve more in their careers, and everyday millions of people use our products to make connections, discover opportunities and gain insights. Our global reach means we get to make a direct impact on the world’s workfo...
2021 Machine Learning for Controls Internship - Blue River Technology (March 2021)
Sunnyvale, California, United States
Blue River Technology serves the agricultural industry by designing and building advanced farm machines that utilize computer vision and machine learning to enable farmers to understand and manage every plant. These machines help farmers to improve profitabilit...
Machine Learning / AI Internship - Apple (March 2021)
Cupertino, California, United States
Apple is seeking highly qualified people for the position of AI/ML Engineer and AI/ML Researcher. The team pursues research & development in the areas of machine learning (ML) with particular focus on deep learning (DL), computer vision (CV), Natural Language P...
Data Science Architect - Okta (March 2021)
Remote US, 100% Remote
Data Science Architect (Remote Eligible)
This is an opportunity to join our fast-growing Data Science team to spearhead the development of cutting-edge machine learning models and add to our product offerings in security, authentication, applications, and cus...