San Jose, California, United States 🇺🇸   (Posted Jul 8 2018)
About the company
At Balbix we build a breach risk dashboard for cybersecurity. Think of us as preventative healthcare for enterprise networks whose vitals we continuously discover, monitor and quantify. Specifically, we model the risk carried by hundreds of attack vectors (e.g. likelihood of breach from phishing), contextualize them to physical assets in the enterprise (e.g. firewalls, laptops, etc.), and prioritize these assets by their relative impact of breach (e.g. active directory servers are more important than display devices). The outcome is a clickable, searchable, and human readable risk dashboard that is relevant, engaging and easy on the eyes.

Data Science at Balbix is embedded into engineering and product functions. We participate in an agile software development cycle with sensors, backend and frontend engineers on a daily basis, provide feedback to product and design teams, and also make time for longer term strategic machine learning projects.


As a data scientist you will get to play with TB-scale data (and growing) from our proprietary sensors (network scanners, traffic monitors, connectors) as well as commercial and openly available threat feeds. With a razor-sharp product focus, you will get to drive a dozen different machine learning problems leveraging deep natural language processing, probabilistic graphical models, network science, search, recommendation systems, and computer vision. In each area, we draw boldly from the latest in machine learning research but are also unafraid to limit ourselves to expert systems or linear regression if the situation demands it.

We stretch ourselves to be generalists, caring as much about storytelling with data, as about bleeding edge algorithms, or scalable model training and deployment. Going forward, we are keen to build a data science culture with equal parts emphasis on knowing our raw data, grokking security first principles, caring about customer needs, explaining our model predictions, deploying them at scale, communicating our work across the company, and adapting the latest advances from arXiv and NIPS.

We look out for each other, enjoy each others’ company, and keep an open channel of communication about all things data and non-data.

Design and develop an ensemble of classical and deep learning algorithms for modeling and understanding the complex interactions between people, software, infrastructure and policies in an enterprise environment

Design and implement algorithms for statistical modeling of enterprise security risk

Implement efficient machine learning algorithms to learn parameters which drive the statistical models

Understand the architecture and the usage of open source software library for numerical computation such as TensorFlow, PyTorch, and ScikitLearn

Good understanding and hands on experience with big data processing frameworks


Master's Degree (preferably Ph.D.) in Computer Science, or Electrical Engineering with hands-on software engineering

Minimum 5 years of experience in the field of Machine Learning

Demonstrable proficiency in coding (Python, Spark is a plus) and programming concepts, combined with enthusiasm and passion to build next generation security and risk analysis platform

Knowledge of state-of-the-art algorithms together with background expertise in statistical analysis and modeling. Deep understanding of core concepts such as NLP, Probabilistic Graphical Model, Deep Learning with graphs structures, model explainability, etc.

A forward thinking personality, a project leader, with good collaboration skills

Solid understanding of probability, statistics and linear algebra

