Stephen Bach

I'm a postdoc at Stanford with Chris Ré. I did my Ph.D. in computer science at the University of Maryland with Lise Getoor.

My latest research is on weakly supervised machine learning, in which the goal is to train models without hand labeled data. With the advent of data-hungry representation learning techniques like deep neural networks, curating labeled training data has replaced feature engineering as the most expensive and time consuming task in machine learning. Weak supervision aims to overcome this bottleneck. I also work on statistical relational learning and information extraction.

Email     CV     Google Scholar     GitHub


I'm seeking an academic position to start in Fall, 2018. Please find my materials here: CV, research statement, teaching statement, and diversity and inclusion statement. Thank you for your interest!


News


Projects
Snorkel is a framework for creating noisy training labels for machine learning. It uses statistical methods to combine weak supervision sources like heuristic rules and task-related data sets, i.e., distant supervision, which are far less expensive to use than hand labeling data. With the resulting estimated labels, users can train many kinds of state-of-the-art models. Snorkel is used at numerous technology companies and research labs, as well as agencies like the FDA.
Probabilistic soft logic is a formalism for building statistical models over relational data like knowledge bases and social networks. PSL programs define hinge-loss MRFs, a type of probabilistic graphical model that admits fast, convex optimization for MAP inference, which makes them very scalable. Researchers around the world have used PSL for bioinformatics, computational social science, natural language processing, information extraction, and computer vision.

2017
Hinge-Loss Markov Random Fields and Probabilistic Soft Logic
Stephen H. Bach, Matthias Broecheler, Bert Huang, and Lise Getoor
Journal of Machine Learning Research, 18(109):1-67, 2017
[bibtex] [code]
@article{bach:jmlr17,
  Author = {Bach, Stephen H. and Broecheler, Matthias and Huang, Bert and Getoor, Lise},
  Journal = {Journal of Machine Learning Research (JMLR)},
  Title = {Hinge-Loss {M}arkov Random Fields and Probabilistic Soft Logic},
  Volume = {18},
  Number = {109},
  Pages = {1--67},
  Year = {2017}}
Snorkel: Rapid Training Data Creation with Weak Supervision
Alexander Ratner, Stephen H. Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré
Proceedings of the VLDB Endowment, 11(3):269-282, 2017
[bibtex]
@article{ratner:vldb17
  Author = {Ratner, Alexander J. and Bach, Stephen H. and Ehrenberg, Henry E. and R{\'e}, Christopher,},
  Journal = {Proceedings of the VLDB Endowment},
  Title = {Snorkel: {R}apid Training Data Creation with Weak Supervision},
  Volume = {11},
  Number = {3},
  Pages = {269--282},
  Year = {2017}}
Soft Quantification in Statistical Relational Learning
Golnoosh Farnadi, Stephen H. Bach, Marie-Francine Moens, Lise Getoor, and Martine De Cock
Machine Learning 2017
[bibtex]
@article{farnadi:ml17,
  Author = {Farnadi, Golnoosh and Bach, Stephen H. and Moens, Marie-Francine and Getoor, Lise and De Cock, Martine},
  Journal = {Machine Learning},
  Title = {Soft Quantification in Statistical Relational Learning},
  Year = {2017}}
Learning the Structure of Generative Models without Labeled Data
Stephen H. Bach, Bryan He, Alexander Ratner, and Christopher Ré
International Conference on Machine Learning (ICML) 2017
[bibtex] [slides] [poster]
@inproceedings{bach:icml17,
  Author = {Bach, Stephen H. and He, Bryan and Ratner, Alexander and R\'e, Christopher},
  Booktitle = {International Conference on Machine Learning (ICML)},
  Title = {Learning the Structure of Generative Models without Labeled Data},
  Year = {2017}}
Snorkel: Fast Training Set Generation for Information Extraction
Alexander J. Ratner, Stephen H. Bach, Henry R. Ehrenberg, and Christopher Ré
SIGMOD 2017 Demo
[bibtex]
@article{ratner:sigmoddemo17,
  Author = Ratner, Alexander J. and Bach, Stephen H. and Ehrenberg, Henry E. and R{\'e}, Christopher,
  Title = Snorkel: {F}ast Training Set Generation for Information Extraction,
  Volume = {ACM SIGMOD Conference on Management of Data (SIGMOD) Demonstration},
  Year = {2017}}
2016
Interpretable Decision Sets: A Joint Framework for Description and Prediction
Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2016
[bibtex] [code]
@inproceedings{lakkaraju:kdd16,
  Author = {Lakkaraju, Himabindu and Bach, Stephen H. and Leskovec, Jure},
  Title = {Interpretable Decision Sets: {A} Joint Framework for Description and Prediction},
  Booktitle = {ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)},
  Year = {2016}}
2015
Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction
Ph.D. Dissertation, University of Maryland
Larry S. Davis Doctoral Dissertation Award
[bibtex]
@phdthesis{bach:thesis15,
  title    = {Hinge-Loss {M}arkov Random Fields and Probabilistic Soft Logic: {A} Scalable Approach to Structured Prediction},
  school   = {University of Maryland, College Park},
  author   = {Bach, Stephen H.},
  year     = {2015}}
Paired-Dual Learning for Fast Training of Latent Variable Hinge-Loss MRFs
Stephen H. Bach*, Bert Huang*, Jordan Boyd-Graber, and Lise Getoor
International Conference on Machine Learning (ICML) 2015
[bibtex] [slides] [supplementary] [poster]
@inproceedings{bach:icml15,
  author       = "Bach, Stephen H. and Huang, Bert and Boyd-Graber, Jordan and Getoor, Lise",
  title        = "Paired-Dual Learning for Fast Training of Latent Variable Hinge-Loss MRFs",
  booktitle    = "International Conference on Machine Learning (ICML)",
  year         = "2015"}
Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees
Stephen H. Bach, Bert Huang, and Lise Getoor
Artificial Intelligence and Statistics (AISTATS) 2015
Selected for oral presentation, 6% of submitted papers (27/442)
[bibtex] [slides] [supplementary] [code]
@inproceedings{bach:aistats15,
  author       = "Bach, Stephen H. and Huang, Bert and Getoor, Lise",
  title        = "Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees",
  booktitle    = "Artificial Intelligence and Statistics (AISTATS)",
  year         = "2015"}
Statistical Relational Learning with Soft Quantifiers
Golnoosh Farnadi, Stephen H. Bach, Marjon Blondeel, Marie-Francine Moens, Lise Getoor, Martine De Cock
International Conference on Inductive Logic Programming (ILP) 2015
Best Student Paper Award
[bibtex]
@InProceedings{farnadi:ilp15,
  author       = "Farnadi, Golnoosh and Bach, Stephen H. and Blondeel, Marjon and Moens, Marie-Francine and Getoor, Lise and De Cock, Martine",
  title        = "Statistical Relational Learning with Soft Quantifiers",
  booktitle    = "International Conference on Inductive Logic Programming (ILP)",
  year         = "2015"}
2014
Rounding Guarantees for Message-Passing MAP Inference with Logical Dependencies
Stephen H. Bach, Bert Huang, and Lise Getoor
NIPS Workshop on Discrete and Combinatorial Problems in Machine Learning (DISCML) 2014
[bibtex] [poster]
@inproceedings{bach:discml14,
  author       = "Bach, Stephen H. and Huang, Bert and Getoor, Lise",
  title        = "Rounding Guarantees for Message-Passing MAP Inference with Logical Dependencies",
  booktitle    = "NIPS Workshop on Discrete and Combinatorial Problems in Machine Learning (DISCML)",
  year         = "2014"}
Probabilistic Soft Logic for Social Good
Stephen H. Bach, Bert Huang, and Lise Getoor
KDD Workshop on Data Science for Social Good 2014
[bibtex] [poster]
@InProceedings{bach:dssg14,
  author       = "Bach, Stephen H. and Huang, Bert and Getoor, Lise",
  title        = "Probabilistic Soft Logic for Social Good",
  booktitle    = "KDD Workshop on Data Science for Social Good",
  year         = "2014"}
Extending PSL with Fuzzy Quantifiers
Golnoosh Farnadi, Stephen H. Bach, Marie-Francine Moens, Lise Getoor, and Martine De Cock
International Workshop on Statistical Relational Artificial Intelligence (StaRAI) 2014
[bibtex]
@InProceedings{farnadi:starai14,
  author       = "Farnadi, Golnoosh and Bach, Stephen H. and Moens, Marie-Francine and Getoor, Lise and De Cock, Martine",
  title        = "Extending PSL with Fuzzy Quantifiers",
  booktitle    = "International Workshop on Statistical Relational Artificial Intelligence (StaRAI)",
  year         = "2014"}
2013
Hinge-loss Markov Random Fields: Convex Inference for Structured Prediction
Stephen H. Bach, Bert Huang, Ben London, and Lise Getoor
Uncertainty in Artificial Intelligence (UAI) 2013
[bibtex] [poster] [code]
@inproceedings{bach:uai13,
  author = "Bach, Stephen H. and Huang, Bert and London, Ben and Getoor, Lise",
  title = "Hinge-loss {M}arkov Random Fields: {C}onvex Inference for Structured Prediction",
  booktitle = "{Uncertainty in Artificial Intelligence (UAI)}",
  year = 2013}
Large-Margin Structured Learning for Link Ranking
Stephen H. Bach, Bert Huang, and Lise Getoor
NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications 2013
Best Student Paper Award
[bibtex] [poster] [code]
@inproceedings{bach:fna13,
  author       = "Bach, Stephen H. and Huang, Bert and Getoor, Lise",
  title        = "Large-Margin Structured Learning for Link Ranking",
  booktitle    = "NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications",
  year         = "2013"}
Learning Latent Groups with Hinge-loss Markov Random Fields
Stephen H. Bach, Bert Huang, and Lise Getoor
ICML Workshop on Inferning: Interactions between Inference and Learning 2013
[bibtex] [poster]
@InProceedings{bach:inferning13,
  author = "Bach, Stephen H. and Huang, Bert and Getoor, Lise",
  title = "Learning Latent Groups with Hinge-Loss {M}arkov Random Fields",
  booktitle = "{ICML Workshop on Inferning: Interactions between Inference and Learning}",
  year = 2013}
Collective Activity Detection using Hinge-Loss Markov Random Fields
Ben London, Sameh Khamis, Stephen H. Bach, Bert Huang, Lise Getoor, and Larry Davis
CVPR Workshop on Structured Prediction: Tractability, Learning and Inference 2013
[bibtex]
@InProceedings{london:sptli13,
  author = "London, Ben and Khamis, Sameh and Bach, Stephen H. and Huang, Bert and Getoor, Lise and Davis, Larry",
  title = "Collective Activity Detection using Hinge-loss {M}arkov Random Fields",
  booktitle = "{CVPR Workshop on Structured Prediction: Tractability, Learning and Inference}",
  year = 2013}
2012
Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization
Stephen H. Bach, Matthias Broecheler, Lise Getoor, and Dianne P. O’Leary
Advances in Neural Information Processing Systems (NIPS) 2012
[bibtex] [supplementary] [poster] [code]
@inproceedings{bach:nips12,
  author = "Bach, Stephen H. and Broecheler, Matthias and Getoor, Lise and O'Leary, Dianne P.",
  title = "Scaling {MPE} Inference for Constrained Continuous {M}arkov Random Fields",
  booktitle = "{Advances in Neural Information Processing Systems (NIPS)}",
  year = 2012}
A Short Introduction to Probabilistic Soft Logic
Angelika Kimmig, Stephen H. Bach, Matthias Broecheler, Bert Huang, and Lise Getoor
NIPS Workshop on Probabilistic Programming: Foundations and Applications 2012
[bibtex]
@inproceedings{kimmig:probprog12,
  author = "Kimmig, Angelika and Bach, Stephen H. and Broecheler, Matthias and Huang, Bert and Getoor, Lise",
  title = "A Short Introduction to Probabilistic Soft Logic",
  booktitle = "{NIPS Workshop on Probabilistic Programming: Foundations and Applications}",
  year = 2012}
Social Group Modeling with Probabilistic Soft Logic
Bert Huang, Stephen H. Bach, Eric Norris, Jay Pujara, and Lise Getoor
NIPS Workshop on Social Network and Social Media Analysis: Methods, Models, and Applications 2012
[bibtex]
@inproceedings{huang:snsma12,
  author = "Huang, Bert and Bach, Stephen H. and Norris, Eric and Pujara, Jay and Getoor, Lise",
  title = "Social Group Modeling with Probabilistic Soft Logic",
  booktitle = "{NIPS Workshop on Social Network and Social Media Analysis: Methods, Models, and Applications}",
  year = 2012}
Graph Summarization in Annotated Data Using Probabilistic Soft Logic
Alex Memory, Angelika Kimmig, Stephen H. Bach, Louiqa Raschid, and Lise Getoor
International Workshop on Uncertainty Reasoning for the Semantic Web (URSW) 2012
[bibtex]
@inproceedings{memory:ursw12,
  author = "Memory, Alex and Kimmig, Angelika and Bach, Stephen H. and Raschid, Louiqa and Getoor, Lise",
  title = "Graph Summarization in Annotated Data Using Probabilistic Soft Logic",
  booktitle = "{International Workshop on Uncertainty Reasoning for the Semantic Web (URSW)}",
  year = 2012}
Older
A Bayesian Approach to Concept Drift
Stephen H. Bach and Marcus A. Maloof
Advances in Neural Information Processing Systems (NIPS) 2010
[bibtex] [poster]
@inproceedings{bach:nips10,
  author = "Bach, Stephen H. and Maloof, Marcus A.",
  title = "A {B}ayesian Approach to Concept Drift",
  booktitle = "{Advances in Neural Information Processing Systems (NIPS)}",
  year = 2010}
Decision-Driven Models with Probabilistic Soft Logic
Stephen H. Bach, Matthias Broecheler, Stanley Kok, and Lise Getoor
NIPS Workshop on Predictive Models in Personalized Medicine 2010
[bibtex] [poster]
@InProceedings{bach:pmpm10,
  author = "Bach, Stephen H. and Broecheler, Matthias and Kok, Stanley and Getoor, Lise",
  title = "Decision-Driven Models with Probabilistic Soft Logic",
  booktitle = "{NIPS Workshop on Predictive Models in Personalized Medicine}",
  year = 2010}
Paired Learners for Concept Drift
Stephen H. Bach* and Marcus A. Maloof*
IEEE International Conference on Data Mining (ICDM) 2008
[bibtex]
@inproceedings{bach:icdm08,
  author = "Bach, Stephen H. and Maloof, Marcus A.",
  title = "Paired Learners for Concept Drift",
  booktitle = "{IEEE International Conference on Data Mining (ICDM)}",
  year = 2008}


( * Equal Contributors)