PyJuice is a new system for training and inference of probabilistic circuits that outperforms prior baselines in speed, memory efficiency, and reproducibility. Benchmarking on PC and HCLT structures across datasets like ImageNet32, PyJuice consistently delivers stronger results—achieving 4.33 bpd vs. 4.82 bpd reported in past work. While not aiming for record-breaking scores, the framework establishes reproducible baselines and opens the door for scalable research on tractable deep generative models.PyJuice is a new system for training and inference of probabilistic circuits that outperforms prior baselines in speed, memory efficiency, and reproducibility. Benchmarking on PC and HCLT structures across datasets like ImageNet32, PyJuice consistently delivers stronger results—achieving 4.33 bpd vs. 4.82 bpd reported in past work. While not aiming for record-breaking scores, the framework establishes reproducible baselines and opens the door for scalable research on tractable deep generative models.

The Future of Tractable Deep Generative Models

2025/08/25 07:11
10 min read
For feedback or concerns regarding this content, please contact us at [email protected]

Abstract and 1. Introduction

  1. Preliminaries and Related Work

  2. Key Bottlenecks in PC Parallelization

  3. Harnessing Block-Based PC Parallelization

    4.1. Fully Connected Sum Layers

    4.2. Generalizing To Practical Sum Layers

    4.3. Efficient Implementations by Compiling PC Layers

    4.4. Analysis: IO and Computation Overhead

  4. Optimizing Backpropagation with PC Flows

  5. Experiments

    6.1. Faster Models with PyJuice

    6.2. Better PCs At Scale

    6.3. Benchmarking Existing PCs

  6. Conclusion, Acknowledgements, Impact Statement, and References

A. Algorithm Details

B. Additional Technical Details

C. Experimental Details

D. Additional Experiments

\

6.3. Benchmarking Existing PCs

\ \

\ \ \ We adopt two PD structures (i.e., PD-mid with 107M edges and PD-large with 405M edges) as well as two HCLT structures (i.e., HCLT-mid with 40M edges and HCLT-large with 174M edges). Details of the adopted models are described in Appendix C.4. We experiment with different optimization strategies and adopt full-batch EM as it yields consistently better performance across models and datasets. Specifically, the computed PC flows are accumulated across all samples in the training set before doing one EM step.

\ Results are shown in Table 3. Notably, we achieve better results compared to previous papers. For example, Liu et al. (2023a) reports 4.82 bits-per-dimension (bpd) for HCLT on ImageNet32, while we achieved 4.33 bpd. The performance improvements stem from more training epochs and the ability to do more hyperparameter search thanks to the speedup. We highlight that the goal of this section is not to set new records for tractable deep generative models, but to establish a set of baselines that can be easily reproduced to track the progress of developments in PC modeling and learning. In Appendix C.4, we include additional benchmark results on the WikiText dataset (Merity et al., 2016).

7. Conclusion

We proposed PyJuice, a novel system that supports training and inference of probabilistic circuits. PyJuice is orders of magnitude faster and much more memory efficient than even very recent baselines. We hope PyJuice can boost future research on tractable deep generative models by allowing for efficient training of large-scale architectures.

Acknowledgements

This work was funded in part by the DARPA PTG Program under award HR00112220005, the DARPA ANSR program under award FA8750-23-2-0004, and the NSF grant #IIS1943641. We thank Honghua Zhang, Pasha Khosravi, and Poorva Garg for providing valuable feedback during the development of PyJuice.

Impact Statement

This paper presents work whose goal is to advance the field of Machine Learning. There are many potential societal consequences of our work, none which we feel must be specifically highlighted here.

References

Ahmed, K., Teso, S., Chang, K.-W., Van den Broeck, G., and Vergari, A. Semantic probabilistic layers for neurosymbolic learning. In Advances in Neural Information Processing Systems 35 (NeurIPS), 2022a.

\ Ahmed, K., Wang, E., Chang, K.-W., and Van den Broeck, G. Neuro-symbolic entropy regularization. In Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence (UAI), 2022b.

\ Ahmed, K., Chang, K.-W., and Van den Broeck, G. A pseudo-semantic loss for deep autoregressive models with logical constraints. In Advances in Neural Information Processing Systems 36 (NeurIPS), 2023a.

\ Ahmed, K., Zeng, Z., Niepert, M., and Van den Broeck, G. Simple: A gradient estimator for k-subset sampling. In Proceedings of the International Conference on Learning Representations (ICLR), 2023b.

\ Choi, Y., Vergari, A., and Van den Broeck, G. Probabilistic circuits: A unifying framework for tractable probabilistic models. techreport, 2020. URL http://starai.cs. ucla.edu/papers/ProbCirc20.pdf.

\ Choi, Y., Dang, M., and Van den Broeck, G. Group fairness by probabilistic modeling with latent fair decisions. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021.

\ Correia, A., Peharz, R., and de Campos, C. P. Joints in random forests. Advances in Neural Information Processing Systems, 33:11404–11415, 2020.

\ Correia, A. H., Gala, G., Quaeghebeur, E., de Campos, C., and Peharz, R. Continuous mixtures of tractable probabilistic models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 7244–7252, 2023.

\ Dadu, V., Weng, J., Liu, S., and Nowatzki, T. Towards general purpose acceleration by exploiting common datadependence forms. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 924–939, 2019.

\ Dang, M., Vergari, A., and Van den Broeck, G. Strudel: Learning structured-decomposable probabilistic circuits. In International Conference on Probabilistic Graphical Models, pp. 137–148. PMLR, 2020.

\ Dang, M., Khosravi, P., Liang, Y., Vergari, A., and Van den Broeck, G. Juice: A julia package for logic and probabilistic circuits. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 16020–16023, 2021.

\ Dang, M., Liu, A., and Van den Broeck, G. Sparse probabilistic circuits via pruning and growing. Advances in Neural Information Processing Systems, 35:28374– 28385, 2022.

\ Darwiche, A. A logical approach to factoring belief networks. KR, 2:409–420, 2002.

\ Darwiche, A. A differential approach to inference in bayesian networks. Journal of the ACM (JACM), 50 (3):280–305, 2003.

\ Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.

\ Gala, G., de Campos, C., Peharz, R., Vergari, A., and Quaeghebeur, E. Probabilistic integral circuits. In International Conference on Artificial Intelligence and Statistics, pp. 2143–2151. PMLR, 2024.

\ Gens, R. and Pedro, D. Learning the structure of sumproduct networks. In International conference on machine learning, pp. 873–880. PMLR, 2013.

\ Lin, B. Y., Zhou, W., Shen, M., Zhou, P., Bhagavatula, C., Choi, Y., and Ren, X. Commongen: A constrained text generation challenge for generative commonsense reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1823–1840, 2020.

\ Liu, A. and Van den Broeck, G. Tractable regularization of probabilistic circuits. Advances in Neural Information Processing Systems, 34:3558–3570, 2021.

\ Liu, A., Mandt, S., and Van den Broeck, G. Lossless compression with probabilistic circuits. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.

\ Liu, A., Zhang, H., and Van den Broeck, G. Scaling up probabilistic circuits by latent variable distillation. In Proceedings of the International Conference on Learning Representations (ICLR), 2023a.

\ Liu, A., Niepert, M., and Van den Broeck, G. Image inpainting via tractable steering of diffusion models. 2024.

\ Liu, X., Liu, A., Van den Broeck, G., and Liang, Y. Expressive modeling is insufficient for offline rl: A tractable inference perspective. arXiv preprint arXiv:2311.00094, 2023b.

\ Liu, X., Liu, A., Van den Broeck, G., and Liang, Y. Understanding the distillation process from deep generative models to tractable probabilistic circuits. In International Conference on Machine Learning, pp. 21825– 21838. PMLR, 2023c.

\ Liu, Z., Luo, P., Wang, X., and Tang, X. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.

\ Loconte, L., Di Mauro, N., Peharz, R., and Vergari, A. How to turn your knowledge graph embeddings into generative models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.

\ Loconte, L., Sladek, A. M., Mengel, S., Trapp, M., Solin, A., Gillis, N., and Vergari, A. Subtractive mixture models via squaring: Representation and learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2024.

\ Lowd, D. and Rooshenas, A. The libra toolkit for probabilistic models. Journal of Machine Learning Research, 16:2459–2463, 2015.

\ Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T., and Raedt, L. D. Deepproblog: neural probabilistic logic programming. In Advances in Neural Information Processing Systems 36 (NeurIPS), 2018.

\ Mari, A., Vessio, G., and Vergari, A. Unifying and understanding overparameterized circuit representations via low-rank tensor decompositions. In The 6th Workshop on Tractable Probabilistic Modeling, 2023.

\ Mathur, S., Gogate, V., and Natarajan, S. Knowledge intensive learning of cutset networks. In Uncertainty in Artificial Intelligence, pp. 1380–1389. PMLR, 2023.

\ Merity, S., Xiong, C., Bradbury, J., and Socher, R. Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843, 2016.

\ Molina, A., Vergari, A., Stelzner, K., Peharz, R., Subramani, P., Di Mauro, N., Poupart, P., and Kersting, K. Spflow: An easy and extensible library for deep probabilistic learning using sum-product networks. arXiv preprint arXiv:1901.03704, 2019.

\ Murphy, K., Linderman, S., Chang, P. G., Li, X., Kara, A., Harper-Donnelly, G., and Duran-Martin, G. Dynamax, 2023. URL https://github.com/probml/ dynamax.

\ Peharz, R., Lang, S., Vergari, A., Stelzner, K., Molina, A., Trapp, M., Van den Broeck, G., Kersting, K., and Ghahramani, Z. Einsum networks: Fast and scalable learning of tractable probabilistic circuits. In International Conference on Machine Learning, pp. 7563–7574. PMLR, 2020a.

\ Peharz, R., Vergari, A., Stelzner, K., Molina, A., Shao, X., Trapp, M., Kersting, K., and Ghahramani, Z. Random sum-product networks: A simple and effective approach to probabilistic deep learning. In Uncertainty in Artificial Intelligence, pp. 334–344. PMLR, 2020b.

\ Poon, H. and Domingos, P. Sum-product networks: A new deep architecture. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 689–690. IEEE, 2011.

\ Pronobis, A., Ranganath, A., and Rao, R. P. Libspn: A library for learning and inference with sum-product networks and tensorflow. In Principled Approaches to Deep Learning Workshop, 2017.

\ Qian, C., Manolache, A., Ahmed, K., Zeng, Z., Van den Broeck, G., Niepert, M., and Morris, C. Probabilistic task-adaptive graph rewiring. In Proceedings of the International Conference on Learning Representations (ICLR), 2023.

\ Rabiner, L. and Juang, B. An introduction to hidden markov models. ieee assp magazine, 3(1):4–16, 1986.

\ Rahman, T., Kothalkar, P., and Gogate, V. Cutset networks: A simple, tractable, and scalable approach for improving the accuracy of chow-liu trees. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15- 19, 2014. Proceedings, Part II 14, pp. 630–645. Springer, 2014.

\ Shah, N., Olascoaga, L. I. G., Zhao, S., Meert, W., and Verhelst, M. Dpu: Dag processing unit for irregular graphs with precision-scalable posit arithmetic in 28 nm. IEEE Journal of Solid-State Circuits, 57(8):2586–2596, 2021.

\ Vergari, A., Choi, Y., Peharz, R., and Van den Broeck, G. Probabilistic circuits: Representations, inference, learning and applications. AAAI Tutorial, 2020.

\ Vergari, A., Choi, Y., Liu, A., Teso, S., and Van den Broeck, G. A compositional atlas of tractable circuit operations for probabilistic inference. Advances in Neural Information Processing Systems, 34:13189–13201, 2021.

\ Wang, B. and Kwiatkowska, M. Compositional probabilistic and causal inference using tractable circuit models. In International Conference on Artificial Intelligence and Statistics, pp. 9488–9498. PMLR, 2023.

\ Xu, J., Zhang, Z., Friedman, T., Liang, Y., and Van den Broeck, G. A semantic loss function for deep learning with symbolic knowledge. In Proceedings of the 35th International Conference on Machine Learning, 2018.

\ Yang, Y., Gala, G., and Peharz, R. Bayesian structure scores for probabilistic circuits. In International Conference on Artificial Intelligence and Statistics, pp. 563–575. PMLR, 2023.

\ Yao, L., Trapp, M., Periasamy, K., Leslin, J., Singh, G., and Andraud, M. Logarithm-approximate floating-point multiplier for hardware-efficient inference in probabilistic circuits. In The 6th Workshop on Tractable Probabilistic Modeling, 2023.

\ Zhang, H., Dang, M., Peng, N., and Van den Broeck, G. Tractable control for autoregressive language generation. In International Conference on Machine Learning, pp. 40932–40945. PMLR, 2023.

\

:::info Authors:

(1) Anji Liu, Department of Computer Science, University of California, Los Angeles, USA ([email protected]);

(2) Kareem Ahmed, Department of Computer Science, University of California, Los Angeles, USA;

(3) Guy Van den Broeck, Department of Computer Science, University of California, Los Angeles, USA;

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
DeepBook Logo
DeepBook Price(DEEP)
$0.035942
$0.035942$0.035942
-3.29%
USD
DeepBook (DEEP) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

Red state lawmaker warns something ominous hiding behind Supreme Court's 'five alarm fire'

Red state lawmaker warns something ominous hiding behind Supreme Court's 'five alarm fire'

A former lawmaker from a red state warned that something ominous is hiding behind the latest "five-alarm fire" from the Supreme Court, according to a new report
Share
Rawstory2026/05/15 08:07
One Of Frank Sinatra’s Most Famous Albums Is Back In The Spotlight

One Of Frank Sinatra’s Most Famous Albums Is Back In The Spotlight

The post One Of Frank Sinatra’s Most Famous Albums Is Back In The Spotlight appeared on BitcoinEthereumNews.com. Frank Sinatra’s The World We Knew returns to the Jazz Albums and Traditional Jazz Albums charts, showing continued demand for his timeless music. Frank Sinatra performs on his TV special Frank Sinatra: A Man and his Music Bettmann Archive These days on the Billboard charts, Frank Sinatra’s music can always be found on the jazz-specific rankings. While the art he created when he was still working was pop at the time, and later classified as traditional pop, there is no such list for the latter format in America, and so his throwback projects and cuts appear on jazz lists instead. It’s on those charts where Sinatra rebounds this week, and one of his popular projects returns not to one, but two tallies at the same time, helping him increase the total amount of real estate he owns at the moment. Frank Sinatra’s The World We Knew Returns Sinatra’s The World We Knew is a top performer again, if only on the jazz lists. That set rebounds to No. 15 on the Traditional Jazz Albums chart and comes in at No. 20 on the all-encompassing Jazz Albums ranking after not appearing on either roster just last frame. The World We Knew’s All-Time Highs The World We Knew returns close to its all-time peak on both of those rosters. Sinatra’s classic has peaked at No. 11 on the Traditional Jazz Albums chart, just missing out on becoming another top 10 for the crooner. The set climbed all the way to No. 15 on the Jazz Albums tally and has now spent just under two months on the rosters. Frank Sinatra’s Album With Classic Hits Sinatra released The World We Knew in the summer of 1967. The title track, which on the album is actually known as “The World We Knew (Over and…
Share
BitcoinEthereumNews2025/09/18 00:02
Data focus shifts to payrolls – Societe Generale

Data focus shifts to payrolls – Societe Generale

The post Data focus shifts to payrolls – Societe Generale appeared on BitcoinEthereumNews.com. Societe Generale analysts note a quiet data calendar ahead of key
Share
BitcoinEthereumNews2026/04/02 17:52

KAIO Global Debut

KAIO Global DebutKAIO Global Debut

Enjoy 0-fee KAIO trading and tap into the RWA boom