5 comments

  • niksmather6 days ago
    Apologies if I didn&#x27;t understand the paper, but why do you want to apply diffusion models to tabular datasets in the first place?<p>Do we think they&#x27;ll be better than decision trees? Is there some tabular problem that can be handled by diffusion but not trees?
    • henrydark5 days ago
      First, they give a novel generation algorithm based on combining trees with diffusion, which trees alone just don&#x27;t give you.<p>Second, yes, they think some tabular data will be fit better by their combination of trees with diffusion than just with trees.
    • robotresearcher6 days ago
      You might not want to make a sword out of iron if steel is available, but understanding the relationship between iron and steel is broadly valuable.
      • niksmather6 days ago
        I can see the mathematical results are interesting, I was more wondering if there was a practical utility to this TreeFlow thing they built.
  • semessier6 days ago
    this lacks the math for any bold claims
    • emil-lp6 days ago
      Did you read the paper? Is there something specifically you&#x27;re missing? A proof? A theorem statement?
      • semessier6 days ago
        this is an empirical engineering paper with theoretical dressing, it would not need to be a theorem paper of course.
  • henrydark6 days ago
    Is the code available somewhere?
  • rsn2436 days ago
    Decision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic. This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes. Our unification reveals a shared optimization principle: \emph{Global Trajectory Score Matching (GTSM)}, for which gradient boosting (in an idealized version) is asymptotically optimal. We underscore the conceptual value of our work through two key practical instantiations: \treeflow, which achieves competitive generation quality on tabular data with higher fidelity and a 2\times computational speedup, and \dsmtree, a novel distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2\% on many benchmarks.
    • Jaxan6 days ago
      You could at least fix the latex commands when copy pasting the abstract. ;-)
  • gorold6 days ago
    Figure 1 definitely cleared up any misunderstandings I had about the paper