Sven Balnojan
1 min readSep 19, 2019

--

Hi Anuradha, you’re right. MRR and Hit@n are reported as per bucket while training and as average over all buckets at evaluation time. This is because both train & test sets are partitioned.

PBG doesn’t help with the “un-partitioning” but instead proposes you split off a test set beforehand with a second config file. That is most likely because an “unpartitioned” test set would be huge, in comparison to what you really need in such problems.

So you’d have:

  • unpartitioned_test.txt to evaluate overall on a small set of edges! (evaluate on this one with config_eval.py)
  • unpartitioned_train.txt which you’d then split into “test & train”, and train & evaluate with your config.py

They explain that here: https://torchbiggraph.readthedocs.io/en/latest/evaluation.html#offline-evaluation

--

--

Sven Balnojan
Sven Balnojan

Written by Sven Balnojan

Head of Product at MAIA | PhD, ex Head of Mkt | Author | AI & Data Expert | Newsletter at http://thdpth.com/

Responses (1)