Hi Anuradha, you’re right. MRR and Hit@n are reported as per bucket while training and as average over all buckets at evaluation time. This is because both train & test sets are partitioned.
PBG doesn’t help with the “un-partitioning” but instead proposes you split off a test set beforehand with a second config file. That is most likely because an “unpartitioned” test set would be huge, in comparison to what you really need in such problems.
So you’d have:
- unpartitioned_test.txt to evaluate overall on a small set of edges! (evaluate on this one with config_eval.py)
- unpartitioned_train.txt which you’d then split into “test & train”, and train & evaluate with your config.py
They explain that here: https://torchbiggraph.readthedocs.io/en/latest/evaluation.html#offline-evaluation