Skip to content

RecBole v1.1.0

Compare
Choose a tag to compare
@zhengbw0324 zhengbw0324 released this 04 Oct 05:35
· 387 commits to master since this release
83e931c

RecBole v1.1.0 Release Notes

After more than half a year of hard work, we have completed the upgrade of RecBole and released a new version: RecBole v1.1.0 !

In this release, we fully consider users' feedback and demands to improve the user friendliness of RecBole. Specifically, we update several commonly used mainstream data processing methods and reconstruct our data module to be compatible with a series of efficient data processing APIs. Meanwhile, we implement distributed training and parallel tuning modules to accelerate models with large-scale data. According to the issues and discussions, we also fix a number of bugs and update the documentation to make it more user-friendly.

In a word, RecBole v1.1.0 is more efficient, convenient and flexible than previous versions. More details will be introduced in the following part:

  • Highlights
  • New Features
  • Bug Fixes
  • Code Refactor
  • Docs

Highlights

The RecBole v1.1.0 release includes a quantity of wonderful new features, some bug fixes and code refactor. A few of the highlights include:

  1. We add 5 new models into RecBole.
  2. More flexible data processing. We add data transformation for sequential models, discretization of continuous features for context-aware models and knowledge graph filtering for knowledge-aware models.
  3. More efficient training and tuning. We add three components in RecBole: multi-GPU training, mixed precision training and intelligent hyperparameter tuning, which makes it more efficient to deal with the large-scale data in different recommendation scenarios.
  4. More reproducible configurations. To further facilitate the search process of hyper parameters, we provide the hyper-parameter selection range and recommended configurations for each model on three datasets, covering four types of recommendation tasks.
  5. More user-friendly documentation. We add detailed running examples and run-time configurations for all kinds of recommendation tasks.

New Features

  • Add 5 new models:
    • Context recommendation (1): DCNv2 (#1418)
    • General recommendation (1): SimpleX (#1282)
    • Sequential recommendation (1): CORE (#1274)
    • Knowledge-aware recommendation (2): KGIN (#1420), MCCLK (#1420)
  • Add ipynb tutorials of prediction in run_example (#1229).
  • Support mixed precision training (#1337).
  • Add the implemention of distributed recommendation (#1338).
  • Support data filtering of knowledge graph (#1342).
  • Support counting of FLOPs (#1345).
  • Add Python code formatting in github action according to PEP8 (#1349).
  • Add non-ergodic hyper-parameter search strategy (#1350).
  • Add float feature field discretization (#1352).
  • Support hyper-parameter search using Ray (#1360, #1411).
  • Add data transform (#1380).
  • Add benchmark into RecBole (#1416).

Bug Fixes

  • Model:
    • Fix a bug inabstract_recommender.py: update embed_input_fields function (#1177)
    • Fix a bug in SGL: remove the device in embedding layer (#1180).
    • Fix a bug in NeuMF: updated the copy method of model parameters (#1186).
    • Fix the code in SRGNN: code optimization of SRGNN (#1217).
    • Fix UserWarning in LightGCN, NGCF, NCL, SGL and SimpleX: add np.array() in get_norm_adj_mat and csr2tensor (#1225, #1397).
    • Fix a bug in CORE: remove item_seq_len in forward (#1379).
    • Fix a bug in FwFMs: update float_embeddings and fwfm_layer in FwFMs (#1414).
  • Dataset:
    • Fix the bug in Interaction when input tensor is 0-d tensor (#1188).
    • Fix the bug of unused_col is not used when using benchmark_file (#1301).
    • Fix dataloader random factors (#1340).
    • Delete transform log (#1385).
    • Fix serialize bug when save/load dataloaders (#1386).
    • Fix the funtion of history_item_matrix (#1405).
  • Trainer:
    • FIX RandomState in hyper_tuning.py (#1192).
    • Fix mixed precision training on CPU (#1344).
  • Util:
  • Config:
    • Fix the bug in neg_sampling config (#1215).
    • Fix the bug of normalize_all in ml-100k.yaml (#1294).
    • Fix the gpu_id format (#1402).
  • Typo:
    • Fix typo of ValueError in dataset._get_download_url (#1190).

Code Refactor

  • Refractor the negative sampling: use train_neg_sample_args(dict) instead of neg_sampling(dict) (#1343).
  • Refractor the log: (1) add hash config and rename log file (#1341). (2) add model and dataset name to log file (#1381).
  • Refractor the test process: add tests for hyper-tuning (#1361).
  • Refractor the configurator: add warning for old parameter (#1367).
  • Refractor the popularity sampling: add alpha parameter for popularity sampling distribution (#1382).
  • Refractor the FPMC: add CE and BPR loss to FPMC (#1383).
  • Refractor the run_hyper: add display parameter for run_hyper (#1385).

Docs

  • Add supplementary description of DMF (#1194).

  • Fix the authors for SR-GNN in docs (#1204).

  • Add sequential, context and knowledge quick start (#1351).

  • Fix docstring warning when making html files (#1353).

  • Fix some documentation typo (#1359).

  • Add docs for Distributed DataParallell (#1362).

  • Insert eval_collector.data_collect when evaluate from checkpoint (#1364).

  • Modify neg_sampling into train_neg_sample_args in sequential docs (#1365).

  • Update open source contributions, model list and add constraints for purpose in README (#1371, #1457).

  • Fix warnings in docs and modify configuration (#1373).

  • Rename neg_sampling to train_neg_sample_args (#1383).

  • Fix description of mixed precision training and Ray (#1407).