Release v0.7.0 · alibaba/TorchEasyRec

Major Features and Improvements

Train/Eval/Export

Support train/eval/export on cpu #27
Support TRT export (Beta) #30 #32 #41 #43 #58 #59 #89
Support AOT export (WIP) #79

Model

Optimize TDM gen tree speed #33
TDM Support string id #72
Rank and Match models support sample weight #50 #57 #63 #65
Add zero collision hash embedding #60
Add intervention methods for multi-target learning #49
Add Autodis and MLP embedding for raw features #73 #75
Add task space for multi-target learning loss #82
Add dual augmented two-tower match model #83
Add HSTU (WIP) #55

Feature

pyfg support CPU without avx512 #20
ExprFeature support l2_norm|dot|euclid_dist #35
Add fg bucketize only mode & refactor fg_encoded to fg_mode #62
Make default bucketize value configurable #94
Support multi-value sequence #96
Support vocab file #97

Dataset

Enhance stability for credential of OdpsDataset #45
Add complex type and credential support for sampler when use odps dataset #52
Support CsvDataset with null columns #56
Negative sampler support string id #70

Config

Support easyrec config convert to tzrec config #37 #39 #51 #90

Upgrade

Release official dlc image #26
Upgrade pytorch to v2.6 torchrec to v1.1.0 #99

Note

For TorchEasyRec 0.7.x, you should use Docker image version 0.7.

For the GPU version (CUDA 12.4):
- mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:0.7-cu124
- PyTorch: v2.6 CUDA: v12.4 FBGEMM: v1.1.0 TorchRec: v1.1.0 Python: v3.11
For the CPU version:
- mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:0.7-cpu
- PyTorch: v2.6 FBGEMM: v1.1.0 TorchRec: v1.1.0 Python: v3.11

Bug Fixes and Other Changes

[bugfix] remove redundant sequence key in feature input names when fg_mode is DAG by @tiankongdeguiji in #21
fix quota_name for add feature info by @chengaofei in #22
update config delete drop feature config by @chengaofei in #23
[feat] make docker compat with gpu driver 470 by @tiankongdeguiji in #24
[bugfix] fix dlc tutorial doc by @tiankongdeguiji in #25
[bugfix] fix dbmtl model doc by @tiankongdeguiji in #28
[feat] add pai dlc and dsw dependency in docker by @tiankongdeguiji in #29
[feat] update easyrec dinggroup qrcode by @tiankongdeguiji in #31
[feat] update pyfg doc to 0.3.5 by @tiankongdeguiji in #34
[bugfix] fix fg arrow handler with sample mask by @tiankongdeguiji in #38
[feat] add unique test work dir by @tiankongdeguiji in #40
[bugfix] add id field of negative sampler to selected columns by @tiankongdeguiji in #42
[bugfix] prevent predict hang when subthread or subproc exception by @tiankongdeguiji in #44
[bugfix] input_tile=3: make dataparser to get user feats before creat… by @yjjinjie in #46
[bugfix] fix sequence feature doc by @tiankongdeguiji in #48
[feat] optimize is_user_feat of Feature when use dag by @tiankongdeguiji in #53
[bugfix] refine sample weight compatibility & refine label dtype check & relax predict pipeline check & fix num_rows < num_workers when use OdpsDataset by @tiankongdeguiji in #54
[feat] add doc for training with maxcompute tables on DLC by @yanzhen1233 in #47
create fg will use resource name by @chengaofei in #64
[bugfix] fix is_sparse of LookupFeature and MatchFeature when use vocab_dict by @tiankongdeguiji in #66
[bugfix] fix odps quota in hitrate.py & refine error info of CsvReader and ParquetReader by @tiankongdeguiji in #67
[bugfix] fix mtl model label in ut by @tiankongdeguiji in #68
[bugfix] fix calculate_shard_storages to handle optimizer correctly by @tiankongdeguiji in #69
[feat] add LOG_LEVEL environ variable by @tiankongdeguiji in #71
[bugfix] fix predict when num_workers = 0 by @tiankongdeguiji in #74
[bugfix] fix duplicate server launch error in odps sampler test by @tiankongdeguiji in #76
[feat] refactor batch_size to tile_size in Batch dataclass by @tiankongdeguiji in #77
[feat]add total_loss to the plogger and summary_writer by @eric-gecheng in #78
[bugfix] fix weighted feature when INPUT_TILE=2 by @tiankongdeguiji in #80
[bugfix] fix negative sample table with multiple partitions by @tiankongdeguiji in #81
[bugfix] readme typo by @eric-gecheng in #85
[doc] fix task space doc error by @chengaofei in #86
[bugfix] add div_no_nan and prevent divide by zero loss weight by @tiankongdeguiji in #88
[feat] remove sample weight and labels when export by @tiankongdeguiji in #91
[feat] configure the shell to be bash by default in docker environments by @tiankongdeguiji in #92
[doc] creat fg json doc add upload fg json to mc method by @chengaofei in #95

New Contributors

@yjjinjie made their first contribution in #30
@eric-gecheng made their first contribution in #50
@yanzhen1233 made their first contribution in #47
@Dave-AdamsWANG made their first contribution in #49
@chengmengli06 made their first contribution in #79
@iWelkin-coder made their first contribution in #55

Full Changelog: v0.6.0...v0.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0

Major Features and Improvements

Train/Eval/Export

Model

Feature

Dataset

Config

Upgrade

Note

Bug Fixes and Other Changes

New Contributors

Contributors