Fudan University released the low-memory optimization technology LOMO | It reduces the memory usage of large model training to 10.8%, which is far ahead of DeepSpeed!

NoSuchKey

Guess you like

Origin juejin.im/post/7250491326260264997