Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

GPT-NeoX-20B on Flax (xmap)

Unclaimed Colabs

Administrator
Staff member
Joined
Dec 1, 2022
Messages
21
License
Apache 2.0
API usage
Self-contained
Type
Generative
The Flax implementation on TPUs currently has a slight performance regression relative to the PyTorch implementations. The comparison can be seen here.

If you want to evaluate GPT-NeoX-20B for research purposes, please use the original GPT-Neox, Minimal PyTorch or Hugging Face implementations.

This TPU implementation of GPT-NeoX-20B is also still a prototype with some hacks, so if you see any room for improvement, please drop by the repo!

(For instance, I'm resorting to fp32 for some operations to avoid NaNs, which leads to greater memory usage than is necessary.)
 
Top Bottom