Shap-E can generate 3D assets from text or images. Unlike their earlier model Point-E, this one can directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields. It is also faster to run and open-source! Read the paper here.

Just like video generation, the quality is still behind image generation. I expect this to change by the end of this year.