Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Incubator Q&A

Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.

Are you here to participate in a specific proposal? Click on the proposal tag (with the dark outline) to see only posts about that proposal and not all of the others that are in progress. Tags are at the bottom of each post.

Comments on What computational resources are needed to train GPT-NeoX?

Post

What computational resources are needed to train GPT-NeoX? Question

+3
−0

GPT-NeoX is provided as open source software which you can train yourself: https://github.com/EleutherAI/gpt-neox

What are the hardware requirements like for training this model within a reasonable timeframe? Ideally, a few hours, but up to a month is reasonable, I think.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Could you be a little more specific about what a reasonable timeframe is? (2 comments)
Could you be a little more specific about what a reasonable timeframe is?
Monica Cellio‭ wrote over 1 year ago

Could you be a little more specific about what a reasonable timeframe is?

matthewsnyder‭ wrote over 1 year ago

That's hard to say - I'm not actually very experienced with such models. The training time-computational resource curve can often be complicated because of the impact of things like specific hardware architecture and memory available.

I would say that anything up to a few days or even weeks is reasonable. I am trying to avoid trivial, uninteresting solutions like "well you can train it on a budget CPU from 10 years ago, if you don't mind waiting 50 years, heh heh".

I'll edit the question to give an arbitrary timeframe.