No, the model is useful without the dataset, but its not functionally "open source", because while you can tune it if you have the training code, you can't replicate it or, more important, train it from scratch with a modified, but not completely new, dataset. (And, also, understanding the existing training data helps understand how to structure data to train that particular model, whether its with a new or modified data set from scratch, or for finetuning.)
For various industry-specific or specialized task models (e.g. recognizing dangerous events in self-driving car scenario) having appropriate data is often the big secret sauce, however, for the specific case of LLMs there are reasonable sets of sufficiently large data available to the public, and even the specific RLHF adaptations aren't a limiting secret sauce because there are techniques to extract them from the available commercial models.
I thought the big secret sauce is the sources of data that is used to train the models. Without this, the model itself is useless quite literally.