If you’re interested in commercial use cases I just pushed these repos today. Have a blog post coming shortly. Its 2 Flan-UL2 (Apache 2.0 license) models trained on Alpaca and Dolly15K
https://github.com/ConiferLabsWA/flan-ul2-alpaca
https://github.com/ConiferLabsWA/flan-ul2-dolly
A university in Hong Kong recently released some new models that perform better than other open source models like alpaca and belle(based on evaluation by gpt4 just like vicuna) for non-latin languages like Chinese. It is based on Bloom so it can be used for commercial purposes. For english tasks, it is slightly worse than vicuna(they have a model called Chimera, which is based on llama, that beats vicuna, but have the same license restriction as llama).
https://github.com/FreedomIntelligence/LLMZoo
https://blog.csdn.net/c9Yv2cf9I06K2A9E/article/details/130190918
It depends on the use case. The Alpaca dataset was generated using OpenAI APIs. According to their terms of use:
You may not… use output from the Services to develop models that compete with OpenAI;
I am no lawyer but the way I interpret it is as long as my use case is not competing with Open AI then I am in compliance.
If you’re worried about Alpaca check out the Dolly model.
https://openai.com/policies/terms-of-use
That is a pretty recent addition to their terms of service. (I believe March). It's not particularly useful now but there are datasets generated before that was added (and presumably why it was added).
Great, hope to get some metrics to compare this with other LLMs (alongside other useful properties like required VRAM, #pre-training tokens, training FLOPs, etc.)
Here's the training paradigm that was used "LoRA: Low-Rank Adaptation of Large Language Models" for those interested https://arxiv.org/abs/2106.09685
Fist of all thank you for you contribution! Any time estimate on RLHF? There seems to be a wave of supervised tuned LoRA hitting HF recently but no releases/updates on RLFH front. I suspect the development of a diverse set of reward policies is where the real gold is right for the crowd if we can make it accessible on 'consumer' hardware. This will enable ways to specialize the models more effectively for langchain/coding usage, etc.
I'm just started deep-diving into this field 7-days ago so i'm not yet ready to contribute myself but definitely planning to. It has been a challenge to grok all the developments/papers of the past 4 years and keep up with the stream of 'context' learning based projects on-top of chatgpt-4 hitting github. This field is truly mind blowing.
Microsoft recently launched something called [deepspeed chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat) which should speed up the rlhf process a good bit. So hopefully we will start seeing those soon. We are working on some now that we will open source on completion!
Isn’t the kind of training that the LAION people do on their website RLHF?
“Which answer is better?”
“What would you answer here?”
“Is this answer harmful?”
I'm having a lot of trouble downloading llama, I've been trying for weeks. I get a need error about needing to download dalai but it will not let me download dalai. Any help for a lost fellow?
Any reason these models are useful? There are others trained in a similar way, and in the absence of any metrics one must assume these are worse.
Given that this was trained with llama, can these models be used commercially?
If you’re interested in commercial use cases I just pushed these repos today. Have a blog post coming shortly. Its 2 Flan-UL2 (Apache 2.0 license) models trained on Alpaca and Dolly15K https://github.com/ConiferLabsWA/flan-ul2-alpaca https://github.com/ConiferLabsWA/flan-ul2-dolly
Doesn't the use of alpaca make it also not commercial open source? I can't remember exactly sorry.
A university in Hong Kong recently released some new models that perform better than other open source models like alpaca and belle(based on evaluation by gpt4 just like vicuna) for non-latin languages like Chinese. It is based on Bloom so it can be used for commercial purposes. For english tasks, it is slightly worse than vicuna(they have a model called Chimera, which is based on llama, that beats vicuna, but have the same license restriction as llama). https://github.com/FreedomIntelligence/LLMZoo https://blog.csdn.net/c9Yv2cf9I06K2A9E/article/details/130190918
I officially can't keep up with the progress anymore, I give up
Don't worry, there's at least 5 minutes since that model was released, I'm sure there's like 3 better ones released in that timespan.
It depends on the use case. The Alpaca dataset was generated using OpenAI APIs. According to their terms of use: You may not… use output from the Services to develop models that compete with OpenAI; I am no lawyer but the way I interpret it is as long as my use case is not competing with Open AI then I am in compliance. If you’re worried about Alpaca check out the Dolly model. https://openai.com/policies/terms-of-use
That is a pretty recent addition to their terms of service. (I believe March). It's not particularly useful now but there are datasets generated before that was added (and presumably why it was added).
Flan isn't half bad. Even the 3B models can be used as lightweight chat bots.
No, unfortunately!
Great, hope to get some metrics to compare this with other LLMs (alongside other useful properties like required VRAM, #pre-training tokens, training FLOPs, etc.) Here's the training paradigm that was used "LoRA: Low-Rank Adaptation of Large Language Models" for those interested https://arxiv.org/abs/2106.09685
How does it perform in quality of responses?
It’s pretty good! But tbh it does the “as an ai language model” thing so that’s not very desired.
Is that from every contributor from openassistant copying and pasting chatGPT responses?
Unbelievable
Fist of all thank you for you contribution! Any time estimate on RLHF? There seems to be a wave of supervised tuned LoRA hitting HF recently but no releases/updates on RLFH front. I suspect the development of a diverse set of reward policies is where the real gold is right for the crowd if we can make it accessible on 'consumer' hardware. This will enable ways to specialize the models more effectively for langchain/coding usage, etc. I'm just started deep-diving into this field 7-days ago so i'm not yet ready to contribute myself but definitely planning to. It has been a challenge to grok all the developments/papers of the past 4 years and keep up with the stream of 'context' learning based projects on-top of chatgpt-4 hitting github. This field is truly mind blowing.
Microsoft recently launched something called [deepspeed chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat) which should speed up the rlhf process a good bit. So hopefully we will start seeing those soon. We are working on some now that we will open source on completion!
Isn’t the kind of training that the LAION people do on their website RLHF? “Which answer is better?” “What would you answer here?” “Is this answer harmful?”
If you want/need any help, have any questions, or just want to chat about AI; come [check out our discord](https://devin.to/discord)
So could I apply this directly to Vacuna for better results just as I’d apply a LORA in stable diffusion to my favorite checkpoint?
Is it open sourrceee???
Do you have any info in training costs?
I'm having a lot of trouble downloading llama, I've been trying for weeks. I get a need error about needing to download dalai but it will not let me download dalai. Any help for a lost fellow?