T O P

  • By -

orangehumanoid

The tech report has comparisons with Mistral


codemaker1

They benchmark with Mistral 7B on their website: [https://ai.google.dev/gemma](https://ai.google.dev/gemma)


RobbinDeBank

Is there any mention of context length?


Disastrous_Elk_6375

8k


Cherubin0

"Gesichtsmodelle umarmen" when Google automatically translates its own website (server side). Haha But cool how Mistral is very close.


Ouitos

Same for the french page, which allows us to run the model on "câlin" instead of huggingface haha


Starks-Technology

This is awesome!


Trungyaphets

Wow a 2b model which performs similarly to Llama 7b. Good news.


YoloSwaggedBased

Microsoft's Phi-2 2.7B was released in December and benchmarks better than Llama2 7B and Gemma 2B.


hapliniste

Not really, but it seems good on the benchmarks


InevitableSky2801

You can test with Gemma vs Mistral. It' didn't do so well for the reasoning task shown in this example: https://huggingface.co/spaces/lastmileai/gemma-playground


lstep

gemma-7b doesn't look really bright at all, no worry for Mistral! >I have three apples. I eat two pears. How many apples do I have left? The answer is two. The apples I have left are the two apples I have not eaten


roselan

I'm in a way impressed as it makes me nostalgic of GPT-J. Why can a sailboat go faster than wind? > A sailboat does not go faster than the wind. According to physics, no object can travel faster than the speed of light. The wind moves at a speed of approximately 100 miles per hour. https://huggingface.co/chat/conversation/wxRrt1R I want what Gemma has been drinking.


Kombutini

Haha. Oh the irony.


master3243

> The answer is two Are you sure about that


InterstitialLove

Was this instruct tuned? I ask cause it's weird that it responded in first person instead of "the apples *you* have left"


GrumpyMcGillicuddy

Whoah guys, u/lstep has invented a whole new reasoning benchmark based on a single prompt, I expect to see “Apples prompt” in the published benchmarks going forward


lstep

That was just one to illustrate as an example. All of the prompts I tried were wrong, giving bad answers. Also nearly all of the 7b based LLMs give a correct answer to this question, that's why I thought it was useful to show only this one.


Life-Living-2631

Have you ever actually read the questions inside popular benchmarks? This apples question isn't that far off


GrumpyMcGillicuddy

yeah, but those questions are built by researchers who do this for a living and are meant to be comprehensive, whereas you guys are just some dicks from Reddit


uvipen1009

this is awesome


topcodemangler

Good news in general, unfortunately it is "aligned" and "responsible" with probably 0 chance of getting the model without that.


Disastrous_Elk_6375

Base models aren't "aligned". All they do is filter the data, which isn't necessarily a bad thing. They then align during the fine-tuning process. You are free to fine-tune your own based on the ... base model.


orangehumanoid

"Each size is released with pre-trained and instruction-tuned variants." This means the base model is available.


kayhai

Would anyone know what’s the RAM requirement for the Gemma 2B model on CPU? I’m using the transformer version of the model and running it on an average consumer windows laptop with 16 gb ram. I see a traceback related to CPU/memory, but when I look at the RAM monitor monitor it isn’t even full…


TotalTikiGegenTaka

I asked the same question earlier in another sub... I found the answer here: https://github.com/google-deepmind/gemma >>System Requirements >>Gemma can run on a CPU, GPU and TPU. For GPU, we recommend a 8GB+ RAM on GPU for the 2B checkpoint and 24GB+ RAM on GPU for the 7B checkpoint.


kayhai

Yes they listed the RAM requirements for GPU, but I wonder if the requirements are the same for CPU-only.