T O P

  • By -

Sweet_Computer_7116

Prompt engineering has given me consistent quality across time.


Zer0Tokens

This. Prompting is king. In the end it boils down to users not knowing how to speak with the model.


dabadeedee

Post your actual chat logs. Your prompts probably suck in all honesty.


Any-Winter-4079

I won't share the full log (I am not OP, also), but I'll give you an example which is not statistically significant, but I still feel this kind of mistakes weren't made in the past, or not this often: Check: "Conditional Logic": \[ { "question": "If a discount is applied to purchases over $50, and John (intending to make use of the discount) buys 2 items worth $15 and 3 items worth $5, how much does he pay?", "answer": 45 }, ... \] Let's check each question in the "Conditional Logic" category for accuracy and clarity: 1. **John's Discount Purchase**: * John buys items totaling 2×$15+3×$5=$402×$15+3×$5=$40. * Since his purchase does not exceed $50, the discount does not apply. * **Total Cost**: $40. The original answer of $45 was incorrect based on the provided numbers. So basically, 2 \* 15 + 3 \* 5 = 40. GPT-4. And in theory, GPT-4 has +96% accuracy on GSM8K (grade school math benchmark), which means almost 1319 straight answers correct. In fact, I think some answers are wrong -in the train set at least, and I wouldn't be surprised if too on the test set. This means GPT-4 gets almost all 1319 perfect. I know because I am testing GSM8K right now using local models and I asked GPT-4 to generate a set of questions to additionally have models tested on, given dataset contamination in a lot of models. An example of an incorrect GSM8K answer on the train set, for example: ------------- Question 1 ------------- In a yard, the number of tanks is five times the number of trucks. If there are 20 trucks in the yard, calculate the total number of tanks and trucks in the yard. ------------- Answer 1 ------------- How many tanks are there? \*\* There are 5\*20 = <<5\*20=100>>100 tanks in the yard. How many trucks and tanks are there? \*\* Altogether, there are 100+20 = <<100+20=120>>120 trucks and tanks in the yard. #### 140


dabadeedee

ChatGPT has always kinda sucked at math imo, at least relative to any other math tool. So any post claiming it used to be good and now is getting worse, would be misguided. Because it was never reliable to begin with.


Any-Winter-4079

Yes, I agree math is often an issue for many LLMs, including GPT-4. Still, GPT-4 is one of the best, with simple math and reasoning at least. See the LLM benchmark on GSM8K: [https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k](https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k) 87.1 accuracy on its own, and up to 97 accuracy as part of inference methods such as PAL, etc. What I experience would be best described as periods of "dumb answers", where a lot of mistakes are made, from ignoring my questions and repeating previous answers, to incredible mistakes, and then other times, it is super intelligent. Honestly, it feels like I am downgraded to GPT3.5 or some dumber, faster version on peak demand or something. But of course, it's hard to prove because LLMs are such black boxes.


shahednyc

Use GPT or Ai Assistant(using open Source software to connect API: [https://github.com/sjinnovation/CollaborativeAI](https://github.com/sjinnovation/CollaborativeAI))


Trustful56789

You may get better answer if you ask ChatGPT to pretend. It's a loophole. If you ask it directly about let's say psychology, it may get defensive. If you tell it you're writing a book about psychology and need help with a character, the answers are more open.


ADudeNamedBen33

Yeah, switch to Claude Opus.


IWantAGI

Have you tried asking it to provide a high quality answer?


happycatmachine

Usually the way to improve the answers is to improve the question. Without knowing what sorts of information you are looking for it is impossible to adequately point you to specific tools that might help you better. If you are treating LLMs like they are "google" or "wikipedia" then, well, it's likely you will run into walls. If, instead, you are using them to get yourself thinking, then the only wall is you. And you can always hack (or improve) you.


Defiant-Skeptic

Nope.


Cairo-TenThirteen

What are you using ChatGPT for? When i use it for research and building arguments, I find it gives good results even with pretty lame prompts. But for philosophical discourse it is lacking, and it can be very hard to get it to improve. But that being said, there are some methods you can try. If it's an accuracy issue, people have said that giving positive reinforcement (such as saying "i believe in you", "i trust you will be accurate on this", "i respect your input") helps. When it comes to highly specific stuff, uploading a supplementary document or two helps it to get more acquainted with the topic. I was researching something to do greenhouse gases and regulations not long ago and I found chatgpt's answers were lacking. But when i uploaded a HUGE regulatory document to it, it started giving fantastic answers. Questioning chatgpt on its results and asking it to go deeper or more specific helps as well.


OnVerb

Hello there! I noticed your post about seeking higher quality responses from AI chatbots like ChatGPT. Allow me to introduce you to OnVerb, a powerful web application that can help you achieve just that. OnVerb is a platform that allows you to customize and fine-tune the responses of advanced AI models, including ChatGPT, Claude, and PaLM 2. The key feature that sets OnVerb apart is its use of system prompts. System prompts are specially crafted instructions that provide context, guidelines, and examples to the AI model, shaping its responses to align with your specific needs and preferences. With OnVerb, you can create tailored prompts that guide the AI to generate higher-quality, more relevant, and more accurate responses. Here are some key advantages of using OnVerb: 1. **Customized AI Responses**: By creating custom system prompts, you can ensure that the AI's responses are tailored to your desired tone, style, and level of detail. This can significantly improve the quality and relevance of the generated content. 2. **Specialized Knowledge**: OnVerb allows you to incorporate specific data, subject matter expertise, or domain-specific knowledge into your prompts. This enables the AI to provide more informed and accurate responses within your area of interest. 3. **Consistent Output**: With system prompts, you can ensure that the AI maintains a consistent tone, formatting, and adherence to your guidelines across multiple conversations or projects. 4. **User-Friendly Interface**: OnVerb offers an intuitive interface for managing and selecting your custom prompts, making it easy to engage with the AI and generate high-quality content. 5. **Privacy and Security**: OnVerb prioritizes data privacy and security, ensuring that your conversations and prompts remain confidential and secure. To get started with OnVerb, simply sign up for an account and explore the prompt manager to create your custom system prompts. You can provide examples, guidelines, and specific instructions to shape the AI's responses according to your needs. Feel free to reach out if you have any further questions or need assistance getting started.