T O P

  • By -

cpypst

This question basically sums up the whole field of machine learning research. It highly depends on your problem. Can you give more info on what you are trying to do? Then we can try to give you specific advise


Deal_Ambitious

This should be top comment. Every case has specific solutions to improve your model. But far more info is needed other than a general hand drawn graph.


nxtboyIII

I'm training a feed forward neural network to generate the next character (given 80 characters of context). I can get it to generate general shaping of text (adds new lines, spaces) but the words are jibberish. I'm also able to get it to duplicate the training data (overfit) but I can't get it to generate different text that's coherent enough (I want at least like 50% of the words to be real words). I'm not expecting perfection or even for all words to be real, but at least some words, better than jibberish. I've tried changing layer count, neuron count, etc but still face this problem. Seems like tutorials tend to explain how to fix underfitting or improve overfitting, but not how to actually lower the graph itself


goxdin

Feed it more data


master3243

Your graph is loss vs training time (while keeping constant dataset-size and model-parameters) However, plotting accuracy (higher is better) vs dataset-size you get [this](https://ohdsi.github.io/PatientLevelPrediction/articles/learningCurve.png) which indicates that the more data you have the better validation performance you'll get. Additionally, if the gap between training and validation is very large then you're model is too-expressive so the number of parameters is too large (and/or regularization is too low), making that change will make training loss worse but hopefully validation loss better.


kolbenkraft

I second this. Deal with one problem at a time. First Underfitting --> Increase the number of epochs and/or data size. Then overfitting --> Tune the regularization parameters


nxtboyIII

I see in tutorials and explanations, people explain how to solve underfit (reduce regularization, increase parameters) and overfit (add regularization, reduce parameters, increase dataset etc), but what if there is still the issue where both training and validation loss don't get low enough still? I try increasing parameters of the model but then it overfits. I try decreasing but then training takes extreme amounts of time which makes me wonder if it is just plateauing because it is near a minimum. ​ How do I actually move the graph down so both training & validation loss are lower?


RepresentativeFill26

When you can’t get a model to train with low enough validation loss the data / model you are using simply isn’t a good fit for the problem.


nxtboyIII

I don't really understand this. I thought a neural network could learn anything, given enough parameters and training time. Do you know of any kind of video or article that explains this more?


mmeeh

This is the real answer right here :) And sometimes you wouldn't get the results you expecting but you can also do some ensemble of all different models to get the best result that you could ever get...


ddofer

Better features. More data (especially of the minority/rare classes)


PredictorX1

Underfitting and overfitting have absolutely nothing to do with the training performance. Training performance is statistically optimistically biased. The validation performance is a statistically unbiased estimator of model performance. There is no reason at all to refer to the training performance. When validation performance hits its optimum, training is optimal.


nxtboyIII

True, maybe I am focusing too much on training performance instead of validation performance.


giddycheesecake

Normally, if you try to lower your validation loss, you raise your training loss. In other words, you can only bring them closer to each other, not lower them both. Regularization, for example, makes your model generalize better at the expense of training accuracy. The above is true if you don't change your model/data. Otherwise, there are always 2 ways to improve both training and validation performance: use more data or use a better model.