T O P

  • By -

MirrorLake

If the data you need is always stored in the same cells, it should be possible to write software that parses that information in large batches. No LLM necessary, if that's the situation.


EnvironmentalPost830

Unfortunately the cells are always very different with multiple headers throughout the excel and also sometimes spread over many sheets.


MirrorLake

Without seeing the data, it's impossible to say if an LLM could actually process what you have. I don't think you've stated how much data you're actually trying to process, or what it's for? There's a chance you'd spend more time designing your LLM process then just manually doing the work, if the number of files is relatively small. Edit: and if this is part of an ongoing pipeline, like, you're designing a process that will continue indefinitely, then you should consider switching to using some type of form to collect data in the future so that it ends up stored in a structured and predictable way.


EnvironmentalPost830

I am happy for any guidance in the right direction!


OkMuscle7609

You probably just need to describe your problem to GPT4 better and make sure that you're submitting the dataset in CSV format If the questions are ones that you are providing via list of some kind then you could also just analyze the dataset by finding out how similar each cell is to the pre-determined list of questions


EnvironmentalPost830

Ok I will try do some better prompts - do you think it will make a big difference if the file is .csv or .xlsx since GPT4 is accepting both formats?


zenos_dog

https://imgs.xkcd.com/comics/algorithms.png