In his Art of Doing Science and Engineering, Hamming warns of the difficulty of obtaining reliable data: in his experience, despite all assurances given, there were always significant errors in collection (above and beyond ordinary noise) in the datasets he encountered.
One example he gave was an important accounting rules change, which (he says) affects the correct interpretation of balance sheet for a large portion of major American companies at the time.
He expected the financial papers to say something about the change affects analysts’ analysis of the companies, but instead nothing. No one seems to have checked where the financial data came from.
Perhaps this is why the Medallion fund won by jettisoning all human-understandable patterns, and rely on algorithms to find shifting patterns in the data. No human/machine truly understand the mountains of wobbly data generated in real time; better to exploit any regularities that arise from time to time.