Speaking to Proofnews, Dave Farina, the host of “Professor Dave Explains,” put it this way: “If you’re profiting off of work that I’ve done [to build a product] that will put me out of work or people like me out of work, then there needs to be a conversation on the table about compensation or some kind of regulation.”
To some extent, the focus on YouTube data distracts from that critical argument, which is that the generative AI (genAI) tools coming into common use today are likely to have been trained by information created by humans and shared online. That’s the kind of information picked up by webcrawlers, including Apple’s.
But data quality is a real issue here, and the search for the best data inherently means that the best data sources are the highest octane of fuels to power training AI.