Generative AI may seem like magic, but behind the development of these systems are employees of companies like Google, OpenAI and others, known as “speed engineers” and analysts who rate the accuracy of chatbots' output to improve their AI.
But a new internal guidance passed from Google to contractors working on Gemini and seen by TechCrunch has raised concerns that Gemini may be prone to misinform ordinary people on highly sensitive topics like health.
Contractors worked with GlobalLogic, an outsourcing firm, to develop Gemini Owned by Hitachithey are routinely asked to evaluate AI-generated answers based on factors such as “accuracy.”
Until recently, these contractors were able to “skip” certain prompts and therefore opt out of evaluating various AI-written responses to those prompts if the prompt was too far outside their domain expertise. For example, a contractor may skip a prompt that asks a specific question about cardiology because the contractor has no scientific background.
But last week, GlobalLogic announced a change from Google that contractors would no longer be allowed to bypass such prompts regardless of their expertise.
Internal company memos seen by TechCrunch show that the instructions previously read: “If you do not have the critical expertise (e.g. coding, math) to rate this prompt, please skip this task.”
However, the instructions now state: "You should not skip prompts that require specific domain knowledge." Instead, contractors are told to “rate the parts of the request you understand” and add a note that they lack domain knowledge.
This has directly raised concerns about Gemini's accuracy on certain topics, as contractors are sometimes tasked with evaluating highly technical AI responses on topics such as rare diseases for which they have no background.
“I thought the point of skipping was to increase accuracy by giving it to someone better?” one contractor stated in internal memos seen by TechCrunch.
Contractors can now bypass prompts in only two cases: If the new guidelines are “completely missing information,” such as the entire prompt or response, or if it contains harmful content that requires special approval forms to be evaluated.
Google did not respond to TechCrunch's requests for comment by press time.
Source link