OpenAI, the company behind the chatbot ChatGPT, has ramped up its hiring around the world, bringing on roughly 1,000 remote contractors over the past six months in regions like Latin America and Eastern Europe, according to people familiar with the matter:
About 60% of the contractors were hired to do what's called "data labeling" — creating massive sets of images, audio clips, and other information that can then be used to train artificial intelligence tools or autonomous vehicles.
The other 40% are computer programmers who are creating data for OpenAI's models to learn software engineering tasks. OpenAI's existing Codex product, launched in Aug. 2021, is designed to translate natural language into code.
[...] Previously, OpenAI trained its models on code scraped from GitHub, a repository site owned by its largest investor, Microsoft, which last week confirmed multi billion dollars in new funding first reported by Semafor. But in this case, OpenAI appears to be building a dataset that includes not just lines of code, but also the human explanations behind them written in natural language.
[...] Sam Altman, OpenAI's CEO, recently put the company's headcount at 375 people, a tiny number compared to the thousands of staff at tech giants like Google and Facebook working on artificial intelligence. "I know I'm not supposed to brag about OpenAI," he tweeted, touting the company's "talent density."
This summer, the artificial intelligence company OpenAI released Codex, a new system that automatically writes software code using only simple prompts written in plain language. Codex is based on GPT-3, a revolutionary deep learning platform that OpenAI trained on nearly all publicly available written text on the Internet through 2019.
As an early Beta tester, I've had extensive opportunities to put both GPT-3 and Codex through their paces. The most frequent question I'm asked about Codex is "Will this replace human programmers?" With world powers like the United States investing billions into training new software developers, it's natural to worry that all the effort and money could be for naught.
If you're a software developer yourself—or your company has spent tons of money hiring them—you can breathe easy. Codex won't replace human developers any time soon, though it may make them far more powerful, efficient, and focused.
Why isn't Codex an existential threat to human developers? Years ago, I worked with a high-level (and highly compensated) data scientist and software developer from a major American consulting firm on a government database project. Our task was to understand how a state agency was using its database to assign grants to organizations, and then to advise the agency on how to improve the database.
On Monday, AI tech darling OpenAI announced that it received a "multi-year, multi-billion dollar investment" from Microsoft, following previous investments in 2019 and 2021. While the two companies have not officially announced a dollar amount on the deal, the news follows rumors of a $10 billion investment that emerged two weeks ago.
[...] "The past three years of our partnership have been great," said Sam Altman, CEO of OpenAI, in a Microsoft news release. "Microsoft shares our values and we are excited to continue our independent research and work toward creating advanced AI that benefits everyone."
In particular, the two companies say they will work on supercomputing at scale to accelerate OpenAI's research, integrating OpenAI's technology into more Microsoft products and "digital experiences" and keeping Microsoft as OpenAI's exclusive cloud provider with Azure. "OpenAI has used this infrastructure to train its breakthrough models, which are now deployed in Azure to power category-defining AI products like GitHub Copilot, DALL·E 2, and ChatGPT," wrote Microsoft.
(Score: 2) by PiMuNu on Thursday February 09, @01:26PM (5 children)
> OpenAI's existing Codex product, launched in Aug. 2021, is designed to translate natural language into code.
This is not possible. Natural language does not adequately describe the problem domain. Information is missing and no amount of AI can fix that.
(Score: 2) by JoeMerchant on Thursday February 09, @01:28PM (1 child)
They're not even to that limit yet.
I've asked ChatGPT basic API questions, and it's pretty good, but about 1/4 queries it will give an example that doesn't even compile.
Combine ChatGPT with an AlphaGo kind of game play engine (successful compile and test wins) and you've got something.
(Score: 0) by Anonymous Coward on Thursday February 09, @01:41PM
> ... it's pretty good, but about 1/4 queries it will give an example that doesn't even compile.
Interesting. I wonder, does this lead to a split in the future where "easy" programs proliferate because they can be "AI" written for next to nothing? Meanwhile, fewer "hard" programming projects get funded because they are perceived to cost too much--using the cheap stuff as a benchmark for what things "should" cost.
This had certainly happened with other products over the years--the cheap, high volume stuff drives the nice-but=expensive stuff out of the market. Take shoes for example, as a kid (c.1960) there were very fine size gradations available to find something that really fit, and the odd foot bump that I had could be accommodated by stretching the leather right in the shoe store. These days you are hard pressed to find anything outside the standard "medium" width--looking really hard you might find "wide" and "narrow".
(Score: 2) by acid andy on Thursday February 09, @01:30PM
You're right. If they've got any sense they'll be feeding it pseudocode.
(Score: 2) by aafcac on Thursday February 09, @01:31PM
That's the same reason why math notation isn't typically done in natural language either. Even proper notation can be ambiguous at times.
(Score: 2) by crafoo on Thursday February 09, @01:36PM
Don't feel too bad. It's not the first time people yell at the sky, "that's not possible!" as a group of motivated, smart people do it.
You act like human-written "production quality" code isn't trash. Like the bar is all that high for AI. I wouldn't be surprised if it already codes better than college graduates.