Home > Information > News
#News ·2025-01-06
You wake up and the singularity is one step closer? !
Yesterday, Stephen McAleer, an agent security researcher at OpenAI, suddenly said:
I miss the old days of AI research before we knew how to create superintelligence.
Altman followed with his eloquent "six-word mantra" : near the singularity; It's unclear which side - the singularity is near; I don't know where I am.
This phrase is meant to convey two things:
1. Simulation hypothesis
2. We simply don't know when the critical moment for AI to really take off will be
He hinted frantically, and hoped that this point would lead to more interpretation.
One after another, they sent intriguing signals that left everyone wondering: Is the singularity really close at hand?
Below in the comments section, a new round of AGI speculation and panic exploded.
According to Steve Newman, the father of Google Docs, "by then, AI will have replaced 95% of human jobs, including new jobs created in the future."
Apollo Research co-founder Marius Hobbhahn went a step further, listing all AGI timelines for 2024-2030.
He predicts that "by 2027, AI will directly replace the top AI researchers in AGI LABS;
"By 2028, AI companies will have 10,000 to 1 million automated AI researchers, and almost all knowledge jobs will be automated by AI."
In line with Newman's view, Hobbhahn believes that by 2024, more than 95% of tasks of economic value can be fully automated by AI.
However, he set this time point at 2029.
In this article, Steve Newman elaborates on his definition of AGI and the impact of AI on the future world.
So what exactly does AGI stand for? According to Newman:
AI can cost-effectively replace humans in more than 95 percent of economic activities, including any new jobs created in the future.
He believes that most of the hypothetical transformative effects of AI are focused on this node.
Thus, this definition of "AGI" represents the moment when the world began to change significantly and when everyone "felt AGI," specifically:
AI systems can actively adapt to accomplish tasks required for most economic activities, and can complete complete tasks rather than isolated ones.
Once AI is able to do most knowledge-based work, high-performance physical robots will follow within a few years.
3 This level of capability enables a range of transformative scenarios, from economic hypergrowth to AI taking over the world.
4 World change scenarios require this level of AI (often, dedicated AI is not enough to change the world).
5 Until AGI is reached, "recursive self-improvement" will be the main driver.
AGI refers to the moment an AI with the necessary capabilities (and economic efficiency) is invented, rather than the moment it is fully deployed across the economy.
There are multiple speculations about how AI could have a transformative impact on the world:
One view is that AI could lead to unimaginable economic growth - driving rapid advances in science and engineering, accomplishing tasks more cheaply than humans, and helping companies and governments make more efficient decisions.
According to recent historical data, world GDP per capita doubles roughly every 40 years. Some believe that advanced AI can make GDP at least double in a year, that is, "super growth."
A decade of "supergrowth" would increase per capita GDP by a factor of 1,000. That means a family living on $2 a day today could earn $730,000 a year in the future.
Another view is that AI could pose catastrophic risks.
It could launch devastating cyberattacks, creating pandemics with high mortality rates; It can give a dictator absolute control over his country or even the world. Even, AI could spiral out of control and eventually destroy all human life.
Others speculate that AI could make humans obsolete, at least in the economic realm. It could end resource scarcity and allow everyone to live a prosperous life (provided they choose to distribute the fruits fairly). It could bring to life technologies that exist only in science fiction, such as curing aging, space colonization, interstellar travel, and nanotechnology.
Not only that, some envision a "singularity," in which the rate of progress is so rapid that we can't predict anything.
Newman speculates that the moment AGI becomes reality is when these visions become reality almost simultaneously.
To be clear, Newman is not saying that predictions about advanced AI will always come true.
In the future, technological breakthroughs will become increasingly difficult, and the so-called "singularity" will not necessarily appear. In other words, "immortality" may not be possible at all.
Plus, people might prefer to interact with each other, and in that case, humans wouldn't really become useless in real economic activities.
When referring to "probably happening around the same time," Newman means that AI, if it can achieve unimaginable economic growth, also has the ability to create truly devastating pandemics, take over the world, or quickly colonize space.
Why Talk about "general Artificial Intelligence"
There is some debate as to whether supergrowth is theoretically possible.
But if AI is unable to automate almost all economic activity, hyper-growth is almost certainly impossible. Automating only half of the work will not have a profound impact; The demand for the other half's work will increase until humanity reaches a new, relatively normal equilibrium. (After all, it has happened in the past; Not long ago, most people worked in agriculture or simple handicrafts.)
As a result, hyper-growth requires AI to be able to do "almost everything." It also requires AI to be able to adapt to work, rather than retooling jobs and processes to fit AI.
Otherwise, AI will permeate the economy at a rate similar to previous technologies - too slow to deliver sustained supergrowth. Hyper-growth requires AI to be versatile enough to do almost everything humans can do, and flexible enough to adapt to the environment in which humans originally worked.
There are also predictions of space colonization, super-deadly pandemics, the end of resource scarcity, and AI taking over the world, all of which can be categorized as "AGI complete" scenarios: they have the same breadth and depth of AI that economic hypergrowth requires.
Newman further argued that as long as AI can accomplish almost all economic tasks, it is sufficient to fulfill all predictions, unless they are not possible for reasons unrelated to AI capabilities.
Why do these very different scenarios require the same level of AI capability?
Threshold effect
He points to Dean Ball's article last month on the "threshold effect."
That said, gradual advances in technology can trigger sudden and dramatic impacts when a critical threshold is reached:
Dean Ball recently wrote about the threshold effect: new technologies do not spread rapidly in the early stages, but only when hard-to-predict utility thresholds are breached. The mobile phone, for example, started out as a clunky, expensive and unmarketable product, and then became ubiquitous.
For decades, self-driving cars were of interest only to researchers, but now Google's Waymo service is doubling its growth every three months.
For any given task, AI will only be widely adopted once the utility threshold for that task is breached. Such breakthroughs can happen quite suddenly; The final step from "not good enough" to "good enough" doesn't have to be huge.
He argues that for all truly transformative AI, the threshold is consistent with the definition he described earlier:
Hyper-growth requires AI to be able to do "almost everything." It also requires AI to be able to adapt tasks, rather than adapting tasks to suit automation.
When AI can accomplish almost any task of economic value and does not need to adjust the task to accommodate automation, it will have the ability to fulfill the full range of predictions. Until these conditions are met, AI will also need the assistance of human experts.
Some details
However, Ball skipped over the issue of AI taking on the physical work - namely robotics.
Most scenarios require high-performance bots, but one or two (such as advanced cyberattacks) may not. The distinction may not matter, however.
Advances in robotics - both the physical capabilities and the software used to control robots - have accelerated significantly recently. This is not entirely accidental: modern "deep learning" techniques are both driving the current wave of AI and are also very effective at robot control.
This has led to a wave of new research in the field of physical robotics hardware. When AI has enough power to stimulate high economic growth, it may also overcome the remaining obstacles to building robots that can perform manual jobs within a few years.
The actual impact will be phased in over at least a few years, and some tasks will be realized sooner than others. Even if AI can accomplish most tasks of economic value, not all companies will act immediately.
In order for AI to do more work than humans, it will take time to build enough data centers, and mass production of physical robots may take even longer.
When you talk about AGI, you're talking about the moment of basic capability, not the moment of full deployment.
When AI is mentioned as being able to "accomplish almost any task of economic value," it does not necessarily mean that a single AI system can accomplish all of these tasks. We may end up creating specialized models that do different tasks, just as humans do different specialties. But creating specialized models must be as easy as training specialized workers.
For the question "How to achieve AGI?" This issue, the current researchers have almost no clue and theoretical basis, there are various schools, debate.
Marius Hobbhahn, co-founder of Apollo Research, believes that AGI should have three core characteristics:
Make more specific, clear assumptions about the sources of risk, the level of risk expected, the security budget, and the risk you are willing to take.
- A clear multi-layered approach to defense that removes certain unrelated, negatively correlated measures, rather than "trying to stack different strategies to reduce risk."
- Planning consistency, many AI companies are not consistent in what they advertise and what they actually do, such as claiming that they are willing to pay a higher price for security, but are not willing to bear the additional cost of transparent reasoning.
In terms of the global development of AGI, we are lucky: most of the reasoning processes in AI systems are now "directly visible" and "easily understood."
By faithful and human-readable, we mean that the reasoning process of the model is carried out in a way that humans can directly understand, and accurately reflects the reasons for the model's actions, and the model cannot make internal reasoning that is inconsistent with the claims.
We should at least:
1. Agree to take this as an important consideration and develop it further. In order to make the model produce a more understandable CoT, we can and are willing to accept the corresponding performance loss, even if the running cost of a normal application increases by a factor of two, or the cost of a high-risk application increases by a factor of five
2. Put more effort into researching easy-to-understand and faithful reasoning processes (CoT). For example, build a set of evaluation tools to test whether the reasoning process of a particular model is faithful, or explore the side effects of directly optimizing readability and how to hide unexpected reasoning.
3. Studies related to "removing visible COTs from architectures" should be reduced, such as Meta's COCONUT
Assuming that the additional cost of readability is not too high (e.g., less than 2x), the current research on "removing CoT" would be a self-destruct wall.
If there is evidence that high compute reinforcement learning (high compute RL), as used in o1 and o3, leads to unreadable reasoning processes, such as the model itself developing a language that is difficult for humans to understand, the corresponding mindset should also shift:
1. Ai developers share this information as soon as possible, "The model's reasoning performance is good, but we don't understand its reasoning process" is much scarier than "The model's reasoning performance is average, but we can understand its reasoning process".
2. Shift the priorities of research and governance strategies, such as putting more effort into decoding the language learned by the model.
If the model produces a human-understandable CoT, our main defense is detailed monitoring of the model, which is highly cost-effective but often overlooked.
Ideally, the model output can be analyzed with a few simple monitoring devices to detect a variety of common failure patterns and bad behavior, including the types of behavior that can be traced:
Smaller, faster models can be used as "balck-box-only monitors," similar to Gemini-Flash for Gemini, or "white-box probes." Trained to detect problematic behavior (for example, linear detectors or SAE detectors);
It is best to monitor both the actions themselves (for example, tool calls) and the reasoning that leads to those actions.
Model developers are already using black-box monitors to prevent jailbreaks, as a proof of concept; Simple linear detectors are extremely cost-effective (requiring only a simple vector product) and can run tens of thousands of detectors in parallel with little additional overhead.
While these methods may seem simple, the researchers believe that no one has systematically studied how to effectively monitor models.
Monitoring may seem simple in concept, but it still requires a lot of iteration and adjustment, such as the need for a complete data set to train and evaluate detectors, calibration thresholds, and so on.
Each AI lab should have an entire team dedicated to building monitoring equipment, making sure models work as expected, analyzing their results, making them run more efficiently, and so on.
Overall, monitoring devices not only directly prevent models from generating harmful content, but also provide a faster and more comprehensive understanding of the effectiveness of alignment techniques.
In the worst case, the model may have a mind of its own, and humans may not necessarily be able to see the reasoning process clearly, so it is necessary to maximize the utility of the model while minimizing risk, including:
1. agents: Previous research mainly focused on small agent systems or small tasks, the laboratory should apply the control technology to the most advanced systems as soon as possible, so as to accumulate practical experience as soon as possible;
2. realism: Control the model in common application scenarios, such as assisting research;
3. Blue team vs. Red Team rules: Under some variants, the red team is too powerful, such as having access to all the data of the blue team and training against the blue team, etc., the assumptions about the team should be more "reasonable" and "accurate".
4. relaxed asummptions: The assumptions of control based on the model are always black and relatively conservative; Tighter boundaries can be defined, reducing constraints and keeping the model's performance roughly the same.
As we accelerate the development of AI systems, we should also deepen our understanding of what big models are up to:
1. Inspection tools: Use better performance tools to detect models' designs, such as larger black-box evaluations and better white-box techniques;
2. The development of intentional behavior: There should be a deep understanding of why the model pursues certain abilities, how the model learns non-short-sighted tendencies, and when the model is willing to act on those tendencies;
3. Real-world tendencies: Understanding how often the model plots in practice, such as analyzing 100 million real model responses, may seem like a low-investment, high-return project, but it requires access to a lot of real deployment data.
4. Hidden intentions: Analyze which training methods reduce the model's intentions or identify those that are difficult to detect. Punishing the model every time it appears to be up to something bad will lead to less negative behavior, but may increase the model's ability to hide the conspiracy. We need to make judgment and balance for different situations;
5. Prepare countermeasures: Ignore, retrain, or completely discard the model depending on the degree of the model;
6. More ability assessment: Further understanding of the degree to which the model evades supervision and interferes with training.
Evaluation is an effective evidence gathering tool for "model plots" that can support other strategies, such as evaluating whether "monitoring" or "control" methods are effective.
On the other hand, evaluation can also give us an idea of the overall capabilities of the model, but existing evaluation still falls short in some areas:
1. Capability assessment: We should continue to build more and better capability assessment, especially for AI research and development, planning, nuclear and biochemical, network and persuasion, etc. We still do not know the specific capability of the model;
2. Elicitation: How can a model be induced to produce maximum capacity, to find the maximum capacity of the model as far as possible?
3. Predictive Assessment: What is the right way to predict ability? How do we break out of those benchmarks where model performance is still 0? There may be a better way to evaluate forecasts;
4. Alignment evaluation: While most research to date has focused on ability assessments, alignment/propensity assessments are also important and should be accelerated to build adversarial alignment assessments, such as setting up various non-short-sighted preference traps, etc., to build a very detailed "psychological profile" of the model to understand its default tendencies;
5. Presentation: For researchers or within the company, a rigorous paper is needed to carefully consider all the potential possibilities, and also to consider how to deliver the presentation to ordinary users.
Existing techniques, such as RLHF, which introduce non-myopic and non-corrigible tendencies into models, should put more effort into understanding in detail the types of targets acquired by the model and building a detailed "cognitive profile" of the model.
There should be a clear distinction between policies used internally and externally, such as a model that is more jailbreaking resistant may reduce its correctability, and it is necessary to determine which policies are appropriate for long-term on-premise deployment.
At best, internal plans should be fully transparent and make clear assumptions about security budgets, the risks the organization is implicitly willing to take, clear red lines, and so on. Specific plans should be communicated frequently with outside experts, who can come from independent organizations, academia, or AI security institutes, and who can enter into confidentiality agreements.
Ethically, companies should share the details of their plans with the public, especially the more impactful AI companies and technologies, and everyone should be able to understand that they "should or shouldn't" expect the technology to be safe.
2025-02-17
2025-02-14
2025-02-13
13004184443
Room 607, 6th Floor, Building 9, Hongjing Xinhuiyuan, Qingpu District, Shanghai
gcfai@dongfangyuzhe.com
WeChat official account
friend link
13004184443
立即获取方案或咨询top