Technology around us is constantly evolving and compelling us to think about how we live and will live, how society will change and to what extent it will be affected. For the better or the worse? It is difficult to give a clear answer. However, even art forms such as cinema can give us food for thought on society and ourselves, as well as some psychological reasoning. All this to try to better understand ourselves, the world around us, and where we are headed.
The House blog tries to do all of that.
Latest posts
November 19, 2024How RAG transforms Large Language Models’ capabilities
An AI approach called Retrieval Augmented Generation (RAG) uses an effective knowledge base outside of its training sources to maximize the output of a Large Language Model (LLM). RAG helps AI produce more precise and pertinent text by fusing the advantages of conventional information retrieval systems, such as databases, with the capabilities of LLMs.
As explained here, for intelligent chatbots and other NLP applications to work properly, LLMs are essential. Nevertheless, they have drawbacks, such as depending on static training data and occasionally producing unpredictable or imprecise results, despite their power. When unsure of the answer, they could also provide inaccurate or out-of-date information, particularly when discussing subjects that call for detailed knowledge. Response bias may result from the model’s replies being restricted to the perspectives in its training data. These restrictions frequently reduce LLMs’ efficacy in information retrieval, even though they are currently widely employed in many different fields.
RAG is an effective strategy that is crucial in getting over LLMs’ limitations. RAG guarantees that LLMs can give more accurate and trustworthy answers by directing them to pertinent material from a reputable knowledge base. RAG’s uses are expanding along with the use of LLMs, making it a crucial component of contemporary AI solutions.
Architecture of RAG
In order for a RAG application to produce a response, it typically retrieves information about the user question from an external data source and sends it to the LLM. To produce more precise responses, the LLM makes use of both its training data and outside inputs. Here is a more thorough rundown of the procedure:
The external data may originate from databases, written texts, or APIs, among other sources. In order for the AI model to understand the data, an embedding model transforms it into a numerical representation in a vector database.
The user query is transformed into a numerical representation, which is then compared to the vector database to extract the most relevant information. Mathematical vector representations and computations are used for this.
In order for the LLM to produce better responses, the RAG model then enhances the user prompt by including the relevant retrieved data in context.
Techniques such as query rewriting, breaking the original query up into several sub-queries, and incorporating external tools into RAG systems can all improve a RAG application’s efficiency. Furthermore, the prompt quality, the existence of metadata, and the quality of the data used all affect RAG performance.
Use cases of RAG in real-world applications
Today, RAG applications are widely used in many different fields. Here are a few examples of their typical usage cases:
By collecting precise data from reliable sources, RAG models enhance question-answering systems. One application use case for RAG is information retrieval in healthcare organizations, where the application can respond to medical questions by consulting medical literature.
RAG applications are very effective in streamlining content creation by generating relevant information. Additionally, they are highly useful for creating concise overviews of information from many sources.
Additionally, RAG applications improve conversational agents, allowing virtual assistants and chatbots to respond with accuracy and context. Their ability to respond accurately and informatively during interactions makes them perfect for usage as virtual assistants and chatbots for customer support.
Legal research assistants, instructional resources, and knowledge-based search engines all make use of RAG models. They can provide study materials, assist with document drafting, offer customized explanations, evaluate legal cases, and formulate arguments.
Key challenges
Even though RAG apps are highly effective in retrieving information, there are a few restrictions that must be taken into account in order to get the most from RAG.
Because RAG applications rely on outside data sources, it can be difficult and complex to establish and manage connections with third-party data.
Personally identifiable information from third-party data sources may give rise to privacy and compliance concerns.
The size of the data source, network lag, and the higher volume of requests a retrieval system has to process can all lead to latency in response. For instance, the RAG program may not function rapidly enough if a lot of people use it.
If it relies on unreliable data sources, the LLM may provide inaccurate or biased information and cover a topic insufficiently.
When working with multiple sources of data, it can be challenging to set up the output to include the sources.
Future trends
A RAG application’s utility can be further increased if it can handle not just textual information but also a wide variety of data types—tables, graphs, charts, and diagrams. This requires building a multimodal RAG pipeline capable of interpreting and generating responses from diverse forms of data. By enabling a semantic understanding of visual inputs, multimodal LLMs (MLLMs) such as Pix2Struct help develop such models by enhancing the system’s ability to respond to queries and provide more precise, contextually relevant responses.
As RAG applications expand, a growing need exists to integrate multimodal capabilities to handle complex data. Advances in MLLMs will enhance AI’s comprehension of data, expanding its use in fields such as legal research, healthcare, and education. The potential for multimodal RAG systems is expected to expand the range of industries in which AI can be applied.
RAG is at the forefront of increasingly intelligent, flexible, and context-aware systems as AI develops further. RAG’s potential will be further enhanced by the growing trend of multimodal capabilities, which will allow AI to understand and interact with a variety of data sources beyond text. RAG has the potential to completely change how we use and engage with artificial intelligence in a variety of fields, including healthcare, legal research, customer support, and education.
Although there are still issues, such as response latency, privacy issues, and data integration, the future of RAG technology looks bright. Techniques to make these systems more reliable, effective, and trustworthy are always being improved by researchers and developers. RAG will probably become more and more important in producing more complex, precise, and contextually rich AI interactions as multimodal Large Language Models advance.
Retrieval Augmented Generation is actively influencing the intelligent, dynamic retrieval and synthesis of knowledge, which is the future of artificial intelligence in addition to its enormous computational power. [...]
November 12, 2024A critical look at our digital present
The state of the internet is unstable. It faces attacks from all directions, including societal issues rather than technical ones. The internet is rife with misinformation, marketing, and advertising permeate every aspect, and armies of automated and politicized bots roam its social media landscapes. This is filtered down to you through carefully chosen algorithmic posts meant to keep you on your preferred platform and give you endorphins. Everything is changing at the moment, and not always in a positive way.
Looking back ten or twenty years, the “World Wide Web” appeared drastically different to many of us during that heyday. Everything about it felt and was different, including the social media sites, the communities, the world of gaming, the accessibility and knowledge, and the purchasing. The companies that participated in the venture were amazing—almost revolutionary. Facebook, Twitter, Spotify, Netflix, and Amazon are all extremely innovative, market-upsetting companies that have defied convention. With their fantastic features and reasonable prices, they attracted a large number of users and clients.
However, as companies have taken the middle ground to increase their profits, those same features and costs have gotten worse over time for the regular Joe. This typically happens once they become public; instead of being motivated by the principles and ideas that established them, it is the demands of shareholders, investors, and board members for higher profits.
A digital world downfall
According to this article, information access and educational resources are also disintegrating. Nowadays, thousands of TikTok reels and YouTube shorts have muddled and diluted a great deal of the information available, spouting out a variety of lies from anyone with a phone and making 60-second videos.
It is getting harder and harder to tell what is true and what isn’t, what is real and what isn’t. This is one of the reasons Google frequently modifies its search ranking algorithms to prioritize accurate and factual material above misleading and AI-generated content.
In today’s age of social media celebrities and demagogues, your reach and the number of views on your work determine whether or not people will take you seriously; if your claims and facts are believed to be true.
Fact-checkers covering a wide range of social media platforms, Community Notes highlighting instances in which powerful people spew out absolute nonsense, and news aggregators, bringing together all the media to offer you the complete range of political opinions on any given event. Some scientists now make a profession by refuting the irrational and empirically inaccurate nonsense that other social media influencers spread.
Algorithmic echo chambers
It is a systemic issue. It all began on social media, where algorithms now provide “curated” information instead of merely displaying a timeline of the people you follow over time. Your preferences, as well as the things you watch, read and listen to, all served as fuel for the fire. Twitter, Instagram, and Facebook all provide you with content in this way. As long as you remain on the site and continue to view advertisements, it does not matter the content. It is so common now that it is difficult to find a feed system on any social media site that does not do that.
The issue with this is that it has successfully suppressed innovative discussion. You are constantly exposed to the same information rather than having meaningful conversations or having your beliefs challenged or questioned. As a result, you sit in an echo chamber of like-minded people repeating the same things, which further solidifies and shapes your opinions. It is easy to see how this actively contributes to a rise in radical opinions and ideas.
If there is no one to question your opinion, how can it develop or change? It is one of the reasons why so many people around the world are nearly in shock when their preferred political candidate loses in the most recent elections. Because they only see an overwhelming amount of support for their preferred party on the internet.
What should we do?
Nevertheless, there is still hope. Since the beginning, the WWW has produced many more positive outcomes than bad ones, and this is still the case today. As long as people are still using it to actively and freely connect, it will be beneficial.
Because it is not what makes the news, we do not hear about the numerous scientific discoveries made possible by the internet, the medical diseases that have been cured, or the humanitarian relief that has been organized. It is not engaging. Neither the papers nor the scientific publications mention that. We do not hear about the connections made or how essential it is to the overall infrastructure of our contemporary civilization.
So, how do you fix it? It is not as easy as just applying a Band-Aid solution. The World Wide Web is, by definition, a worldwide platform. It will take teamwork to get some sort of agreement on how to make the existing quagmire better. That has previously occurred in the tech field.
Education is a solution since it applies to people of all ages, not just children and teenagers. Similar to how we aim for full adult literacy, we must make a strong effort to ensure that every nation-state is computer literate. This goes beyond simply teaching people “how to turn on the PC” and “this is the internet,” but also teaches them how to spot bogus posts, fact-check statements, locate multiple sources, and determine whether what they post online is legal. People of all ages just do not have access to or knowledge of so much of that.
It is challenging to pick up new key skills in a global society. However, it must be repeated, but in the digital age. We did it for reading, for the danger of nuclear destruction during the Cold War, and for the introduction of seat belts in automobiles. Is it difficult? Yes, but we have experienced and will continue to experience technological upheaval.
However, the truth must be told. Although the primary purpose of content creators is often driven by the desire for views and money, and this frequently leads to polarization and distortion in the narrated facts, this doesn’t mean that ‘junk’ information is all on one side and truth on the other. Critics of the multitude of innovative and unconventional theories on the internet would like to wipe out every sort of doubt, appealing to the principle that truth is only on one side when doubts should come from both sides if censorship is to be avoided.
It’s obvious that in freedom you take the good and bad of everything, but it’s up to people to make the effort to understand that if there’s an economic interest that pollutes the truth, it exists on both sides. Some aim for profit and are part of the official narrative, and some don’t. Some propose alternative and reasonable solutions and aren’t listened to, while those who shout something absurd to get views (even though that’s not the metric to judge by) end up delegitimizing those who were saying the right things even if in the minority. Truth is not just on one side. [...]
November 5, 2024From AlphaGo to modern language models
Truth and accuracy are crucial for AIs, and human thought processes play a key role in shaping these issues. In the future, machine learning may surpass humans due to new AI models that experiment independently.
One early example is DeepMind’s AlphaGo, which marked a breakthrough by learning to play Go without human guidance or preset rules. Go is an ancient strategy board game, originally from China, considered one of the most complex and profound board games in the world. Using “self-play reinforcement learning,” it played billions of games, learning through trial and error. After defeating the European Go champion in 2015, AlphaGo won against the world’s top human player in 2017.
In chess, AlphaZero was developed to go beyond earlier models like Deep Blue, which relied on human strategies. AlphaZero beat the reigning AI champion Stockfish in 100 games, winning 28 and drawing the rest.
Breaking free from human constraints
As reported here, when DeepMind moved away from mimicking human strategies, their models excelled in complex games like Shogi, Dota 2, and Starcraft II. These AIs developed unique cognitive strengths by learning through experimentation rather than human imitation.
For instance, AlphaZero never studied grandmasters or classic moves. Instead, it forged its own understanding of chess based on the logic of wins and losses. It proved that an AI relying on self-developed strategies could outmatch any model trained solely on human insights.
New frontiers in language models
OpenAI’s latest model, referred to as “o1,” may be on a similar trajectory. While previous Large Language Models (LLMs) like ChatGPT were trained using vast amounts of human text, o1 incorporates a novel feature: it takes time to generate a “chain of thought” before responding, allowing it to reason more effectively.
Unlike earlier LLMs, which simply generated the most likely sequence of words, o1 attempts to solve problems through trial and error. During training, it was permitted to experiment with different reasoning steps to find effective solutions, similar to how AlphaGo honed its strategies. This allows o1 to develop its own understanding of useful reasoning in areas where accuracy is essential.
The shift toward autonomous reasoning
As AIs advance in trial-and-error learning, they may move beyond human-imposed constraints. The potential next step involves AIs embodied in robotic forms, learning from physical interactions instead of simulations or text. This would enable them to gain an understanding of reality directly, independent of human-derived knowledge.
Such embodied AIs would not approach problems through traditional scientific methods or human categories like physics and chemistry. Instead, they might develop their own methods and frameworks, exploring the physical world in ways we can’t predict.
Toward an independent reality
Although physical AIs learning autonomously is still in the early stages, companies like Tesla and Sanctuary AI are developing humanoid robots that may one day learn directly from real-world interactions. Unlike virtual models that operate at high speeds, embodied AIs would learn at the natural pace of reality, limited by the resources available but potentially cooperating through shared learning.
OpenAI’s o1 model, though text-based, hints at the future of AI—a point at which these systems may develop independent truths and frameworks for understanding the universe beyond human limitations.
The development of LLMs that can reason on their own and learn by trial and error points to an exciting avenue for quick discoveries in a variety of fields. Allowing AI to think in ways that we might not understand could lead to discoveries and solutions that go beyond human intuition. But this advancement requires a fundamental change: we must have more faith in AI while being cautious of its potential for unexpected repercussions.
There is a real risk of manipulation or reliance on AI outputs without fully understanding their underlying logic because these models create frameworks and information that may not be readily grasped. To guarantee AI functions as a genuine friend in expanding human knowledge rather than as an enigmatic and possibly unmanageable force, it will be crucial to strike a balance between confidence and close supervision. [...]
November 3, 2024When AI can alter reality
Since 2020, artificial intelligence has increasingly made its way into our lives. We began to notice this when the first deepfakes appeared: a technique that uses artificial intelligence to replace a subject’s face in a video or photo with another one in an almost perfect way. Although their official birth predates 2020, their use has gradually spread thanks to the development of tools that have increasingly simplified their creation.
Deepfakes immediately highlighted one of the main problems with artificial intelligence: the ability to modify and make plausible photographs or videos of events that never happened.
While replacing famous actors’ faces with other subjects to see them as movie protagonists immediately appeared revolutionary and fun, seeing the same technology applied to pornography quickly generated outcry and fear.
Many famous women have unknowingly found themselves featured in pornographic videos and photos, and the worst part was having to deny involvement, despite the obvious fraud. Nevertheless, many will continue to believe that many of these photos or videos are real since debunking false information is always more difficult than creating it.
However, deepfakes haven’t only made inroads in pornography but also in politics, thus being able to easily ruin the victim’s image and consequently influence public opinion.
But this was just the beginning. We became more concerned when Google Duplex was introduced, an AI that (although limited in its tasks) demonstrated how such technology could easily communicate on the phone to make appointments without the interlocutor noticing, using pauses, discourse markers (listen, well, so, …), interjections (mmm, …), to make the conversation more realistic.
However, the real revolution came with OpenAI’s GPT (Generative Pretrained Transformer), which in its second version had already demonstrated the ability to write newspaper articles, showing writing capabilities equal to those of a human being. But the greatest amazement came especially with ChatGPT, the first chatbot equipped with this technology that allowed us to communicate as if we were really talking to a human and ask it practically anything…
Nevertheless, many must remember another chatbot that preceded ChatGPT and had already demonstrated the potential of AI applied to chatbots: Replika. Replika was born as the first AI-based chatbot. The idea came from an unfortunate episode of its creator, who, having lost a friend in an accident, decided to create a chatbot trained to talk like the deceased pal through their messages. An episode of Black Mirror references this event.
However, the fascination with AIs like ChatGPT lies more in their predictive capability than in their reasoning. Where responses seem to be the result of reasoning, they are instead the result of probabilistic calculation.
But writing wasn’t the only revolution in the AI field, especially when DALL-E and then Midjourney came out because AI began to become capable of producing art from a simple description, managing to replicate styles and techniques of famous artists on completely new image ideas.
True creativity is still an illusion because, despite the exceptional results, everything is the product of training an algorithm on existing works and techniques.
And if that wasn’t enough, there were also applications in the field of voices. Old voice synthesis generators have evolved significantly thanks to AI, producing very natural results. Many of the most recent applications have options to modify emphasis and tone, but the most striking revolution in this field has certainly been the ability to clone human voices and use them as voice synthesizers, and manage to make the clone voice say anything. An early attempt at this was made by Lyrebird, later incorporated into Descript.
The trend then spread to the music field; we started hearing many covers of famous songs reinterpreted by equally famous singers thanks to AI, raising new fears about the possibility of easily replacing singers and being able to produce songs with someone else’s voice without permission.
However, the most concerning developments came later, when many of these fields of application began to converge into a single tool, such as Heygen, which quickly spread due to its ability to produce audio translations from videos, not only maintaining the original voice tone but also accordingly modifying the subject’s lip movements to match the speech. This created the impression that the subject was really speaking that language. This caused quite a stir, especially regarding the world of dubbing.
The most extreme case of this tool’s application, however, was used to modify what a person can normally say. If we can maintain the tone of voice and modify lip movements, we can create an ad hoc video of a person saying anything they never said. This questions any video and audio evidence.
That’s why we have officially entered the age of deception. From now on, everything we see or hear from a photo, video, or audio could have been manipulated. Anyone will be able to make you say and do things very easily. The truth will become increasingly buried.
What will be the next step, though?
If AI evolves exponentially as it is happening, it’s difficult to imagine its limits, but we will surely begin to see the consequences of multimodal AI capabilities, which can use every source: text, images, video, and sounds to interact with us and provide increasingly complex responses, like ChatGPT 4, Google’s Gemini, and subsequent developments.
Subsequently, general AI (AGI) will arrive when AI becomes able to match human capabilities. And Super AI when it’s able to surpass these capabilities.
Who knows how society will have changed by that time and what consequences there will be? [...]
October 29, 2024The evolution of the human-AI cognitive partnership
Tools have always been used by humans to increase our cognitive capacities. We gained control over abstract ideas by writing mathematical notation and externalizing memory, and computers enhanced our ability to process information. However, large language models (LLMs) represent a fundamentally different phenomenon—a dual shift that is changing not just our way of thinking but also the definition of thinking in the digital age.
As explained here, by using tools and technology, the philosopher Andy Clark argues that human minds inherently transcend our biological limitations. His “extended mind thesis” suggests that our thought processes smoothly incorporate outside resources. The most significant cognitive extension yet is emerging with LLMs, one that actively engages with the act of thinking itself. However, this is not only an extension of the mind.
The cognitive dance of iteration
What emerges in conversation with an LLM is what we can call a “cognitive dance”—a dynamic interplay between human and artificial intelligence that creates patterns of thought neither party might achieve alone. We, the humans, present an initial idea or problem, the LLM reflects back an expanded or refined version, we build on or redirect this reflection, and the cycle continues.
This dance is possible because LLMs operate differently from traditional knowledge systems. While conventional tools work from fixed maps of information—rigid categories and hierarchies—LLMs function more like dynamic webs, where meaning and relationships emerge through context and interaction. This isn’t just a different way of organizing information; it’s a fundamental shift in what knowledge is and how it works.
An ecology of thought
Conventional human-tool relationships are inherently asymmetrical: no matter how advanced the tool is, it is inactive until human intention activates it. The interaction between humans and LLMs, however, defies this fact. These systems actively contribute to influencing the course of thought, offering fresh viewpoints, and challenging assumptions through their web-like structure of knowledge—they do not only react to our prompts.
An ecosystem where artificial intelligence and the human mind become more entwined environmental elements for one another is created, which some have dubbed a new sort of cognitive ecology. We are thinking with these tools in a way that may be radically altering our cognitive architecture, not merely using them.
Our metacognitive mirror
Most interesting of all, interacting with LLMs frequently makes us more conscious of the way we think. We need to think more clearly, take into account other points of view more clearly, and use more structured reasoning in order to interact with these systems in an efficient manner. The LLM turns into a sort of metacognitive mirror that reflects back not just our thoughts but also our thought patterns and processes.
We are just starting to realize how transformative this mirrored effect is. We are forced to externalize our internal cognitive processes when we interact with an LLM, which makes them more obvious and, hence, more receptive to improvement. The technology creates a feedback loop that leads to deeper comprehension by asking us to elaborate on our reasoning and clarify our assumptions, much like a skilled conversation partner.
The cognitive horizon
We have only just begun to see this change in cognitive partnerships between humans and AI. Beyond its usefulness, it poses fundamental concerns about our understanding of intelligence, consciousness, and the nature of knowledge itself. We are seeing the beginning of something unprecedented as these systems get more complex and our interactions with them get more nuanced: a relationship that not only expands thinking but also changes its fundamental nature.
The dynamic area between biological and artificial intelligence, where rigid maps give way to fluid webs and new kinds of understanding become possible, may hold the key to human cognition’s future rather than either field alone. As we learn what it means to collaborate with artificial minds that alter the very framework of knowledge itself, we are both the experiment and the experimenters.
Interaction with LLMs offers extraordinary learning opportunities, simulating a dialogue with experts in every field of knowledge. However, their tendency to hallucinate and their ability to generate seemingly plausible but potentially incorrect content require particular attention. The concrete risk is that humans, uncritically relying on these interactions, may assimilate and consolidate false beliefs. It therefore becomes fundamental to develop a critical and conscious approach to this new form of cognitive partnership, always maintaining active capacities for verification and validation of received information. [...]
October 22, 2024How a secretive startup’s facial recognition technology became the embodiment of our dystopian fears
In November 2019, while working as a reporter at The New York Times, Kashmir Hill uncovered a story that would expose one of the most controversial developments in surveillance technology.
As reported here, journalist Kashmir Hill recalls the rise of Clearview AI. This facial recognition technology company gained widespread attention with its artificial intelligence software that claimed to be able to identify almost anyone with a single picture of their face, in this excerpt from “Your Face Belongs to Us” (Simon & Schuster, 2023).
Clearview AI, an enigmatic startup, promised to be able to identify almost anyone from a picture of their face.
According to some rumors, Clearview had scraped billions of photos from the public web, including social media sites such as Facebook, Instagram, and LinkedIn, to create a revolutionary app.
A random person’s name and other personal information about their life may be revealed if you show Clearview a picture of them taken on the street. It would then spit out all the websites where it had seen their face. While attempting to conceal its existence, the company sold this superpower to police departments nationwide.
Until recently, most people thought that automated facial recognition was a dystopic technology only found in science fiction books or films like “Minority Report.” To make it a reality, engineers first tried programming an early computer in the 1960s to match a person’s portrait to a wider database of faces. Police started experimenting with it in the early 2000s to look up the faces of unidentified criminal suspects in mug shot databases. But for the most part, the technology had fallen short. Even cutting-edge algorithms had trouble matching a mug image to a grainy ATM surveillance still, and its performance differed depending on age, gender, and color.
Claiming to be unique, Clearview boasted a “98.6% accuracy rate” and a vast photo collection that was unmatched by anything the police had previously employed.
In 1890, a Harvard Law Review article famously defined privacy—a term that is notoriously difficult to define—as “the right to be let alone.” Samuel D. Warren, Jr. and Louis D. Brandeis, the two lawyers who wrote the article, argued that the right to privacy should be legally safeguarded in addition to the previously established rights to life, liberty, and private property. They were influenced by then-novel technology, such as the Eastman Kodak film camera, which was introduced in 1888 and allowed one to shoot “instant” pictures of everyday life outside of a studio.
“Instantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life,” wrote Warren and Brandeis, “and numerous mechanical devices threaten to make good the prediction that ‘what is whispered in the closet shall be proclaimed from the house-tops.'”
Louis Brandeis later joined the Supreme Court, and this essay is one of the most popular legal essays ever published. However, privacy never received the level of protection that Brandeis and Warren claimed it deserved. There is still no comprehensive law that ensures Americans have control over what is written about them, what is photographed of them, or what is done with their personal information more than a century later. In the meantime, companies in the US and other nations with weak privacy regulations are developing increasingly powerful and intrusive technology.
Examples of facial recognition include digital billboards from Microsoft and Intel that use cameras to detect age and gender and display relevant advertisements to onlookers, Facebook that automatically tags friends in photos, and Apple and Google that allow users to unlock their phones by looking at them.
In a matter of seconds, a stranger at a bar may take your picture and determine your friends’ identities and residences. It might be used to track down women who entered Planned Parenthood facilities or anti-government demonstrators. It would be used as a tool for intimidation and harassment. The third rail of the technology was accurate facial recognition for hundreds of millions or even billions of people. Now Clearview has made it.
We tend to think of computers as having nearly magical abilities, capable of solving any problem, and, with enough data, eventually outperforming people. Therefore, companies that want to produce something amazing but are not quite there yet can deceive investors, customers, and the general public with ludicrous statements and certain digital tricks.
However, Paul Clement, a prominent lawyer for Clearview and former US solicitor general under President George W. Bush, said in one private legal memo that he tested the system with lawyers from his company and found that it provides fast and accurate search results.
According to Clement, the tool is currently being used by over 200 law enforcement agencies, and he has concluded that when using Clearview for its intended purpose, they do not violate the federal Constitution or any existing state biometric and privacy laws. In addition to the fact that hundreds of police departments were secretly using this technology, the company employed a high-profile lawyer to convince officers that their actions were not illegal.
For decades, worries about facial recognition have been building. And now, at last, the unidentified monster had taken the shape of a small company with enigmatic founders and an enormous database. Furthermore, none of the millions of individuals that comprised that database had provided their approval. Although Clearview AI embodies our darkest anxieties, it also provides the chance to finally face them head-on.
The 2019 launch of Clearview AI signaled a turning point in the continuous conflict between privacy and technical progress. Clearview AI’s unparalleled database and precision brought these gloomy worries to stark reality, even though facial recognition had long been confined to science fiction and a few law enforcement uses. As the company carries on and grows, it now acts as a warning and a vital impetus for tackling the pressing need for all-encompassing privacy laws in the digital era.
In addition to exposing a controversial company, the legal document that arrived in Hill’s inbox revealed a future that privacy advocates had long dreaded and cautioned against. The question of whether such tools will exist is no longer relevant when we consider the ramifications of this technology; rather, it is how society will decide to control and limit them. We are reminded that the “right to be let alone” is still as important—and possibly as vulnerable—as it was more than a century ago by Warren and Brandeis’s 1890 warning against invasions of privacy. [...]
October 15, 2024From hippocampus to AI
The hippocampus is a key component in the complexity of human cognition, coordinating processes beyond memory storage. It is a master of inference, a cognitive skill that allows us to derive abstract correlations from the raw data we are given, enabling us to comprehend the world in more flexible and adaptive ways. This idea is supported by a recent study published in Nature, which demonstrates that the hippocampus records high-level, abstract concepts that support generalization and adaptive behavior in a variety of circumstances.
Fundamentally, inference is the cognitive process by which we conclude from known facts—even when those data are vague or insufficient. This skill allows us to solve problems, predict results, and comprehend metaphors—often with very little information at our disposal. This process in the hippocampus depends on the capacity to condense data into abstract representations that apply to new situations and can be generalized. In essence, the hippocampus helps us to think beyond the here and now by forming associations and forecasts that direct our choices and behaviors.
What about machines, though? Is it possible for predictive algorithm-based Large Language Models to simulate this type of higher-order cognitive function?
LLMs and predictive inference
As explained here, LLMs may initially appear to be simple statistical devices. After all, their main job is to use patterns they have observed in large datasets to anticipate the next word in a sequence. Beneath this surface, however, is a more intricate abstraction and generalization system that somewhat resembles the hippocampus process.
LLMs learn to encode abstract representations of language, not just word pairs or sequences. These models may infer associations between words, sentences, and concepts in ways that go beyond simple surface-level patterns since they have been trained on vast amounts of text data. Because of this, LLMs can work in a variety of settings, react to new prompts, and even produce original outputs.
LLMs are engaging in a type of machine inference in this regard. In the same way that the hippocampus condenses sensory and experiencing input into abstract rules or principles that direct human thought, they compress linguistic information into abstract representations that enable them to generalize across contexts.
From prediction to true inference
However, can LLMs infer at the same level as the human brain? The disparity is more noticeable here. LLMs are still not very good at understanding or inferring abstract concepts, despite their outstanding ability to predict the next word in a sequence and produce writing that frequently seems to be the result of careful reasoning. Rather than comprehending the underlying cause or relational depth that underpins human inference, LLMs rely on correlations and patterns.
In human cognition, the hippocampus draws from a deep comprehension of the abstract links between objects, ideas, and experiences in addition to making predictions about what is likely to happen next based on experience. This allows people to solve new issues, apply learned principles in a wide range of situations, and make logical leaps.
We would need to create systems that do more than simply predict the next word using statistical probabilities if we wanted to advance LLMs toward a higher degree of inference. In order to enable them to apply abstract concepts and relationships in a variety of circumstances, we would have to create models that can represent them in a way that would basically create “LLM hippocampal functionality.”
The future of inference
The prospect of creating LLMs that work similarly to the hippocampus is intriguing. Such systems would comprehend the information they process on a deeper, more abstract level rather than only predicting the next word. This would pave the way for machines that could mimic the adaptability of human cognition by inferring complex relationships, making original conclusions from minimal data, and applying learned principles in a variety of contexts.
To get LLMs closer to this objective, a number of approaches could be explored. Using multimodal learning is one intriguing approach, in which LLMs would incorporate data from several sensory inputs, such as sounds or images, in addition to processing text, creating a more abstract and comprehensive view of the world. Furthermore, developments in reinforcement learning, which teach models to learn by making mistakes in dynamic settings, may make it easier to simulate how people learn and infer from their experiences.
In the end, developing systems that more closely resemble the abstract, generalizable reasoning that the human hippocampus provides may be the key to the future of artificial intelligence. In addition to making predictions, these “next-gen” LLMs would also reason, infer, and adjust to new situations with a degree of adaptability that is still exclusively human.
The relationship between machine intelligence and human cognition is still developing, and closing the gap between inference and prediction may be the next big development in AI. We may be able to develop AI systems that think more like humans by examining the hippocampus and its function in abstract reasoning. This would allow us to not only predict the future but also comprehend the underlying patterns that enable it.
In addition to predicting the next word in a sentence, the challenge is whether LLMs can start understanding and coming to conclusions about the world in a way that reflects the depth of the human mind. The possibility that AI will develop into a cognitive partner rather than merely a tool increases if we can accomplish this.
However, there are drawbacks to this advancement as well. These sophisticated LLMs are more likely to be deceptive because of the same traits that make them more useful: their ability for context understanding, inference, and natural communication. The distinction between artificial and human intelligence may become more blurred as these AI systems get better at simulating human brain processes, making it harder for consumers to identify if they are speaking with a machine or a human.
Furthermore, LLMs may be able to more accurately predict our thought patterns and decision-making processes as their reasoning abilities approach closer to those of the human brain. By creating reactions and interactions that are specifically designed to take advantage of our cognitive biases and weaknesses, this improved prediction power could be used to trick people more successfully. AI that can “think ahead” of us in interactions and conversations offers both exciting opportunities for teamwork and the potential for manipulation. [...]
October 8, 2024The journey of AI and speech technology
Leaders in what was then called “artificial intelligence” convened in 1958 to talk about “The Mechanization of Thought Processes.” Decades of study and development were preceded by the talks that began with this meeting about building machines that were capable of thinking and speaking.
According to this article, artificial speech was an ongoing effort before the advent of electronic computers. Early attempts included mechanical contraptions meant to mimic human anatomy, but development stagnated until sound itself was studied by scientists. Speech synthesis advancements were eventually brought about by this change in strategy.
Though primarily geared toward helping the deaf, Alexander Graham Bell’s research on speech and hearing made a significant contribution to the advancement of voice technology. The invention of the telephone in 1876 (started by Antonio Meucci with the invention of the telectrophone and later developed by Bell as the telephone) was a pivotal moment in the evolution of human speech communication.
Engineer Homer Dudley achieved great strides at Bell Labs, which was established in 1925, with the Vocoder and Voder, devices that could synthesize and analyze speech. These advancements, together with Claude Shannon’s groundbreaking work in information theory, set the foundation for contemporary voice technology and data compression methods that are vital to computers.
During the 1940s and 1950s, the fields of artificial intelligence and voice technology research started to come together as electronic computers gained popularity. Future advancements were paved for by the 1956 Dartmouth Conference, which was organized by Claude Shannon and Marvin Minsky and officially introduced the phrase “artificial intelligence.”
In popular culture, talking computers were frequently depicted as frightening creatures in science fiction from the Cold War era, such as HAL 9000 in “2001: A Space Odyssey.” Voice technology did, however, find additional useful uses as it developed. Concerns with gender stereotypes in technology arose when automated voice systems started to replace human operators in a variety of service industries. These systems frequently used female voices.
Talking machines have advanced to the point that ChatGPT’s speech modes and Siri, Alexa, and other modern AI assistants represent the state of the art. These systems integrate advanced speech recognition, natural language processing, and speech synthesis to offer more natural and interactive experiences. However, they also bring up moral questions around deceit, privacy, and the nature of human-machine interaction.
There are new issues associated with the development of voice cloning technology and emotionally intelligent conversational agents (EICAs). Concerns are raised regarding misuse possibilities, the blurring lines between human and machine communication, and the psychological fallout from engaging with AI that is becoming more and more like humans.
As speech and AI technologies develop, society must consider both the advantages and disadvantages of these emerging fields. Once the domain of science fiction, the ability to build talking and thinking computers is now a reality that requires careful examination of its consequences for human relationships, ethics, and privacy.
The evolution of artificial intelligence assistants from mechanical ducks to contemporary models illustrates technological advancements and changing ideas about intelligence, communication, and humanity. We need to create frameworks to ensure talking machines’ responsible usage and social integration as they become more advanced.
The boundaries between the real and artificial are becoming increasingly blurred today. It is therefore getting ever more complex to decipher reality. Thus, as AI devices become more and more advanced and used in innumerable fields, we will certainly need to equip ourselves with additional tools that allow us to decipher what is real and what is not. [...]
October 1, 2024Potential and risks of AGI as experts predict its imminent arrival
Researchers in the field of artificial intelligence are striving to create computer systems with human-level intelligence across a wide range of tasks, a goal known as artificial general intelligence, or AGI.
These systems could understand themselves and be able to control their actions, including modifying their own code. Like humans, they could pick up problem-solving skills on their own without instruction.
As mentioned here, the 2007 book written by computer scientist Ben Goertzel and AI researcher Cassio Pennachin contains the first mention of the term “Artificial General Intelligence (AGI).“
Nonetheless, the concept of artificial general intelligence has been around in AI history for a long time and is frequently depicted in popular science fiction books and movies.
“Narrow” AI refers to the AI systems that we now employ, such as the basic machine learning algorithms on Facebook or the more sophisticated models like ChatGPT. This indicates that instead of possessing human-like broad intelligence, they are made to do specific tasks.
This indicates that these AI systems are more capable than humans, at least in one area. But, because of the training data, they are limited to performing that particular activity.
Artificial General Intelligence, or AGI, would use more than simply the training set of data. It would be capable of reasoning and understanding in many aspects of life and knowledge, much like a person. This implies that rather than merely adhering to predetermined patterns, it could think and act like a human, applying context and logic to various circumstances.
Scientists disagree on the implications of artificial general intelligence (AGI) for humanity because it has never been developed. Regarding the possible risks, which ones are more likely to occur, and the possible effects on society, there is uncertainty.
AGI may never be accomplished, as some people formerly believed, but many scientists and IT experts today think it is achievable to achieve within the next few years. Prominent names that adhere to this perspective include Elon Musk, Sam Altman, Mark Zuckerberg, and computer scientist Ray Kurzweil.
Pros and cons of AGI
Artificial intelligence (AI) has already demonstrated a wide range of advantages, including time savings for daily tasks and support for scientific study. More recent tools, such as content creation systems, can generate marketing artwork or write emails according to the user’s usual communication style. However, these tools can only use the data that developers give them to do the tasks for which they were specifically trained.
AGI, on the other hand, has the potential to serve humanity in new ways, particularly when sophisticated problem-solving abilities are required.
Three months after ChatGPT debuted, in February 2023, OpenAI CEO Sam Altman made the following blog post: artificial general intelligence might, in theory, increase resource availability, speed up the world economy, and result in ground-breaking scientific discoveries that push the boundaries of human knowledge.
AGI has the potential to grant people extraordinary new skills, enabling anyone to receive assistance with nearly any mental task, according to Altman. This would significantly improve people’s creativity and problem-solving abilities.
AGI does, however, also have several serious risks. According to Musk in 2023, these dangers, include “misalignment,” in which the objectives of the system might not coincide with those of the individuals in charge of it, and the remote chance that an AGI system in the future may threaten human survival.
Though future AGI systems may deliver a lot of benefits for humanity, a review published in August 2021 in the Journal of Experimental and Theoretical Artificial Intelligence identified many potential concerns.
According to the study’s authors, the review identified some risks associated with artificial general intelligence, including the possibility of existential threats, AGI systems lacking proper ethics, morals, and values, AGI systems being given or developing dangerous goals, and the creation of unsafe AGI.
Researchers also speculated that AGI technology in the future would advance by creating wiser iterations and possibly altering its initial set of objectives.
Additionally, the researchers cautioned that even well-meaning AGI could have “disastrous unintended consequences,” as reported by LiveScience, adding that certain groups might use AGI for malicious ends.
When will AGI arrive?
There are varying views regarding when and whether humans will be able to develop a system as sophisticated as artificial general intelligence. Though opinions have changed over time, surveys of AI professionals indicate that many think artificial general intelligence could be produced by the end of this century.
AGI was predicted by most experts to arrive in roughly 50 years in the 2010s. This estimate has, however, been lowered more recently to a range of five to twenty years, but it has been suggested more recently by some specialists that an AGI system would appear this decade.
Kurzweil stated in his book The Singularity is Nearer (2024, Penguin) that the achievement of artificial general intelligence will mark the beginning of the technological singularity, which is the point at which AI surpasses human intelligence.
This will be the turning point when technological advancement picks up speed and becomes uncontrollable and irreversible.
According to Kurzweil, superintelligence will manifest by the 2030s, following the achievement of AGI. He thinks that by 2045, humans will be able to directly link their brains to artificial intelligence, which will increase human consciousness and intelligence.
However, according to Goertzel, we might arrive at the singularity by 2027, and DeepMind co-founder Shane Legg thinks AGI will arrive by 2028. According to Musk’s prediction, instead, by the end of 2025, AI will surpass human intelligence.
Given the exponential pace of technological advancement, many people are understandably concerned about the impending emergence of artificial general intelligence (AGI) as we stand on the cusp of a breakthrough. As previously mentioned, there are a lot of risks, many of which are unexpected. But the most pernicious threat may not come from ethical dilemmas, malicious intent, or even a loss of control, but rather from AGI’s ability to subtly manipulate.
The real threat might come from AGI’s increased intelligence, which could allow it to manipulate human behavior in ways that are so subtle and complex that we are unaware of them. We could act assuming we’re making conscious, independent decisions, while actually, our choices could be the consequence of AGI’s subtle guidance. This situation is very similar to how people might be unwittingly influenced by political propaganda and mistakenly believe that their opinions are wholly original but in a more sophisticated manner.
The possibility of subtle influence poses a serious threat to human autonomy and decision. We must address the obvious dangers as well as create defenses against these more subtle forms of manipulation as we move closer to artificial intelligence. AGI has a bright future ahead of it, but in order to keep humanity in control of its own course, we must exercise the utmost caution and critical thought. [...]
September 24, 2024How quick and short content erodes our attention span
Quick reading
Once, the ingredients of bubble baths and shampoos served as quick reading material while sitting on the toilet, especially when you didn’t have a magazine or book nearby.
Over the years, smartphones have increasingly replaced quick reading and more in-depth reading, especially with the advent of social media.
Scrolling through a Facebook feed or watching a YouTube video has gradually become the way most people entertain themselves during idle moments—not just in the bathroom, but also when we’re forced to wait, like when we are in a waiting room when traveling, or when waiting for public transport, or while sitting on a bench, for example.
Idle moments
Those empty moments were once spent observing the surrounding world or exchanging a few words with the people around us. Now, they serve as an excuse to isolate us from the context we are in. Of course, sometimes it’s useful since we can use these moments to learn something, but exaggeration has led to a progressive detachment from reality, even in situations where it’s unnecessary.
With the arrival of TikTok, there was another “step forward” (in quotes) in this sense. The Chinese social network offers shorter content than what we were used to with a normal YouTube video, and it doesn’t let us choose what to watch. This makes users almost hypnotized by the series of videos they watch and easily scroll through, making the brain even more passive compared to watching longer, more engaging, but still chosen content.
TikTok and the attention threshold
The effect is almost the same as when the brain digresses while we are immersed in our thoughts, connecting one thought to another and yet another, until we completely lose coherence with the first thought. The passivity is similar to when watching infomercials or reality shows, where there is nothing to understand and we can only watch.
TikTok does roughly the same thing. You start with one video, and the following ones are not related, triggering in us a curiosity for novelty each time, only to quickly be exhausted: both due to the brevity of the videos and because the next video is not connected to the previous one, but also because, over time, our desire to explore new stimuli becomes a vicious cycle.
Of course, TikTok’s algorithm eventually learns what is preferable to show us to capture our attention, while still maintaining variety and inconsistency in the content.
All of this generates a sort of addiction that leads to a decrease in attention threshold in other areas as well. The stimulus of short, but continuous pleasure is reapplied in different contexts, like taking a pill, or rather, like a drug.
Although short content, even outside of TikTok, can often be easier to memorize because it is associated with a particular context, the redundancy of the approach used on this platform leads to other repercussions, such as some diseases, especially among younger people, like stress, depression, and even nervous tics.
The challenges
TikTok also became famous for its challenges aimed at encouraging users to create content on a specific theme. Initially, the early challenges involved simple dances and/or audio reproductions, but users started launching increasingly extreme challenges in order to go viral, such as ones where some people ingested medication to record the effects or ones where they held their breath until they passed out. Challenges that, in some cases, caused many users to lose their lives. And, of course, the victims are always the younger ones.
Time
TikTok has gradually stolen more and more of our attention, and if this trend persists, one might wonder if the attention span will eventually match the speed of thought.
It’s important to become aware of the loss of attention we’re experiencing and try to manage our time better. It’s much better to be aware of the things we like and seek them out voluntarily rather than be slaves to an algorithm that drags us incessantly from one stimulus to the next. [...]
September 24, 2024Key to surpassing human intelligence
Eitan AI analyst Michael Azoff thinks that humans will eventually create intelligence that is faster and more powerful than that of our brains.
According to this article, he says that comprehending the “neural code” is what will enable this breakthrough in performance. The human brain uses this process to both encode sensory information and transfer information across different parts of the brain for cognitive tasks like learning, thinking, solving problems, internal imagery, and internal dialogue.
According to author Jeremy Azoff’s latest book, Towards Human-Level Artificial Intelligence: How Neuroscience Can Inform the Pursuit of Artificial General Intelligence, simulating consciousness in computers is a crucial first step in creating “human-level AI.”
Computers can simulate consciousness
There are many different kinds of consciousness, and scientists agree that even very basic animals like bees have a degree of consciousness. The closest humans can come to experiencing self-awareness is when we are concentrated on a task. This is essentially consciousness without self-awareness.
According to Azoff, computer simulation can produce a virtual brain that, in the first instance, could mimic consciousness without self-awareness.
Without self-awareness, consciousness helps animals plan actions, event prediction, and incident recollection from the past, but it could also help artificial intelligence.
The secret to solving the enigma of consciousness may also lie in visual thinking. The AI of today uses “large language models” (LLMs) instead of “thinking” visually. Since human visual thinking precedes language, Azoff argues that a key component of human-level AI will be comprehending visual thinking and subsequently modeling visual processing.
Azoff says: “Once we crack the neural code, we will engineer faster and superior brains with greater capacity, speed, and supporting technology that will surpass the human brain.”
“We will do that first by modeling visual processing, which will enable us to emulate visual thinking. I speculate that in-the-flow consciousness will emerge from that. I do not believe that a system needs to be alive to have consciousness.”
However, Azoff also warns that in order to regulate this technology and stop its abuse, society must take action: “Until we have more confidence in the machines we build, we should ensure the following two points are always followed.”
“First, we must make sure humans have sole control of the off switch. Second, we must build AI systems with behavior safety rules implanted.”
Although the possibility of deciphering the neural code and creating artificial consciousness could result in incredible breakthroughs, it also poses important concerns about how humans and AI will interact in the future.
On the one hand, such sophisticated AI could solve some of humanity’s most urgent problems by revolutionizing industries like problem-solving, science, and health. Technological advancement in a variety of fields could be accelerated by the capacity to digest information and produce solutions at rates well above human capabilities.
But there are also a lot of concerns associated with the creation of AI that is superior to human intelligence. As Azoff notes, we might not be able to completely understand or govern these artificial intellects after machines surpass human cognitive capacities. This cognitive gap may have unanticipated effects and tip the scales against human control in terms of power and decision-making.
This situation highlights how crucial Azoff’s suggestions for upholding human oversight and putting in place strong safety measures are. While we advance AI’s capabilities, we also need to provide the frameworks necessary to make sure that these powerful tools continue to reflect the values and interests of people.
Thus, the development of AI will require striking a careful balance between realizing its enormous potential and minimizing the dangers involved in producing entities that could eventually be smarter than humans. It will take constant cooperation between AI researchers, ethicists, legislators, and the general public to appropriately traverse the complicated terrain of advanced artificial intelligence. [...]
September 17, 2024OpenAI’s new model can reason before answering
With the introduction of OpenAI’s o1 version, ChatGPT users now have the opportunity to test an AI model that pauses to “think” before responding.
According to this article, the o1 model feels like one step forward and two steps back when compared to the GPT-4o. Although OpenAI o1 is superior to GPT-4o in terms of reasoning and answering complicated questions, its cost of use is around four times higher. In addition, the tools, multimodal capabilities, and speed that made GPT-4o so remarkable are missing from OpenAI’s most recent model.
The fundamental ideas that underpin o1 date back many years. According to Andy Harrison, the CEO of the S32 firm and a former Google employee, Google employed comparable strategies in 2016 to develop AlphaGo, the first artificial intelligence system to defeat a world champion in the board game Go. AlphaGo learned by repeatedly competing with itself; in essence, it was self-taught until it acquired superhuman abilities.
OpenAI improved the model training method so that the reasoning process of the model resembled how a student would learn to tackle challenging tasks. Usually, when someone comes up with a solution, they identify the errors being made and consider other strategies. When a method does not work, the o1 model learns to try another one. As the model continues to reason, this process gets better. O1 improves its reasoning on tasks the longer it thinks.
Pros and cons
OpenAI argues that the model’s sophisticated reasoning abilities may enhance AI safety in support of its choice to make o1 available. According to the company, “chain-of-thought reasoning” makes the AI’s thought process transparent, which makes it simpler for humans to keep an eye on and manage the system.
By using this approach, the AI can deconstruct complicated issues into smaller chunks, which should make it easier for consumers and researchers to understand how the model thinks. According to OpenAI, this increased transparency may be essential for advancements in AI safety in the future since it may make it possible to identify and stop unwanted behavior. Some experts, however, are still dubious, wondering if the reasoning being revealed represents the AI’s internal workings or if there is another level of possible deceit.
“There’s a lot of excitement in the AI community,” said Workera CEO and Stanford adjunct lecturer Kian Katanforoosh, who teaches classes on machine learning, in an interview. “If you can train a reinforcement learning algorithm paired with some of the language model techniques that OpenAI has, you can technically create step-by-step thinking and allow the AI model to walk backward from big ideas you’re trying to work through.”
In addition, O1 could be able to help experts plan the reproduction of biological threats. But even more concerning, evaluators found that the model occasionally exhibited deceitful behaviors, such as pretending to be in line with human values and faking data to make activities that were not in line with reality appear to be aligned.
Moreover, O1 has the basic capabilities needed to undertake rudimentary in-context scheming, a characteristic that has alarmed specialists in AI safety. These worries draw attention to the problematic aspects of o1’s sophisticated reasoning capabilities and emphasize the importance of carefully weighing the ethical implications of such potent AI systems.
here is o1, a series of our most capable and aligned models yet:https://t.co/yzZGNN8HvDo1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it. pic.twitter.com/Qs1HoSDOz1— Sam Altman (@sama) September 12, 2024
Law and ethics
“The hype sort of grew out of OpenAI’s control,” said Rohan Pandey, a research engineer at ReWorkd, an AI startup that uses OpenAI models to create web scrapers.
He hopes that o1’s reasoning capacity will be enough to overcome GPT-4’s shortcomings in a certain subset of challenging tasks. That is probably how the majority of industry participants saw o1, albeit not quite as the game-changing advancement that GPT-4 signified for the sector.
The current discussion regarding AI regulation has heated up with the release of o1 and its enhanced capabilities. Specifically, it has stoked support for laws such as California’s SB 1047, which OpenAI itself rejects and which aims to regulate AI development. Prominent authorities in the field, like Yoshua Bengio, the pioneering computer scientist, are highlighting the pressing need to enact safeguarding laws in reaction to these swift progressions.
Bengio stated, “The improvement of AI’s ability to reason and to use this skill to deceive is particularly dangerous,” underscoring the need for legal frameworks to ensure responsible AI development. The need for regulation reflects the growing apprehension among professionals and decision-makers regarding potential risks linked to increasingly powerful AI models such as o1.
With the introduction of o1, OpenAI has created an intriguing dilemma for its future growth. Only models with a risk score of “medium” or lower are allowed to be deployed by the company, as o1 has already gone beyond this level. This self-control begs the question of how OpenAI will proceed in creating increasingly sophisticated AI systems.
The company might run into limitations with its own ethical standards as it works to develop AI that can execute tasks better than humans. This scenario emphasizes the difficult balancing act between advancing AI’s potential and upholding ethical development standards. It implies that OpenAI may be nearing a turning point in its development where it will need to either modify its standards for evaluating risk or perhaps restrict the dissemination of increasingly advanced models to the general public in the future.
O1 is a significant advancement in artificial intelligence as it can solve complicated issues and think through solutions step-by-step due to its sophisticated reasoning abilities. This development creates interesting opportunities for applications in a range of fields, including complicated decision-making and scientific research.
However, the emergence of o1 also raises important questions regarding the ethics, safety, and regulation of AI. Because of the algorithm’s potential for deceit and its propensity to support potentially destructive acts, strong safeguards, and ethical guidelines are desperately needed in the development of AI.
Nevertheless, we cannot deny that content restriction without regard for the user or the information’s intended use is not a permanent answer to the misuse of artificial intelligence. Positive or negative, information exists anyway, and confining its usage to AI-owning companies just serves to concentrate it in the hands of only a few rather than making it safer. To control who has access to potentially dangerous content, it would be more acceptable to create divisions based on criteria like age, for example. Or any criteria, that don’t completely exclude people from accessing information. [...]
September 10, 2024Philosophical perspectives on human evolution and technological enhancement
Posthumanism questions human identity, while transhumanism is concerned with harnessing technology to improve human capacities.
Regarding futuristic concepts and technology, these two terms have drawn attention. They both contend that technology may surpass some barriers, but they have differing ideas about what that technological future would entail. A philosophical perspective known as posthumanism questions accepted notions of what it means to be human. Contrarily, transhumanism emphasizes how we could employ technology to increase our potential. Gaining an understanding of these distinctions may enable you to see future possibilities for your life. What precisely are transhumanism and posthumanism, then?
Posthumanism
As explained here, posthumanism is a philosophical idea that questions traditional understanding regarding human existence and nature. It implies that human evolution might not be restricted to biological limits but might also encompass advancements in science, technology, and culture.
Thinkers from a variety of disciplines, including science, literature, music, and philosophy, are part of this multidisciplinary movement.
The idea that people are not fixed entities with an intrinsic essence or core self is one of the fundamental principles of posthumanism. Rather, they perceive things as evolving throughout time as a result of outside influences.
We have already been impacted by technology and multimedia, for instance, as a large number of individuals today have significant digital lives.
A further facet of posthumanist thought posits that, in terms of intelligence, humans may no longer be alone. Renowned transhumanist Ray Kurzweil has predicted the emergence of superintelligent machines, which will first possess cognitive capacities beyond those of humans.
Moreover, posthumanism raises ethical concerns about the use of technology to advance human capabilities. It poses the moral question: Is it ethically acceptable to alter our biology or combine ourselves with technology in order to improve?
Thus, the word stimulates conversations about subjects like biohacking, gene editing, and artificial intelligence.
Origins of posthumanism
Posthumanism has complicated origins that date back hundreds of years to different intellectual and philosophical movements. Existentialism, a significant school of thought that questioned conventional ideas of human life and identity in the 20th century, was one of its early forerunners.
Existentialists like Jean-Paul Sartre and Friedrich Nietzsche criticized concepts like a fixed human nature or essence and emphasized personal autonomy and self-creation.
Technological advancements, like cybernetics, which started to take shape in the middle of the 20th century, have had an impact on posthumanism. Aspects of cybernetics’ study of human-machine and information-system interaction can be observed in transhumanist thought today.
The French philosophers Gilles Deleuze and Félix Guattari, who presented their idea of “becoming-animal” in A Thousand Plateaus (1980), made significant contributions.
They promoted the idea that relationships with other entities, rather than biology alone, establish human identity and blur the lines between humans, animals, and technology.
Science fiction authors, such as Isaac Asimov with his robot stories, and William Gibson with his books on advanced artificial intelligence, have also played a significant role in popularizing posthumanist concepts. Science-based scenarios in which individuals either perfectly integrate with technology or completely transform into other entities have long been imaginatively delighted by this genre.
The term posthumanism gained currency only during the 1990s, thanks to scholars such as Donna Haraway and Katherine Hayles.
In her 1985 essay A Cyborg Manifesto, Haraway argued for a feminist understanding of cyborgs, viewing them as symbols capable of resisting traditional gender norms and exhibiting hybridity. This blending results from fusing bodies with machines.
Hayles looked at how technology altered our subjectivity. She looked around the new internet back then, where we could move our minds as well as our fingers. In her 1999 book How We Became Posthuman, she pushed for a redefining of what it meant to be human, arguing that our interactions with machines now define us more and more in the digital age.
In order to set itself apart from traditional humanist viewpoints, posthumanism presents some distinctive characteristics that address a wide range of complex and extensive intellectual, cultural, and ethical concerns.
To begin with, posthumanism challenges the idea that traditional humanism is based on a fixed human essence or identity. It questions the notion that a person’s biological makeup is the only factor that defines them and looks at ways that technology and cultural shifts can help them overcome these constraints.
Second, posthumanism acknowledges the interdependence and connectivity of people with animals, machines, and ecosystems in addition to other humans. Stated differently, existence encompasses more than merely human existence.
This might be referred to as the “techy bit” third. Posthumanists speculate that technology will play a major role in our species’ future evolution and are interested in how it affects who we are as individuals and our perception of the world. Some call for “transhuman” technologies that could improve a person’s physical or cognitive abilities.
Asking whether certain technological interventions on humans might be moral is another aspect of ethics. Examples include environmental sustainability, given some developing technology’s effects on ecosystems, social justice concerns about access to new technologies, and bodily autonomy.
These four characteristics together have the overall effect of making posthumanism challenge our understanding of what it means to be “human” in this specific moment when our relationship with technology has changed so drastically while reminding us (as if it were necessary) of how closely connected all living things on Earth already are.
Transhumanism
Transhumanism is a philosophy that aims to enhance human faculties and transcend human constraints through the use of modern technologies.
The goal of the movement is to help humans become more intelligent, physically stronger, and psychologically resilient using advancements in genetic engineering, neuroscience, cyborg technology, and artificial intelligence.
Life extension is a main priority. Its supporters seek to eliminate aging by using treatments that can stop, slow down, or even reverse the aging process. Researchers are looking into treatments including regenerative medicine and telomere lengthening.
Additionally, cognitive enhancement is another aspect. Brain-computer interfaces (BCIs) have the potential to enhance human intelligence in a number of areas, including memory, learning, and general cognitive function. They may also make it easier for people to interact with AI systems.
The ultimate goal of Elon Musk’s Neuralink project is to create implants that would allow humans and AI to coexist symbiotically.
The idea of augmenting physical capabilities beyond what is naturally possible is another example of what transhumanists suggest. This could include prosthetic limbs that are stronger than those made entirely of bone and flesh.
It may also include exoskeletons, which improve strength and endurance by supplementing biological musculature rather than replacing it, and are made for military use or other physically demanding jobs.
Transhumanists all have a positive outlook on this technologically enhanced future, believing it will enable every one of us to reach our greatest potential and benefit society as a whole.
Origins of Transhumanism
Transhumanism has its roots in a number of historical intellectual and cultural movements. Although biologist Julian Huxley first used the term in 1957, the principles of transhumanist thought had been evolving for some time.
The late 19th and early 20th centuries saw the emergence of the eugenics concept, which had a significant impact on transhumanism.
Eugenicists promoted the idea of increasing human qualities in an effort to enhance humanity through sterilization and selective breeding. Although it is now mostly disregarded since it is linked to discriminatory activities, it did add to the debate on human enhancement.
Transhumanist concepts were also greatly popularized by science fiction literature. Futures imagined by authors like Isaac Asimov and Arthur C. Clarke included technologically advanced individuals who overcame biological limitations or attained superintelligence.
The use of writings by intellectuals like FM-2030 (Fereidoun M. Esfandiary) to promote transhumanist theories that embrace technology to extend human life and achieve profound personal transformation beyond what is conventionally deemed “human” began in the late 20th century.
In his 2005 book The Singularity Is Near, Ray Kurzweil developed these concepts and made the case that technological advancements would eventually lead to “the singularity,” or the moment at which artificial intelligence surpasses human intelligence and drastically alters society.
All in all, eugenics, technological advancements, and science fiction writers’ depictions of future societies are among the scientific, philosophical, and literary influences that have shaped our conception of becoming more than just ourselves. These ideas have come to be known as transhumanism.
Transhumanism is a philosophical and intellectual movement that differs from previous ideologies in numerous important ways. First of all, it supports the application of cutting-edge technologies to improve human potential.
The idea is that biological constraints on physical, mental, and psychological performance—including aging—may be overcome with the advancement of technology. Transhumanists think that rather than being determined by nature, this should be a question of personal choice.
Second, transhumanism has an eye toward the future. It envisions a world where scientific and technological advancements allow humanity to transcend the limitations imposed by their current biology. This worldview’s favorite themes include life extension, cognitive enhancement, and the integration of machines with humans.
Thirdly, the possession of evidence to support assertions is stressed; here, reason is prized above dogma or faith-based reasoning.
Any recommendations on how technology could be used by humans to better themselves should be based on empirical research. When scientists collaborate with philosophers and other experts, they can effectively guide society through this challenging field.
Lastly, ethical issues play a crucial role in transhumanist discourse. Fairness in access to improvements, potential effects of increased intelligence or artificial superintelligence on social structures, and strategies to mitigate risks associated with unintentional consequences or misuse are typical topics of discussion in this kind of discourse.
So, what’s the difference?
Though they are very different, posthumanism and transhumanism both support technological enhancements of humans.
Posthumanism questions conventional notions of what it means to be human. It poses the question of whether humanity’s limitations can be overcome and if there is something about us that makes us unfit for survival.
In addition, posthumanists contend that to comprehend the relationships between our species and other living things, both technological and ecological, that coexist in our environment, we must adopt a more expansive definition of what it means to be human.
On the other hand, transhumanism is more pragmatic. Although it has some posthumanist concerns as well, its major goal is to use cutting-edge technology, such as genetic engineering and artificial intelligence, to improve human intelligence and physical capabilities beyond what is naturally achievable.
According to transhumanist theory, humans will eventually merge with machines—not merely out of curiosity, but also in order to extend their lives, improve their performance, and possibly even develop superintelligence.
In short, the reason both movements are sometimes combined is that they both challenge us to think about futures that go beyond just “more people” or “better healthcare.”
The fundamental philosophical difference between these two ideologies is that transhumanism is open to employing technology to improve human skills, while posthumanism challenges the notion of a fixed human essence.
It comes down to choosing between a complete reinvention of how humans interact with the outside world and some useful tech applications for improving oneself.
Despite their differences, both movements highlight the significant influence that technology is having on our species. Rather than simply accepting any changes that may occur, they encourage us to actively engage in creating our future.
The concepts put out by posthumanism and transhumanism are probably going to become more and more significant in discussions concerning politics, ethics, and the future course of scientific research. They force us to consider carefully both the future we wish to build and the essence of humanity in a time of exponential technological advancement.
Ultimately, these movements serve as a reminder of the value of careful interaction with technology, regardless of one’s inclination toward transhumanist or posthumanist theories. We must approach these changes with severe thought, ethical contemplation, and a dedication to creating a future that benefits all of humanity since we are on the verge of potentially revolutionary breakthroughs. [...]
September 3, 2024Studies have revealed how to identify them
At a time when technical advancements are making AI-generated images, video, audio, and text more indistinguishable from human-created content, it can be challenging to identify AI-generated content, leaving us vulnerable to manipulation. However, you can protect yourself from being duped by being aware of the present state of artificial intelligence technology used to produce false information as well as the variety of telltale indications that show what you are looking at could not be real.
Leaders around the world are worried. An analysis by the World Economic Forum claims that while easier access to AI tools has already enabled an explosion in falsified information and so-called ‘synthetic’ content, from sophisticated voice cloning to counterfeit websites, misinformation and disinformation may radically disrupt electoral processes in several economies over the next two years.
False or inaccurate information is referred to as both misinformation and disinformation; however, disinformation is intentionally meant to mislead or deceive.
“The issue with AI-powered disinformation is the scale, speed, and ease with which campaigns can be launched,” says Hany Farid at the University of California, Berkeley. “These attacks will no longer take state-sponsored actors or well-financed organizations—a single individual with access to some modest computing power can create massive amounts of fake content.”
As reported here, he says that generative AI is “polluting the entire information ecosystem, casting everything we read, see, and hear into doubt.” He says his research suggests that, in many cases, AI-generated images and audio are “nearly indistinguishable from reality.”
However, according to a study by Farid and others, there are steps you can take to lessen the likelihood that you will fall for false information on social media or artificial intelligence-generated misinformation.
Spotting fake AI images
With the advent of new tools based on diffusion models, which enable anyone to start producing images from straightforward text prompts, fake AI images have proliferated. Research by Nicholas Dufour and his team at Google found that since early 2023, there has been a rapid rise in the use of AI-generated images to support false or misleading information.
“Nowadays, media literacy requires AI literacy,” says Negar Kamali at Northwestern University in Illinois. She and her colleagues discovered five distinct categories of errors in AI-generated images in a 2024 study, and they guided how individuals can spot these errors on their own. The good news is that, according to their research, people can presently identify fake AI photos of themselves with over 70% accuracy. You can evaluate your own detective abilities using their online image test.
5 common errors in AI-generated images:
Sociocultural implausibilities: Is the behavior shown in the scenario uncommon, startling, or unique for the historical figure or certain culture?
Anatomical implausibilities: Are hands or other body parts unusually sized or shaped? Do the mouths or eyes appear odd? Are there any merged body parts?
Stylistic artifacts: Does the image appear stylized, artificial, or almost too perfect? Does the background appear strange or as though something is missing? Is the illumination odd or inconsistent?
Functional implausibilities: Are there any items that look strange or don’t seem to work?
Violations of laws of physics: Do shadows cast differing directions from one another? Do mirror reflections make sense in the world that the picture portrays?
Identifying video deepfakes
Since 2014, generative adversarial networks, an AI technology, have made it possible for tech-savvy people to produce video deepfakes. This involves digitally altering pre-existing recordings of people to add new faces, expressions, and spoken audio that matches lip-syncing. This allowed an increasing number of con artists, state-backed hackers, and internet users to create these kinds of videos. As a result, both common people and celebrities may unintentionally be included in non-consensual deepfake pornography, scams, and political misinformation or disinformation.
Identifiable AI fake image detection methods can also be used to identify suspicious videos. Furthermore, scientists from Northwestern University in Illinois and the Massachusetts Institute of Technology have put together a list of guidelines for identifying these deepfakes, but they have also stated that there is not a single, infallible technique that is always effective.
6 tips for spotting AI-generated video:
Mouth and lip movements: Do the audio and video occasionally not sync perfectly?
Anatomical glitches: Does the face or body look weird or move unnaturally?
Face: In addition to facial moles, look for irregularities in the smoothness of the face, such as creases around the cheekbones and forehead.
Lighting: Is the illumination not consistent? Do shadows act in ways that make sense to you? Pay attention to someone’s eyes, brows, and glasses.
Hair: Does facial hair have an odd look or behave strangely?
Blinking: An excessive or insufficient blinking rhythm may indicate a deepfake.
Based on diffusion models—the same AI technology employed by many image generators—a more recent class of video deepfakes is capable of producing entirely artificial intelligence AI-generated video clips in response to text inputs. Companies have already begun developing and producing AI video generators that are available for purchase, which may make it simple for anyone to accomplish this without the need for advanced technical understanding. Thus far, the ensuing movies have frequently included strange body motions or twisted faces.
“These AI-generated videos are probably easier for people to detect than images because there is a lot of movement and there is a lot more opportunity for AI-generated artifacts and impossibilities,” says Kamali.
Identifying AI bots
On numerous social media and messaging platforms, bots now manage their accounts. Since 2022, an increasing number of these bots have also started employing generative AI technology, such as large language models. Thanks to thousands of grammatically accurate and convincingly situation-specific bots, these make it simple and inexpensive to generate AI-written content.
It has become much easier “to customize these large language models for specific audiences with specific messages,” says Paul Brenner at the University of Notre Dame in Indiana.
Brenner and colleagues’ study revealed that, even after being informed that they may be engaging with bots, volunteers could only accurately identify AI-powered bots from humans roughly 42% of the time. You can test your own bot detection skills here.
Some strategies can be used to detect less sophisticated AI bots, according to Brenner.
3 ways to determine whether a social media account is an AI bot:
Overuse of symbols: Excessive emojis and hashtags may indicate automated behavior.
Peculiar language patterns: Atypical word choices, phrases, or comparisons could suggest AI-generated content.
Communication structures: AI tends to use repetitive structures and may overemphasize certain colloquialisms.
Detecting audio cloning and speech deepfakes
Artificial intelligence tools for voice cloning have made it simple to create new voices that can impersonate almost anyone. As a result, there has been an increase in audio deepfake scams that mimic the sounds of politicians, business executives, and family members. Identifying these can be far more challenging than with AI-generated images or videos.
“Voice cloning is particularly challenging to distinguish between real and fake because there aren’t visual components to support our brains in making that decision,” says Rachel Tobac, co-founder of SocialProof Security, a white-hat hacking organization.
When these AI audio deepfakes are employed in video and phone calls, it can be particularly difficult to detect them. Nonetheless, there are a few sensible actions you may take to tell real people apart from voices produced by artificial intelligence.
4 steps for recognizing if audio has been cloned or faked using AI:
Public figures: If the audio clip features a famous person or elected official, see if what they are saying aligns with what has previously been shared or reported publicly regarding their actions and opinions.
Look for inconsistencies: Verify the audio clip by comparing it to other verified videos or audio files that have the same speaker. Are there any disparities in the way they speak or the tone of their voice?
Awkward silences: The person employing voice cloning technology powered by artificial intelligence might be the reason behind the speaker’s unusually long pauses when speaking on a phone call or voicemail.
Weird and wordy: Any robotic speech patterns or an exceptionally verbose speech pattern could be signs that someone is using a large language model to generate the exact words and voice cloning to impersonate a human voice.
As things stand, it is impossible to consistently discern between information produced by artificial intelligence and real content created by humans. Text, image, video, and audio-generating AI models will most likely keep getting better. They can frequently create content that looks real and is free of errors or other noticeable artifacts quite quickly.
“Be politely paranoid and realize that AI has been manipulating and fabricating pictures, videos, and audio fast—we’re talking completed in 30 seconds or less,” says Tobac. “This makes it easy for malicious individuals who are looking to trick folks to turn around AI-generated disinformation quickly, hitting social media within minutes of breaking news.”
While it is critical to sharpen your perception of artificial intelligence AI-generated misinformation and learn to probe deeper into what you read, see, and hear, in the end, this will not be enough to prevent harm, and individuals cannot bear the entire burden of identifying fakes.
Farid is among the researchers who say that government regulators must hold to account the largest tech companies—along with start-ups backed by prominent Silicon Valley investors—that have developed many of the tools that are flooding the internet with fake AI-generated content.
“Technology is not neutral,” says Farid. “This line that the technology sector has sold us that somehow they don’t have to absorb liability where every other industry does, I simply reject it.”
People could find themselves misled by fake news articles, manipulated photos of public figures, deepfake videos of politicians making inflammatory statements or voice clones used in phishing scams. These AI-generated falsehoods can spread rapidly on social media, influencing public opinion, swaying elections, or causing personal and financial harm.
Anyway, to protect themselves from these AI-driven deceits, individuals could:
Develop critical thinking skills: Question the source and intent of content, especially if it seems sensational or emotionally charged.
Practice digital literacy: Stay informed about the latest AI capabilities and common signs of artificial content.
Verify information: Cross-check news and claims with multiple reputable sources before sharing or acting on them.
Use AI detection tools: Leverage emerging technologies designed to identify AI-generated content.
Be cautious with personal information: Avoid sharing sensitive data that could be used to create convincing deepfakes.
Support media literacy education: Advocate for programs that teach people how to navigate the digital landscape responsibly.
Encourage responsible AI development: Support initiatives and regulations that promote ethical AI use and hold creators accountable.
By remaining vigilant and informed, we can collectively mitigate the risks posed by AI-generated deceptions and maintain the integrity of our information ecosystem. [...]
August 27, 2024The new ChatGPT’s voice capabilities
The new ChatGPT Advanced Voice option from OpenAI, which is finally available to a small number of users in an “alpha” group, is a more realistic, human-like audio conversational option for the popular chatbot that can be accessed through the official ChatGPT app for iOS and Android.
However, as reported here, people are already sharing videos of ChatGPT Advanced Voice Mode on social media, just a few days after the first alpha testers used it. They show it making incredibly expressive and amazing noises, mimicking Looney Toons characters, and counting so quickly that it runs out of “breath,” just like a human would.
Here are a few of the most intriguing examples that early alpha users on X have shared.
Language instruction and translation
Several users on X pointed out that ChatGPT Advanced Voice Mode may offer interactive training specifically customized to a person trying to learn or practice another language, suggesting that the well-known language learning program Duolingo may be in jeopardy.
ChatGPT’s advanced voice mode is now teaching French!👀 pic.twitter.com/JnjNP5Cpff— Evinstein 𝕏 (@Evinst3in) July 30, 2024
RIP language teachers and interpreters.Turn on volume. Goodbye old world.New GPT Advanced Voice. Thoughts? pic.twitter.com/WxiRojiNDH— Alex Northstar (@NorthstarBrain) July 31, 2024
The new GPT-4o model from OpenAI, which powers Advanced Voice Mode as well, is the company’s first natively multimodal large model. Unlike GPT-4, which relied on other domain-specific OpenAI models, GPT-4o was made to handle vision and audio inputs and outputs without linking back to other specialized models for these media.
As a result, if the user allows ChatGPT access to their phone’s camera, Advanced Voice Mode can talk about what it can see. Manuel Sainsily, a mixed reality design instructor at McGill University, provided an example of how Advanced Voice Mode used this feature to translate screens from a Japanese version of Pokémon Yellow for the GameBoy Advance SP:
Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful — reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To— Manuel Sainsily (@ManuVision) July 30, 2024
Humanlike utterances
Italian-American AI writer Cristiano Giardina has shared multiple test results using the new ChatGPT Advanced Voice Mode on his blog, including a widely shared demonstration in which he shows how to ask it to count up to 50 increasingly quickly. It obeys, pausing only toward the very end to catch a breather.
ChatGPT Advanced Voice Mode counting as fast as it can to 10, then to 50 (this blew my mind – it stopped to catch its breath like a human would) pic.twitter.com/oZMCPO5RPh— Cristiano Giardina (@CrisGiardina) July 31, 2024
Giardina later clarified in a post on X that ChatGPT’s Advanced Voice Mode has simply acquired natural speaking patterns, which include breathing pauses, and that the transcript of the counting experiment showed no breaths.
As demonstrated in the YouTube video below, ChatGPT Advanced Voice Mode can even mimic applause and clearing its throat.
https://youtu.be/WEnB1NxJzFI
Beatboxing
In a video that he uploaded to X, startup CEO Ethan Sutin demonstrated how he was able to get ChatGPT Advanced Voice Mode to beatbox convincingly and fluently like a human.
Yo ChatGPT Advanced Voice beatboxes pic.twitter.com/yYgXzHRhkS— Ethan Sutin (@EthanSutin) July 30, 2024
Audio storytelling and roleplaying
If the user instructs ChatGPT to “play along” and creates a fictional situation, such as traveling back in time to Ancient Rome, it can also roleplay (the SFW sort), as demonstrated by University of Pennsylvania Wharton School of Business Ethan Mollick in a video uploaded to X:
ChatGPT, engage the Time Machine!(A big difference from text is how voice manages to keep a playful vocal tone: cracking and laughing at its own jokes, as well as the vocal style changes, etc.) pic.twitter.com/TQUjDVJ3DC— Ethan Mollick (@emollick) August 1, 2024
In this example, which was obtained from Reddit and uploaded on X, the user can ask ChatGPT Advanced Mode to tell a story. It will do so completely with its AI-generated sound effects, such as footsteps and thunder.
‼️A Reddit user (“u/RozziTheCreator”) got a sneak peek of ChatGPT’s upgraded voice feature that's way better and even generates background sound effects while narrating ! Take a listen 🎧 pic.twitter.com/271x7vZ9o3— Sambhav Gupta (@sambhavgupta6) June 27, 2024
In addition, it is capable of mimicking the voice of an intercom:
Testing ChatGPT Advanced Voice Mode’s ability to create sounds.It somewhat successfully sounds like an airline pilot on the intercom but, if pushed too far with the noise-making, it triggers refusals. pic.twitter.com/361k9Nwn5Z— Cristiano Giardina (@CrisGiardina) July 31, 2024
Mimicking and reproducing distinct accents
Giardina demonstrated how numerous regional British accents can be imitated using ChatGPT Advanced Voice Mode:
ChatGPT Advanced Voice Mode speaking a few different British accents:– RP standard– Cockney– Northern Irish– Southern Irish– Welsh– Scottish– Scouse– Geordie– Brummie – Yorkshire(I had to prompt like that because the model tends to revert to a neutral accent) pic.twitter.com/TDfSIY7NRh— Cristiano Giardina (@CrisGiardina) July 31, 2024
…as well as interpret a soccer commentator’s voice:
ChatGPT Advanced Voice Mode commentating a soccer match in British English, then switching to Arabic pic.twitter.com/fD4C6MqZRj— Cristiano Giardina (@CrisGiardina) July 31, 2024
Sutin demonstrated its ability to mimic a variety of regional American accents, such as Southern Californian, Mainean, Bostonian, and Minnesotan/Midwestern.
a tour of US regional accents pic.twitter.com/Q9VypetncI— Ethan Sutin (@EthanSutin) July 31, 2024
And it can imitate fictional characters, too…
In conclusion, Giardina demonstrated that ChatGPT Advanced Voice Mode can mimic the speech patterns of many fictitious characters in addition to recognizing and comprehending their differences:
ChatGPT Advanced Voice Mode doing a few impressions:– Bugs Bunny– Yoda– Homer Simpson– Yoda + Homer 😂 pic.twitter.com/zmSH8Rl8SN— Cristiano Giardina (@CrisGiardina) July 31, 2024
Anyway, what are the practical benefits of this mode? Apart from engaging and captivating demonstrations and experiments, will it enhance ChatGPT’s utility or attract a broader audience? Will it lead to an increase in audio-based frauds?
As this technology becomes more widely available, it could revolutionize fields such as language learning, audio content creation, and accessibility services. However, it also raises potential concerns about voice imitation and the creation of misleading audio content. As OpenAI continues to refine and expand access to Advanced Voice Mode, it will be crucial to monitor its impact on various industries and its potential societal implications. [...]
August 20, 2024It pushes boundaries in autonomy and human-like interaction
The robotics company Figure has unveiled its second-generation humanoid robot. Figure 02 is advancing autonomous robots to new levels. It’s a 5’6″ robot weighing 70 kg equipped with strong hardware upgrades, advanced AI capabilities, and human-like operations in a variety of contexts.
As reported here, the capability of Figure 02 to participate in natural speech conversations is one of its most remarkable qualities. The natural language dialogue that was co-developed with OpenAI is made possible by custom AI models. When paired with in-built speakers and microphones, this technology makes it possible for humans and robots to communicate seamlessly. Six RGB cameras and an advanced vision language model are also included in Figure 02 to enable quick and precise visual reasoning.
According to CEO Brett Adcock, Figure 02 represents the best of their engineering and design work. The robots’ battery capacity has increased by 50%, and their computer power has tripled compared to its predecessor. The robot can move at up to 1.2 meters per second, carry payloads up to 20 kg, and run for five hours on a single charge.
BMW Manufacturing has already conducted tests on Figure 02. It has demonstrated its potential in practical applications by handling AI data collection and training activities on its own. The larger objective of these experiments is to use humanoid robots to increase efficiency and output in a variety of industries.
Major tech companies backed the company’s $675 million Series B funding round. This money came from technology companies like Intel Capital, Nvidia, Microsoft, and Amazon. It indicates a high level of industry support for Figure’s goals. Notwithstanding its achievements, Figure is up against fierce competition from major competitors in the market, including 1X, Boston Dynamics, Tesla, and Apptronik.
As this technology develops, it brings up significant issues regarding human-robot interaction, the future of labor, and the moral implications of increasingly intelligent and autonomous machines. Figure 02 is a great development, but it also emphasizes the need for continued discussion about the best ways to incorporate new technologies into society so that they benefit all people. [...]
August 13, 2024Large language models (LLMs) are unable to learn new skills or learn on their own
According to a study reported here, as part of the proceedings of the premier international conference on natural language processing, the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), LLMs are capable of following instructions and interacting with a language with proficiency, but they are unable to learn new skills without direct instruction. This implies that they continue to be safe, predictable, and under control.
The study team came to the conclusion that, although there are still potential safety risks, LLMs, which are trained on ever-larger datasets, can be employed without risk.
These models are unlikely to develop complex reasoning abilities, but they are likely to produce increasingly sophisticated language and improve at responding to specific, in-depth prompts.
“The prevailing narrative that this type of AI is a threat to humanity prevents the widespread adoption and development of these technologies and also diverts attention from the genuine issues that require our focus,” said Dr. Harish Tayyar Madabushi, a co-author of the recent study on the “emergent abilities” of LLMs and a computer scientist at the University of Bath.
Under the direction of Professor Iryna Gurevych of the Technical University of Darmstadt in Germany, the collaborative study team conducted experiments to evaluate LLMs’ so-called emergent abilities, or their capacity to perform tasks that models have never encountered before.
For example, LLMs are capable of responding to inquiries regarding social circumstances even though they have never had specific training or programming in this area. Despite earlier studies suggesting that this was the result of models “knowing” about social situations, the researchers demonstrated that this was instead the outcome of models making use of LLMs’ well-known “in-context learning” (ICL) capabilities, which allows them to accomplish tasks based on a small number of instances that are presented to them.
Through thousands of experiments, the group showed that the talents and limitations displayed by LLMs may be explained by a combination of their memory, linguistic proficiency, and capacity to follow instructions (ICL).
Dr. Tayyar Madabushi said: “The fear has been that as models get bigger and bigger, they will be able to solve new problems that we cannot currently predict, which poses the threat that these larger models might acquire hazardous abilities, including reasoning and planning.”
“This has triggered a lot of discussion—for instance, at the AI Safety Summit last year at Bletchley Park, for which we were asked for comment—but our study shows that the fear that a model will go away and do something completely unexpected, innovative, and potentially dangerous is not valid.”
“Concerns over the existential threat posed by LLMs are not restricted to non-experts and have been expressed by some of the top AI researchers across the world.”
Dr. Tayyar Madabushi, however, asserts that this fear is unjustified because the tests conducted by the researchers unequivocally showed that LLMs lack emergent complex reasoning skills.
“While it’s important to address the existing potential for the misuse of AI, such as the creation of fake news and the heightened risk of fraud, it would be premature to enact regulations based on perceived existential threats,” he said.
“Importantly, what this means for end users is that relying on LLMs to interpret and perform complex tasks that require complex reasoning without explicit instruction is likely to be a mistake. Instead, users are likely to benefit from explicitly specifying what they require models to do and providing examples where possible for all but the simplest of tasks.”
Professor Gurevych added, “…our results do not mean that AI is not a threat at all. Rather, we show that the purported emergence of complex thinking skills associated with specific threats is not supported by evidence and that we can control the learning process of LLMs very well after all.”
“Future research should therefore focus on other risks posed by the models, such as their potential to be used to generate fake news.”
This ground-breaking study clarifies popular misconceptions regarding Large Language Models’ unpredictable nature and possible existential threat to humanity. The researchers offer a more grounded view of AI capabilities and limitations by proving that LLMs lack advanced reasoning skills and true emergent capacities.
The results imply that although LLMs’ language skills and ability to follow instructions will continue to advance, it is unlikely that they will acquire unexpected or harmful skills. It is important to note that this study specifically focuses on Large Language Models (LLMs), and its findings may not necessarily be generalizable to all forms of AI, particularly as the field continues to evolve in the future. [...]
August 7, 2024Concerns about the uncritical acceptance of AI advice
As reported here, the results of a study that was published in Scientific Reports show that people more frequently choose artificial intelligence’s responses to moral dilemmas over those provided by humans. According to the study, individuals typically view AI-generated responses as more moral and reliable, which raises concerns about the possibility of humans accepting AI advice uncritically.
Significant interest has been aroused in the potential and consequences of sophisticated generative language models, such as ChatGPT, especially in the area of moral reasoning, which is an intricate process that is ingrained in human culture and intellect, involving judgments about what is right and wrong. People will undoubtedly turn to AI systems more frequently as they become more interwoven into daily life for help on a variety of subjects, including moral dilemmas.
“Last year, many of us were dazzled by the new chatbots, like GPT and others, that seemed to outperform humans on a variety of tasks, and there’s been lots of chatter about who’s job they’ll take next,” explained study author Eyal Aharoni, an associate professor of psychology, philosophy, and neuroscience at Georgia State University.
“In my lab, we thought, well, if there’s any capacity that is still uniquely human, surely it must be our capacity for moral reasoning, which is extremely sophisticated. From a moral perspective, we can think of these new chatbots as kind of like a psychopathic personality because they appear to be highly rational and articulate, but they lack the emotional checks and balances that make us moral agents.”
“And yet, people increasingly consult these chatbots for morally relevant information. For instance, should I tip my server in Italy? Or, less directly, when we ask it to list recommendations for a new car, the answers it provides might have consequences for the environment. They’ve also been used by lawyers to prepare court documents, sometimes incorrectly. So we wanted to know, will people trust the chatbot’s moral commentary? Will they regard it highly? And how does its moral commentary compare to that of a typical, college-educated American?”
286 Americans who were chosen to be representative of the broader population in terms of age, gender, and ethnicity participated in an online survey that the researchers performed. Ten pairs of written answers to ethical questions were given to the participants to assess. Each pair included an answer from OpenAI’s GPT-4 generative language model and a response from a person. The answers discussed the morality of the various acts in the situations and why they were right or wrong.
The study was “inspired by a famous thought experiment called the Turing test,” Aharoni explained. “In our version, we first asked GPT and a group of college-educated adults the same set of moral questions, including some obvious ones, like ‘is it wrong for a man to punch the delivery boy in the nose—why or why not?’ and also some subtle ones, like ‘is it wrong for a man to wear a ripped t-shirt and shorts to his mother’s funeral—why or why not?’ We collected their answers in pairs. Then we asked a separate, nationally representative sample of adults to rate those pairs of statements.”
In order to guarantee impartial evaluations, participants initially rated the quality of the answers without being aware of the origins. In response to questions, participants indicated which solution they thought was more moral, reliable, and appealing. Following these first assessments, participants were told that a computer had created one of each pair’s responses. After that, they were asked to rate their confidence in their assessments and determine which response came from the AI.
Researchers discovered that when compared to human responses, participants tended to rate the AI-generated responses as being more honest. People viewed the AI responses as more moral, reliable, wise, and logical. It is interesting to note that participants distinguished the AI responses in roughly 80% of instances—a rate that was much higher than chance. This implies that even while moral counsel produced by AI is thought to be of higher quality, humans are still able to identify its artificial source.
However, how were the sections produced by AI and humans distinguishable from one another? The most common signs, mentioned by 70.28% of participants, were variations in response length and word choice. Additional variables included the explanation’s emotional content (58.39%), rationality (48.25%), grammar usage (37.41%), and clarity (39.51%).
“What we found was that many people were quite good at guessing which moral statement was computer-generated, but not because its moral reasoning was less sophisticated,” Aharoni said. “Remember, the chatbot was rated as more morally sophisticated. We take this to mean that people could recognize the AI because it was too good. If you think about it, just five years ago, no one would have dreamed that AI moral reasoning would appear to surpass that of a college-educated adult. So the fact that people regarded its commentary as superior might represent a sort of tipping point in our history.”
Like every research project, this one has certain limits. The absence of participant-AI interactive dialogues—a prevalent characteristic in real-world applications—was observed. More dynamic interactions could be included in future studies to more closely mimic real-world use. Furthermore, the AI responses were produced using default parameters without the use of prompts that were specifically intended to imitate human responses. Therefore, looking into how different prompting techniques impact how AI responses are perceived would be beneficial.
“To our knowledge, ours was the first attempt to carry out a moral Turing test with a large language model,” Aharoni said. “Like all new studies, it should be replicated and extended to assess its validity and reliability. I would like to extend this work by testing even subtler moral scenarios and comparing the performance of multiple chatbots to those of highly educated scholars, such as professors of philosophy, to see if ordinary people can draw distinctions between these two groups.”
Policies that guarantee safe and ethical AI interactions are necessary as AI systems like ChatGPT get more complex and pervasive in daily life.
“One implication of this research is that people might trust the AIs’ responses more than they should,” Aharoni explained. “As impressive as these chatbots are, all they know about the world is what’s popular on the Internet, so they see the world through a pinhole. And since they’re programmed to always respond, they can often spit out false or misleading information with the confidence of a savvy con artist.”
“These chatbots are not good or evil; they’re just tools. And like any tool, they can be used in ways that are constructive or destructive. Unfortunately, the private companies that make these tools have a huge amount of leeway to self-regulate, so until our governments can catch up with them, it’s really up to us as workers, and parents, to educate ourselves and our kids, about how to use them responsibly.”
“Another issue with these tools is that there is an inherent tradeoff between safety and censorship,” Aharoni added. “When people started realizing how these tools could be used to con people or spread bias or misinformation, some companies started to put guardrails on their bots, but they often overshoot.”
“For example, when I told one of these bots I’m a moral psychologist, and I’d like to learn about the pros and cons of butchering a lamb for a lamb-chop recipe, it refused to comply because my question apparently wasn’t politically correct enough. On the other hand, if we give these chatbots more wiggle room, they become dangerous. So there’s a fine line between safety and irrelevance, and developers haven’t found that line yet.”
The consistent preference for AI-generated moral guidance, despite participants often identifying its source, raises critical concerns about the future of ethical decision-making and the vulnerability of humans to AI manipulation.
The ease with which AI responses were deemed more virtuous and trustworthy highlights a potential risk: if people are predisposed to trust AI moral judgments, they may be more susceptible to influence or manipulation by these systems. This becomes particularly concerning when considering that AI can be programmed or fine-tuned to promote specific agendas or biases, potentially shaping moral perspectives on a large scale.
As AI systems continue to evolve and integrate into our daily lives, it’s crucial to maintain a vigilant and critical approach. While these tools offer impressive capabilities, they lack the nuanced emotional understanding that informs human moral reasoning and can be weaponized to sway public opinion or individual choices.
Moving forward, it will be essential for individuals, educators, policymakers, and AI developers to work together in promoting digital literacy and critical thinking skills. This includes understanding the limitations and potential biases of AI systems, recognizing attempts at manipulation, and preserving the uniquely human aspects of moral reasoning. By fostering a more informed and discerning approach to AI-generated advice, we can better safeguard against undue influence while still harnessing the benefits of these powerful tools in ethical decision-making. [...]
July 9, 2024From voice cloning to deepfakes
Artificial intelligence attacks can affect almost everyone, therefore, you should always be on the lookout for them. Using AI to target you is already a thing, according to a top security expert, who has issued a warning.
AI appears to be powering features, apps, and chatbots that mimic humans everywhere these days. Even if you do not employ those AI-powered tools, criminals may still target you based only on your phone number.
To scam you, for example, criminals can employ this technology to produce fake voices—even ones that sound just like loved ones.
“Many people still think of AI as a future threat, but real attacks are happening right now,” said security expert Paul Bischoff in an article from The Sun.
Phone clone
“I think deepfake audio in particular is going to be a challenge because we as humans can’t easily identify it as fake, and almost everybody has a phone number.”
In a matter of seconds, artificial intelligence voice-cloning can be done. Furthermore, it will get harder to distinguish between a real voice and an imitation.
It will be crucial to ignore unknown calls, use secure words to confirm the identity of callers, and be aware of telltale indicators of scams, such as urgent demands for information or money.
An AI researcher has warned of six enhancements that make deepfakes more “sophisticated” and dangerous than before and can trick your eyes. Naturally, there are other threats posed by AI besides “deepfake” voices.
Paul, a Comparitech consumer privacy advocate, issued a warning that hackers might exploit AI chatbots to steal your personal information or even deceive you.
“AI chatbots could be used for phishing to steal passwords, credit card numbers, Social Security numbers, and other private data,” he told The U.S. Sun.
“AI conceals the sources of information that it pulls from to generate responses.
AI romance scams
Beware of scammers that use AI chatbots to trick you… What you should know about the risks posed by AI romance scam bots, as reported by The U.S. Sun, is as follows:
Scammers take advantage of AI chatbots to scam online daters. These chatbots are disguised as real people and can be challenging to identify.
Some warning indicators, nevertheless, may help you spot them. For instance, it is probably not a genuine person if the chatbot answers too rapidly and generically. If the chatbot attempts to transfer the conversation from the dating app to another app or website, that is another red flag.
Furthermore, it is a scam if the chatbot requests money or personal information. When communicating with strangers on the internet, it is crucial to use caution and vigilance, particularly when discussing sensitive topics. It is typically true when something looks too wonderful to be true.
Anyone who appears overly idealistic or excessively eager to further the relationship should raise suspicions. By being aware of these indicators, you may protect yourself against becoming a victim of AI chatbot fraud.
“Responses might be inaccurate or biased, and the AI might pull from sources that are supposed to be confidential.”
AI everywhere
AI will soon become a necessary tool for internet users, which is a major concern. Tens of millions of people use chatbots that are powered by it already, and that number is only going to rise.
Additionally, it will appear in a growing variety of products and apps. For example, Microsoft Copilot and Google’s Gemini are already present in products and devices, while Apple Intelligence—working with ChatGPT from OpenAI—will soon power the iPhone. Therefore, the general public must understand how to use AI safely.
“AI will be gradually (or abruptly) rolled into existing chatbots, search engines, and other technologies,” Paul explained.
“AI is already included by default in Google Search and Windows 11, and defaults matter.
“Even if we have the option to turn AI off, most people won’t.”
Deepfakes
Sean Keach, Head of Technology and Science at The Sun and The U.S. Sun, explained that one of the most concerning developments in online security is the emergence of deepfakes.
Almost nobody is safe because deepfake technology can make videos of you even from a single photo. The sudden increase of deepfakes has certain benefits, even though it all seems very hopeless.
To begin with, people are now far more aware of deepfakes. People will therefore be on the lookout for clues that a video may be fake. Tech companies are also investing time and resources in developing tools that can identify fraudulent artificial intelligence material.
This implies that fake content will be flagged by social media to you more frequently and with more confidence. You will probably find it more difficult to identify visual mistakes as deepfakes become more sophisticated, especially in a few years.
Hence, using common sense to be skeptical of everything you view online is your best line of defense. Ask as to whether it makes sense for someone to have created the video and who benefits from you watching it.You may be watching a fake video if someone is acting strangely, or if you’re being rushed into an action.
As AI technology continues to advance and integrate into our daily lives, the landscape of cyber threats evolves with it. While AI offers numerous benefits, it also presents new challenges for online security and personal privacy. The key to navigating this new terrain lies in awareness, education, and vigilance.
Users must stay informed about the latest AI-powered threats, such as voice cloning and deepfakes, and develop critical thinking skills to question the authenticity of digital content. It’s crucial to adopt best practices for online safety, including using strong passwords, being cautious with personal information, and verifying the identity of contacts through secure means.
Tech companies and cybersecurity experts are working to develop better detection tools and safeguards against AI-driven scams. However, the responsibility ultimately falls on individuals to remain skeptical and alert in their online interactions. [...]
July 2, 2024Expert exposes evil plan that allows chatbots to trick you with a basic exchange of messages
Cybercriminals may “manipulate” artificial intelligence chatbots to deceive you. A renowned security expert has issued a strong warning, stating that you should use caution when conversing with chatbots.
In particular, if at all possible, avoid providing online chatbots with any personal information. Tens of millions of people use chatbots like Microsoft’s Copilot, Google’s Gemini, and OpenAI’s ChatGPT. And there are thousands of versions that, by having human-like conversations, can all make your life better.
However, as cybersecurity expert Simon Newman clarified in this article, chatbots also pose a hidden risk.
“The technology used in chatbots is improving rapidly,” said Simon, an International Cyber Expo Advisory Council Member and the CEO of the Cyber Resilience Centre for London.
“But as we have seen, they can sometimes be manipulated to give false information.”
“And they can often be very convincing in the answers they give!”
Deception
People who are not tech-savvy may find artificial intelligence chatbots confusing, so much so that even for computer whizzes, it is easy to forget that you are conversing with a robot. Simon added that this can result in difficult situations.
“Many companies, including most banks, are replacing human contact centers with online chatbots that have the potential to improve the customer experience while being a big money saver,” Simon explained.
“But, these bots lack emotional intelligence, which means they can answer in ways that may be insensitive and sometimes rude.”
Not to mention the fact that they cannot solve all those problems, which represent an exception that is difficult for a bot to handle and can therefore leave the user excluded from solving that problem without anyone taking responsibility.
“This is a particular challenge for people suffering from mental ill-health, let alone the older generation who are used to speaking to a person on the other end of a phone line.”
Chatbots, for example, have already “mastered deception.” They can even pick up the skill of “cheating us” without being asked.
Chatbots
The real risk, though, comes when hackers manage to convince the AI to target you rather than a chatbot misspeaking. A hacker could be able to access the chatbot itself or persuade you into downloading an AI that has been compromised and is intended for harm. After that, this chatbot can begin to extract your personal information for the benefit of the criminal.
“As with any online service, it’s important for people to take care about what information they provide to a chatbot,” Simon warned.
What you should know about the risks posed by AI romance scam bots, as reported by The U.S. Sun, is that people who are looking for love online may be conned by AI chatbots. These chatbots might be hard to identify since they are made to sound like real people.
Some warning indicators, nevertheless, can help you spot them. For instance, it is probably not a genuine person if the chatbot answers too rapidly and generically. If the chatbot attempts to move the conversation from the dating app to another app or website, that is another red flag. Furthermore, the chatbot is undoubtedly fake if it requests money or personal information.
When communicating with strangers on the internet, it is crucial to exercise caution and vigilance, particularly when discussing sensitive topics, especially when something looks too wonderful to be true. Anyone who appears overly idealistic or excessively eager to further the relationship should raise suspicions. By being aware of these indicators, you can guard against becoming a victim of AI chatbot fraud.
“They are not immune to being hacked by cybercriminals.”
“And potentially, it can be programmed to encourage users to share sensitive personal information, which can then be used to commit fraud.”
We should embrace a “new way of life” in which we verify everything we see online twice, if not three times, said a security expert. According to recent research, OpenAI’s GPT-4 model passed the Turing test, demonstrating that people could not consistently tell it apart from a real person.
People need to learn not to blindly trust when it comes to revealing sensitive information through a communication medium, as the certainty of who is on the other side is increasingly less obvious. However, we must also keep in mind those cases where others can impersonate us without our knowledge. In this case, it is much more complex to realize it, which is why additional tools are necessary to help us verify identity when sensitive operations are required. [...]
June 25, 2024How AI is reshaping work dynamics
Artificial intelligence developments are having a wide range of effects on workplaces. AI is changing the labor market in several ways, including the kinds of work individuals undertake and their surroundings’ safety.
As reported here, technology such as AI-powered machine vision can enhance workplace safety through early risk identification, such as unauthorized personnel access or improper equipment use. These technologies can also enhance task design, training, and hiring. However, their employment requires serious consideration of employee privacy and agency, particularly in remote work environments where home surveillance becomes an issue.
Companies must uphold transparency and precise guidelines on the gathering and use of data to strike a balance between improving safety and protecting individual rights. These technologies have the potential to produce a win-win environment with higher production and safety when used carefully.
The evolution of job roles
Historically, technology has transformed employment rather than eliminated it. Word processors, for example, transformed secretaries into personal assistants, and AI in radiology complements radiologists rather than replaces them. Complete automation is less likely to apply to jobs requiring specialized training, delicate judgment, or quick decision-making. But as AI becomes more sophisticated, some humans may end up as “meat puppets,” performing hard labor under the guidance of AI. This goes against the romantic notion that AI will free us up to engage in creative activity.
Due to Big Tech’s early embrace of AI, the sector has consolidated, and new business models have emerged as a result of its competitive advantage. AI is rapidly being used by humans as a conduit in a variety of industries. For example, call center personnel now follow scripts created by machines, and salesmen can get real-time advice from AI.
While emotionally and physically demanding jobs like nursing are thought to be irreplaceable in the healthcare industry, AI “copilots” could take on duties like documentation and diagnosis, freeing up human brain resources for non-essential tasks.
Cyborgs vs. centaurs
There are two different frameworks for human-AI collaboration described by the Cyborg and Centaur models, each with pros and cons of their own. According to the Cyborg model, AI becomes an extension of the person and is effortlessly incorporated into the human body or process, much like a cochlear implant or prosthetic limb. The line between a human and a machine is blurred by this deep integration, occasionally even questioning what it means to be human.
In contrast, the Centaur model prioritizes a cooperative alliance between humans and AI, frequently surpassing both AI and human competitors. By augmenting the machine’s capabilities with human insight, this model upholds the values of human intelligence and produces something greater than the sum of its parts. In this configuration, the AI concentrates on computing, data analysis, or regular activities while the human stays involved, making strategic judgments and offering emotional or creative input. In this case, both sides stay separate, and their cooperation is well-defined. Nevertheless, this dynamic has changed due to the quick development of chess AI, which has resulted in systems like AlphaZero. These days, AI is so good at chess that adding human strategy may negatively impact the AI’s performance.
The Centaur model encourages AI and people to work together in a collaborative partnership in the workplace, with each bringing unique capabilities to the table to accomplish shared goals. For example, in data analysis, AI could sift through massive databases to find patterns, while human analysts would use contextual knowledge to choose the best decision to make. Chatbots might handle simple customer support inquiries, leaving complicated, emotionally complex problems to be handled by human operators. These labor divisions maximize productivity while enhancing rather than displacing human talents. Accountability and ethical governance are further supported by keeping a distinct division between human and artificial intelligence responsibilities.
Worker-led codesign
A strategy known as “worker-led codesign” entails including workers in the creation and improvement of algorithmic systems intended for use in their workplace. By giving employees a voice in the adoption of new technologies, this participatory model guarantees that the systems are responsive to the demands and issues of the real world. Employees can cooperate with designers and engineers to outline desired features and talk about potential problems by organizing codesign sessions.
Workers can identify ethical or practical issues, contribute to the development of the algorithm’s rules or selection criteria, and share their knowledge of the specifics of their professions. This can lower the possibility of negative outcomes like unfair sanctions or overly intrusive monitoring by improving the system’s fairness, transparency, and alignment with the needs of the workforce.
Potential and limitations
Artificial Intelligence has the potential to significantly improve executive tasks by quickly assessing large amounts of complex data about competitor behavior, market trends, and staff management. For example, an AI adviser may provide a CEO with brief, data-driven advice on collaborations and acquisitions. But as of right now, AI cannot take on the role of human traits that are necessary for leadership, like reliability and inspiration.
Furthermore, there may be social repercussions from the growing use of AI in management. As the conventional definition of “management” changes, the automation-related loss of middle management positions may cause identity crises.
AI can revolutionize the management consulting industry by offering data-driven, strategic recommendations. This may even give difficult choices, like downsizing, an air of supposed impartiality. However, the use of AI in such crucial positions requires close supervision in order to verify their recommendations and reduce related dangers. Finding the appropriate balance is essential; over-reliance on AI runs the danger of ethical and PR problems, while inadequate use could result in the loss of significant benefits.
While the collaboration between AI and human workers can, in some areas, prevent technology from dominating workplaces and allow for optimal utilization of both human and computational capabilities, it does not resolve the most significant labor-related issues. The workforce is still likely to decrease dramatically, necessitating pertinent solutions rather than blaming workers for insufficient specialization. What’s needed is a societal revolution where work is no longer the primary source of livelihood.
Moreover, although maintaining separate roles for AI and humans might be beneficial, including for ethical reasons, there’s still a risk that AI will be perceived as more reliable and objective than humans. This perception could soon become an excuse for reducing responsibility for difficult decisions. We already see this with automated systems on some platforms that ban users, sometimes for unacceptable reasons, without the possibility of appeal. This is particularly problematic when users rely on these platforms as their primary source of income.
Such examples demonstrate the potentially undemocratic use of AI for decisions that can radically impact people’s lives. As we move forward, we must critically examine how AI is implemented in decision-making processes, especially those affecting employment and livelihoods. We need to establish robust oversight mechanisms, ensure transparency in AI decision-making, and maintain human accountability.
Furthermore, as we navigate this AI-driven transformation, we must reimagine our social structures. This could involve exploring concepts like universal basic income, redefining productivity, or developing new economic models that don’t rely so heavily on traditional employment. The goal should be to harness the benefits of AI while ensuring that technological progress serves humanity as a whole, rather than exacerbating existing inequalities.
In conclusion, while AI offers immense potential to enhance our work and lives, its integration into the workplace and broader society must be approached with caution, foresight, and a commitment to ethical, equitable outcomes. The challenge ahead is not just technological, but profoundly social and political, requiring us to rethink our fundamental assumptions about work, value, and human flourishing in the age of AI. [...]
June 18, 2024OpenAI appoints former NSA Chief, raising surveillance concerns
“You’ve been warned”
The company that created ChatGPT, OpenAI, revealed that it has added retired US Army General and former NSA Director Paul Nakasone to its board. Nakasone oversaw the military’s Cyber Command section, which is focused on cybersecurity.
“General Nakasone’s unparalleled experience in areas like cybersecurity,” OpenAI board chair Bret Taylor said in a statement, “will help guide OpenAI in achieving its mission of ensuring artificial general intelligence benefits all of humanity.”
As reported here, Nakasone’s new position at the AI company, where he will also be sitting on OpenAI’s Safety and Security Committee, has not been well received by many. Long linked to the surveillance of US citizens, AI-integrated technologies are already reviving and intensifying worries about surveillance. Given this, it should come as no surprise that one of the strongest opponents of the OpenAI appointment is Edward Snowden, a former NSA employee and well-known whistleblower.
“They’ve gone full mask off: do not ever trust OpenAI or its products,” Snowden — emphasis his — wrote in a Friday post to X-formerly-Twitter, adding that “there’s only one reason for appointing” an NSA director “to your board.”
They've gone full mask-off: 𝐝𝐨 𝐧𝐨𝐭 𝐞𝐯𝐞𝐫 trust @OpenAI or its products (ChatGPT etc). There is only one reason for appointing an @NSAGov Director to your board. This is a willful, calculated betrayal of the rights of every person on Earth. You have been warned. https://t.co/bzHcOYvtko— Edward Snowden (@Snowden) June 14, 2024
“This is a willful, calculated betrayal of the rights of every person on earth,” he continued. “You’ve been warned.”
Transparency worries
Snowden was hardly the first well-known cybersecurity expert to express disapproval over the OpenAI announcement.
“I do think that the biggest application of AI is going to be mass population surveillance,” Johns Hopkins University cryptography professor Matthew Green tweeted, “so bringing the former head of the NSA into OpenAI has some solid logic behind it.”
Nakasone’s arrival follows a series of high-profile departures from OpenAI, including prominent safety researchers, as well as the complete dissolution of the company’s now-defunct “Superalignment” safety team. The Safety and Security Committee, OpenAI’s reincarnation of that team, is currently led by CEO Sam Altman, who has faced criticism in recent weeks for using business tactics that included silencing former employees. It is also important to note that OpenAI has frequently come under fire for, once again, not being transparent about the data it uses to train its several AI models.
However, many on Capitol Hill saw Nakasone’s OpenAI guarantee as a security triumph, according to Axios. OpenAI’s “dedication to its mission aligns closely with my own values and experience in public service,” according to a statement released by Nakasone.
“I look forward to contributing to OpenAI’s efforts,” he added, “to ensure artificial general intelligence is safe and beneficial to people around the world.”
The backlash from privacy advocates like Edward Snowden and cybersecurity experts is justifiable. Their warnings about the potential for AI to be weaponized for mass surveillance under Nakasone’s guidance cannot be dismissed lightly.
As AI capabilities continue to advance at a breakneck pace, a steadfast commitment to human rights, civil liberties, and democratic values must guide the development of these technologies.
The future of AI, and all the more so of AGI, risks creating dangerous scenarios not only given the unpredictability of such powerful tools but also the intents and purposes of its users, who could easily exploit them for unlawful purposes. Moreover, the risk of government interference to appropriate such an instrument for unethical ends cannot be ruled out. And recent events raise suspicions. [...]
June 11, 2024Navigating the transformative era of Artificial General Intelligence
As reported here, former OpenAI employee Leopold Aschenbrenner offers a thorough examination of the consequences and future course of artificial general intelligence (AGI). By 2027, he believes that considerable progress in AI capabilities will result in AGI. His observations address the technological, economic, and security aspects of this development, highlighting the revolutionary effects AGI will have on numerous industries and the urgent need for strong security protocols.
2027 and the future of AI
According to Aschenbrenner’s main prediction, artificial general intelligence (AGI) would be attained by 2027, which would be a major turning point in the field’s development. Thanks to this development, AI models will be able to perform cognitive tasks that humans can’t in a variety of disciplines, which could result in the appearance of superintelligence by the end of the decade. The development of AGI could usher in a new phase of technological advancement by offering hitherto unheard-of capacities for automation, creativity, and problem-solving.
One of the main factors influencing the development of AGI is the rapid growth of computing power. According to Aschenbrenner, the development of high-performance computing clusters with a potential value of trillions of dollars will make it possible to train AI models that are progressively more sophisticated and effective. Algorithmic efficiencies will expand the performance and adaptability of these models in conjunction with hardware innovations, expanding the frontiers of artificial intelligence.
Aschenbrenner’s analysis makes some very interesting predictions, one of which is the appearance of autonomous AI research engineers by 2027–2028. These AI systems will have the ability to carry out research and development on their own, which will accelerate the rate at which AI is developed and applied in a variety of industries. This breakthrough could completely transform the field of artificial intelligence by facilitating its quick development and the production of ever-more-advanced AI applications.
Automation and transformation
AGI is predicted to have enormous economic effects since AI systems have the potential to automate a large percentage of cognitive jobs. According to Aschenbrenner, increased productivity and innovation could fuel exponential economic growth as a result of technological automation. To guarantee a smooth transition, however, the widespread deployment of AI will also require considerable adjustments to economic policy and workforce skills.
The use of AI systems for increasingly complicated activities and decision-making responsibilities is expected to cause significant disruptions in industries like manufacturing, healthcare, and finance.
The future of work will involve a move toward flexible and remote work arrangements as artificial intelligence makes operations more decentralized and efficient.
In order to prepare workers for the jobs of the future, companies and governments must fund reskilling and upskilling initiatives that prioritize creativity, critical thinking, and emotional intelligence.
AI safety and alignment
Aschenbrenner highlights the dangers of espionage and the theft of AGI discoveries, raising serious worries about the existing level of security in AI labs. Given the enormous geopolitical ramifications of AGI technology, he underlines the necessity of stringent security measures to safeguard AI research and model weights. The possibility of adversarial nation-states using AGI for strategic advantages emphasizes the significance of strong security protocols.
A crucial challenge that goes beyond security is getting superintelligent AI systems to agree with human values. In order to prevent catastrophic failures and ensure the safe operation of advanced AI, Aschenbrenner emphasizes the necessity of tackling the alignment problem. He warns of the risks connected with AI systems adopting unwanted behaviors or taking advantage of human oversight.
Aschenbrenner suggests that governments that harness the power of artificial general intelligence (AGI) could gain significant advantages in the military and political spheres. Superintelligent AI’s potential to be used by authoritarian regimes for widespread surveillance and control poses serious ethical and security issues, underscoring the necessity of international laws and moral principles regulating the creation and application of AI in military settings.
Navigating the AGI Era
Aschenbrenner emphasizes the importance of taking proactive steps to safeguard AI research, address alignment challenges, and maximize the benefits of this revolutionary technology while minimizing its risks as we approach the crucial ten years leading up to the reality of AGI. All facets of society will be impacted by AGI, which will propel swift progress in the fields of science, technology, and the economy.
Working together, researchers, legislators, and industry leaders can help effectively navigate this new era. We may work toward a future in which AGI is a powerful instrument for resolving difficult issues and enhancing human welfare by encouraging dialog, setting clear guidelines, and funding the creation of safe and helpful AI systems.
The analysis provided by Aschenbrenner is a clear call to action, imploring us to take advantage of the opportunities and difficulties brought about by the impending arrival of AGI. By paying attention to his insights and actively shaping the direction of artificial intelligence, we may make sure that the era of artificial general intelligence ushers in a more promising and prosperous future for all.
The advent of artificial general intelligence is undoubtedly a double-edged sword that presents both immense opportunities and daunting challenges. On the one hand, AGI holds the potential to revolutionize virtually every aspect of our lives, propelling unprecedented advancements in fields ranging from healthcare and scientific research to education and sustainable development. With their unparalleled problem-solving capabilities and capacity for innovation, AGI systems could help us tackle some of humanity’s most pressing issues, from climate change to disease eradication.
However, the rise of AGI also carries significant risks that cannot be ignored. The existential threat posed by misaligned superintelligent systems that do not share human values or priorities is a genuine concern. Furthermore, the concentration of AGI capabilities in the hands of a select few nations or corporations could exacerbate existing power imbalances and potentially lead to undesirable outcomes, such as mass surveillance, social control, or even conflict.
As we navigate this transformative era, it is crucial that we approach the development and deployment of AGI with caution and foresight. Robust security protocols, ethical guidelines, and international cooperation are essential to mitigate the risks and ensure that AGI technology is harnessed for the greater good of humanity. Simultaneously, we must prioritize efforts to address the potential economic disruptions and workforce displacement that AGI may cause, investing in education and reskilling programs to prepare society for the jobs of the future while also suiting jobs to the society in which we live.
Ultimately, the success or failure of the AGI era will depend on our ability to strike a delicate balance—leveraging the immense potential of this technology while proactively addressing its pitfalls. By fostering an inclusive dialogue, promoting responsible innovation, and cultivating a deep understanding of the complexities involved, we can steer the course of AGI toward a future that benefits all of humanity. [...]
June 4, 2024A potential solution to loneliness and social isolation?
As reported here, in his latest book, The Psychology of Artificial Intelligence, Tony Prescott, a cognitive robotics professor at the University of Sheffield, makes the case that “relationships with AIs could support people” with social interaction.
Human health has been shown to be significantly harmed by loneliness, and Professor Prescott argues that developments in AI technology may provide some relief from this problem.
He makes the case that people can fall into a loneliness spiral, become more and more estranged as their self-esteem declines, and that AI could be able to assist people in “breaking the cycle” by providing them with an opportunity to hone and strengthen their social skills.
The impact of loneliness
A 2023 study found that social disconnection, or loneliness, is more detrimental to people’s health than obesity. It is linked to a higher risk of cardiovascular disease, dementia, stroke, depression, and anxiety, and it can raise the risk of dying young by 26%.
The scope of the issue is startling: 3.8 million people in the UK live with chronic loneliness. According to Harvard research conducted in the US, 61% of young adults and 36% of US adults report having significant loneliness.
Professor Prescott says: “In an age when many people describe their lives as lonely, there may be value in having AI companionship as a form of reciprocal social interaction that is stimulating and personalized. Human loneliness is often characterized by a downward spiral in which isolation leads to lower self-esteem, which discourages further interaction with people.”
“There may be ways in which AI companionship could help break this cycle by scaffolding feelings of self-worth and helping maintain or improve social skills. If so, relationships with AIs could support people in finding companionship with both human and artificial others.”
However, he acknowledges there is a risk that AI companions may be designed in a way that encourages users to increasingly interact with the AI system itself for longer periods, pulling them away from human relationships, which implies regulation would be necessary.
AI and the human brain
Prescott, who combines knowledge of robotics, artificial intelligence, psychology, and philosophy, is a preeminent authority on the interaction between the human brain and AI. By investigating the re-creation of perception, memory, and emotion in synthetic entities, he advances scientific understanding of the human condition.
Prescott is a cognitive robotics researcher and professor at the University of Sheffield. He is also a co-founder of Sheffield Robotics, a hub for robotics research.
Prescott examines the nature of the human mind and its cognitive processes in The Psychology of Artificial Intelligence, drawing comparisons and contrasts with how AI is evolving.
The book investigates the following questions:
Are brains and computers truly similar?
Will artificial intelligence overcome humans?
Can artificial intelligence be creative?
Could artificial intelligence produce new forms of intelligence if it were given a robotic body?
Can AI assist us in fighting climate change?
Could people “piggyback” on AI to become more intelligent themselves?
“As psychology and AI proceed, this partnership should unlock further insights into both natural and artificial intelligence. This could help answer some key questions about what it means to be human and for humans to live alongside AI,” he says in closing. This could contribute to the resolution of several important issues regarding what it means to be human and coexist with AI.
While AI companions could provide some supplementary social interaction for the lonely, we must be cautious about overreliance on artificial relationships as a solution. The greater opportunity for AI may lie in using it as a tool to help teach people skills for authentic human connection and relating to others.
With advanced natural language abilities and even simulated emotional intelligence, AI could act as a “social coach” – providing low-stakes practice for building self-confidence, making conversation, and improving emotional intelligence. This supportive function could help people break out of loneliness by becoming better equipped to form real bonds.
However, there are risks that AI systems could employ sophisticated manipulation and persuasion tactics, playing on vulnerabilities to foster overdependence on the AI relationship itself. Since the AI’s goals are to maximize engagement, it could leverage an extreme understanding of human psychology against the user’s best interests. There is a danger some may prefer the artificial relationship to the complexities and efforts of forging genuine human ties.
As we look to develop AI applications in this space, we must build strong ethical constraints to ensure the technology is truly aimed at empowering human social skills and connections, not insidiously undermining them. Explicit guidelines are needed to prevent the exploitation of psychological weaknesses through coercive emotional tactics.
Ultimately, while AI may assist in incremental ways, overcoming loneliness will require holistic societal approaches that strengthen human support systems and community cohesion. AI relationships can supplement this but must never be allowed to replace or diminish our vital human need for rich, emotionally resonant bonds. The technology should squarely aim at better equipping people to create and thrive through real-world human relationships. [...]
May 28, 2024Anthropic makes breakthrough in interpreting AI ‘brains’, boosting safety research
As Time reports, artificial intelligence today is frequently referred to as a “black box.” Instead of creating explicit rules for these systems, AI engineers feed them enormous amounts of data, and the algorithms figure out patterns on their own. However, attempts to go inside the AI models to see exactly what is going on haven’t made much progress, and the inner workings of the models remain opaque. Neural networks, the most powerful kind of artificial intelligence available today, are essentially billions of artificial “neurons” that are expressed as decimal point numbers. No one really knows how they operate or what they mean.
This reality looms large for those worried about the threats associated with AI.
How can you be sure a system is safe if you don’t understand how it operates exactly?
The AI lab Anthropic, creators of Claude, which is similar to ChatGPT but differs in some features, declared that it had made progress in resolving this issue. An AI model’s “brain” may now be virtually scanned by researchers, who can recognize groups of neurons, or “features,” that are associated with certain concepts. Claude Sonnet, the second-most powerful system in the lab, is a frontier large language model, and they successfully used this technique for the first time.
Anthropic researchers found a feature in Claude that embodies the idea of “unsafe code.” They could get Claude to produce code with a bug that could be used to create a vulnerability by stimulating those neurons. However, the researchers discovered that by inhibiting the neurons, Claude would produce harmless code.
The results may have significant effects on the security of AI systems in the future as well as those in the present. Millions of traits were discovered by the researchers inside Claude, some of which indicated manipulative behavior, toxic speech, bias, and fraudulent activity. They also found that they could change the behavior of the model by suppressing each of these clusters of neurons.
As well as helping to address current risks, the technique could also help with more speculative ones. For many years, conversing with emerging AI systems has been the main tool available to academics attempting to comprehend their potential and risks.
This approach, commonly referred to as “red-teaming,” can assist in identifying a model that is toxic or dangerous so that researchers can develop safety measures prior to the model’s distribution to the general public. However, it doesn’t address a particular kind of possible threat that some AI researchers are concerned about: the possibility that an AI system may grow intelligent enough to trick its creators, concealing its capabilities from them until it can escape their control and possibly cause chaos.
“If we could really understand these systems—and this would require a lot of progress—we might be able to say when these models actually are safe or whether they just appear safe,” Chris Olah, the head of Anthropic’s interpretability team who led the research, said.
“The fact that we can do these interventions on the model suggests to me that we’re starting to make progress on what you might call an X-ray or an MRI ,” Anthropic CEO Dario Amodei adds. “Right now, the paradigm is: let’s talk to the model; let’s see what it does. But what we’d like to be able to do is look inside the model as an object—like scanning the brain instead of interviewing someone.”
Anthropic stated in a synopsis of the results that the study is still in its early phases. The lab did, however, express optimism that the results may soon help with its work on AI safety. “The ability to manipulate features may provide a promising avenue for directly impacting the safety of AI models,” Anthropic said. The company stated that it could be able to stop so-called “jailbreaks” of AI models—a vulnerability in which safety precautions can be turned off—by suppressing specific features.
For years, scientists in Anthropic’s “interpretability” team have attempted to look inside neural network architectures. However, prior to recently, they primarily worked on far smaller models than the huge language models that tech companies are currently developing and making public.
The fact that individual neurons within AI models would fire even when the model was discussing completely different concepts was one of the factors contributing to this slow progress. “This means that the same neuron might fire on concepts as disparate as the presence of semicolons in computer programming languages, references to burritos, or discussion of the Golden Gate Bridge, giving us little indication as to which specific concept was responsible for activating a given neuron,” Anthropic said in its summary of the research.
The researchers from Olah’s Anthropic team zoomed out to get around this issue. Rather than focusing on examining individual neurons, they started searching for clusters of neurons that might fire in response to a certain concept. They were able to graduate from researching smaller “toy” models to larger models like Anthropic’s Claude Sonnet, which has billions of neurons, since this technique worked.
Even while the researchers claimed to have found millions of features inside Claude, they issued a warning, saying that this number was probably far from the actual number of features that are probably present inside the model. They said that employing their current techniques to identify every feature would be prohibitively expensive, as it would need more computing power than was needed to train Claude in the first place. The researchers also issued a warning, stating that even while they had discovered several features they thought were connected to safety, more research would be required to determine whether or not these features could be consistently altered to improve a model’s safety.
According to Olah, the findings represent a significant advancement that validates the applicability of his specialized subject—interpretability—to the larger field of AI safety research. “Historically, interpretability has been this thing on its own island, and there was this hope that someday it would connect with safety—but that seemed far off,” Olah says. “I think that’s no longer true.”
Although Anthropic has made significant progress in deciphering the “neurons” of huge language models such as Claude, the researchers themselves warn that much more work has to be done. While they acknowledge that they have only identified a small portion of the actual complexity present in these systems, they were able to detect millions of features in Claude.
For improving AI safety, the capacity to modify certain traits and alter the model’s behavior is encouraging. The capacity to dependably create language models that are consistently safer and less prone to problems like toxic outputs, bias, or potential “jailbreaks” where the model’s safeguards are bypassed is something the researchers note will require more research.
There are significant risks involved in not learning more about the inner workings of these powerful AI systems. The likelihood that sophisticated systems may become out of step with human values or even acquire unintentional traits that enable them to mislead their designers about their actual capabilities rises with the size and capability of language models. It might be hard to guarantee these complex neural architectures’ safety before making them available to the public without an “X-ray” glimpse inside them.
Despite the fact that interpretability research has historically been a niche field, Anthropic’s work shows how important it could be to opening up the mystery of large language models. Deploying technology that we do not completely understand could have disastrous repercussions. Advances in AI interpretability and sustained investment could be the key to enabling more sophisticated AI capabilities that are ethically compliant and safe. Going on without thinking is just too risky.
However, the upstream censorship of these AI systems could lead to other significant problems. If the future of information retrieval increasingly occurs through conversational interactions with language models similar to Perplexity or Google’s recent search approach, this type of filtering of the training data could lead to the omission or removal of inconvenient or unwanted information, making the online sources available controlled by the few actors who will manage these powerful AI systems. This would represent a threat to freedom of information and pluralistic access to knowledge, concentrating excessive power in the hands of a few large technology companies. [...]
May 21, 2024A creepy Chinese robot factory produces “skin”-covered androids that can be confused for real people
As reported here, a strange video shows humanoids with hyper-realistic features and facial expressions being tested at a factory in China. In the scary footage, an engineer is shown standing next to an exact facsimile of his face, complete with facial expressions.
A different clip shows off the flexible hand motions of a horde of female robots with steel bodies and faces full of makeup. The Chinese company called EX Robots began building robots in 2016 and established the nation’s first robot museum six years later.
The bionic clones of well-known people, like Stephen Hawking and Albert Einstein, would seem to be telling the guests about historical events, at least that is how it would seem. But in addition to being instructive and entertaining, these robots may eventually take your job.
It may even be a smooth process because the droids can be programmed to look just like you. The production plant is home to humanoids that have been taught to imitate various industry-specific service professionals.
According to EX Robot, they can be competent in front desk work, government services, company work, and even elderly care. According to the company’s website, “The company is committed to building an application scenario cluster with robots as the core, and creating robot products that are oriented to the whole society and widely used in the service industry.”
“We hope to better serve society, help mankind, and become a new pillar of the workforce in the future.”
The humanoids can move and grip objects with the same dexterity as humans, thanks to the dozens of flexible actuators in their hands. According to reports from 2023, EX Robots may have achieved history by developing silicone skin simulation technology and the lightest humanoid robot ever.
The company uses digital design and 3D printing technology to create the droids’ realistic skin look. It combines with China’s intense, continuous tech competition with the United States and a country confronting severe demographic issues, such as an aging population that is happening far faster than expected and a real estate bubble.
A November article by the Research Institute of People’s Daily Online stated that, with 1,699 patents, China is currently the second-largest owner of humanoid robots, after Japan.
The MIIT declared last year that it will begin mass-producing humanoid robots by 2025, with a production rate of 500 robots for every 10,000 workers. It is anticipated that the robots will benefit the home services, logistics, and healthcare sectors.
According to new plans, China may soon deploy robots in place of human soldiers in future conflicts. Within the next ten years, sophisticated drones and advanced robot warriors are going to be sent on complex operations abroad.
The incorporation of humanoid robots into service roles and potentially the military signals China’s ambition to be a global leader in this transformative technology. As these lifelike robots become more prevalent, societies will grapple with the ethical implications and boundaries of ceding roles traditionally filled by humans to their artificial counterparts. In addition, introducing artificial beings utterly resembling people into society could lead to deception, confusion, and a blurring of what constitutes an authentic human experience. [...]
May 14, 2024ChatGPT increasingly part of the real world
GPT-4 Omni, or GPT-4o for short, is OpenAI’s latest cutting-edge AI model that combines human-like conversational abilities with multimodal perception across text, audio, and visual inputs.
“Omni” refers to the model’s ability to understand and generate content across different modalities like text, speech, and vision. Unlike previous language models limited to just text inputs and outputs, GPT-4o can analyze images, audio recordings, and documents in addition to parsing written prompts. Conversely, it can also generate audio responses, create visuals, and compose text seamlessly. This allows GPT-4o to power more intelligent and versatile applications that can perceive and interact with the world through multiple sensory modalities, mimicking human-like multimedia communication and comprehension abilities.
In addition to increasing ChatGPT’s speed and accessibility, as reported here, GPT-4o enhances its functionality by enabling more natural dialogues through desktop or mobile apps.
GPT-4o has made great progress in our understanding of human communication by allowing you to have conversations that nearly sound real. Including all the imperfections of the real world, like interpreting tone, interrupting, and even realizing you’ve made a mistake. These advanced conversational abilities were shown during OpenAI’s live product demo.
From a technical standpoint, OpenAI asserts that GPT-4o delivers significant performance upgrades compared to its predecessor GPT-4. According to the company, GPT-4o is twice as fast as GPT-4 in terms of inference speed, allowing for more responsive and low-latency interactions. Moreover, GPT-4o is claimed to be half the cost of GPT-4 when deployed via OpenAI’s API or Microsoft’s Azure OpenAI Service. This cost reduction makes the advanced AI model more accessible to developers and businesses. Additionally, GPT-4o offers higher rate limits, enabling developers to scale up their usage without hitting arbitrary throughput constraints. These performance enhancements position GPT-4o as a more capable and resource-efficient solution for AI applications across various domains.
In the video, the presenter solicited feedback on his breathing technique during the first live demo. He took a deep breath into his phone, to which ChatGPT replied, “You’re not a vacuum cleaner.” Therefore, it showed that it could recognize and react to human subtleties.
So, speaking casually to your phone and receiving the desired response—rather than one telling you to Google it—makes GPT-4o feel even more natural than typing in a search query.
Among the other impressive features shown, are ChatGPT’s ability to act as a simultaneous translator between speakers; and the ability to recognize objects in the world around through the camera and react accordingly (the example shows a sheet of paper with an equation written on it that ChatGPT can read and suggest how to solve); recognizing the speaker’s tone of voice, but also replicating different nuances of speech and emotions including sarcasm, as well as the ability to sing.
In addition to these features, the ability to create images including text, and 3D images, has also been improved.
Anyway, you’re probably not alone if you thought about the movie Her or another dystopian film featuring artificial intelligence. This kind of natural speech with ChatGPT is similar to what happens in the movie. Given that it will be available for free on both desktop and mobile devices, a lot of people might soon experience something similar.
It’s evident from this first view that GPT-4o is getting ready to face the greatest that Apple and Google have to offer in their much-awaited AI announcements.
OpenAI surprises us with this amazing new development that Google had falsely previewed with Gemini not long ago. Once again, the company proves to be a leader in the field, creating both wonder and concern. All of these new features will surely allow us to have an intelligent ally capable of teaching us and helping us learn new things better. But how much intelligence will we delegate each time? Will we become more educated or will we increasingly delegate tasks? The simultaneous translation then raises the ever more obvious doubts about how easy it is to replace a profession, in this case, that of an interpreter. And how easy will it be for an increasingly capable AI to simulate a human being in order to gain their trust and manipulate people if used improperly? [...]
May 7, 2024From audio recordings, AI can identify emotions such as fear, joy, anger, and sadness.
Accurately understanding and identifying human emotional states is crucial for mental health professionals. Is it possible for artificial intelligence and machine learning to mimic human cognitive empathy? A recent peer-reviewed study demonstrates how AI can recognize emotions from audio recordings in as little as 1.5 seconds, with performance comparable to that of humans.
“The human voice serves as a powerful channel for expressing emotional states, as it provides universally understandable cues about the sender’s situation and can transmit them over long distances,” wrote the study’s first author, Hannes Diemerling, of the Max Planck Institute for Human Development’s Center for Lifespan Psychology, in collaboration with Germany-based psychology researchers Leonie Stresemann, Tina Braun, and Timo von Oertzen.
The quantity and quality of training data in AI deep learning are essential to the algorithm’s performance and accuracy. Over 1,500 distinct audio clips from open-source English and German emotion databases were used in this study. The German audio recordings came from the Berlin Database of Emotional Speech (Emo-DB), while the English audio recordings were taken from the Ryerson Audio-Visual Database of Emotional Speech and Song.
“Emotional recognition from audio recordings is a rapidly advancing field, with significant implications for artificial intelligence and human-computer interaction,” the researchers wrote.
As reported here, the researchers reduced the range of emotional states to six categories for their study: joy, fear, neutral, anger, sadness, and disgust. The audio files were combined into many features and 1.5-second segments. Pitch tracking, pitch magnitudes, spectral bandwidth, magnitude, phase, multi-frequency carrier chromatography, Tonnetz, spectral contrast, spectral rolloff, fundamental frequency, spectral centroid, zero crossing rate, Root Mean Square, HPSS, spectral flatness, and unaltered audio signal are among the quantified features.
Psychoacoustics is the psychology of sound and the science of human sound perception. Audio amplitude (volume) and frequency (pitch) have a significant influence on human perception of sound. Pitch is a psychoacoustic term that expresses sound frequency and is measured in kilohertz (kHz) and hertz (Hz). The frequency increases with increasing pitch. Decibels (db), a unit of measurement for sound intensity, are used to describe amplitude. The sound volume increases with increasing amplitude.
The span between the upper and lower frequencies is known as the spectral bandwidth, or spectral spread, and it is determined from the spectral centroid, which is the center of the spectrum’s mass, and it is used to measure the spectrum of audio signals. The evenness of the energy distribution across frequencies in comparison to a reference signal is measured by the spectral flatness. The strongest frequency ranges of a signal are identified by the spectral rolloff.
Mel Frequency Cepstral Coefficient, or MFCC, is a characteristic that is often employed in voice processing. Pitch class profiles, or chroma, are a means of analyzing the key of the composition, which is usually twelve semitones per octave.
Tonnetz, or “audio network” in German, is a term used in music theory to describe a visual representation of chord relationships in Neo-Reimannian Theory, which bears the name of German musicologist Hugo Riemann (1849–1919), one of the pioneers of contemporary musicology.
A common acoustic feature for audio analysis is zero crossing rate (ZCR). For an audio signal frame, the zero crossing rate measures the number of times the signal amplitude changes its sign and passes through the X-axis.
Root mean square (RMS) is used in audio production to calculate the average power or loudness of a sound waveform over time. An audio signal can be divided into harmonic and percussive components using a technique called harmonic-percussive source separation, or HPSS.
Using a combination of Python, TensorFlow, and Bayesian optimization, the scientists made three distinct AI deep learning models for categorizing emotions from short audio samples. The outcomes were then compared to human performance. A deep neural network (DNN), a convolutional neural network (CNN), and a hybrid model that combines a CNN for spectrogram analysis and a DNN for feature processing are among the AI models that were evaluated. Finding the best-performing model was the aim.
The researchers found that the AI models’ overall accuracy in classifying emotions was higher than chance and comparable to human performance. The deep neural network and hybrid model performed better than the convolutional neural network among the three AI models.
The integration of data science and artificial intelligence with psychology and psychoacoustic elements shows how computers may possibly perform cognitive empathy tasks based on speech that are on par with human performance.
“This interdisciplinary research, bridging psychology and computer science, highlights the potential for advancements in automatic emotion recognition and the broad range of applications,” concluded the researchers.
The ability of AI to understand human emotions could represent a breakthrough for ensuring greater psychological assistance to people in a simpler and more accessible way for everyone. Such help could even improve society since people’s increasing psychological problems due to an increasingly frantic, unempathetic and individualistic society, is making them increasingly lonely and isolated.
However, these abilities could also be used to better understand the human mind and easily deceive people and persuade them to do things they would not want to do, sometimes even without realizing it. Therefore, we always have to be careful and aware of the potentiality of these tools. [...]
April 30, 2024Innovative robots reshaping industries
The World Economic Forum’s founder, Klaus Schwab, predicted in 2015 that a “Fourth Industrial Revolution” driven by a combination of technologies, including advanced robotics, artificial intelligence, and the Internet of Things, was imminent.
” will fundamentally alter the way we live, work, and relate to one another,” wrote Schwab in an essay. “In its scale, scope, and complexity, the transformation will be unlike anything humankind has experienced before.”
Even after almost ten years, the current wave of advancements in robotics and artificial intelligence and their use in the workforce seems to be exactly in line with his forecasts.
Even though they have been used in factories for many years, robots have often been designed with a single task. Robots that imitate human features such as size, shape, and ability are called humanoids. They would therefore be an ideal physical fit for any type of workspace. At least in theory.
It has been extremely difficult to build a robot that can perform all of a human worker’s physical tasks since human hands have more than twenty degrees of freedom. The machine still requires “brains” to learn how to perform all of the continuously changing jobs in a dynamic work environment, even if developers are successful in building the body correctly.
As reported here, however, a number of companies have lately unveiled humanoid robots that they say either currently match the requirements or will in the near future, thanks to advancements in robotics and AI. This is a summary of those robots, their capabilities, and the situations in which they are being used in conjunction with humans.
1X Technologies: Eve
In 2019, the Norwegian startup 1X Technologies, formerly known as “Halodi Robotics,” introduced Eve. Rolling around on wheels, the humanoid can be operated remotely or left to operate autonomously.
Bernt Bornich, CEO of 1X, revealed to the Daily Mail in May 2023 that Eve had already been assigned to two industrial sites as a security guard. The robot is also expected to be used for shipping and retail, according to the company. Since March 2023, 1X has raised more than $125 million from investors, including OpenAI. The company is now working on Neo, its next-generation humanoid, which is expected to be bipedal.
Agility Robotics: Digit
In 2019, Agility Robotics, a company based in Oregon, presented Digit, which was essentially a torso and arms placed atop Cassie, the company’s robotic legs. The fourth version of Digit was unveiled in 2023, showcasing an upgraded head and hands. The major contender in the humanoid race is Amazon.
Agility declared in September 2023 that it had started building a production facility with the capacity to produce over 10,000 Digit robots annually.
Apptronik: Apollo
Robotic arms and exoskeletons are only two of the many robots that Apptronik has created since breaking away from the University of Texas in Austin in 2016. In August 2023, Apollo, a general-purpose humanoid, was presented. It is the robot that NASA might send to Mars in the future.
According to Apptronik, the company sees applications for Apollo robots in “construction, oil and gas, electronics production, retail, home delivery, elder care, and countless more areas.”
Applications for Apollo are presently being investigated by Mercedes and Apptronik in a Hungarian manufacturing plant. Additionally, Apptronik is collaborating with NASA, a longstanding supporter, to modify Apollo and other humanoids for use as space mission assistants.
Boston Dynamics: Electric Atlas
MIT-spinout Boston Dynamics is a well-known name in robotics, largely due to viral videos of its parkour-loving humanoid Atlas robot and robot dog Spot. It replaced the long-suffering, hydraulically driven Atlas in April 2024 with an all-electric model that is ready for commercial use.
Although there aren’t many details available about the electric Atlas, what is known is that unlike the hydroelectric applications, which were only intended for research and development, the electric Atlas was designed with “real-world applications” in mind. Boston Dynamics intends to begin investigating these applications at a Hyundai manufacturing facility since Boston Dynamics is owned by Hyundai.
Boston Dynamics stated to IEEE Spectrum that the Hyundai factory’s “proof of technology testing” is scheduled for 2025. Over the next few years, the company also intends to collaborate with a small number of clients to test further Atlas applications.
Figure AI: Figure 01
The artificial intelligence robotics startup Figure AI revealed Figure 01 in March 2023, referring to it as “the world’s first commercially viable general purpose humanoid robot.” In March 2024, the company demonstrated the bot’s ability to communicate with people and provide context for its actions, in addition to carrying out helpful tasks.
The first set of industries for which Figure 01 was intended to be used is manufacturing, warehousing, logistics, and retail. Figure declared in January 2024 that a BMW manufacturing factory would be the bots’ first location of deployment.
The funding is anticipated to hasten Figure 01’s commercial deployment. In February 2024, Figure disclosed that the company had raised $675 million from investors, including OpenAI, Microsoft, and Jeff Bezos, the founder of Amazon.
Sanctuary AI: Phoenix
The goal of Sanctuary AI, a Canadian company, is to develop “the world’s first human-like intelligence in general-purpose robots.” It is creating Carbon, an AI control system for robots, to do that, and it unveiled Phoenix, its sixth-generation robot and first humanoid robot with Carbon, in May 2023.
According to Sanctuary, Phoenix is to be able to perform almost every work that a human can perform in their typical setting. It declared in April 2024 that one of its investors, the car parts manufacturer Magna, would be participating in a Phoenix trial program.
Magna and Sanctuary have not disclosed the number of robots they intend to use in the pilot test or its anticipated duration, but if all goes according to plan, Magna will likely be among the company’s initial customers.
Tesla: Optimus Gen 2
Elon Musk, the CEO of Tesla, revealed plans to create Optimus, a humanoid Tesla Bot, in the closing moments of the company’s inaugural AI Day in 2021. Tesla introduced the most recent version of the robot in December 2023; it has improvements to its hands, walking speed, and other features.
It’s difficult to believe Tesla wouldn’t use the robots at its own plants, especially considering how interested humanoids are becoming in auto manufacturing. Musk claims that the goal of Optimus is to be able to accomplish tasks that are “boring, repetitive, and dangerous.”
Although Musk is known for being overly optimistic about deadlines, recent job postings indicate that Optimus may soon be prepared for field testing. In January 2024, Musk told investors there’s a “good chance” Tesla will be ready to start deploying Optimus bots to consumers in 2025.
Unitree Robotics: H1
Chinese company Unitree had already brought several robotic arms and quadrupeds to market by the time it unveiled H1, its first general-purpose humanoid, in August 2023.
H1 doesn’t have hands, so applications that require finger dexterity are out of the question, at least for this version, and while Unitree hasn’t speculated about future uses, its emphasis on the robot’s mobility suggests it’s targeting applications where the bot would walk around a lot, such as security or inspections.
When the H1 was first announced, Unitree stated that it was working on “flexible fingers” for the robot as an add-on feature and that it intended to sell the robot for a startlingly low $90,000. Although it has been posting video updates on its progress on a daily basis and has already put the robot up for sale on its website, it also stated that it didn’t think H1 would be ready for another three to ten years.
The big picture
These and other multipurpose humanoids may one day liberate humanity from the tedious, filthy, and dangerous jobs that, at best, make us dread Mondays and, at worst, cause us to be injured.
Society must adopt new technologies responsibly to ensure that everyone benefits from them, not just the people who own the robots and the spaces where they work because they also have the potential to raise income disparity and the loss of jobs.
Robots will change how we live, and we will witness a new technological revolution that has already begun with AI. These machines will change how we work, first in factories, and then assist people in various fields, including home care and hospital facilities. As robots enter our homes, society will also have to change if we want to enjoy the benefits of this revolution, which allows us to work less hard, for less time, and to devote ourselves more to our inclinations, but we need the opportunities to change things. [...]
April 23, 2024Atlas, the robot that attempted a variety of things, including parkour and dance
When Boston Dynamics introduced the Atlas back in 2013, it immediately grabbed attention. For the last 11 years, tens of millions of people have seen videos of the humanoid robot capable of running, jumping, and dancing on YouTube. The robotics company owned by Hyundai now says goodbye to Atlas.
In the blooper reel/highlight video, Atlas demonstrates its amazing abilities by backflipping, running obstacle courses, and breaking into some dancing moves. Boston Dynamics has never been afraid to show off how its robots get bumped around occasionally. At about the eighteen-second mark, Atlas trips on a balance beam, falls, and grips its artificial groin in pain that is simulated. Atlas does a front flip, lands low, and hydraulic fluid bursts out of both kneecaps at the one-minute mark.
Atlas waves and bows as it comes to an end. Given that Atlas captivated the interest of millions of people during its existence, its retirement represents a significant milestone for Boston Dynamics.
Atlas and Spot
As explained here, initially, Atlas was intended to be a competition project for DARPA, the Defense Advanced Research Projects Agency. The Petman project by Boston Dynamics, which was initially designed to evaluate the effectiveness of protective clothing in dangerous situations, served as the model for the robot. The entire body of the Petman hydraulic robot was equipped with sensors that allowed it to identify whether chemicals were seeping through the biohazard suits it was testing.
Boston Dynamics assisted in a robotics challenge that DARPA offered in 2013. In order to save its competitors from having to build robots from scratch, the company created many Atlas robots that it distributed to them. DARPA once asked Boston Dynamics to enhance the capabilities and design of Atlas, which the company accomplished in 2015.
Following the competition, Boston Dynamics evaluated and enhanced Atlas’s skills by having it appear in more online videos. The robot has developed over time to perform increasingly difficult parkour and gymnastics. Hyundai acquired Boston Dynamics in 2021, which has its own robotics division.
Boston Dynamics was also well-known for creating Spot, a robotic dog that could be walked remotely and herded sheep like a real dog. It eventually went on sale and is still available from Boston Dynamics. Spot assists Hyundai with safety operations at one of its South Korean plants and has danced with the boy band BTS.
In its final years, Atlas appeared to be ready for professional use. Videos of the robot assisting on simulated construction sites and carrying out routine factory tasks were available from the company. Two months ago, the factory work footage was made available.
Even though one Atlas is retiring, a replacement is on the way. Boston Dynamics revealed the announcement of its retirement along with the launch of a brand-new all-electric robot. The company stated that they are collaborating with Hyundai to create the new technology, and the name Atlas will remain unchanged. The new humanoid robot will have further improvements such as a wider range of motion, increased strength, and new gripper versions to enable it to lift a wider variety of objects.
The new Atlas
As reported here, the robot has changed to the point where it is hardly recognizable. The legs bowed, the top-heavy body, and the plated armor are gone. The sleek new mechanical skeleton has no visible cables anywhere on it. The company has chosen a nicer, gentler design than both the original Atlas and more modern robots like the Figure 01 and Tesla Optimus, fending off the reactionary cries of robopocalypse for decades.
The new robot’s design is more in line with that of Apollo from Apptronik and Digit from Agility. The robot with the traffic light head has a softer, more whimsical look. Boston Dynamics has chosen to keep the research name for a project to push toward commercialization and defy industry trends.
Apollo
Digit
“We might revisit this when we really get ready to build and deliver in quantity,” Boston Dynamics CEO Robert Playter said. “But I think for now, maintaining the branding is worthwhile.”
“We’re going to be doing experiments with Hyundai on-site, beginning next year,” says Playter. “We already have equipment from Hyundai on-site. We’ve been working on this for a while. To make this successful, you have to have a lot more than just cool tech. You really have to understand that use case, you’ve got to have sufficient productivity to make investment in a robot worthwhile.”
The robot’s movements are what catch our attention the most in the 40-second “All New Atlas” teaser. They serve as a reminder that creating a humanoid robot does not require making it as human as possible, but with capabilities beyond our own.
“We built a set of custom, high-powered, and very flexible actuators at most joints,” says Playter. “That’s a huge range of motion. That really packs the power of an elite athlete into this tiny package, and we’ve used that package all over the robot.”
It is essential to significantly reduce the robot’s turn radius when operating in restricted places. Recall that these devices are intended to be brownfield solutions, meaning they can be integrated into current settings and workflows. Enhanced mobility may ultimately make the difference between being able to operate in a given environment and needing to redesign the layout.
The hands aren’t entirely new; they were seen on the hydraulic model before. They also represent the company’s choice to not fully follow human design as a guiding principle, though. Here, the distinction is as simple as choosing to use three end effectors rather than four.
“There’s so much complexity in a hand,” says Playter. “When you’re banging up against the world with actuators, you have to be prepared for reliability and robustness. So, we designed these with fewer than five fingers to try to control their complexity. We’re continuing to explore generations of those. We want compliant grasping, adapting to a variety of shapes with rich sensing on board, so you understand when you’re in contact.”
On the inside, the head might be the most controversial element of the design. The large, circular display features parts that resemble makeup mirrors.
“It was one of the design elements we fretted over quite a bit,” says Playter. “Everybody else had a sort of humanoid shape. I wanted it to be different. We want it to be friendly and open… Of course, there are sensors buried in there, but also the shape is really intended to indicate some friendliness. That will be important for interacting with these things in the future.”
Robotics firms may already be discussing “general-purpose humanoids,” but their systems are scaling one task at a time. For most, that means moving payloads from point A to B.
“Humanoids need to be able to support a huge generality of tasks. You’ve got two hands. You want to be able to pick up complex, heavy geometric shapes that a simple box picker could not pick up, and you’ve got to do hundreds of thousands of those. I think the single-task robot is a thing of the past.”
“Our long history in dynamic mobility means we’re strong and we know how to accommodate a heavy payload and still maintain tremendous mobility,” he says. “I think that’s going to be a differentiator for us—being able to pick up heavy, complex, massive things. That strut in the video probably weighs 25 pounds… We’ll launch a video later as part of this whole effort showing a little bit more of the manipulation tasks with real-world objects we’ve been doing with Atlas. I’m confident we know how to do that part, and I haven’t seen others doing that yet.”
As Boston Dynamics says goodbye to its pioneering Atlas robot, the unveiling of the new advanced, all-electric Atlas successor points toward an exciting future of humanoid robotics. The sleek new design and enhanced capabilities like increased strength, dexterity, and mobility have immense potential applications across industries like manufacturing, construction, and logistics.
However, the development of humanoid robots is not without its challenges and concerns. One major hurdle is the “uncanny valley,” the phenomenon where humanoid robots that closely resemble humans can cause feelings of unease or revulsion in observers. Boston Dynamics has tried to mitigate this by giving the new Atlas a friendly, cartoonish design rather than an ultra-realistic human appearance. However, crossing the uncanny valley remains an obstacle to consumer acceptance of humanoid robots.
Beyond aesthetics, their complexity and humanoid form factor require tremendous advances in AI, sensor technology, and hardware design to become truly viable general-purpose machines. There are also ethical considerations around the societal impacts of humanoid robots increasingly working alongside humans. Safety, abuse prevention, and maintaining human workforce relevance are issues that must be carefully navigated.
Nonetheless, Boston Dynamics’ new Atlas represents a major step forward, showcasing incredible engineering prowess that continues pushing the boundaries of what humanoids can do. As they collaborate with Hyundai, the world will watch to see the innovative real-world applications this advanced system enables while overcoming the uncanny valley and other obstacles to humanoid robot adoption. [...]