what is next for AI in 2024 ?
We took a risk at this time last year. We tried our hand at forecasting the future in a field where nothing stays the same.
How did we perform? Our four main predictions for 2023 were that multimodal chatbots would be the next big thing (check: OpenAI's GPT-4 and Google DeepMind's Gemini, the most potent large language models available, can handle text, images, and audio); that strict new regulations would be drafted by lawmakers (check: Biden's executive order was released in October, and the European Union's AI Act was finally agreed upon in December); that Big Tech would face pressure from open-source startups (half right: the open-source boom continues, but AI firms like OpenAI and Google DeepMind still stole the show); and that AI will completely transform big pharma (albeit it's too soon to say, as the AI revolution in drug discovery is just getting started, and the first pharmaceuticals developed using AI won't hit the market for years).
Rather, we have selected a few more focused tendencies. Here are some things to be aware of in 2024. (Return the next year to see how we performed.)
- Personalized chatbots
A chatbot is given to you! You also receive a chatbot! Tech businesses that have made significant investments in generative AI will under pressure in 2024 to demonstrate that their products can generate revenue. AI behemoths Google and OpenAI are staking a lot on going small in order to achieve this: they are creating approachable platforms that let users create custom language models and create micro chatbots that are tailored to their own requirements without the need for coding knowledge. Both have released web-based resources that let anybody create generative AI applications.By 2024, generative AI might really be helpful to the average non-techie, and more individuals will be fiddling with a million tiny AI models. Cutting-edge artificial intelligence models, like GPT-4 and Gemini, are multimodal, which means they can comprehend visuals and even videos in addition to text. Numerous new apps could become available with this additional capabilities. Real estate agents can, for instance, submit text from past listings, hone an effective model to produce language that is similar with a single click, upload images and videos of fresh listings, and ask the customized AI to create a description of the property.
Naturally, though, the effectiveness of this strategy depends on how consistently these models perform. Generative models are rife with biases, and language models frequently invent things. Additionally, they are simple to hack, particularly if internet browsing is permitted. None of these issues have been resolved by IT businesses. They'll have to provide their clients with solutions for these issues once the novelty wears off.
—Melissa Heikkilä
- Video will be the second wave of generative AI.
It's incredible how quickly the extraordinary becomes commonplace. In 2022, the first generative models that generate photorealistic visuals burst onto the scene, quickly becoming standard. Tools such as Adobe's Firefly, Stability AI's Stable Diffusion, and OpenAI's DALL-E inundated the internet with astounding photos of everything from prize-winning artwork to the pope wearing Balenciaga. Not everything about it is amusing, though; for every pug waving pompom, there's another instance of plagiarized fantasy art or sexist sexual stereotypes.Text-to-video is the new frontier. Anticipate it to magnify all the positive, negative, and ugly aspects of text-to-image conversion.
When generative models were taught to piece together several still photos into brief films, we were given our first taste of what these models were capable of a year ago. The outcomes were choppy and warped. However, technology has advanced quickly.
Runway, a generative video model startup (also the one that co-created Stable Diffusion) is releasing updated versions of its tools on a monthly basis. Even though the Gen-2 model, its most recent model, only produces short videos, the quality is remarkable. The finest trailers don't stray too far from potential Pixar releases.
Every year, Runway hosts an AI film festival featuring experimental films created using various AI techniques. The top ten films from this year's festival will be exhibited in New York and Los Angeles, with a $60,000 prize fund.
Not surprisingly, major studios are paying attention. Film industry heavyweights like Disney and Paramount are already investigating the application of generative AI across their whole production process. Actors' performances are lip-synchronized to several foreign-language overdubs using this technology. It also represents a revolution in the realm of conceivable special effects. Harrison Ford appeared in Indiana Jones and the Dial of Destiny in 2023, de-aging the actor. This is only the beginning.
Deep fake technology is also becoming more and more popular for marketing and training purposes, off the big screen. Synthesia, a UK-based company, for instance, creates technologies that can instantly transform an actor's one-off performance into a never-ending stream of deepfake avatars that recite any screenplay you give them. As per the company's statement, 44% of Fortune 100 organizations already utilize its technology.
Actors' capacity to do so much with so little begs important questions. The SAG-AFTRA strikes of last year were primarily motivated by worries over the usage and abuse of AI by studios. However, the full effect of technology is still not fully understood. According to Souki Mehdaoui, an independent filmmaker and cofounder of Bell & Whistle, a firm that specializes in creative technology, "the craft of filmmaking is fundamentally changing."
When generative models were taught to piece together several still photos into brief films, we were given our first taste of what these models were capable of a year ago. The outcomes were choppy and warped. However, technology has advanced quickly.
Runway, a generative video model startup (also the one that co-created Stable Diffusion) is releasing updated versions of its tools on a monthly basis. Even though the Gen-2 model, its most recent model, only produces short videos, the quality is remarkable. The finest trailers don't stray too far from potential Pixar releases.
Not surprisingly, major studios are paying attention. Film industry heavyweights like Disney and Paramount are already investigating the application of generative AI across their whole production process. Actors' performances are lip-synchronized to several foreign-language overdubs using this technology. It also represents a revolution in the realm of conceivable special effects. Harrison Ford appeared in Indiana Jones and the Dial of Destiny in 2023, de-aging the actor. This is only the beginning.
Actors' capacity to do so much with so little begs important questions. The SAG-AFTRA strikes of last year were primarily motivated by worries over the usage and abuse of AI by studios. However, the full effect of technology is still not fully understood. According to Souki Mehdaoui, an independent filmmaker and cofounder of Bell & Whistle, a firm that specializes in creative technology, "the craft of filmmaking is fundamentally changing."
—Will Douglas Heaven
- Election misinformation produced by AI will proliferate.
Deep fakes and AI-generated election misinformation will be a major issue in 2024 when a record number of people cast ballots, if previous elections are any indication. Politicians are already using these instruments as weapons. Two aspiring Argentine presidents attacked their rivals with artificial intelligence-generated photos and videos. During Slovakia's elections, deepfakes of a liberal pro-European party leader making jokes about child pornography and threatening to raise beer prices went viral. Additionally, Donald Trump has supported a group that creates memes with racist and sexist motifs using AI in the US.While producing a deepfake would have needed sophisticated technical knowledge just a few years ago, generative AI has made the process incredibly simple and accessible, and the results are becoming more and more lifelike. AI-generated content has the potential to deceive even trustworthy sources. For example, stock picture marketplaces such as Adobe's have been overrun by user-submitted AI-generated photos that pretend to reflect the Israel-Gaza dispute.
For those battling the spread of such content, the upcoming year will be crucial. The development of strategies to monitor and reduce its content is still in its early stages. Watermarks, like SynthID from Google DeepMind, are still largely optional and not infallible. Additionally, the removal of false information from social media networks is infamously slow. Prepare for a large real-time experiment aimed at disproving fake news generated by AI.
—Melissa Heikkilä
Inspired by some of the fundamental methods driving the present explosion in generative AI, roboticists are beginning to construct more versatile robots capable of performing a greater variety of jobs.
In recent years, there has been a movement in AI from the use of numerous small models, each taught to perform a particular task (e.g., detect photos, draw them, caption them), to the use of single, monolithic models that are trained to perform all these functions and more. Through a process called fine-tuning, researchers may teach OpenAI's GPT-3 to handle coding difficulties, produce movie scripts, pass high school biology tests, and perform other tasks. Multimodal models, such as Google DeepMind's Gemini and GPT-4, are capable of handling both language and visual tasks.
- Multitasking robots
In recent years, there has been a movement in AI from the use of numerous small models, each taught to perform a particular task (e.g., detect photos, draw them, caption them), to the use of single, monolithic models that are trained to perform all these functions and more. Through a process called fine-tuning, researchers may teach OpenAI's GPT-3 to handle coding difficulties, produce movie scripts, pass high school biology tests, and perform other tasks. Multimodal models, such as Google DeepMind's Gemini and GPT-4, are capable of handling both language and visual tasks.
Robots can be trained to perform different tasks, such as opening doors and flipping pancakes, using the same methodology; a single, universal model could enable multitasking. In 2023, a number of instances of work in this field surfaced.
An improvement to DeepMind's Gato from the previous year, Robocat was published in June. Rather than teaching itself to control a single arm, as is more common, Robocat learns to control a variety of robot arms through trial and error.
The business released a large new general-purpose training data set and yet another general-purpose robot model, dubbed RT-X, in October, working with 33 academic labs. Similar technologies are being examined by other prestigious research teams, including RAIL (Robotic Artificial Intelligence and Learning) at the University of California, Berkeley.
The absence of data is the issue. Text and image data sets the size of the internet are used by generative AI. In contrast, there aren't many reliable data sources available for robots to learn how to perform the various industrial or household jobs that humans want them to.
An improvement to DeepMind's Gato from the previous year, Robocat was published in June. Rather than teaching itself to control a single arm, as is more common, Robocat learns to control a variety of robot arms through trial and error.
The business released a large new general-purpose training data set and yet another general-purpose robot model, dubbed RT-X, in October, working with 33 academic labs. Similar technologies are being examined by other prestigious research teams, including RAIL (Robotic Artificial Intelligence and Learning) at the University of California, Berkeley.
The absence of data is the issue. Text and image data sets the size of the internet are used by generative AI. In contrast, there aren't many reliable data sources available for robots to learn how to perform the various industrial or household jobs that humans want them to.
One team at New York University led by Lerrel Pinto is tackling that. He and his colleagues are working on methods that allow robots to learn by making mistakes and creating their own training data along the way. In a project even more understated, Pinto has enlisted volunteers to use an iPhone camera attached to a trash picker to gather video data from around their homes. Large data sets for robot training have also been made available by major corporations in recent years; one example is Meta's Ego4D.
Driverless automobiles are already demonstrating the potential of this method. A new wave of self-driving AI is being led by startups like Wayve, Waabi, and Ghost. This type of AI employs a single huge model to manage a car instead of several smaller models for different driving duties. This has enabled smaller businesses to overtake industry titans like Waymo and Cruise. London's congested, narrow streets are currently the site of Wayve's autonomous vehicle trials. Everywhere, robots are about to receive a similar boost.
—Will Douglas Heaven
0 comments:
Post a Comment