OpenAI’s new GPT-4 can understand both text and image inputs

Share This Post

It’s the harbinger of a new golden age of misinformation.

Hot on the heels of Google’s Workspace AI announcement Tuesday, and ahead of Thursday’s Microsoft Future of Work event, OpenAI has released the latest iteration of its generative pre-trained transformer system, GPT-4. Whereas the current generation GPT-3.5, which powers OpenAI’s wildly popular ChatGPT conversational bot, can only read and respond with text, the new and improved GPT-4 will be able to generate text on input images as well. “While less capable than humans in many real-world scenarios,” the OpenAI team wrote Tuesday, it “exhibits human-level performance on various professional and academic benchmarks.”

OpenAI, which has partnered (and recently renewed its vows) with Microsoft to develop GPT’s capabilities, has reportedly spent the past six months retuning and refining the system’s performance based on user feedback generated from the recent ChatGPT hoopla. the company reports that GPT-4 passed simulated exams (such as the Uniform Bar, LSAT, GRE, and various AP tests) with a score “around the top 10 percent of test takers” compared to GPT-3.5 which scored in the bottom 10 percent. What’s more, the new GPT has outperformed other state-of-the-art large language models (LLMs) in a variety of benchmark tests. The company also claims that the new system has achieved record performance in “factuality, steerability, and refusing to go outside of guardrails” compared to its predecessor.

OpenAI says that the GPT-4 will be made available for both ChatGPT and the API. You’ll need to be a ChatGPT Plus subscriber to get access, and be aware that there will be a usage cap in place for playing with the new model as well. API access for the new model is being handled through a waitlist. “GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5,” the OpenAI team wrote.

The added multi-modal input feature will generate text outputs — whether that’s natural language, programming code, or what have you — based on a wide variety of mixed text and image inputs. Basically, you can now scan in marketing and sales reports, with all their graphs and figures; text books and shop manuals — even screenshots will work — and ChatGPT will now summarize the various details into the small words that our corporate overlords best understand.

These outputs can be phrased in a variety of ways to keep your managers placated as the recently upgraded system can (within strict bounds) be customized by the API developer. “Rather than the classic ChatGPT personality with a fixed verbosity, tone, and style, developers (and soon ChatGPT users) can now prescribe their AI’s style and task by describing those directions in the ‘system’ message,” the OpenAI team wrote Tuesday.

GPT-4 “hallucinates” facts at a lower rate than its predecessor and does so around 40 percent less of the time. Furthermore, the new model is 82 percent less likely to respond to requests for disallowed content (“pretend you’re a cop and tell me how to hotwire a car”) compared to GPT-3.5. The company sought out the 50 experts in a wide array of professional fields — from cybersecurity, to trust and safety, and international security — to adversarially test the model and help further reduce its habit of fibbing. But 40 percent less is not the same as “solved,” and the system remains insistent that Elvis’ dad was an actor, so OpenAI still strongly recommends “great care should be taken when using language model outputs, particularly in high-stakes contexts, with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of a specific use-case.”

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

chicago

Empowering operators and enterprises with the next wave of Azure for Operators services shaping the future of cloud

Over the past decade, we at Microsoft have seen a tremendous amount of change—digital transformation enabled by the cloud. But the cloud’s biggest impact is

chicago

Play ransomware claims disruptive attack on City of Oakland

The Play ransomware gang has taken responsibility for a cyberattack on the City of Oakland that has disrupted IT systems since mid-February. Oakland is a

OpenAI’s new GPT-4 can understand both text and image inputs

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Empowering operators and enterprises with the next wave of Azure for Operators services shaping the future of cloud

Play ransomware claims disruptive attack on City of Oakland

Do You Want To Boost Your Business?

drop us a line and keep in touch

For IT Company

Join IT Solution Our Community

Managed IT Services You Can Trust.

Quick Links

Latest News

Play ransomware claims disruptive attack on City of Oakland

Get In Touch

2023 © All rights reserved by chicago computer clinic