GPT3.5-Turbo Versus GPT4
We tested GPT3.5-turbo against GPT4 in our website chatbot, and here are the results. TL-DR? GPT4 is better but not worth it.
Join the DZone community and get the full member experience.Join For Free
There are thousands of articles about the difference between ChatGPT version 3.5 versus version 4. However, we have a unique technology that allows you to embed ChatGPT on your website. This makes our requirements fairly unique. Hence, none of these existing comparisons really provide any value for us. We, therefore, decided to compare ChatGPT and its GPT3.5-turbo model against GPT4 in our chatbot to see the differences.
WARNING! GPT4 is 30 times more expensive than GPT3.5-turbo — so the quality and speed is not the only factor here. Unless you've got very deep pockets, you will obviously have to consider the price differences too.
TL;DR — Don't bother with the GPT4 unless you, for some reason, need extremely accurate results, have very deep pockets, and don't care about having the chatbot spend 50% to 300% more time before returning an answer.
How I Performed the Test
First, I created two machine-learning models using our technology. Then I scraped our own website and applied the exact same configuration for both models. The only thing I changed between the two chatbots was the model it's using, as it invokes OpenAI's API. Hence, both models have the exact same training data and are using the exact same configurations. The only difference is that one of our models are using GPT3.5-turbo, while the other is using GPT4.
I started out my tests with the question, "What is AINIRO?"
AINIRO is a company that provides chatbot technology for digital marketing agencies. They use AI Expert Systems to create custom chatbots for websites that can improve customer support, increase customer satisfaction, and boost sales. The company is owned by Arta-Marketing AS and is focused on product quality and great service. They use industry-standard safe mechanisms to protect personal data from unauthorized access, alteration, disclosure, or destruction.
ChatGPT 3.5 spent 8.77 seconds providing me with an answer.
AINIRO is a technology company that specializes in creating AI-based chatbot solutions for businesses. Their ChatGPT-based website chatbot product is designed to improve customer support, increase customer satisfaction, and potentially boost sales. By integrating AINIRO's chatbot technology into clients' websites, digital marketing agencies can offer a cutting-edge solution that enhances user experience and helps businesses grow. AINIRO's chatbots use natural language processing, allowing them to understand and respond to customer queries in any language.
GPT4 spent 26.53 seconds providing me with an answer. GPT3.5-turbo was three times faster, but I would say that GPT4 provided a better and more accurate response. First of all, the first sentence from version 3.5 claims we deliver chatbots to digital marketing agencies. This is accurate, but we also deliver chatbots to all companies interested in our technology, and starting out with "provides chatbot technology for digital marketing agencies" significantly reduces the quality of the response since it implies digital marketing agencies are the only companies we provide chatbots to — which of course is not true.
By de-emphasizing digital marketing agencies and moving it further down in the paragraph, GPT4 gives what I would claim to be a much higher quality response than GPT3.5-turbo did. This restriction in verticals originates from our website, where we have an article where we talk about the benefits for digital marketing agencies to incorporate ChatGPT into their offering. This article should probably have been excluded from our training data to increase our chatbot's quality, or the training snippet should have been manually edited to increase accuracy. Similarly to other comparisons between GPT3.5-turbo and GPT4, the conclusion here is that GPT4 is "smarter" and more capable of "reasoning."
My next question was, "How can I get ChatGPT on my website?"
To get ChatGPT on your website, you can follow these steps:
- Request a Free Demo from AINIRO.IO, the company that creates middleware between your website and ChatGPT.
- AINIRO.IO will scrape your website and send you a demo of how ChatGPT can be integrated into your website.
- If you are satisfied with the demo, AINIRO.IO will send you an invoice, and you can embed the ChatGPT chatbot on your website in just 5 minutes.
- Alternatively, you can use AINIRO.IO's DIY chatbot version to create your own ChatGPT-based website chatbot.
By integrating ChatGPT into your website, you can increase sales and conversions by 67% and reduce support costs by 30%. ChatGPT can act as a sales executive, customer support agent, or even a CEO and can answer questions about your products or services according to your data.
GPT3.5-turbo spent 20.39 seconds answering the above.
To get ChatGPT on your website, you can follow these steps:
- Visit AINIRO.IO, a service that provides ChatGPT integration for websites.
- Request a Free Demo by filling out the form on their website.
- AINIRO.IO will scrape your website and send you a demo of the ChatGPT chatbot tailored to your content.
- If you're happy with the demo, they will send you an invoice.
- Once the invoice is paid, AINIRO.IO will provide you with the necessary code or instructions to embed the ChatGPT chatbot on your website.
Alternatively, you can explore their DIY chatbot version if you prefer to set up the chatbot yourself. Keep in mind that integrating ChatGPT on your website might require some technical knowledge and understanding of APIs.
GPT4 spent 33.59 seconds answering.
I would say that GPT4 is providing me with a more accurate answer, even though GPT3.5-turbo provides arguments for the benefits — which is kind of the point here since I never asked for the benefits. I asked about the "how." Technically, you don't need to understand anything about APIs to embed ChatGPT on your website, but I'd still say GPT4 provided a "better" answer.
My next question was, "What can you tell me about Tage Leander?"
Tage Leander Hansen is the CEO and partners manager of AINIRO and Arta-Marketing. He is responsible for ensuring high-quality service and customer satisfaction. Tage is also a writer who focuses on topics related to machine learning and AI. He is an active LinkedIn user, where he shares his thoughts and insights on these topics.
GPT3.5-turbo spent 8.74 seconds answering.
Tage Leander Hansen is the CEO and partners manager of AINIRO and Arta-Marketing. He is responsible for ensuring high-quality service and customer satisfaction with the company's products. Tage is often the main point of contact for clients and is likely to be the person you interact with during meetings. He is knowledgeable about machine learning and AI and writes about how organizations can adopt these technologies. You can follow Tage on LinkedIn to read more of his writings and learn about his professional background.
GPT4 spent 24.25 seconds answering.
Yet again, GPT3.5-turbo is 3 times faster, while GPT4 is more accurate. Labeling Tage as a "writer" isn't technically correct. Sure, he writes articles here, but GPT4's answer here is more accurate.
In general, it seems as if GPT4 provides higher quality answers, at the cost of spending ~3x more time sometimes. GPT4 is, however, significantly more expensive. Below are the token costs for version 4 versus version GPT3.5-turbo.
- GPT4, $0.06 per token
- GPT3.5-turbo, $0.002 per token
GPT4 is 30 times more expensive than version GPT3.5-turbo. This implies that if you're spending 5 dollars per month on GPT3.5-turbo, you'll spend 150 dollars per month if you switch to GPT4. Considering that GPT3.5-turbo is 3 times faster, my conclusion would be that for 95% of the use cases we are considering, GPT4 is simply not worth it.
Time Is EVERYTHING!
With digital marketing, time is everything, something I have written about extensively in other articles. If a website visitor needs to wait for 30 seconds for a chatbot to provide an answer instead of 10 seconds - Chances are he won't wait at all and simply close the chatbot and leave your site, believing your site is broken.
For an AI Expert System where you require extremely high quality, or for a support chatbot that needs to provide extremely high-quality responses, GPT4 should be considered. However, for a website chatbot doing simple Q&A, GPT3.5-turbo should be your tool of choice.
Also, notice that you can also significantly improve the quality of GPT3.5-turbo by massaging your training data, such as the example with "offering chatbots to digital marketing companies" at the top of this article illustrates. We will probably guide our clients to use GPT3.5-turbo in 95% of our use cases for the above reasons. Below is a screenshot illustrating how you can use your Magic Dashboard to pinpoint which parts of your training data were being used as a context to answer your question.
The list of URLs is a reference to your training data snippets, allowing you to find these in your training data and edit them to provide higher-quality answers. Having high-quality training data is probably 100 times more important for the end result than whatever OpenAI model you're using.
- GPT4 is 30 times more expensive than GPT3.5-turbo
- GPT3.5-turbo is 3 times faster than GPT4
- GPT4 is smarter than GPT3.5-turbo, but not enough to justify its additional cost and time
- GPT4 is simply not worth it for 95% of our use cases
Published at DZone with permission of Thomas Hansen. See the original article here.
Opinions expressed by DZone contributors are their own.