Unveiling the Future of Gadgets — Unveil the Latest Gadgets & Tech Trends

DeepSeek's R1 model training expenses dash the grandeur of big tech's substantial AI investment

Chinese AI company DeepSeek claims to have developed an top-tier model on a minimal budget

, and Administrator

2025 September 27 . 10:25 PM

2 min read

DeepSeek's expensive training for R1 model puts a damper on large tech companies' excessive AI... — DeepSeek's expensive training for R1 model puts a damper on large tech companies' excessive AI investments

DeepSeek's R1 model training expenses dash the grandeur of big tech's substantial AI investment

In a groundbreaking development, Chinese AI developer DeepSeek has made a significant impact in the tech world with its flagship model, DeepSeek R1. This reasoning model, designed to excel at complex tasks such as mathematics and coding, has been downloaded over 10 million times on AI community platform Hugging Face.

DeepSeek R1's training process is unique, compared to a child playing video games, learning through trial and error. It uses a carrot-and-stick approach to reinforcement learning, where correct problem-solving is rewarded, and incorrect answers are penalised. This approach has enabled the model to develop its own strategies without copying human tactics.

The training process of DeepSeek R1 was completed using 512 Nvidia H800 chips, renowned for their high efficiency in data and energy use. DeepSeek's claim of cost-efficient training processes has been vindicated with the peer-reviewed publication of R1 in a reputable journal like Nature.

Reports from Wired suggest that DeepSeek's CEO confirmed the cost of training the R1 model to be 'more than $294,000' during a 2024 MIT event. This figure is significantly lower compared to the projected costs for a new model by Anthropic CEO Dario Amodei, who estimated it to be upwards of $100 billion in mid-2024. OpenAI CEO Sam Altman previously hinted that foundation model training cost upwards of $100 million.

DeepSeek's decision to offer R1 as an open weight model, freely available for anyone to download, has been welcomed by industry stakeholders. This move is expected to drive further innovation and collaboration in the AI community.

The tech industry giant, Cisco, is looking to capitalise on the 'DeepSeek effect' by leveraging the cost-efficient and innovative approach of DeepSeek R1's training process. AI reasoning models like DeepSeek R1 are purposefully trained on real-world data to 'learn' how to solve specific problems, making them valuable assets in various industries.

Researchers at Carnegie Mellon University have compared DeepSeek R1's training process to a child playing video games, learning through trial and error. This comparison underscores the potential of DeepSeek R1 to revolutionise the way AI models are trained and could pave the way for more cost-effective and efficient AI solutions in the future.

Latest

This is an edited picture of a forest where we can see trees, path and the sky.

Explore Gadget Flare's Tech Data & Cloud Computing Solutions

Kamchatka Residents Get State Forest Registry Extracts in Just 10 Minutes

Say goodbye to long waits! Kamchatka's new digital system delivers state forest registry extracts in just 10 minutes, boosting convenience and efficiency.

, and Administrator

2025 October 9

In this image we can see a watch in a box. There is a white color paper with some text on it. At...

Wearables

Amazon Prime Day: Grab Ben Affleck's Timex Expedition Scout from 'The Accountant 2' for Under €60

Get your hands on Ben Affleck's on-screen timepiece before 'The Accountant 2' hits theaters. This stylish and affordable watch is a must-have for adventure enthusiasts and movie fans.

, and Administrator

2025 October 9

In this image there is a text written on the compound wall, behind the compound wall there are...

Climate-change

Axpo Misses Renewable Energy Targets, Coupon Premiums Rise

Axpo fell short on its renewable energy targets, triggering higher coupon payments. Despite this setback, the company remains committed to its sustainability goals.

, and Administrator

2025 October 9

As we can see in the image, there is a woman wearing bag and on road there is a car.

Stay Ahead of Cyber Threats with Gadget Flare

BlackByte Ransomware Gang Resurfaces With Sophisticated EDR Bypass Attack

BlackByte's new attack method disables EDR and ETW features, rendering ineffective EDR vendors. This development highlights the need for adaptive security measures.

, and Administrator

2025 October 9

DeepSeek's R1 model training expenses dash the grandeur of big tech's substantial AI investment

DeepSeek's R1 model training expenses dash the grandeur of big tech's substantial AI investment

Read also:

Related

Latest