The first batch of DeepSeek developers have begun to escape

Image source: Generated by Unbounded AI

Behind the busy response of the DeepSeek service, it is not just the anxious waiting of ordinary users. When the API interface response breaks through the critical threshold, the world of DeepSeek developers also has a continuous shock butterfly effect.

On January 30, Lin Sen, an AI developer in Beijing who connected to DeepSeek's base, suddenly received an alarm on the program background. Before he could have been happy for DeepSeek's out-of-come, Lin Sen's program was because of Unable to call DeepSeek, the background was forced to be paralyzed for 3 days.

At first, Lin Sen thought this was due to insufficient balance in DeepSeek's account. It was not until he returned to work after the Spring Festival holiday on February 3 that he finally received a notice from DeepSeek that suspended API recharge. At this time, despite the sufficient balance in the account, he could no longer call DeepSeek.

On the third day after Lin Sen received the backend notice, DeepSeek officially issued an announcement on February 6, announcing the suspension of API service recharge. Nearly half a month has passed, and as of February 19, the API recharge service of the DeepSeek open platform has not returned to normal.

Picture Note: DeepSeek developer platform has not yet recovered the recharge image source: Alphabet list screenshot

I realized that the backend paralysis was due to the overload of the DeepSeek server, and as a developer, I have been doing it for a long time When Lin Sen felt "abandoned" after several days, but did not receive any advance notice, nor any after-sales maintenance service.

"It's like there is a small shop at the door of your house. You are an old customer. After applying for a card, you have always gotten along well with the boss. Suddenly one day, the shop was rated as a Michelin restaurant and the boss threw away the old customer. On the side, I didn’t recognize the card I applied for before.” Lin Sen described.

As the first batch of developers to start deploying DeepSeek in July 2023, Lin Sen is excited about DeepSeek's out-of-the-box, but now, in order to maintain operation, he can only switch to ChatGPT. After all, "ChatGPT is still More expensive, but at least stable. ”

When DeepSeek went from a small shop that was passed down by word of mouth to a Michelin restaurant that was popular in check-in, more developers who called no way out like Lin Sen began to escape one after another DeepSeek.

In June 2024, XiaoWindow AI Q&A machine was connected to DeepSeek V2 in the early stages of the product. What surprised XiaoWindow partner Lou Chi was that at that time, DeepSeek was the only one who could do the full text. Reciting "Yueyang Tower Inscription" will not make mistakes. Therefore, the team used DeepSeek to assume one of the most core functional roles of the product.

But for developers, although DeepSeek is good, its stability is always lacking.

Lou Chi told the letter list (ID: wujicaijing) that during the Spring Festival, not only are C-end users busy with access, but developers are often unable toCalling DeepSeek, the team decided to select several large-scale model platforms that have been connected to DeepSeek to call at the same time.

After all, "there are dozens of platforms that have a full-blooded version of DeepSeek R1." Using R1 of these large-scale model platforms can also meet the needs of users with Agent and Prompt.

In order to compete for the developer group that spilled out of DeepSeek, some leading cloud manufacturers have begun to hold activities frequently for developers. "If you participate in the event, you will get free computing power. If you don't call in large quantities, small developers will It can be used almost for free." said Yang Huichao, technical director of Yibiao AI.

However, DeepSeek is currently popular. As the first batch of developers flee, more developers are still flocking to the former, hoping to get the traffic dividend of the former.

Xi Jian’s project is a role-playing AI companion app by calling DeepSeek’s API. It has gained about 3,000 active users in its first week of launching on February 2.

Although there was a user feedback that DeepSeek's API calls reported an error, 60% of users hope that Xi Jian will launch the Android version as soon as possible. In Xi Jian’s social media backend, at least dozens of users send private messages to download links every day. “The AI companionship platform built on DeepSeek” has undoubtedly become a new label for APPs to become popular.

According to the alphabet list statistics, the list of various APPs connected to DeepSeek included in the official website of DeepSeek has only 182 lines of APP list before 2025, and has now expanded to 488 lines.

One side, DeepSeek became the "domestic light" and became popular, with 100 million users pouring in 7 days. On the other side, it is the first batch of developers deployed on DeepSeek, which is due to overloaded traffic. The service is busy, and they have switched to other big models.

For developers, long-term service exceptions are no longer simple failures, but become a crack between the code world and business logic, and they are forced to perform survival calculations under the cost of migration, whether it is To influx or escape, developers need to face the aftershocks brought by the DeepSeek explosion.

One

After the Spring Festival, the mini program background was forced to be paralyzed for three days By the sixth day of the Lunar New Year, Lin Sen, who had been deployed for more than a year, left DeepSeek, and returned to ChatGPT.

Even if the API call price is nearly 10 times higher, ensuring the stability of the service at this time has become an option with higher priority.

It is worth noting that developers leave DeepSeek and turn to other big models, which is not as easy as users switching to call models within the APP. "Different large language models, even different versions of the same language model, forThe feedback results of the words are subtle. "Even though Lin Sen is still calling ChatGPT, migrating all key nodes from DeepSeek and ensuring stable and high-quality content feedback, it still took him more than half a day.

Switching this action itself may It only takes two seconds, but "more developers, it takes a week to change the prompt word and repeat the test. "Lin Sen told the alphabet list,

In the eyes of small developers like Lin Sen, it is understandable that DeepSeek server is insufficient, but if you can notify in advance, you can avoid many losses, whether it is time cost or APP Maintenance cost.

After all, "Login to the DeepSeek developer background requires a mobile phone number registration, and you only need a text message to inform the developer in advance. "Now, these losses will be borne by the developers who once supported them when DeepSeek was unknown.

When developers are deeply coupled with a large model platform, stability has undoubtedly become unnecessary. The contract, a frequently fluctuating service interface, is enough to allow developers to re-examine their loyalty to the platform.

Just last year, when Lin Sen called the Mistral Mockup (the French top mockup company), Because of the Mistral billing system error, he repeated the payment. After he sent the email, Mistral corrected the problem less than an hour and attached a 100 euro voucher as compensation. This response also gave Lin Sen more trust. Now, he has also moved some of his services back to Mistral.

Yibiao AI technical director Yang Huichao is on DeepSeek V3 After the release of the version, a escaping began to be planned.

Don't use DeepSeek to write poems or complain. What if DeepSeek is used to write bids? Yang Huichao, who is responsible for the company's AI bid project, has already launched the V3 version after DeepSeek launched Starting to find alternatives. For him, in a professional field like bids, "DeepSeek is becoming less stable." ”

The reasoning ability of the DeepSeek R1 version became popular does not attract Yang Huichao. After all, “As a developer, the main reasoning ability of software depends on programs and algorithms, not too much on the basic capabilities of the model. Even if the underlying layer uses the oldest GPT 3.5, it can produce a good result by relying on algorithm correction. As long as the model responds to the answers are stable, it is enough. ”

In the actual call process, DeepSeek seemed to be more like a smart but lazy "good student" in Yang Huichao's eyes.

After upgrading to V3 version, Yang Huichao discovered that DeepSeek There is a higher success rate for answering some complex questions, but the stability has also risen to an unacceptable level. "Now ask 10 questions, at least one of the output is unstable. In addition to the content required to be generated, DeepSeek often likes sinceBy playing, additionally generated content that is not related to the problem. ”

For example, error characters are not allowed in the bid. At the same time, the results returned by the big model are often specified to use the Json structure (using instructions to call the big model stably to return a fixed field) to output it. Data is convenient for subsequent function calls, but errors or inaccuracies occur, which will cause subsequent calls to fail.

"DeepSeek R1, perhaps compared with previous V3 versions, has improved its inference ability much, but its stability cannot be achieved. The level of commercialization. "In the @电影 Mark account, Yang Huichao mentioned.

Picture note: Garbage code appeared during the generation of DeepSeek V3 Picture source: @电影 Mark account

As early 2024, DeepSeek-coder Yang Huichao did not deny that DeepSeek was a good student, but now, in order to ensure the quality and stability of the bids generated, Yang Huichao could only turn his attention to other large-scale models in China that are more B-side users.

After all, DeepSeek, once known as Pinduoduo in the AI industry, quickly gathered a group of small and medium-sized AI developers with the price-performance tag. But now if you want to call DeepSeek directly and stably, you must do it. On-premises deployment. “Deploy a DeepSeek R1, it is required The cost of 300,000 to 400,000 yuan is calculated using the online API, I will never be able to use 300,000 yuan in my life. ”

Yang Huichao, who have no way to call, are leaving DeepSeek in batches.

2< p>

Once upon a time, Lin Sen and others were the first to firmly choose DeepSeek.

In June 2024, Lin Sen was developing his own AI When a young man listened to the world, he compared dozens of large-scale model platforms at home and abroad at that time. He needed to use large-scale models to process thousands of news every day, filter and sort, to find out the technology and natural news suitable for teenagers to listen to, and to Processing news text.

This requires not only to be smart, but also to be cheap.

The process of thousands of news items per day is extremely expensive for tokens and independent development Lin Sen, for the ChatGPT model is very expensive and is only suitable for processing core links. It is quick to screen and analyze large amounts of text, and it also depends on other low-priced large models to support it.

At the same time, Whether it is Mistral or Gemini abroad , or ChatGPT, the call is very complicated: you need to have a specific server abroad, and you also need to be a relay station, and you need to use a foreign credit card to purchase tokens.

Lin Sen uses a British friend’s credit card. Only by completing the recharge in the ChatGPT account. Once the server is overseas, the API response speedThere will be some delays, which makes Lin Sen turn his attention to China and find a ChatGPT replacement.

DeepSeek surprised Lin Sen. "At that time, DeepSeek was not the most famous, but it was the most stable feedback." Taking the API call requested every 10 seconds as an example, other domestic big models may not return any content within 100 times, but DeepSeek Returns every time, and maintains the reply quality that is not inferior to ChatGPT and BAT’s big model platforms.

Compared with the price of big model API calls from ChatGPT and BAT, DeepSeek is really too cheap.

Lin Sen handed over a lot of news reading and preliminary analysis to DeepSeek, and found that the calling cost of DeepSeek is 10 times lower than that of ChatGPT. After the instruction optimization, the cost of calling DeepSeek every day is as low as 2-3 yuan. "Maybe it is not the best compared to ChatGPT, but the price of DeepSeek is extremely low. For my project, it has a very cost-effective performance. High. ”

Photo Note: Lin Sen uses a big model to include news and analyze it (left) and finally presented it to the Young People Listening World Mini Program (right) Picture source: Lin Sen provides

Cost-performance ratio , has become the primary reason why developers choose DeepSeek. In 2023, Yang Huichao initially switched the company's AI project from ChatGPT to Mistral, mainly to control costs. Then in May 2024, DeepSeek launched the V2 version, which pushed the API to 2 yuan per million tokens. This is undoubtedly a dimensionality reduction blow to other big model manufacturers. This has also become Yang Huichao's switch to DeepSeek's project of making AI bid tools. The origin of the cause.

At the same time, after testing, Yang Huichao found that the domestic BATs who have already relied on cloud services to gain market share on the B-side, "the platform is too heavy."

For startups like Yibiao AI, if they choose BAT, they will face bundled consumption of cloud services. For Yang Huichao, who simply calls the big model service, DeepSeek's API calls are undoubtedly more troublesome.

DeepSeek also beats the cost of migration.

Whether it is Lin Sen or Yang Huichao, the initial APP development is based on OpenAI interface form. If you switch to the BAT's big model platform, you must redevelop the underlying layer. However, DeepSeek is compatible with OpenAI like interface. Switching large models only requires modifying the platform address, "1 minute painless switching."

The small window AI Q&A machine was equipped with DeepSeek on the first day of its official sales and will be 5 Among the core roles, the roles of Chinese and composition guidance were handed over to DeepSeek for construction.

As a partner, Lou Chi was also Dee in June last yearpSeek was amazing. "DeepSeek has great Chinese understanding and is the only big model that will not make mistakes when reciting the full text of "Yueyang Tower Inscriptions" at that time." Lou Chi told the alphabet list, compared to other big models with regular and full of style document output. , using DeepSeek to teach children to write essays, often wins in writing imagination.

Before social media was popular to use DeepSeek to write poetry and science fiction novels, DeepSeek's gorgeous writing style made the small window AI team's eyes lit up.

For developers, they are still looking forward to the recovery of DeepSeek calls. Now, whether it is moving to the platform where BAT deploys the full-blooded version of DeepSeek R1, or switching to other big model manufacturers, it seems to be " Wanwan is like a lord.”

Three

But competitors are working hard to tie DeepSeek's outstanding expertise in in-depth reasoning.

Domestic, Baidu and Tencent have recently added deep thinking skills to their self-developed big models; abroad, OpenAI also urgently launched a new "Deep Research" in February, using the thinking skills of inference big models to Search online and will be open to Pro, Plus and Team users. Google DeepMind also released the Gemini 2.0 model series in February, with the 2.0 Flash Thinking experimental version being a model that enhances reasoning capabilities.

It is worth noting that DeepSeek is still mainly based on text reading, but whether it is ChatGPT or Gemini 2.0, in addition to supporting deep thinking, it has introduced its reasoning capabilities into multimodal, supporting video, voice, Various input modes such as documents and pictures.

For DeepSeek, in addition to catching up with multimodality, the greater challenge also comes from the competitors' approach in price.

On the cloud platform deployment side, a number of leading cloud manufacturers choose to connect to DeepSeek, sharing traffic while relying on cloud services to bind customers. The call to the DeepSeek big model has become a "gift" that binds enterprise cloud services to some extent.

Baidu founder Robin Li recently proposed that in the field of large language models, "the inference cost can be reduced by more than 90% every 12 months."

In the trend of the decline in inference cost , It is inevitable that BAT's API call prices will continue to decline, and DeepSeek's cost-effectiveness advantage is ushering in the pressure of a new round of price wars for major manufacturers.

However, the big model API price war is just the beginning, and for developers, big model manufacturers have also worked together to serve.

Lin Sen has been exposed to many large and small large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-scale large-We will actively contact the developers.

Although as an open source big model platform, the goal is to provide developers with more inclusive AI support, DeepSeek even has no entrance to issue invoices to developers on its official website.

"Every time the API is recharged, unlike other big model platforms, you can directly issue invoices in the background. DeepSeek needs to go around outside the official website and add customer service company WeChat to issue invoices." Yang Huichao told the alphabet list, whether it is Price or service, DeepSeek's "cost-performance ratio" label seems to be a little unstable.

The AI product manager of a leading manufacturer told the alphabet that some Internet companies insist on replacing the original large model with DeepSeek, regardless of the time it takes to replace the model and readjust Prompt. At the same time, even the full-blooded version of DeepSeek R1 has many general abilities such as ‌Function calling, etc. that do not support it.

Compared with BATs who use cloud services to run B-side service scenarios, DeepSeek is still inferior to AI manufacturers in terms of convenience.

But the traffic effect of DeepSeek has not yet faded, and there are still many trendsetters.

Some companies claimed to be connected to DeepSeek, but they just started calling the API and recharged hundreds of dollars. Some companies announced that they had deployed the DeepSeek model, but in fact they only asked employees to read the B station tutorial and download the one-click installation package. In this DeepSeek boom, there is a mixture of mud and sand.

The tide will eventually fade, but DeepSeek obviously has more homework to do.