Explorer
Home Blog Resume Contact

Have Yinz Been Following The DeepSeek News?

Feb 5, 2025

Out of nowhere, a relatively unknown Chinese AI company called DeepSeek came out with a surprisingly capable open-source LLM model, hitting the top of app store charts, and sparking some serious hand-wringing (and hot takes) across the tech world.

The moment that really kicked things off was the release of DeepSeek-R1, a reasoning-focused language model they dropped on Hugging Face under a very permissive MIT license. That meant anyone, individuals, startups, even big companies, could use it commercially without paying a dime. The model claimed strong performance on tricky benchmarks like MATH-500 and SWE-bench Verified, which (if true) would put it in the same ballpark as OpenAI’s o1. On top of that, they claimed to do this at a stupidly cheap (relatively speaking) cost. DeepSeek reportedly trained its V3 model (the one used in their AI assistant app) for just $5.6 million, using older Nvidia GPUs. For Comparison OpenAI and Anthropic are reportedly spending hundreds of millions on their newer models.

DeepSeek’s AI assistant app very quickly hit the top of both the App and Play stores in the US. I didn’t download it, sketchy Chinese AI spyware is not the kind of thing I trust on my phone, but millions of other people didn’t seem to mind.

Reactions across the tech world came fast. Marc Andreessen called it “one of the most amazing and impressive breakthroughs I’ve ever seen.” Yann LeCun from Meta used it to make the case for open-source dominance. Garry Tan had a more measured take, pointing out that cheaper models will drive demand for inference, which benefits companies building the infrastructure.

On Hugging Face, DeepSeek’s release turned into a mini gold rush. Developers created over 500 derivative models based on R1, and downloads surpassed 2.5 million in just a few days.

But of course there were a lot of red flags, starting with (surprising no one) censorship. Researchers found that R1 refused to respond to about 85% of prompts related to things like Tiananmen Square or Taiwan, often deflecting with extremely nationalistic replies. Though apparently it was pretty easy to jailbreak these filters if you want.

On the geopolitical front reports say hundreds of U.S.-based organizations, especially ones with government contracts, have blocked DeepSeek’s services outright, citing data security risks. And in a really weird series of events, according to The Information, Microsoft launched an internal investigation into whether DeepSeek improperly used OpenAI’s API output to train its models, which would be a violation of OpenAI’s terms. But they also made DeepSeek-R1 available on Azure AI Foundry, saying it passed “rigorous safety evaluations.” So on one hand, they’re investigating the company; on the other, they’re selling its models. Something something capitalism, amirite?

Toward the end of the month DeepSeek then released Janus-Pro, a family of image generation models they say outperform DALL·E 3. These were also released on Hugging Face, spanning from 1B to 7B parameters. So, we can all look to even more AI generated spam in our Instagram feeds?…Yay?

All of this is still developing, so who knows if any of this will even still apply in a month, but it’s uh, interesting times to say the least.

Would You Like To Know More?