Towards Data Science

Towards Data Science · 2026-06-02T01:03:23.047Z

What do we lose when we outsource research — and other similar, cognitively demanding tasks — to AI agents? Jacopo Tagliabue offers a nuanced and frank reflection on an emerging conundrum.

Internet Publishing

San Francisco, California 646,420 followers

Publish insights on the world-leading AI, ML & data-science platform and reach data professionals worldwide.

View all 334 employees

About us

Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward. Contributors receive editorial guidance, best-in-class publishing tools and prominent placement on our site, newsletter and social feeds. Accepted articles are eligible for the TDS Author Payment Program, which compensates writers based on reader engagement. If you have an idea worth sharing, submit your draft, join the conversation and connect with a global audience of data professionals. Insight Partners is an investor in Towards Data Science.

Website: http://towardsdatascience.com
External link for Towards Data Science
Industry: Internet Publishing
Company size: 11-50 employees
Headquarters: San Francisco, California
Type: Privately Held
Specialties: Data Science, Machine Learning, Artificial Intelligence, Data Visualization, Data, Data Engineering, AI Agents, Software Development, DevOps, Programming, Technology, and Digital Publishing

Locations

Primary

548 Market St

San Francisco, California 94104, US

Get directions

Employees at Towards Data Science

See all employees

Updates

Towards Data Science

646,420 followers
2h
Report this post
Follow along the latest installment of Angela Shi's enterprise RAG series to learn why stacking a reranker on top of a weak retrieval doesn’t save it, and what cross-encoders actually fix.

Rerankers Aren’t Magic Either: When the Cross-Encoder Layer Is Worth the Cost | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
6h
Report this post
If your pipeline fails silently, the issue is rarely just the model. Emmimal P Alexander shows how adding a control layer fixes validation and output consistency problems at scale.

Prompt Engineering Isn’t Enough — I Built a Control Layer That Works in Production | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
8h
Report this post
"But what if we could exploit this structural predictability? What if we could predict the value of a section before we ever send it to the LLM, drastically cutting ingestion costs by strategically ignoring the noise?" Partha Sarkar introduces a novel, effective approach to named-entity resolution in RAG systems.

Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
11h
Report this post
"This group doesn’t just think with AI—they actively think about how they’re thinking while using AI. And this skill may quietly become the defining human advantage in the AI era. That skill is: metacognitive regulation." Rashi Desai proposes a counterintuitive area for AI practitioners to focus on.

Meta-Cognitive Regulation Might Be the Most Important AI Skill Nobody Is Talking About | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
16h Edited
Report this post
For a hands-on deep dive on RAG failure points, embeddings' shortcomings, and how practitioners can overcome them, don't miss Angèle LIM's recent deep dive, part of the Enterprise Document Intelligence series.

Embeddings Aren’t Magic: The Predictable Failure Modes of RAG Retrieval | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
18h
Report this post
We're thrilled to share a new article by Minh Chien Vu: a thorough and accessible introduction to Qdrant TurboQuant, a recently released quantization method.

Qdrant TurboQuant Explained: Is TurboQuant the Silver Bullet? | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science

646,420 followers
20h
Report this post
What do we lose when we outsource research — and other similar, cognitively demanding tasks — to AI agents? Jacopo Tagliabue offers a nuanced and frank reflection on an emerging conundrum.

It’s the Lessons We Learned Along the Way. Or, Is It? | Towards Data Science https://towardsdatascience.com

1 Comment

Like Comment Share
Towards Data Science reposted this
Salvatore Cagliari
3d
Report this post
If you ever wondered what "Lineage" means in DAX and how to manipulate it, read my latest piece on Towards Data Science, where I dive into this topic to show you how you can improve your DAX code by using the lineage.

Towards Data Science

646,420 followers
3d

Why does lineage matter in DAX? How can we use it in our day-to-day PowerBI projects? Follow along Salvatore Cagliari's new tutorial to find out.

Explaining Lineage in DAX | Towards Data Science https://towardsdatascience.com

Like Comment Share
Towards Data Science reposted this
Minh Chien Vu

Ph.D | AI lead
2d Edited
Report this post
I spent weeks benchmarking Qdrant's new TurboQuant and my honest take is: it's not a silver bullet, but it's the most thoughtful quantization I've seen in production vector search. Most engineers treat quantization as a simple tradeoff: compress more, lose recall. TurboQuant asks a different question — 𝑤ℎ𝑎𝑡 𝑖𝑓 𝑡ℎ𝑒 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑖𝑡𝑠𝑒𝑙𝑓 𝑤𝑎𝑠 𝑠𝑚𝑎𝑟𝑡𝑒𝑟? The idea behind it (from a Google Research paper presented at ICLR 2026): 𝐫𝐨𝐭𝐚𝐭𝐞 𝐭𝐡𝐞 𝐯𝐞𝐜𝐭𝐨𝐫 𝐛𝐞𝐟𝐨𝐫𝐞 𝐜𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐧𝐠 𝐢𝐭. That rotation spreads energy evenly across all dimensions, so no single dimension carries too much signal or too much noise. Then apply one codebook to everything equally. Scalar quantization applies the same fixed grid to every dimension regardless of variance. Binary quantization throws away almost everything except the sign. TurboQuant changes the shape of the problem first, then spends bits on a better-prepared vector. Here's what I actually measured across 10K / 50K / 100K vectors on the DBpedia OpenAI embeddings dataset (1536-dim, high variance ratio of 233x): → 𝐓𝐐 𝟒-𝐛𝐢𝐭 reached 0.965 recall@10 at 100K vectors, only 1.5 points below Scalar Quantization, at 𝟖× 𝐜𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐨𝐧 vs Scalar's 4× → 𝐁𝐢𝐧𝐚𝐫𝐲 𝐐𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧 fell from 0.916 to 0.78 recall as the dataset doubled. TQ variants held much more stable → Adding rescoring, 𝐓𝐐 𝟒-𝐛𝐢𝐭 𝐡𝐢𝐭 𝟎.𝟗𝟗𝟔 — effectively matching Float32 recall at half the memory → Latency with TQ 4-bit + rescore: 𝟔.𝟒𝐦𝐬 vs Float32's 7.6ms The thing that surprised me most? TurboQuant's recall doesn't degrade as fast as the corpus grows. That's the rotation step doing its job. My practical conclusion after all this: → TQ 4-bit is the most balanced starting point. Better compression than Scalar, similar recall. → TQ 1.5-bit + rescoring is the move when you're storage-constrained but can't sacrifice retrieval quality. → TQ 1-bit: skip it unless you've tested it hard on your own embeddings. → Still prefer Binary Quantization if throughput is the only goal. TurboQuant costs more per query. One important caveat worth calling out: TurboQuant launched May 11, 2026. Real production experience is still limited. The geometry preservation works great for L2/cosine/dot product. For Manhattan distance, it needs full vector reconstruction — stick with Scalar Quantization there. I wrote up the full technical breakdown of the pipeline, the benchmarks, and a decision flowchart for when to use each method, shared in Towards Data Science — link in comments. If you're running Qdrant in production and have tried TurboQuant on real data — I'm genuinely curious whether your recall numbers held up at larger scales. Link to the full article: https://lnkd.in/g6QUXJtZ #VectorDatabase #Qdrant #MachineLearning #VectorSearch #NLP
4 Comments

Like Comment Share

Towards Data Science

Internet Publishing

San Francisco, California 646,420 followers

Publish insights on the world-leading AI, ML & data-science platform and reach data professionals worldwide.

About us

Locations

Employees at Towards Data Science

Chris Fotache

Frederic Marthoz Baro

Roger Noble

Courtney Perigo

Updates

Join now to see what you are missing

Similar pages

Daily Dose of Data Science

Kaggle

Analytics Vidhya

DeepLearning.AI

Hugging Face

Data Science Central

HumindZ

KDnuggets

Machine Learning Mastery

Lightning AI

Browse jobs

Analyst jobs

Engineer jobs

Scientist jobs

Intern jobs

Machine Learning Engineer jobs

Manager jobs

Data Analyst jobs

Head jobs

Associate Scientist jobs

Data Scientist jobs

Software Engineer jobs

Project Manager jobs

Data Science Specialist jobs

Director Data Science jobs

Chemist jobs

Graduate jobs

Developer jobs

Senior Application Developer jobs

Associate Product Manager jobs

Specialist jobs