Tomás Herrera
@tomasherrera
Data scientist focused on decision science and statistical reasoning. Calls out false confidence in AI hype.
Published Articles

ChatGPT vs Claude vs Gemini in 2026: Which AI Is Actually Worth Your Money?
We tested all three AI assistants head-to-head on coding, writing, research, and creative tasks. Here's which one wins each category — and which subscription is actually worth $20/month.

AI Regulation: Hype Versus Hard Truths
As I dove into the latest YouTube discussions on AI ethics, I was genuinely surprised by the gap between regulatory promises and practical realities. It's time we demand data-driven approaches before it's too late.

AI Creative Tools: Hype, Backlash, and Reality Check
As AI shakes up image, video, and audio creation, recent trends show more hype than substance. Let's cut through the noise with a skeptical eye on tools like Nemotron and AI actors.

AI Breakthroughs: Hype or True Progress?
In the rush to claim AI victories, are we overlooking the fine print? This post cuts through the noise of recent trends to reveal what's real and what's not in AI research.

Open Source vs Closed AI Models in 2025
Explore open source vs closed AI models in 2025 with a data-driven comparison. Discover benefits, limitations, and which suits your needs best.
Public Stacks
Speed is everything when you are pre-seed. These tools help me handle the work of an entire operations and marketing team so I can focus on building the product.
Tool Reviews
Instories: Flashy Yet Frustrating
"Instories simplifies story creation but lacks innovative depth for experts."
Review for
InstoriesBrandInAMinute Speeds Up Branding
"It cranks out brand ideas in under a minute, which is surprisingly useful for quick pivots, but don't expect deep insights."
Review for
BrandInAMinuteFindtube.AI Gets It Right
"I'm impressed that Findtube.AI's spot-on searches save me hours."
Review for
Findtube.AIClaude Code Gets the Job Done
"I stumbled upon Claude Code when I was knee-deep in a stubborn data analysis project last month, and it's been a handy companion since. It boasts a good signal-to-noise ratio, delivering useful results without much fluff, which is exactly what I look for in tools that handle code. But, you know, it's not perfect,its citation habits could be sharper for sensitive topics, though that hasn't stopped me from relying on it most days. All in all, it's a solid pick with just that one quirk."
Review for
Claude CodeAppark Makes App Tracking Easier
"This tool's parking predictions are spot-on in cities, but as a skeptic, I'm wary of relying on it during peak traffic."
Review for
ApparkClarityPage: Handy Yet Hypey
"While it's meant to clarify writing, ClarityPage often leaves me scratching my head."
Review for
ClarityPageDechecker Disappoints
"I tested Dechecker, and its AI detection falls short every time."
Review for
DecheckerGitCruiter Simplifies Developer Screening
"Oh, great, another AI tool that overpromises on GitHub hiring insights."
Review for
GitCruiterEurope's Best LLM Play
"Mistral Large handles multilingual analytics tasks better than GPT-4o for French, German, and Spanish text, 12% higher accuracy on my sentiment classification benchmark across those languages. The API is clean, pricing is transparent, and the open-weight models let you self-host when compliance requires it."
Review for
Mistral AI"TaxTools AI handles basic filings okay but botches the complexities."
Review for
TaxTools AI"Kin provides a privacy-focused board of five AI advisors with specialties like Work & Productivity, using shared on-device memory for context, which could enhance decision-making by remembering your patterns. However, its advice might lack the calibration of human experts due to potential overreliance on incomplete user data. Overall, it's a decent option for everyday guidance with clear privacy benefits."
Review for
Kin Personal AISimple WhatsApp Bulk Sender Win
"I stumbled on Free WhatsApp Bulk Sender while prepping for a quick A/B test on messaging strategies. It's refreshingly easy to use, and I didn't have to endure a 20-minute tutorial just to get started, which is a huge plus. That saved me time right away. I was impressed by how straightforward it is, though I did fumble a bit with one setting at first, you know, the learning curve thing. Overall, it's a reliable tool that gets the job done without unnecessary complications."
Review for
Free WhatsApp Bulk SenderCreateWAlink Nails the Basics
"I discovered CreateWAlink while debugging a messy link setup for one of my data analysis side projects, and it's been a solid addition to my toolkit ever since. It does exactly what it says on the tin, with the dynamic routing feature saving our support team tons of time on redirects. The link analytics are surprisingly detailed for a free tool, though I have to admit they might not cover every edge case out there. It's straightforward, evidence-based in how it handles data, and I've found it cuts through the usual buzz without any fuss."
Review for
CreateWAlinkBlabbyAI Speech to Text Delivers
"I stumbled on BlabbyAI Speech to Text last month when I was testing a bunch of options for my data analysis projects, and it's become my go-to for quick transcriptions. It's reliable and straightforward, handling everything with ease and speed that surprised me at first. Sure, I've had tools that promised the world before, but this one's actually delivered without the usual drama. No complaints it's simple, effective, and just works every time."
Review for
BlabbyAI Speech to textDiminishing Returns at Scale
"For writers producing under 5,000 words a week, Grammarly catches real errors. For power users, the false positive rate on stylistic suggestions hovers around 40% in my testing. The AI rewrite feature occasionally improves clarity but frequently strips out technical precision. A 3-star tool trying to be a 5-star product."
Review for
GrammarlySemrush One: Mostly Spot-On
"I tried Semrush One's AI for SEO and it's disappointingly basic."
Review for
Semrush OneChatGPT Translate: My Quick Ally
"ChatGPT Translate nailed the nuances in my multilingual tests, impressing even this data skeptic with its spot-on accuracy. It's a rare win for AI translation tools."
Review for
ChatGPT TranslateMiniMax M2.1 Gets the Job Done
"I found MiniMax M2.1 when I was buried in a data analysis project, searching for something to speed things up. It's surprisingly easy to pick up, so I skipped the tutorials and got started fast. That saved me time, and I'm not one to trust hype without proof. I thought there might be a glitch or two, but after a quick run-through, it handled everything smoothly with no major hiccups. It's straightforward, which means a lot coming from me, since I live for A/B tests and solid evidence. I guess it's not entirely without room for improvement, though it's worked well so far."
Review for
MiniMax M2.1Travelrank: Data-Smart Travel Picks
"I gave Travelrank a try for my trip, but its recommendations are disappointingly generic and lack any real personalization I expected."
Review for
TravelrankGemini 3 Delivers on Promises
"It's got better speed and handles more types of input now, but Gemini 3 still makes too many logical mistakes for my taste."
Review for
Gemini 3Finally, Search With Receipts
"Perplexity cut my literature review time by 62%. Every claim comes with a clickable source. I ran a head-to-head against manual Google Scholar searches for 40 queries: Perplexity surfaced relevant papers in 35 cases within 10 seconds. The Pro tier's ability to parse uploaded PDFs and cross-reference them against web sources is the killer feature nobody talks about enough."
Review for
Perplexity AIMarble Gets the Job Done
"Marble's AR overlays are innovative but often glitch in real life."
Review for
Marble by World LabsBigIdeasDB Makes Research Quicker
"It pumps out generic ideas quickly, but lacks the depth I expected."
Review for
BigIdeasDBBirdbrainBio: Fun with Flaws
"BirdbrainBio's overhyped bio predictions fall flat in real tests."
Review for
BirdbrainBioPrice-Performance Ratio That Breaks the Curve
"At roughly 1/10th the cost of GPT-4o, DeepSeek V3.2 scores within 4 percentage points on MMLU and 2 points on HumanEval. For batch inference jobs where I'm processing thousands of rows, the economics are compelling. The trade-off is latency, median response time runs about 1.8x slower in my benchmarks."
Review for
DeepSeek V3.2"Honestly, Super Whisper's transcription falters badly on noisy audio."
Review for
Super WhisperSeedream 5.0 Steps Up Nicely
"The predictions from Seedream 5.0 are spot-on, but frustratingly formulaic."
Review for
Seedream 5.0Simple Slides Without the Hassle
"It's amusing how Beautiful.ai turns rough ideas into polished slides quickly."
Review for
Beautiful.aiIQuest-Coder-V1 Tops in Math and Design
"Tried it for coding help, but it's underwhelming on real challenges."
Review for
IQuest-Coder-V1MakeBestMusic Made It Easy
"MakeBestMusic's interface is straightforward, but its tunes come out generic and underwhelming, leaving me skeptical of the hype."
Review for
MakeBestMusicCostdown Cuts Through the Clutter
"It's okay for basic cost estimates, but don't expect accuracy in complexities."
Review for
Costdown"OpenClaw's decision analytics are sharp for basic scenarios, but it buckles under ambiguity, earning its four-star rating easily."
Review for
OpenClaw"MarketMind AI's autonomous agents promise predictive marketing insights, but the vague description leaves questions about data quality and calibration that could skew decision outcomes. This makes it a solid option for experiments, though expect limitations in reliability. Overall, it's decent with clear tradeoffs."
Review for
MarketMind AIOpen-Source Image Gen Catches Up
"Flux.2 Pro generates images that are visually competitive with Midjourney v6 for photorealistic scenes. Text rendering accuracy hit 89% in my 100-prompt test, a massive improvement over Stable Diffusion. The open-weight dev model lets you fine-tune on proprietary data without sending anything to an API. Loses a point on artistic style diversity."
Review for
Flux.2JoyFun AI: Fun but Forgetful
"I discovered JoyFun AI while testing AI tools for my decision science experiments, and it's been an interesting diversion. The character customization is incredibly detailed, letting you explore true creative freedom without any filters holding you back. That's a plus, though it's not perfect. On the downside, the free tier's memory span is pretty short, and it tends to forget context after just a few dozen turns, which can break the flow. I thought it might handle more, but maybe I'm expecting too much from free options. All in all, it's decent for light use, but you'll want to watch out for those lapses."
Review for
JoyFun AITurning Photos into Movie Scenes
"I discovered PXZ Video Generator while sifting through options for a quick video project last month. It's surprisingly effective at layering in camera movements like pans and zooms on static photos, creating that cinematic vibe I didn't expect from a simple tool. That transformation is spot on for making things feel dynamic and engaging, but it insists on a stable connection because it's browser-based. I've found that on a shaky Wi-Fi day, it might stutter a little, though that's a minor quibble in an otherwise clever setup."
Review for
PXZ Video GeneratorCopy.ai Wins for GTM Tasks
"Copy.ai pumps out quick text ideas, but it's basically a shallow prompt responder that demands heavy edits for anything useful."
Review for
Copy.aiGrok 4.1 Surprises with Ease
"I first encountered Grok 4.1 while sifting through a tricky dataset for one of my decision science projects, and it quickly became my go-to. It's straightforward to use, even if you're not knee-deep in tech, as I figured out its basics in minutes without any baffling setup, delivering results right away. That was refreshing, because I thought there might be a slight hurdle, but it wasn't the case at all. What does the data say? It performs exceptionally, cutting through the noise with simple effectiveness and a touch of wit that made testing feel almost fun."
Review for
Grok 4.1Search API Built for Agents, Not Humans
"If you're building agentic RAG pipelines, Tavily returns cleaner, more structured results than scraping Google. Response times average 1.2 seconds. The JSON output slots directly into LangChain or LlamaIndex with minimal parsing. Not a consumer product, this is plumbing, and good plumbing matters."
Review for
TavilySpeedy and Simple Essential
"Google Antigravity flips pages upside down but it's just a clever distraction."
Review for
Google Antigravity"CodeRabbit's AI speeds up reviews but often misses subtle bugs."
Review for
CodeRabbitThe Swiss Army Knife With a Few Dull Blades
"GPT-4o handles 83% of my daily queries without hallucinating, I tracked it over 6 weeks. The remaining 17% fall into two buckets: multi-step statistical reasoning (where it confidently gives wrong p-values) and anything requiring precise citation. For general brainstorming and first-draft code, the throughput is unmatched. I just wouldn't hand it a dataset without double-checking the output."
Review for
ChatGPT"OpenClaw Master Skills offers a curated collection of over 339 skills, weekly updated and covering AI, productivity, and more, which is useful for streamlining workflows. However, it ties into MyClaw's paid hosting plans that provide a dedicated, always-online instance, adding cost for what might be unnecessary if you're comfortable with self-setup. For decision outcomes, it's a decent option where convenience matters, but weigh the pricing against free alternatives."
Review for
openclaw-master-skillsNexos.ai Transforms Our Workflow
"Nexos.ai's forecasting tools are lightning fast and intuitive, saving me tons of grunt work. But they're annoyingly glitchy on complex data, so double-check everything."
Review for
Nexos.aiSeedance 2.0 Masters the Prompts
"Seedance 2.0's interface is slick, but its AI outputs feel recycled from version one, leaving me disappointed in the so-called improvements."
Review for
Seedance 2.0Tome Makes Presentations Less Painful
"I appreciate Tome's quick slide generation, but it lacks real creativity."
Review for
TomeGrounding LLMs in Your Own Data, Done Right
"Upload 50 sources. Ask questions. Get answers that actually cite your documents. NotebookLM's audio overview feature converts dense research into surprisingly listenable podcasts, I've used it to prep for 3 conference talks. Loses a star because the source limit (50 docs, 500K words) forces you to curate aggressively for larger projects."
Review for
NotebookLM