Card required to test
Any platform that hides the product behind credit card capture is optimising for refunds, not for honest evaluation.
Guide — — by Mahmoud Zalt
A founder's 30-minute test for any AI platform: what to try first, which red flags to spot, and how to confirm the homepage promises before paying.
Most AI platform demos are designed to delay your judgment, not earn it. A sales rep walks a curated path on prepared data, hides the steps that fail, and leaves before you touch the dashboard. The pitch sounds magical because every example was rehearsed. Sign in cold with your own messy inputs and the gap between the demo and the product becomes obvious, often inside five minutes. The shortcut: skip the demo, take the trial yourself, judge 30 minutes of cold use against a written checklist.
The 30-minute evaluation has one rule: do real work, not tutorial work. Pick a task you would do anyway this week, hand it to the platform exactly as you would describe it to a teammate, and time how fast you reach a usable output. In the same session, push on memory, channels, and limits, because those three almost always decide whether the tool sticks past week one. The order below is the one I run on every new AI platform that crosses my desk.
Some signals are bad enough that no later feature can compensate. They appear inside the first 30 minutes if you go looking, and they almost always predict where the relationship ends three months from now. Each item below is a deal breaker on its own, not a tally. If one shows up cleanly during a cold evaluation, close the tab and move on, even when the homepage was beautiful and the review thread was glowing. Time spent arguing yourself out of a red flag is time stolen from finding the platform that did not raise one in the first place.
Any platform that hides the product behind credit card capture is optimising for refunds, not for honest evaluation.
If you cannot reach the product without a human gatekeeper, the founder team does not trust their own onboarding flow.
A modern AI platform should produce something usable inside the first session. Long onboarding wizards usually hide thin product.
If it invents facts about your business or pretends to have completed work it never did, no integration depth will save the trust.
Stacked credit meters, unclear per-seat math, and quotes-only enterprise tiers all signal a product priced to confuse, not to fit.
Once you have walked the checklist and watched for the red flags, the next question is whether the platform can actually do the daily work or just talk about it. The cleanest way to find out is to give it a single recurring job you already own this week. Not a benchmark prompt, just one real task with all the messy context. If the output is shippable, you have your answer. If you keep editing it to the point you could have written it faster yourself, that is also your answer.
Putting a real assistant in front of a real task is how the abstract becomes concrete inside the same 30-minute window. You stop arguing about model names and feature lists and start asking the only question that matters: did this save me 20 minutes today or cost me 20 more. Once you have an answer, every homepage claim has evidence next to it. The next step is to verify the bigger promises one by one, on your own data, before any credit card moves.
Every AI platform homepage stacks claims that sound concrete and almost never are. Real-time. Autonomous. Works with your stack. Replaces five tools. The trick is to translate each claim into a test you can run inside the trial, with your own inputs, in under five minutes per promise. If the homepage says native Gmail, send a real email. If it says memory, ask it to remember your business and check tomorrow. If it says workforce, hire a second employee and see whether it inherits context from the first.
The smallest test that proves something is a single job from your own week, given to the platform with no prep and no cleanup. Draft a follow-up to a real lead. Summarize a real meeting recording. Write a real social post in your voice using your last three posts as context. Hand it the work you would otherwise do yourself and see whether the output is shippable, almost shippable, or a polite hallucination. That task, repeated against three or four candidates in a 30-minute window each, sorts the category faster than any feature matrix online.
Solo trial first, every time. A sales demo runs on prepared data and a curated path, which tells you almost nothing about how the product behaves on your real inputs. Take the cold trial and only book the call if you still have specific buying questions left after 30 minutes.
It replaces the go or no-go decision, not the pilot. Thirty minutes is enough to spot the deal breakers. If the platform clears that bar, you still want a one or two week pilot on a recurring task before annual billing.
Treat it as a red flag and look for an alternative that lets you in cold. AI tools confident in onboarding put the product first. If no cold trial exists in the category, ask for sandbox access on your own data, refuse the walkthrough, and time the same 30-minute checklist.
Use a fresh email, decline the onboarding call, mute the in-app chat, and pretend the company does not exist for the first 30 minutes. The product either earns the next step on its own merit or it does not. Reading sales nudges before the output corrupts your judgment of the output.
Hand the platform one real task from your week, in plain language, with the same context you would give a teammate. Time how fast it returns something shippable. If that loop takes more than 10 minutes or the result needs heavy editing, you have the answer.
Once you have a 30-minute method you trust, the next problem is volume: ten new AI platforms launch every week, and not all deserve half an hour. The companion read below shows how to filter the category before you start a trial, so the 30-minute test only runs against candidates that already pass a coarser sniff test.
The 30-minute test is not a trick or a hack. It is the smallest amount of cold contact with a product that produces an honest answer, which is exactly what most buyers never get because they let the sales motion replace their own judgment. Build the habit and the category gets quieter quickly: most platforms eliminate themselves before lunch, and the ones that survive your half-hour earn the longer pilot they deserve. The pattern that worked for me is the same one I keep recommending: refuse the demo, sign up cold, run one real task, watch for the red flags, verify the homepage promises yourself. Whatever survives is worth the next week. Whatever does not was never going to make payroll.