How AI allergen scanners work
You point your phone at a packet of biscuits. A few seconds later, the app tells you it contains soy lecithin and you should avoid it. How does that actually work — and where are the limits?
This is a non-technical explainer of what’s happening inside an AI allergen scanner, why it’s better than reading the label yourself, and where it falls short.
Step 1: Reading the label (OCR)
The scanner starts by reading what’s on the packet using OCR (Optical Character Recognition).
OCR has been around since the 1960s for documents, but modern OCR is much smarter than the old days. It can:
- Recognise text on curved or angled packaging
- Handle different fonts, sizes, and colours
- Cope with reflections, glare, and uneven lighting
- Read text in multiple languages (essential for travel)
What it produces is a string of raw text — the ingredient list, basically, transcribed into machine-readable form.
This step can fail. Tiny text in a foreign language behind a plastic wrapper covered in glare is hard. Modern phones and good OCR libraries handle most real-world labels in a single shot, but edge cases (handwritten labels, badly damaged packets, very stylised fonts) can return partial results.
Step 2: Parsing the ingredients
Reading the text is only half the job. The scanner then has to understand which words are ingredients and which are everything else (manufacturer details, weight, allergen advice boxes, marketing claims).
This is where structured parsing comes in. Most ingredient lists follow predictable patterns:
- A label like “Ingredients:” or “Ingredientes:” or “成分:”
- A comma-separated or bullet-separated list
- Allergens often emphasised in bold or CAPS (under UK/EU law)
- Percentages in parentheses
- “May contain” warnings at the end
A scanner that understands these patterns can pull out the ingredient list reliably, even from a noisy OCR result.
Step 3: Matching against an allergen database
Once the scanner has a clean ingredient list, it compares each ingredient against an allergen database.
This is the part that does the heavy lifting. A good allergen database doesn’t just match plain words — it understands:
- Synonyms. “Milk” and “lactose” and “casein” and “whey” are all milk-related. “Lecithin” usually means soy. See our post on hidden allergens in food labels for a longer list.
- Derived ingredients. Lupin flour is lupin. Marzipan is almond. Worcestershire sauce contains fish (anchovies). The database knows these mappings.
- E-numbers and additives. E322 is lecithin (soy). E120 is carmine (insect-derived). E471 is mono- and diglycerides of fatty acids (which can be derived from animal or plant sources). We maintain a full E-number directory with allergen risk info.
- Translation. When scanning a foreign-language label, the database needs to know that “leche” means milk and “trigo” means wheat.
The database is the difference between “this scanner is useful” and “this scanner is gimmicky”. A weak database matches plain words and misses derivatives. A good one catches the technical names too.
Step 4: Filtering against your profile
The final step is personal. The scanner doesn’t just flag allergens in general — it flags allergens that matter to you based on your AllergIQ allergy profile.
If you’re allergic to peanuts, the scanner ignores the dairy and wheat and just calls out the peanut risk. If you’re severe enough to react to cross-contamination, the scanner also surfaces “may contain peanuts” warnings.
That personalisation is what turns a generic ingredient parser into a useful safety tool.
Why it’s better than reading the label yourself
Three reasons:
- It doesn’t get tired. After scanning 30 products in a single grocery shop, your eyes start skimming. The scanner doesn’t.
- It knows the technical names. Even an experienced allergy parent doesn’t always know that casein is milk-derived or carmine is an insect product. The scanner does.
- It works across languages. Reading a Spanish label on holiday is exhausting. Scanning it is instant.
What it doesn’t replace: careful label reading for severe allergies. For anaphylaxis-grade reactions, always double-check the label yourself and ask the manufacturer if anything’s unclear. The scanner is a fast first line of defence, not a substitute for verification.
What an AI scanner can’t do (yet)
A few limits worth being clear about:
- Handwritten labels are unreliable. Restaurant menus written in chalk or handwritten allergen notices are hard.
- Highly stylised fonts (heavily decorated, very thin, foreign cursive scripts) may return partial results.
- Damaged or wet labels that the OCR can’t fully resolve will give incomplete output.
- “Natural flavourings” are a known weakness across the industry — they can legally hide allergens in some jurisdictions, and no scanner can tell you what’s inside a black-box ingredient.
- Restaurant food isn’t packaged with an ingredient label. That’s where a printed chef card does the job a scanner can’t.
The best safety setup for severe allergies is: scanner for packaged food, printed allergy card for restaurants, careful reading for anything unfamiliar, and an EpiPen always within reach.
How AllergIQ implements this
AllergIQ runs OCR via on-device APIs where possible (faster, more private) and sends complex labels to a server-side analyser when needed. The allergen database is maintained against the UK and EU 14 major allergens plus extensive coverage of additives, hidden ingredients, and allergen-derived E-numbers. Your allergy profile lives in your account, so every scan respects it without you re-entering anything.
If you’d rather see it in action than read about it, download AllergIQ on the App Store or Google Play. The free monthly scan allowance covers occasional use; credit packs are available for heavy users like weekly grocery shoppers.
The TL;DR
An AI allergen scanner reads the label, understands the words, looks up each ingredient in an allergen database, and tells you whether anything matches your personal allergy profile. The accuracy depends mostly on the database — good ones catch hidden derivatives, bad ones miss them. AllergIQ’s database is built specifically for allergen detection, not generic translation, which is what makes the scanner reliable enough to trust.
Still, for severe allergies, verify. The scanner is a first pass, not the final word.
Related reading
Hidden allergens in food labels: 12 you might miss
Soy lecithin, milk casein, hidden gluten. A practical guide to allergens that hide under technical names on ingredient labels, and how to spot them.
Allergy card vs chef card: what's the difference?
Allergy card and chef card mean essentially the same thing — but with regional and use-case nuance. Here's when each term matters and how to pick the right one.
Allergy cards for kids in school: what teachers need
A practical guide for parents of allergic children: what information schools need, what to put on a child's allergy card, EpiPen storage, and how to brief teachers.