Aldi UK’s price scrapes contain signals most retailers pay consultants millions to find. This use case shows how weekly product, pricing and availability data can cut overstock write-downs by flagging slow-movers before they become losses.
Aldi UK operates on a high-velocity, lean SKU model โ roughly 1,900 core lines compared to Tesco’s 40,000+. That sounds like it should make inventory easy. But there’s a catch: Aldi’s legendary Specialbuys (WIGIG โ When It’s Gone, It’s Gone) aisle introduces 30โ50 new non-food items every Thursday and Sunday. These have no sales history, no demand curve, and no reorder trigger.
When a Specialbuy misses its sales target โ because the weather turned, a competitor ran a better deal, or the item just didn’t resonate โ it doesn’t get repriced quietly. It sits at full price until Aldi decides to mark it down, by which point the opportunity cost of shelf space and working capital has already compounded.
The core tension: Aldi’s EDLP model means it rarely runs mid-cycle promotions. Unlike Tesco or Sainsbury’s which use Clubcard deals to clear slow stock, Aldi has fewer levers to pull. By the time a markdown decision is made, write-downs are often 40โ60% of original cost.
Aldi doesn’t publish real-time stock levels. Buyers use weekly sales reports with 5โ7 day lag โ too slow for Specialbuys with a 2-week shelf life.
โยฃ1.1M/yrSeasonal and novelty items routinely miss forecast by ยฑ35%. No external price benchmarking means buyers can’t tell if it’s a price issue or a demand issue.
โยฃ1.8M/yrWhen Lidl runs a parallel Specialbuy at 15% less, Aldi’s version stalls โ but nobody realises it until inventory review. Weeks of missed repricing window.
โยฃ1.3M/yrAldi UK’s website (aldi.co.uk) exposes product listings, category pages, and the Specialbuys calendar โ all accessible without authentication. A well-structured scraper can extract a rich dataset at two cadences: weekly core grocery prices and bi-weekly Specialbuy launches.
Here’s what a single Aldi UK scraped product record looks like in the database:
Not all scraped fields are equal. Some exist purely for identification, others are the actual signal that tells you whether something is turning into dead stock. Here’s the full annotated schema.
Below is a sample of 22 SKUs from a live scrape โ a mix of core grocery lines and Specialbuys. The Risk Score column is the composite overstock signal. Filter by category and sort any column. Products scoring above 70 are flagged for immediate review.
| Product | Category | Price | Wks Live | Comp Gap | Stock Level | Risk Score โ | Action |
|---|
// Risk Score = (weeks_live ร 8) + (comp_gap ร 1.2) + (stock_pct ร 0.6) ยท capped 0โ100 ยท scraped data illustrative
Velocity is units sold per week, inferred from availability changes across scrape snapshots. A product that was “in_stock” week 1 and shows “low_stock” by week 3 has high velocity. One that stays “in_stock” for 5+ weeks without a price change is your overstock candidate.
How velocity is inferred from scraped data: Aldi doesn’t publish unit sales.
But availability status (in_stock โ low_stock โ sold_out) gives a directional signal.
Combined with store count estimates and category seasonality coefficients,
you can model relative velocity within ยฑ20% โ good enough for a markdown trigger.
Overstock losses aren’t just the unsold stock. They include the opportunity cost of tied-up working capital, the cost of markdown decisions made too late, and logistic disposal costs. The breakdown below shows estimated annual losses per category โ and how much a data-driven approach could recover.
The recovery mechanism is straightforward: When the scraper flags a Specialbuy as “persistent stock” in week 3 with a competitor gap above 10%, an automated alert triggers a buyer review. Moving the markdown decision from week 6 to week 3 doubles the sell-through window at a reduced price โ turning a 55% write-down into a 25% markdown.
The difference isn’t in having better buyers โ it’s in giving buyers the right data at the right time. Here’s what the Specialbuy management workflow looks like with and without scraped price intelligence.
The numbers above are calibrated to Aldi UK’s scale. Plug in your own figures to estimate what a scraping-based overstock signal could save your operation.
Scraped data is genuinely powerful โ but it’s not a magic number. Anyone building a production overstock model on this needs to be honest about its gaps.
Availability โ units sold. Scraped availability flags (in_stock / low_stock) are a proxy signal, not actual sell-through data. A product can stay “in_stock” because Aldi replenished it from a regional depot โ the signal would look like slow movement even if the item is selling fine. Always cross-reference with internal EPOS data when the signal fires.
Scraping is a signal layer, not a source of truth. The scraped dataset reflects Aldi’s online product listing โ which may not perfectly match in-store availability. Regional store differences, online-only exclusives and distribution delays all create noise. Treat risk scores as prioritisation cues, not hard decisions.
Legal & ToS considerations: Scraping aldi.co.uk for internal business intelligence sits in a grey area. Publicly accessible pricing data is generally permissible in UK law, but bulk automated requests may violate Aldi’s ToS. Always consult legal counsel, rate-limit requests, and explore whether Aldi’s data partnerships or price comparison feeds offer the same data through a compliant channel.