Transaction Categorization AI: Smart Spending Analysis

Whistl's AI doesn't just categorise transactions—it understands spending context, identifies risky patterns, and detects merchant manipulation. Using natural language processing and machine learning, Whistl transforms raw transaction data into actionable financial insights.

Why Transaction Categorisation Matters

Raw bank transaction data is cryptic and unhelpful:

  • "POS 3847291 MELBOURNE AU 14/02"
  • "SPRTBET*8372918 SYDNEY"
  • "AMZN Mktp AU*2K83H Sydney"

Without intelligent categorisation, you can't see patterns. Whistl's AI transforms this data into meaningful categories that reveal spending behaviour.

How Whistl's Categorisation AI Works

Whistl uses a multi-layer approach to transaction categorisation:

Layer 1: Merchant Identification

The AI first identifies the merchant from transaction descriptors:

# Merchant extraction examples
"SPRTBET*8372918 SYDNEY" → Sportsbet
"CROWN CASINO MELBOURNE" → Crown Casino
"AMZN Mktp AU*2K83H" → Amazon
"WOOLWORTHS 1234 SYDNEY" → Woolworths
"UBER   TRIP 14/02" → Uber

Layer 2: Category Classification

Once the merchant is identified, it's classified into spending categories:

Primary Categories

CategoryExamplesRisk Level
GamblingSportsbet, Crown, TABCritical
ShoppingAmazon, eBay, retailersHigh
DiningRestaurants, cafes, deliveryMedium
EntertainmentNetflix, Spotify, eventsMedium
TransportUber, fuel, public transportLow
GroceriesWoolworths, ColesLow
UtilitiesElectricity, water, internetLow
HealthcarePharmacy, doctorsLow

Layer 3: Risk Assessment

Each transaction is assessed for risk based on multiple factors:

  • Merchant type: Gambling = critical risk
  • Amount: Large amounts = higher risk
  • Time: Late night = elevated risk
  • Frequency: Multiple transactions = velocity risk
  • Category budget: Over budget = increased risk

Natural Language Processing for Categorisation

Whistl uses NLP to understand transaction descriptions:

Text Preprocessing

# Raw transaction description
raw = "SPRTBET*8372918 SYDNEY AU 14/02 19:34"

# Preprocessing steps
cleaned = raw.lower()                    # "sprtbet*8372918 sydney au..."
cleaned = remove_special_chars(cleaned)  # "sprtbet 8372918 sydney au..."
cleaned = remove_numbers(cleaned)        # "sprtbet sydney au"
tokens = tokenize(cleaned)               # ["sprtbet", "sydney", "au"]

Merchant Matching

# Merchant database lookup
merchant_db = {
    "sprtbet": {"name": "Sportsbet", "category": "gambling", "risk": "critical"},
    "crown": {"name": "Crown Casino", "category": "gambling", "risk": "critical"},
    "amazon": {"name": "Amazon", "category": "shopping", "risk": "high"},
    "woolworths": {"name": "Woolworths", "category": "groceries", "risk": "low"},
}

# Fuzzy matching for variations
def match_merchant(tokens):
    for token in tokens:
        for merchant_key in merchant_db:
            if fuzzy_match(token, merchant_key) > 0.8:
                return merchant_db[merchant_key]
    return None

Handling Ambiguous Transactions

Some transactions are ambiguous and require context:

  • "THE STAR SYDNEY" - Could be casino or entertainment venue
  • "CLUBS AUSTRALIA" - Could be RSL club or nightclub
  • "TAB" - Could be betting or newsagency

Whistl uses additional context to disambiguate:

  • Transaction amount (betting tends to be specific amounts)
  • Time of day (casinos more likely at night)
  • Location data (near known venues)
  • User correction history

Merchant Embedding Risk Detection

Whistl detects when merchants are similar to known risky merchants:

Embedding Technology

Merchants are represented as vectors in a high-dimensional space:

# Merchant embeddings (simplified)
merchant_vectors = {
    "sportsbet": [0.92, 0.15, 0.88, ...],    # Close to gambling cluster
    "ladbrokes": [0.91, 0.14, 0.87, ...],    # Close to sportsbet
    "crown": [0.89, 0.12, 0.91, ...],        # Close to gambling cluster
    "woolworths": [0.12, 0.85, 0.09, ...],   # Far from gambling cluster
}

# Similarity calculation
def merchant_risk(new_merchant):
    vector = get_embedding(new_merchant)
    similarity_to_gambling = cosine_similarity(vector, gambling_centroid)
    return similarity_to_gambling

New Merchant Detection

When you transact with a new merchant, Whistl assesses risk:

  • Name similarity: Does the name sound like a gambling site?
  • Category patterns: Does it fit gambling transaction patterns?
  • Amount patterns: Are amounts typical of betting?
  • Time patterns: Does timing match gambling behaviour?

Spending Velocity Detection

Whistl tracks spending velocity within categories:

Velocity Calculation

# Spending velocity calculation
def calculate_velocity(category, window_days=7):
    recent_spending = sum(
        tx.amount for tx in transactions
        if tx.category == category and
        tx.date >= today - timedelta(days=window_days)
    )
    
    historical_average = get_historical_average(category, window_days)
    
    velocity_ratio = recent_spending / historical_average
    
    return velocity_ratio

# Risk thresholds
if velocity_ratio > 2.0:
    risk_level = "HIGH"      # 2x normal spending
elif velocity_ratio > 1.5:
    risk_level = "ELEVATED"  # 1.5x normal spending
else:
    risk_level = "NORMAL"

Velocity-Based Interventions

When velocity exceeds thresholds, Whistl intervenes:

  • 1.5x normal: AI check-in message
  • 2.0x normal: SpendingShield elevation
  • 3.0x normal: Category block consideration

Category Budget Tracking

Whistl tracks spending against category budgets:

Budget Ratio Calculation

# Budget ratio calculation
def budget_ratio(category):
    budget = get_monthly_budget(category)
    spent = get_month_to_date_spending(category)
    days_remaining = days_in_month - day_of_month
    
    # Projected full month spending
    projected = spent / day_of_month * days_in_month
    
    ratio = spent / budget
    projected_ratio = projected / budget
    
    return {
        "current_ratio": ratio,
        "projected_ratio": projected_ratio,
        "remaining": budget - spent
    }

Budget-Based Risk

Budget RatioRisk LevelAction
<50%LowNormal monitoring
50-80%ModerateBudget reminder
80-100%HighSpending warning
>100%CriticalCategory block consideration

User Corrections and Learning

Whistl learns from user corrections to improve categorisation:

Correction Interface

  • Users can recategorise any transaction
  • Mark transactions as "not gambling" (false positives)
  • Mark transactions as "risky" (false negatives)
  • Add notes to transactions

Learning from Corrections

# Learning from user corrections
def learn_from_correction(transaction, new_category):
    # Update merchant category mapping
    merchant_db[transaction.merchant]["category"] = new_category
    
    # Update model weights
    update_nlp_weights(transaction.description, new_category)
    
    # Update similar merchants
    for similar_merchant in find_similar_merchants(transaction.merchant):
        suggest_category(similar_merchant, new_category)

Privacy: On-Device Categorisation

All transaction categorisation happens on your device:

  • Transaction data: Never leaves your phone
  • Merchant database: Stored locally, updated securely
  • NLP models: Run on-device via Core ML
  • Category history: Stored encrypted locally

Effectiveness Data

From categorisation accuracy testing:

MetricResult
Merchant Identification Accuracy94%
Category Classification Accuracy91%
Risk Assessment Accuracy87%
User Correction Rate6% of transactions
Learning Improvement (30 days)+12% accuracy

User Testimonials

"The categorisation is scary accurate. It knew a transaction was gambling-related even though the descriptor was cryptic." — Marcus, 28

"I love seeing my spending by category. Finally understand where my money goes." — Emma, 26

"When I went over my shopping budget, Whistl caught it immediately. Not weeks later like my old budgeting app." — Sarah, 34

Conclusion

Whistl's transaction categorisation AI transforms cryptic bank data into meaningful insights. By understanding not just what you spent, but what it means for your financial health, Whistl enables proactive protection.

This isn't just categorisation—it's intelligent spending analysis that protects you from yourself.

Get Smart Spending Analysis

Whistl's AI categorises transactions and detects risky patterns. Download free and understand your spending.

Download Whistl Free

Related: SpendingShield | Spending Velocity | Machine Learning