Guide: deploy French Spam Type Detection locally

On Kinoux, the French Spam Type Detection model doesn’t just do “spam / not spam”. It classifies each message as ham (legitimate) or into five spam families: promotion, phishing, scam, seo_boost, clickbait.

The benefit is straightforward: you don’t handle a phishing attempt the same way as a promo pitch. By recognizing the type, you can apply different actions that are safer and easier to control.

This guide follows a gradual approach: start cautiously, validate on a small batch of real data, then set thresholds. The goal is not to “automate everything”, but to automate only the cases where the model is very confident (high score).

What the model returns (in plain terms)

For each text (contact form, comments, shared inbox…), the model outputs a score for every category: ham, promotion, phishing, scam, seo_boost, clickbait. In practice, you typically look at the category with the highest score, then decide whether the model is confident enough to trigger an action.

The basic rule is simple: the model may suggest a type, but you allow an automatic action only if the score is very high. Everything else stays in a safety zone (review or no action).

Why “by type” is better than “spam / ham”

With a binary model, you are stuck with a blunt decision: block or allow. In real life, not all spam is equally risky.

When you have the type, you can adapt your handling. phishing or scam usually deserves immediate quarantine (direct risk). A promotion can go to a “marketing” folder or be held for review. seo_boost (often link spam) can be hidden or filtered. clickbait is often ambiguous, so it’s generally healthier to keep it “to review” at first.

Result: you automate more intelligently, with fewer risks and fewer side effects.

The safest method: 3 zones

Instead of trying to find “the perfect threshold” right away, use a three-zone setup. It’s the simplest way to benefit from the model without making users uncomfortable, because you always keep an exit door.

The logic fits in one sentence: when the model is very confident, you act; when it hesitates, you ask a human to review; when the model isn’t confident, you change nothing.

Example workflow

Imagine a contact form on a WordPress website. If the model predicts phishing with a very high score, you quarantine the message. If it hesitates between promotion and clickbait with a medium score, you put it in a “to review” queue. And if the score is low (or if the model predicts ham), you keep the normal flow.

In practice, this means organizing processing into three steps: an automatic but reversible action (quarantine / hold), a review queue (dashboard, dedicated inbox, moderation), then the normal flow.

Why this split reduces risk

This approach prevents “false blocks” because ambiguous cases are never automated. It also lets you start with very conservative thresholds and relax them gradually without breaking your product. Finally, it opens the door to a simple optimization: you can use different thresholds depending on the channel (comments ≠ inbox ≠ contact form).

Before thresholds: define what “critical” means

Before tuning thresholds, you need to clarify one thing: which mistake costs you the most.

With this model, two situations commonly occur. The first is a false block: a legitimate message (ham) is classified as a spam family. The second is a missed spam: a truly dangerous message (often phishing or scam) still gets through.

Depending on your use case, the priority isn’t the same. In a support inbox, blocking ham can be more damaging than letting a bit of spam through. On the other hand, in a public space (comments, user messages), letting phishing or scam through may be the bigger risk.

The simplest approach is to write one sentence that becomes the rule of the game for the rest, for example: “I’d rather let some spam through than block ham.” Or the opposite: “I’d rather quarantine highly suspicious cases, even if it creates extra review work.”

After that, thresholds stop being an abstract debate: they simply make that priority true in the automatic zone.

Prepare a small real batch to set your thresholds

You don’t need thousands of rows to decide. A batch of 200 to 500 messages from your database is often enough to set realistic thresholds.

What matters is that the messages look like your day-to-day: contact forms, comments/reviews, shared inbox, tickets. Keep “real life” noise (typos, short texts, signatures, copy/paste), otherwise your thresholds will be overly optimistic and you’ll be surprised in production.

To keep it simple, use a spreadsheet with a few columns: the text, the “true type” (what you judge correct), the model’s predicted type, the associated score, and optionally a “critical if wrong” marker. Important tip: fill the “true type” before running the model, otherwise you’ll bias yourself.

If you need a quick reminder for labeling, here’s the spirit of the categories: promotion (sales outreach), phishing (impersonation + link/urgency), scam (financial fraud), seo_boost (backlinks/guest posts/link spam), clickbait (sensational/vague hook).

Choose thresholds without overcomplicating things

After preparing a batch of messages in a spreadsheet, there’s one missing piece before moving on: getting the model’s scores for those messages.

Before going further, the simplest option is to put the model behind a small local API. That way, you can fetch scores with a curl command and set thresholds based on something concrete.

Before going further: set up a local API (FastAPI)

The goal is deliberately minimal: a Python service loads the model once at startup, then you send texts over HTTP (POST /predict).

Recommended structure

kinoux-spam-type-api/
	model/
		... model files ...
	requirements.txt
	.env.example
	app.py
	serve.py

Installation

Create requirements.txt:

transformers>=4.40,<5.0
torch
safetensors
fastapi>=0.110
uvicorn[standard]>=0.23
python-dotenv>=1.0

Then install:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

Configuration via .env

Create a .env file:

HOST=0.0.0.0
PORT=8000
MODEL_DIR=./model
MAX_LENGTH=256

These 4 parameters are enough for a first test:

  • HOST: the address the service listens on (e.g. 0.0.0.0 to accept requests from the network, or 127.0.0.1 to stay local).
  • PORT: the HTTP port the API responds on (e.g. 8000)
  • MODEL_DIR: the path to the folder containing the model files (e.g. ./model).
  • MAX_LENGTH: the maximum length processed by the tokenizer; beyond that, text is truncated (e.g. 256).

FastAPI API (POST /predict)

Create app.py:



#!/usr/bin/env python3
“””FastAPI wrapper for a packaged Kinoux multi-class spam type model.”””

from __future__ import annotations

import os
from contextlib import asynccontextmanager
from pathlib import Path
from typing import Any

import torch
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from transformers import AutoModelForSequenceClassification, AutoTokenizer

load_dotenv()

DEFAULT_MODEL_DIR = Path(__file__).resolve().parent / “model”


def str_to_bool(value: str | None, default: bool = False) -> bool:
    if value is None:
        return default
    return value.strip().lower() in {“1”, “true”, “yes”, “y”, “on”}


def choose_device(force_cpu: bool) -> torch.device:
    if force_cpu or not torch.cuda.is_available():
        return torch.device(“cpu”)
    return torch.device(“cuda”)


def load_model(model_dir: Path, device: torch.device):
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    model = AutoModelForSequenceClassification.from_pretrained(model_dir)
    model.to(device).eval()
    labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
    return tokenizer, model, labels


def predict_scores(
    text: str,
    tokenizer,
    model,
    labels: list[str],
    device: torch.device,
    max_length: int,
) -> dict[str, Any]:
    batch = tokenizer(
        text,
        return_tensors=”pt”,
        truncation=True,
        padding=True,
        max_length=max_length,
    ).to(device)

    with torch.inference_mode():
        logits = model(**batch).logits
        probs = torch.softmax(logits, dim=-1)[0].detach().cpu().tolist()

    scores = {label: float(score) for label, score in zip(labels, probs)}
    top_label = max(scores, key=scores.get)
    top_score = scores[top_label]

    return {
        “top_label”: top_label,
        “top_score”: round(top_score, 6),
        “scores_percent”: {k: round(v * 100.0, 2) for k, v in scores.items()},
    }


class PredictRequest(BaseModel):
    text: str | None = Field(default=None, description=”Single text”)
    texts: list[str] | None = Field(default=None, description=”Batch of texts”)
    max_length: int | None = Field(default=None, description=”Override MAX_LENGTH”)


@asynccontextmanager
async def lifespan(app: FastAPI):
    model_dir = Path(os.getenv(“MODEL_DIR”, str(DEFAULT_MODEL_DIR))).resolve()
    if not model_dir.exists():
        raise RuntimeError(f”Model directory not found: {model_dir}”)

    force_cpu = str_to_bool(os.getenv(“FORCE_CPU”), default=True)
    device = choose_device(force_cpu)

    tokenizer, model, labels = load_model(model_dir, device)

    app.state.model_dir = model_dir
    app.state.device = device
    app.state.tokenizer = tokenizer
    app.state.model = model
    app.state.labels = labels
    app.state.max_length = int(os.getenv(“MAX_LENGTH”, “256”))

    yield


app = FastAPI(title=”Kinoux Spam Type Detection API”, version=”1.0.0″, lifespan=lifespan)


@app.get(“/health”)
def health() -> dict[str, Any]:
    return {
        “status”: “ok”,
        “device”: str(app.state.device),
        “model_dir”: str(app.state.model_dir),
        “labels”: app.state.labels,
        “max_length”: app.state.max_length,
    }


@app.post(“/predict”)
def predict(payload: PredictRequest) -> dict[str, Any]:
    if payload.text is not None:
        texts = [payload.text]
    elif payload.texts is not None:
        texts = payload.texts
    else:
        raise HTTPException(status_code=400, detail=”Provide ‘text’ or ‘texts'”)

    cleaned = [t.strip() for t in texts if t and t.strip()]
    if not cleaned:
        raise HTTPException(status_code=400, detail=”Empty text(s)”)

    max_length = payload.max_length if payload.max_length is not None else app.state.max_length

    results: list[dict[str, Any]] = []
    for t in cleaned:
        results.append(
            {
                “text”: t,
                **predict_scores(
                    text=t,
                    tokenizer=app.state.tokenizer,
                    model=app.state.model,
                    labels=app.state.labels,
                    device=app.state.device,
                    max_length=max_length,
                ),
            }
        )

    return {“results”: results}

Then create serve.py:



#!/usr/bin/env python3

from __future__ import annotations

import os

import uvicorn
from dotenv import load_dotenv

load_dotenv()


def main() -> None:
    host = os.getenv(“HOST”, “0.0.0.0”)
    port = int(os.getenv(“PORT”, “8000”))
    uvicorn.run(“app:app”, host=host, port=port, log_level=”info”)


if __name__ == “__main__”:
    main()

Start the API:

python serve.py

Fetch model scores with curl

Before discussing thresholds, make sure the API responds:

curl http://127.0.0.1:8000/health

Then test a single text:

curl -X POST http://127.0.0.1:8000/predict 
	-H "Content-Type: application/json" 
	-d '{"text":"Offre SEO spéciale: boostez votre site avec 2000 backlinks garantis."}'

To go faster on a batch, you can send multiple texts at once:

curl -X POST http://127.0.0.1:8000/predict 
	-H "Content-Type: application/json" 
	-d '{"texts":["Cliquez ici pour récupérer vos gains de loterie internationale.","Promo flash : -70% sur toute la boutique, livraison dès demain."]}'

The response includes:

  • top_label: the most likely category (e.g. seo_boost)
  • top_score: the associated score (0 to 1)
  • scores_percent: scores for all categories (percent)

That’s exactly what you need to fill your spreadsheet and set thresholds.

Automate spreadsheet enrichment (without overdoing it)

At this point, you can already do a “manual” test: copy a sentence from your spreadsheet, call the API with curl, then paste top_label and the score.

But once you have 200 to 500 messages, the most comfortable approach is to write a small, straightforward script that:

  • reads your batch exported as CSV (from Excel),
  • sends each text to the API (POST /predict),
  • writes a new enriched CSV with top_label + top_score (and optionally per-class scores).

Same idea if your source is an inbox: the script reads emails (IMAP/Exchange/internal tooling), sends the text to the API, then routes the message (quarantine, review, normal) according to your rules.

The key idea: you keep the model in a single place (the API), and everything else becomes simple integration around it.

Set thresholds with a simple method

Now that you can retrieve scores, the goal is not “the best number”. The goal is that the automatic zone is very safe, especially on the mistakes you consider “critical”.

Start deliberately conservative. Typically, you only automate phishing and scam when the score is really high. For promotion and seo_boost, you may prefer holding or reviewing at first. And for clickbait, review is often the best option until you have enough feedback.

Then apply a method that works well and stays readable. In your spreadsheet, take one category (for example phishing), sort rows by descending score, then find the first critical mistake. Your “automatic” threshold sits just above that. You end up with a “safe” automatic zone built from your own data.

Once that zone is stable, add a second threshold for the “to review” zone. You now have a clear setup: above the auto threshold, action; between the two, review; below, no action (normal flow).

Deploy without breaking your product

A common mistake is to plug in the model and apply an irreversible action on day one. A cautious rollout fits in three steps.

First, run the model in silent mode: it computes scores and you only log the type, the score, and the source. This helps confirm your real data matches your validation batch.

Then enable only reversible actions: quarantine for phishing/scam, hold for promotion/seo_boost, review queue for clickbait. At this stage, forbid automatic deletion.

Finally, ramp up gradually (for example 5% → 20% → 50% → 100%) and keep a “rollback” button.

What to monitor

You don’t need a complicated system. Monitor a few signals that provide useful alerts.

If the “to review” volume increases, it usually means traffic changed or the model is less comfortable. If you see false blocks (ham classified as spam), that’s your main alarm. And if you observe missed phishing/scam cases in a small regular sample, you’ll need to tighten thresholds or adjust the logic (more review, extra rules, etc.).

Finally, look at results by channel: a threshold acceptable for a contact form may be too aggressive for public comments.

Conclusion

The French Spam Type Detection model is valuable because it provides reasons (the type), not just a verdict.

The safest method is also the simplest: you organize your flow into three zones, set thresholds on a small real batch, automate reversible actions first, then expose the model via a local API for clean integration. After that, you refine by channel and prioritize monitoring false blocks.