An end-to-end platform that automates the hiring pipeline — CV parsing, semantic candidate-to-job matching, and LLM-generated interview questions. Built from NLP models to a full-stack RESTful app with analytics dashboard.
3D Smart Factory's HR pipeline relied on keyword-based ATS filtering — fast but blunt. Strong candidates were getting rejected because their resumes phrased experience differently than the job description. Meanwhile recruiters spent ~80% of their first-pass time on manual triage.
The goal: a system that understands semantic intent, not just term frequency. One that ranks candidates by genuine fit, summarizes their profile, and prepares the interviewer with tailored questions before the first call.
A classifier would have needed labeled CV/job pairs — we had none. Embeddings let us bootstrap from a frozen Sentence-Transformers model and tune the similarity threshold empirically. Six weeks of dev time saved, no labeled-data bottleneck.
Pragmatism over perfectionHR data couldn't leave the company VPC for compliance reasons. LLaMA running locally on a GPU instance gave us full data sovereignty with acceptable latency (~4s per generation). GPT-3.5 was marginally better but a non-starter on legal review.
Compliance shaped the model choiceRecruiters don't read JSON. The dashboard turned the matching score into a ranked, filterable, exportable list with one-click "summarize" and "send to interviewer" actions. The UI was what made adoption happen — the model was just an input.
UX is part of the model# core/matching.py — the heart of the ranking pipeline from sentence_transformers import SentenceTransformer, util model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2") def rank_candidates(job_desc: str, cvs: list[dict]) -> list[dict]: # 1. embed the job description once job_vec = model.encode(job_desc, convert_to_tensor=True) # 2. embed each parsed CV (cached if seen before) cv_vecs = model.encode( [cv["summary"] for cv in cvs], convert_to_tensor=True, show_progress_bar=False, ) # 3. cosine similarity gives a [0,1] fit score per candidate scores = util.cos_sim(job_vec, cv_vecs)[0] # 4. zip + sort, return top matches with diagnostic context return sorted( [{**cv, "score": float(s)} for cv, s in zip(cvs, scores)], key=lambda c: c["score"], reverse=True, )
Always open to new AI & data engineering challenges. If you have a project that needs this kind of attention, let's talk.