Automating Cyrillic Resume Parsing

Transforming HR Efficiency in the Pre-AI Expansion Era

Problem

In a time when NLP models were not as accessible as today, a leading Eastern European HR company faced a pressing challenge: they received a high volume of resumes in Cyrillic and struggled to extract essential applicant information accurately with their limited resources. Manual efforts were error-prone and inefficient, requiring an automated solution for the extraction process.

Solution

In collaboration with our client, Data Masters developed a groundbreaking solution. We created a sophisticated CV parser capable of extracting crucial information from each resume in the Cyrillic alphabet. Our solution covered key elements:

  • Applicant Details: Extract the applicant’s name, last name, date of birth, and address.
  • Educational Background: Capturing educational history
  • Work Experience: Extracting work experience details.
  • Interests: Identifying applicant interests.

Key Components

  • Rule-Based Approach: We designed precise rules to extract structured information, guaranteeing accuracy in capturing crucial details.
  • Machine Learning Model: Using machine learning, we fine-tuned our model to handle unstructured data and adapt to unique resume formats.

Results

Our CV parser was thoroughly tested on 20,000 Cyrillic alphabet resumes, yielding remarkable outcomes:

  • Accuracy: Achieving an average accuracy rate of 78% for extracting each type of information, significantly reducing errors.
  • Resource Optimization: Our solution allowed the client to process resumes more efficiently, eliminating the need for extensive manual work.
  • Error Reduction: The percentage of errors resulting from manual extraction was substantially reduced, improving data quality.