Universal CV parser

As a part of human resource management system development, we had to build a CV parser module, which is able to obtain the CV data from multiple types of documents. Standard MS Office and other popular text redactor formats, audio files, images and PDF files — everything goes. Implementation of this plugin will rapidly decrease the time needed to create new candidate profiles in any system.

img

We decided to use various computer vision and text conversion tools like unoconv. The first step was to use OpenCV for images and spaCy.io for textual documents to define the areas of interest in the file. If the human’s photo was identified, it can be instantly used to create a profile avatar. Textual data are processed by the specially trained spaCy model, which allows getting the necessary entities out of documents: person’s names, dates, organisations, skills, etc. The data is then imported to the draft of the candidate’s profile, so the HR specialist can approve or adjust the data. This simplifies and speeds up the profile creation significantly, and the process of profile data approval or adjustment is used to train the next version of the model.

Tags

Location: Ukraine

Industry: IT services/ HR management

Partnership period: 2018-ongoing

Team size: 5 people

Team location: Kharkiv, Ukraine

Expertise delivered: DevOps skills, ML & NLP, OCR, data science skills, Python development

Technology stack (c иконками): Python, Google Vision API, spaCy.io, OpenCV, unoconv tool

"This plugin provides the recruiters and HR specialists with literally unlimited capabilities! Just imagine — the recruiter can make a photo of the candidate’s CV, import it to the system in a couple of clicks from her phone — and the candidate’s profile is almost done! Once the system will be able to process the call transcripts, it will cover all the recruiting needs! HireUkraine came up with an excellent idea and has all the required skills to turn it into life!"