Resume GPT
Association:
N/A
Duration:
3 weeks
Machine Learning
OpenAI API
Python

This project is an AI-powered Resume Review Tool designed to help job seekers optimize their resumes for Applicant Tracking Systems (ATS) and improve their chances of landing interviews. Built with advanced Natural Language Processing (NLP) techniques, the tool analyzes resumes for key factors such as ATS compatibility, keyword matching, job title alignment, quantifiable achievements, action verb usage, and readability. By comparing the resume against a provided job description, the tool generates actionable feedback, including an ATS compatibility score, prioritized improvements, and suggestions for enhancing content and structure.
The idea was born out of a desire to address the overwhelming challenges faced by job seekers in today’s competitive market. As someone who has witnessed countless talented candidates struggle to navigate the complexities of Applicant Tracking Systems (ATS) and stand out in a sea of applications (including myself), I wanted to create a tool that simplifies the process. By leveraging AI and NLP, this tool aims to empower candidates with actionable insights, helping them craft resumes that not only pass ATS filters but also highlight their unique strengths and achievements. My goal is to level the playing field, ensuring that every candidate has the tools they need to succeed in their job search.
def keyword_matching(self, resume_text, job_description): """ Extract keywords from the job description and compare them to the resume using TF-IDF. """ # Initialize stopwords stop_words = set(stopwords.words('english')) # Add more words to ignore (common job posting filler words) additional_stopwords = { 'will', 'able', 'role', 'responsibility', 'responsibilities', 'required', 'years', 'job', 'position', 'candidate', 'candidates', 'include', 'including', 'experience', 'work', 'working', 'ability', 'accommodation', 'must', 'should', 'can', 'strong', 'demonstrated', 'demonstrable', 'demonstrated', 'excellent', 'outstanding', 'successful', 'success', 'prefer', 'preferred', 'minimum', 'maximum', 'ideal', 'ideally', 'plus', 'bonus', 'qualification', 'qualifications', 'skill', 'skills', 'across', 'within', 'using', 'use', 'understanding', 'understand', 'knowledge', 'related', 'degree', 'degrees', 'education', 'educational', 'field', 'fields', 'need','service','services','look','looking','help','join','team','teams', 'new', 'industry', 'industries', 'industry', 'sector', 'sectors', 'company', 'companies', } stop_words.update(additional_stopwords) # Technical terms that shouldn't be considered stopwords technical_terms = { 'ai', 'ml', 'ui', 'ux', 'api', 'sql', 'aws', 'gcp', 'azure', 'css', 'js', 'ci', 'cd', 'bi', 'pm', 'vp', 'cto', 'ceo', 'cfo', 'coo', 'cpo' } for term in technical_terms: if term in stop_words: stop_words.remove(term) def clean_text(text): """Basic text cleaning: lowercase and remove punctuation/extra whitespace""" text = text.lower() text = re.sub(r'[^\w\s\-/]', ' ', text) # Keep hyphens and slashes for technical terms text = re.sub(r'\s+', ' ', text).strip() return text def extract_meaningful_phrases(text): """Extract meaningful phrases from text focusing on skills and qualifications""" text = clean_text(text) # Define patterns for important phrases patterns = [ # Technical skills r'\b(?:python|java|javascript|typescript|react|angular|vue|node\.?js|express|django|flask|sql|nosql|mongodb|postgresql|mysql|oracle|aws|azure|gcp|docker|kubernetes|terraform|jenkins|git|github|gitlab|agile|scrum|kanban|jira|confluence|tableau|power\s*bi|excel|vba|ml|ai|machine\s*learning|deep\s*learning|nlp|natural\s*language\s*processing|data\s*science|data\s*visualization|data\s*analysis|data\s*engineering|data\s*modeling|product\s*management|product\s*development|product\s*strategy|ux|ui|design\s*thinking|figma|sketch|devops|ci\/cd|continuous\s*integration|continuous\s*deployment)\b', # Business skills and domains r'\b(?:product\s*manager|product\s*owner|project\s*manager|program\s*manager|business\s*analyst|systems\s*analyst|solutions\s*architect|technical\s*architect|enterprise\s*architect|financial\s*analysis|financial\s*modeling|financial\s*planning|budget\s*management|strategic\s*planning|roadmap\s*development|stakeholder\s*management|team\s*leadership|team\s*management|cross\-functional|client\s*relationship|vendor\s*management|executive\s*presentation|data\-driven|decision\s*making|problem\s*solving|critical\s*thinking|communication\s*skills|fintech|insurtech|regtech|health\s*tech|e\-commerce|saas|artificial\s*intelligence|blockchain|cryptocurrency|payments|lending|wealth\s*management|retirement|asset\s*management|investment\s*management|risk\s*management|compliance|regulatory|security|user\s*experience|customer\s*experience|market\s*analysis|competitor\s*analysis|growth\s*strategy|marketing\s*strategy|digital\s*transformation|change\s*management|innovation|analytics|metrics|kpis|okrs)\b', # Education and certifications r'\b(?:mba|phd|master|bachelor|bs|ba|ms|cfa|cpa|pmp|csm|cspo|safe|prince2|itil|six\s*sigma|lean|comptia|cisco|microsoft\s*certified|aws\s*certified|google\s*certified|azure\s*certified)\b' ] # Extract all matches all_matches = [] for pattern in patterns: matches = re.findall(pattern, text) all_matches.extend(matches) # Get words and bigrams (for anything not captured by patterns) words = text.split() # Filter out stopwords for individual words filtered_words = [word for word in words if word not in stop_words and len(word) > 2] # Extract bigrams (two-word phrases) bigrams = [] for i in range(len(words) - 1): # Only keep bigrams where at least one word is not a stopword if words[i] not in stop_words or words[i+1] not in stop_words: # Don't include bigrams that are just numbers or very short words if (len(words[i]) > 2 or words[i].isdigit()) and (len(words[i+1]) > 2 or words[i+1].isdigit()): bigram = words[i] + ' ' + words[i+1] bigrams.append(bigram) # Combine all extracted terms all_terms = all_matches + filtered_words + bigrams return all_terms # Extract terms from job description job_terms = extract_meaningful_phrases(job_description) # Count frequencies term_counts = Counter(job_terms) # Apply domain-specific weights weighted_terms = {} for term, count in term_counts.items(): weight = count # Give higher weights to specific domain terms if re.search(r'finance|financial|banking|investment|insurance', term): weight *= 1.5 # Give higher weights to management/leadership terms if relevant if re.search(r'manager|director|lead|leadership|strategy', term): weight *= 1.3 # Give higher weights to technical skills if re.search(r'python|java|sql|data|algorithm|machine learning|ai', term): weight *= 1.4 # Give higher weights to product-related terms if re.search(r'product|roadmap|feature|requirement|backlog|agile', term): weight *= 1.5 weighted_terms[term] = weight # Sort terms by weight sorted_terms = sorted(weighted_terms.items(), key=lambda x: x[1], reverse=True) # Extract top N most important terms, removing duplicates top_n = 15 seen = set() important_keywords = [] for term, _ in sorted_terms: # Skip similar terms (e.g., don't include both "python" and "python programming") should_skip = False for existing in important_keywords: if term in existing or existing in term: should_skip = True break if not should_skip and term not in seen: important_keywords.append(term) seen.add(term) if len(important_keywords) >= top_n: break # Clean resume for comparison clean_resume = clean_text(resume_text) # Check for keywords in resume matched = [] missing = [] for keyword in important_keywords: if keyword.lower() in clean_resume: matched.append(keyword) else: # Check for variations (especially for multi-word terms) keyword_parts = keyword.split() if len(keyword_parts) > 1: # For multi-word keywords, check if any variations exist variations = [ '-'.join(keyword_parts), '/'.join(keyword_parts), ''.join(keyword_parts) ] found = False for var in variations: if var.lower() in clean_resume: matched.append(keyword) found = True break # Also check if all parts exist independently (within reasonable distance) if not found: all_parts_exist = True for part in keyword_parts: if part not in stop_words and part not in clean_resume: all_parts_exist = False break if all_parts_exist: matched.append(keyword) found = True if not found: missing.append(keyword) else: # For single words, check for similar forms using stemming/lemmatization logic similar_forms = { 'manage': ['manager', 'management', 'managing'], 'develop': ['developer', 'development', 'developing'], 'analyze': ['analyst', 'analysis', 'analytical'], 'finance': ['financial', 'financing'], 'strategy': ['strategic', 'strategist'], 'lead': ['leader', 'leadership'] } found = False for base, forms in similar_forms.items(): if keyword in forms or keyword == base: for form in forms + [base]: if form in clean_resume: matched.append(keyword) found = True break if found: break if not found: missing.append(keyword) # Calculate match percentage match_percentage = (len(matched) / max(len(important_keywords), 1)) * 100 return { "match_percentage": round(match_percentage, 2), "matched": matched, "missing": missing, "important_keywords": important_keywords # Return this for debugging }
I also wrote about the ethical aspects of AI implementation in the hiring process alongside a suggested framework for organizations and businesses to adopt.

Other Projects
© 2025. All rights Reserved.
© 2025. All rights Reserved.
© 2025. All rights Reserved.