Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for How to Automate Developer OSINT with Python
bohdan AI profile imageBohdan Lukianets
Bohdan LukianetsSubscriber forbohdan AI

Posted on

How to Automate Developer OSINT with Python

Hey Dev.to community! 👋

Last month, I had a surprising realization. While applying for a contract position, the client mentioned they'd already checked out several of my old Stack Overflow answers and even a GitHub issue I'd commented on two years ago. It got me thinking—just how visible is my digital footprint as a developer?

In this post, I'll show you how I built a simple Python tool leveraging Open Source Intelligence (OSINT) techniques to gather developer-related information from public sources. We'll use my own nickname "bohdanai" as a practical example.

📌 What is OSINT?

Open Source Intelligence (OSINT) refers to collecting and analyzing publicly available data. For developers, OSINT can help you quickly gather useful information such as GitHub profiles, LinkedIn profiles, open-source contributions, and professional activities.

🎯 Our Goal

We'll create a Python script that:

  • Accepts a developer's name or nickname.
  • Searches online sources like Google for mentions.
  • Extracts relevant LinkedIn and GitHub profile links.
  • Optionally scrapes specific websites for mentions.
  • Returns structured results for further analysis.

🛠️ Tools and Libraries Used:

  • requests — for sending HTTP requests.
  • BeautifulSoup — for parsing HTML content.
  • googlesearch-python — to perform Google searches directly from Python.

Install these libraries with:

pipinstallrequests beautifulsoup4 googlesearch-python## 🛠️ Tools and Libraries Used:-`requests`forsending HTTP requests.-`BeautifulSoup`forparsing HTML content.-`googlesearch-python` — to perform Google searches directly from Python.Install these libraries with:
Enter fullscreen modeExit fullscreen mode


bash
pip install requests beautifulsoup4 googlesearch-python

Hey Dev.to community! 👋

Last month, I had a surprising realization. While applying for a contract position, the client mentioned they'd already checked out several of my old Stack Overflow answers and even a GitHub issue I'd commented on two years ago. It got me thinking—just how visible is my digital footprint as a developer?

In this post, I'll show you how I built a simple Python tool leveraging Open Source Intelligence (OSINT) techniques to gather developer-related information from public sources. We'll use my own nickname "bohdanai" as a practical example.

📌 What is OSINT?

Open Source Intelligence (OSINT) refers to collecting and analyzing publicly available data. For developers, OSINT can help you quickly gather useful information such as GitHub repositories, LinkedIn profiles, open-source contributions, and professional activities.

🎯 Our Goal

We'll create a Python script that:

  • Accepts a developer's name or nickname.
  • Searches online sources like Google for mentions.
  • Extracts relevant LinkedIn and GitHub profile links.
  • Optionally scrapes specific websites for mentions.
  • Returns structured results for further analysis.

🛠️ Tools and Libraries Used:

  • requests — for sending HTTP requests.
  • BeautifulSoup — for parsing HTML content.
  • googlesearch-python — to perform Google searches directly from Python.

Install these libraries with:

pipinstallrequests beautifulsoup4 googlesearch-python
Enter fullscreen modeExit fullscreen mode

⚙️ Python Script Implementation

Here's the complete Python script:

importrequestsfrombs4importBeautifulSoupfromgooglesearchimportsearchimportjsondefsearch_developer(name,num_results=10,website_url=None):results={"name":name,"google_results":[],"linkedin_url":None,"github_url":None,"website_mentions":[]}# Google Searchprint(f"🔍 Searching Google for'{name}'...")try:forurlinsearch(name,num=num_results,stop=num_results,pause=2):results["google_results"].append(url)exceptExceptionase:print(f"⚠️ Google search error:{e}")# LinkedIn URL extractionprint("🔗 Extracting LinkedIn URL...")forurlinresults["google_results"]:if"linkedin.com/in/"inurl:results["linkedin_url"]=urlbreak# GitHub URL extractionprint("🐱 Extracting GitHub URL...")forurlinresults["google_results"]:if"github.com"inurland"/issues"notinurland"/pull"notinurl:results["github_url"]=urlbreak# Optional: Website scrapingifwebsite_url:print(f"🌐 Scraping mentions from{website_url}...")try:response=requests.get(website_url,timeout=5)soup=BeautifulSoup(response.text,'html.parser')texts=soup.stripped_stringsfortextintexts:ifname.lower()intext.lower():results["website_mentions"].append(text)exceptrequests.RequestExceptionase:print(f"⚠️ Website scraping error:{e}")returnresultsdefsave_to_json(data,filename="osint_results.json"):withopen(filename,'w')asf:json.dump(data,f,indent=2)print(f"📁 Results saved to{filename}")if__name__=="__main__":developer_name=input("Enter developer name or username to research:")or"bohdanai"my_website="https://bohdanlukianets.pro"data=search_developer(developer_name,website_url=my_website)print("\n📋 --- OSINT Results ---")print(f"👤 Name/Nickname:{data['name']}")print(f"🔗 LinkedIn:{data.get('linkedin_url','Not found')}")print(f"🐙 GitHub:{data.get('github_url','Not found')}")print("\n🌍 Google Results:")foridx,urlinenumerate(data["google_results"],start=1):print(f"{idx}.{url}")ifdata["website_mentions"]:print(f"\n🔖 Mentions on{my_website}:")formentionindata["website_mentions"]:print(f"-{mention}")else:print(f"\n🔖 No mentions found on{my_website}.")save_to_json(data)
Enter fullscreen modeExit fullscreen mode

🚀 Running the Script

Simply save the code above in a file calledosint_script.py and run it:

python osint_script.py
Enter fullscreen modeExit fullscreen mode

You'll be prompted to enter a username or you can hit enter to use the default "bohdanai".

✅ Strengths of This Script:

  • Easy to Understand: Clear structure and inline comments.
  • Beginner-Friendly: Great introduction to web scraping and OSINT.
  • Basic Error Handling: Handles exceptions in requests and search operations.
  • Popular Libraries: Uses well-known and maintained Python libraries.

⚠️ Weaknesses & Points for Improvement:

  • Reliability: Google scraping might lead to temporary IP blocks.
  • Limited Personalization: Currently uses fixed URLs; consider adding custom inputs.
  • Ethical Concerns: Always ensure you have permission and follow ethical guidelines.
  • Output Format: Consider adding detailed CSV or Excel outputs for easier analysis.
  • Rate Limiting: Add delays (time.sleep()) for more extensive searches to avoid IP blocks.

🔐 Ethical Considerations:

Always respect privacy and legal frameworks when gathering personal data from the web. This script is intended purely for educational purposes and ethical OSINT investigations.

🧠 Lessons Learned & Practical Tips:

  • Google rate limits are real—use pauses between requests.
  • Always scrape responsibly to avoid getting IP-blocked.
  • Results vary widely based on online activity and SEO.

🔮 Where This Could Go Next:

  • Using official APIs instead of scraping.
  • Sentiment analysis on collected mentions.
  • Visualizing your digital presence.
  • Automated monitoring for new mentions.

🌐 Connect With Me:

Feel free to connect with me through these platforms:

💬 Your Turn!

Have you checked your own digital footprint? Were you surprised by what you found? What enhancements or ideas would you like to see next? Let me know your thoughts in the comments!

Happy coding! 🌟👩‍💻👨‍💻

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

#bohdanai #bohdanlukianets #bohdan.AI

Trending onDEV CommunityHot

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp