GitHub Contributor Analysis, code via LLM

A reporter got a tip about a small tech company working close to a government department into which their team was looking, and wanted to find out who was working there—beyond what was available on LinkedIn.

The company had an active GitHub account, and some public repositories. That could be a good source of gaining more information on at least their engineers, I thought.

I knew GitHub had an API through which I could pull a list of a repository's contributors and details about them. Writing a script to do this wasn't going to be particularly difficult, perhaps an hour or two's work, but Claude, or another LLM-based product, would be able to do it much faster.

So I instructed Claude with an incredibly simple prompt:

I want Python code to look at a GitHub repo (public) using the API, find all the contributors, get their names, links to social media, etc., and order them by most recent commits. I'd like this saved as a CSV.

Its output was fantastic. A couple of minutes of finessing gave me methods I could call to get a list of contributors to a code repository, as well as biographical information and links to their online profiles:

REPO_URL = "https://github.com/openai/whisper"

class GitHubContributorAnalyzer:
    """
    A class to analyze contributors of a GitHub repository,
    including fetching their names, social media links, and sorting by most recent commits.
    """
   
...

REPO_OWNER, REPO_NAME = extract_repo_info(REPO_URL)

# Optional: Add your GitHub token here for higher rate limits
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")

# Create analyzer and run analysis
analyzer = GitHubContributorAnalyzer(GITHUB_TOKEN)
contributors = analyzer.analyze_repository(REPO_OWNER, REPO_NAME)

I entered the organization's GitHub URL into REPO_URL and got my list as a spreadsheet: (This is for OpenAI's Whisper repository; OpenAI is not the organization into which the reporter was looking!)

Username Name Email Bio Company Location Avatar URL Profile URL Contributions Latest Commit Date Social Website Social GitHub Social Twitter Social LinkedIn
jongwook Jong Wook Kim jongwook@nyu.edu OpenAI San Francisco, CA avatar GitHub 69 2025-01-04T20:56:16Z Website GitHub
cclauss Christian Clauss cclauss@me.com Working hard to find and fix bugs in software... Christian Clauss Switzerland avatar GitHub 3 2025-01-04T09:38:35Z GitHub
Purfview Pro observer of the earthlings... Contact: purfview@protonmail.com Aldebaran avatar GitHub 1 2024-12-01T05:47:01Z GitHub
lvaughn Lowell Vaughn lowell@vaughnresearch.com avatar GitHub 1 2024-11-26T17:37:01Z GitHub
YuZekai YuZekai yuzekai@hdu.edu.cn Hangzhou Dianzi University Hangzhou, Zhejiang avatar GitHub 1 2024-11-13T00:35:54Z GitHub
BotMaster3000 BotMaster3000 avatar GitHub 1 2024-11-04T07:00:30Z GitHub
kittsil avatar GitHub 1 2024-10-26T14:17:31Z GitHub
xingjianan Jianan Xing avatar GitHub 1 2024-09-10T16:53:08Z GitHub
ryanheise Sydney, Australia avatar GitHub 4 2023-12-18T20:11:16Z GitHub

The whole thing took just a few minutes and gave the reporter's team a list of all contributors to the target company's top repository, as well as social and contact details.

Going further, an LLM could be used to parse blogposts, social content, etc., to find stories, draw knowledge graphs of links between people, etc.


To do this yourself, first clone (try GitHub Desktop) my repository, open main.py and enter the URL of the GitHub repository about which you're interested in learning in REPO_URL, as well as a GitHub API token in your .env, and then run uv run main.py in your Terminal.

Subscribe to AI for Investigation

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe