Building a network explorer to find small, active accounts on X
July 08, 2024
I recently decided to become more active on X and ran into a familiar algo-feed issue—all my content was either super viral or from massive accounts.
I figured if I could find more small, active accounts to interact with, I'd get more unique content and more dialogue vs. passive consumption.
Getting data without scraping or paying for it
To start, I grabbed all the accounts followed by the accounts that I follow—my second degree following. When Elon took over, X bricked their API pricing (would cost $40k+ to get the data I needed). For this (non-commercial!) proj, I can deal with rate limits and latency—I just need endpoint access—so I used a reverse-engineered version of the client-side API.
I was able to pull cookies from 7 different accounts, but I still had to wait about ~5 seconds between requests to avoid rate limits. Originally, I intended to grab 500 follower lists, ~250k accounts, and 1 page of tweets for each account. This would have taken an entire month at 100% uptime. So, instead, I used the following lists to map out the follow density for each prospective account (i.e., how many of the accounts I follow, follow them), and then filtered out anyone with less than 3 mutual follows (irrelevant) and more than 3k followers (too big to care about me).
This still took about ~12 days, but after filtering for active accounts (more than 5 self-authored tweets in last 60 days), I was left with 8,703 new accounts to consider following.
Visualizing cultural clusters in a force-directed graph
8k accounts is too many to personally vet, so I built some visuals to help me focus on specific communities. I tried a few different libraries, but they had garbage graphics engines and could barely render 5% of my data. Eventually, I started searching for libraries specifically built on WebGL and found Cosmograph.
Unfortunately, Cosmograph doesn't have vectorized links (direction only, no magnitude), so my first graphs were very clumpy and round. I tried to fix this by adding multiple links between edges (e.g., 6 repeat links instead of a single link with strength = 6), but this increased my data size ~6x to like 100mb and made the graph way harder to read.
As a fix, I pre-processed my data with the Louvain community detection algorithm so that I could give preference to links within communities.
After that, I added some styling, node sizing, and tools to see the account represented by a given graph node. The only other notable feature was a follower slider so that I could jump down to smaller accounts. 2k followers is a lot in certain contexts, and I found some accounts in the 2k-3k range had already seemed to hit escape velo.
Grading accounts in my cli
Last thing. I needed a faster way to mark accounts so I could follow them later. Here, I used a tiny CLI program to spit out account data and accept a tiered ranking (i.e., "a, b, c"). This way I could take a community cluster, mark the accounts I liked, then run a script to follow them.
Use the explorer
If you want to use the full explorer, you can find it here. Safari hasn't integrated the latest version of WebGL, so you'll need to use pure Chrome (macOS yes, iOS no) for it to run properly.