The Bluesky Profile Posts Scraper makes it easy to extract complete post data from any Bluesky profile, including text, media, and engagement metrics. It solves the challenge of manually collecting public Bluesky content for analytics, research, and automation workflows. This tool delivers clean, structured JSON that’s ready for dashboards, machine learning, or social trend analysis.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Bluesky Profile Posts Scraper you've just found your team — Let’s Chat. 👆👆
This project retrieves posts from Bluesky user profiles and outputs detailed, structured data. It helps analysts, creators, and developers gather insights from Bluesky activity without manual effort.
- Automatically collects posts, engagement metrics, and author details.
- Supports large-scale data gathering with reliable, consistent output.
- Ideal for social listening, content research, reporting, or archiving.
- Captures images, videos, and embedded media alongside text.
- Provides standardized JSON for easy integration into analytic workflows.
| Feature | Description |
|---|---|
| Comprehensive Post Capture | Extracts text, images, videos, engagement counts, timestamps, and URIs. |
| Fast Data Extraction | Efficiently retrieves large volumes of posts from multiple profiles. |
| Media Retrieval | Captures media such as images, thumbnails, and embedded video playlists. |
| Clean Structured Output | Delivers unified JSON for direct use in analysis or automation. |
| Author Metadata | Retrieves profile information including handle, display name, avatar, and DID. |
| Field Name | Field Description |
|---|---|
| likeCount | Number of likes received by the post. |
| replyCount | Total replies associated with the post. |
| repostCount | Number of reposts. |
| quoteCount | Number of quoted posts. |
| indexedAt | Timestamp when the post was indexed. |
| uri | Unique Bluesky URI for the post. |
| author | Object containing author DID, handle, display name, avatar, and metadata. |
| text | Full text content of the post. |
| embed | Object containing playlist and thumbnail media URLs. |
[
{
"likeCount": 215,
"quoteCount": 1,
"replyCount": 19,
"repostCount": 12,
"indexedAt": "2025-02-15T07:18:17.051Z",
"uri": "at://did:plc:cy4af3hlkdaht7wltvdmc35k/app.bsky.feed.post/3li76g6lq722r",
"author": {
"did": "did:plc:cy4af3hlkdaht7wltvdmc35k",
"handle": "t3.gg",
"displayName": "Theo",
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:cy4af3hlkdaht7wltvdmc35k/bafkreiczc675kmmavcc4pyzhaixqkkucdahhxqw25xrlp2rt4cajdmojfm@jpeg",
"associated": {
"chat": {
"allowIncoming": "following"
}
},
"labels": [],
"createdAt": "2023-04-12T03:02:52.540Z"
},
"text": "I made a new search engine. Kind of.\n\nIntroducing unduck.link, my DuckDuckGo replacement :)",
"embed": {
"playlist": "https://video.bsky.app/watch/did%3Aplc%3Acy4af3hlkdaht7wltvdmc35k/bafkreido5hly55f4t5fbjqtmkhcycc44lgy5bozvntvuzkqmltoedleygm/playlist.m3u8",
"thumbnail": "https://video.bsky.app/watch/did%3Aplc%3Acy4af3hlkdaht7wltvdmc35k/bafkreido5hly55f4t5fbjqtmkhcycc44lgy5bozvntvuzkqmltoedleygm/thumbnail.jpg"
}
}
]
Bluesky Profile Posts Scraper/
├── src/
│ ├── runner.js
│ ├── extractors/
│ │ ├── bluesky_parser.js
│ │ └── utils_media.js
│ ├── outputs/
│ │ └── exporters.js
│ └── config/
│ └── settings.example.json
├── data/
│ ├── inputs.sample.txt
│ └── sample.json
├── package.json
├── requirements.txt
└── README.md
- Researchers extract Bluesky discussions to analyze social trends and public sentiment.
- Marketing teams collect engagement metrics to track influencer activity and campaign performance.
- Developers integrate post data into dashboards or automation tools to streamline reporting.
- Content creators monitor competitors’ posts to discover new ideas and benchmark engagement.
- Data analysts build datasets for training models or conducting behavioral analysis.
Q: Does the scraper retrieve media like images and videos? Yes — it captures embedded media such as playlist URLs, thumbnails, and image attachments.
Q: How many posts can it extract per profile? It supports large-scale extraction and can process extensive post histories within rate limits.
Q: What output format does it use? All results are exported in structured JSON for easy integration.
Q: Does it require API access? No API keys are required; it works directly with publicly available profile data.
Primary Metric: Capable of processing hundreds of posts per minute with efficient batching. Reliability Metric: Maintains a high success rate across diverse user profiles with stable output formatting. Efficiency Metric: Optimized to minimize redundant network calls and reduce resource use during large crawls. Quality Metric: Provides high data completeness with consistent extraction of metadata, media, and engagement fields.
