#️⃣ Hashtag Scraping
Scrape all tweets using a specific hashtag from X/Twitter with full metadata, engagement metrics, and flexible export options.
📦 What You Get
- Complete hashtag search - All tweets containing your target hashtag
- Author information - Username, display name, verification status
- Tweet text content with hashtags preserved
- Timestamps (posted date/time)
- Engagement metrics (likes, retweets, replies, views)
- Media URLs (images, videos, GIFs)
- Additional hashtags and mentions in each tweet
- URLs shared in tweets
- Filter by Top or Latest
- Export to JSON or CSV
💡 Use Cases
- Track marketing campaigns - Monitor your branded hashtags and campaign performance
- Trend monitoring - Follow trending topics and emerging conversations
- Competitive analysis - Analyze competitor campaigns and audience engagement
- Research & analytics - Study hashtag usage patterns and sentiment
- Content curation - Find the best content for newsletters and reports
- Influencer discovery - Identify key voices in specific hashtag communities
- Event tracking - Monitor live events, conferences, and product launches
- Crisis monitoring - Track brand mentions during PR situations
🌐 Example 1: Browser Console (Quick)
Best for: Quickly scraping hashtag tweets from search results, up to ~200 tweets
Steps:
- Go to x.com/search
- Enter your hashtag (e.g.,
#AIor#crypto) - Select a tab (Top or Latest)
- Open browser console (F12 → Console tab)
- Paste the code below and press Enter
// ============================================
// XActions - Hashtag Scraper (Browser Console)
// Go to: x.com/search?q=%23YOUR_HASHTAG
// Open console (F12), paste this
// Author: nich (@nichxbt)
// ============================================
(async () => {
const TARGET_COUNT = 200; // Adjust target tweet count
const SCROLL_DELAY = 2000; // ms between scrolls (2 sec recommended)
// Extract hashtag from current URL
const urlParams = new URLSearchParams(window.location.search);
const searchQuery = urlParams.get('q') || '';
const searchFilter = urlParams.get('f') || 'top'; // top, live (latest)
// Find hashtag in query
const hashtagMatch = searchQuery.match(/#\w+/);
const targetHashtag = hashtagMatch ? hashtagMatch[0].toLowerCase() : searchQuery;
console.log('#️⃣ Starting hashtag scrape...');
console.log(`🏷️ Hashtag: ${targetHashtag}`);
console.log(`📋 Filter: ${searchFilter === 'live' ? 'Latest' : 'Top'}`);
console.log(`📊 Target: ${TARGET_COUNT} tweets`);
const tweets = new Map();
let retries = 0;
const maxRetries = 15;
// Helper to parse count strings like "1.2K", "45M"
const parseCount = (str) => {
if (!str) return 0;
str = str.trim().replace(/,/g, '');
if (str.endsWith('K')) return Math.round(parseFloat(str) * 1000);
if (str.endsWith('M')) return Math.round(parseFloat(str) * 1000000);
if (str.endsWith('B')) return Math.round(parseFloat(str) * 1000000000);
return parseInt(str) || 0;
};
// Extract tweet data from article elements
const extractTweets = () => {
const articles = document.querySelectorAll('article[data-testid="tweet"]');
const extracted = [];
articles.forEach(article => {
try {
// Get tweet ID from the tweet link
const tweetLink = article.querySelector('a[href*="/status/"]');
const href = tweetLink?.getAttribute('href') || '';
const statusMatch = href.match(/\/status\/(\d+)/);
const tweetId = statusMatch ? statusMatch[1] : null;
if (!tweetId) return;
// Get author info
const userLinks = article.querySelectorAll('a[href^="/"][role="link"]');
let author = null;
for (const link of userLinks) {
const linkHref = link.getAttribute('href') || '';
if (linkHref.match(/^\/[a-zA-Z0-9_]+$/) && !linkHref.includes('/status/')) {
author = linkHref.slice(1);
break;
}
}
// Get display name
const nameEl = article.querySelector('[data-testid="User-Name"]');
const displayName = nameEl?.querySelector('span')?.textContent?.trim() || null;
// Check if verified
const verified = !!article.querySelector('[data-testid="icon-verified"]') ||
!!article.querySelector('svg[aria-label*="Verified"]');
// Get tweet text
const textEl = article.querySelector('[data-testid="tweetText"]');
const text = textEl?.textContent?.trim() || '';
// Get timestamp
const timeEl = article.querySelector('time');
const timestamp = timeEl?.getAttribute('datetime') || null;
const displayTime = timeEl?.textContent?.trim() || null;
// Get engagement metrics
const replyBtn = article.querySelector('[data-testid="reply"]');
const retweetBtn = article.querySelector('[data-testid="retweet"]');
const likeBtn = article.querySelector('[data-testid="like"]');
const viewsEl = article.querySelector('a[href*="/analytics"]');
const replies = parseCount(replyBtn?.textContent);
const retweets = parseCount(retweetBtn?.textContent);
const likes = parseCount(likeBtn?.textContent);
const views = parseCount(viewsEl?.textContent);
// Get media (images, videos, GIFs)
const mediaUrls = [];
// Images
const images = article.querySelectorAll('[data-testid="tweetPhoto"] img');
images.forEach(img => {
const src = img.getAttribute('src');
if (src && src.includes('pbs.twimg.com/media')) {
const highRes = src.replace(/&name=\w+/, '&name=large');
mediaUrls.push({
type: 'image',
url: highRes,
});
}
});
// Videos/GIFs
const videos = article.querySelectorAll('video');
videos.forEach(video => {
const poster = video.getAttribute('poster');
const src = video.querySelector('source')?.getAttribute('src');
mediaUrls.push({
type: video.closest('[data-testid="videoPlayer"]') ? 'video' : 'gif',
url: src || poster || null,
thumbnail: poster,
});
});
// Extract all hashtags from text
const hashtags = (text.match(/#\w+/g) || []).map(h => h.toLowerCase());
// Extract mentions from text
const mentions = (text.match(/@\w+/g) || []).map(m => m.toLowerCase());
// Extract URLs from tweet
const urlElements = article.querySelectorAll('[data-testid="tweetText"] a[href^="http"]');
const urls = Array.from(urlElements).map(a => a.getAttribute('href')).filter(Boolean);
// Check if it's a retweet
const socialContext = article.querySelector('[data-testid="socialContext"]');
const isRetweet = socialContext?.textContent?.toLowerCase().includes('reposted') || false;
extracted.push({
id: tweetId,
url: `https://x.com/${author}/status/${tweetId}`,
author,
displayName,
verified,
text,
timestamp,
displayTime,
replies,
retweets,
likes,
views,
media: mediaUrls,
hashtags,
mentions,
urls,
isRetweet,
targetHashtag: targetHashtag,
scrapedAt: new Date().toISOString(),
});
} catch (e) {
console.warn('Error extracting tweet:', e);
}
});
return extracted;
};
// Scroll and collect tweets
while (tweets.size < TARGET_COUNT && retries < maxRetries) {
const extracted = extractTweets();
const prevSize = tweets.size;
extracted.forEach(tweet => {
if (!tweets.has(tweet.id)) {
tweets.set(tweet.id, tweet);
}
});
console.log(`📊 Collected: ${tweets.size}/${TARGET_COUNT} tweets`);
if (tweets.size === prevSize) {
retries++;
console.log(`⏳ No new tweets found, retry ${retries}/${maxRetries}`);
} else {
retries = 0;
}
// Scroll to load more
window.scrollTo(0, document.body.scrollHeight);
await new Promise(r => setTimeout(r, SCROLL_DELAY));
}
// Convert to array and sort by engagement
const results = Array.from(tweets.values())
.sort((a, b) => (b.likes + b.retweets) - (a.likes + a.retweets));
console.log(`\n✅ Scraping complete!`);
console.log(`📊 Total tweets: ${results.length}`);
console.log(`🏷️ Hashtag: ${targetHashtag}`);
// Calculate stats
const totalLikes = results.reduce((sum, t) => sum + t.likes, 0);
const totalRetweets = results.reduce((sum, t) => sum + t.retweets, 0);
const totalReplies = results.reduce((sum, t) => sum + t.replies, 0);
const uniqueAuthors = new Set(results.map(t => t.author)).size;
const verifiedCount = results.filter(t => t.verified).length;
console.log(`\n📈 Stats for ${targetHashtag}:`);
console.log(` 👥 Unique authors: ${uniqueAuthors}`);
console.log(` ✅ Verified accounts: ${verifiedCount}`);
console.log(` ❤️ Total likes: ${totalLikes.toLocaleString()}`);
console.log(` 🔄 Total retweets: ${totalRetweets.toLocaleString()}`);
console.log(` 💬 Total replies: ${totalReplies.toLocaleString()}`);
// Find top tweet
if (results.length > 0) {
const topTweet = results[0];
console.log(`\n🏆 Top Tweet:`);
console.log(` @${topTweet.author}: "${topTweet.text.substring(0, 80)}..."`);
console.log(` ❤️ ${topTweet.likes.toLocaleString()} likes`);
}
// Export functions
window.hashtagData = results;
window.downloadJSON = () => {
const blob = new Blob([JSON.stringify(results, null, 2)], { type: 'application/json' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `hashtag_${targetHashtag.replace('#', '')}_${Date.now()}.json`;
a.click();
URL.revokeObjectURL(url);
console.log('📁 JSON downloaded!');
};
window.downloadCSV = () => {
const headers = ['id', 'url', 'author', 'displayName', 'verified', 'text', 'timestamp',
'likes', 'retweets', 'replies', 'views', 'hashtags', 'mentions'];
const csvRows = [headers.join(',')];
results.forEach(tweet => {
const row = [
tweet.id,
tweet.url,
tweet.author,
`"${(tweet.displayName || '').replace(/"/g, '""')}"`,
tweet.verified,
`"${tweet.text.replace(/"/g, '""').replace(/\n/g, ' ')}"`,
tweet.timestamp,
tweet.likes,
tweet.retweets,
tweet.replies,
tweet.views,
`"${tweet.hashtags.join(', ')}"`,
`"${tweet.mentions.join(', ')}"`,
];
csvRows.push(row.join(','));
});
const blob = new Blob([csvRows.join('\n')], { type: 'text/csv' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `hashtag_${targetHashtag.replace('#', '')}_${Date.now()}.csv`;
a.click();
URL.revokeObjectURL(url);
console.log('📁 CSV downloaded!');
};
window.copyToClipboard = () => {
navigator.clipboard.writeText(JSON.stringify(results, null, 2));
console.log('📋 Copied to clipboard!');
};
console.log(`\n💾 Export Options:`);
console.log(` downloadJSON() - Save as JSON file`);
console.log(` downloadCSV() - Save as CSV file`);
console.log(` copyToClipboard() - Copy to clipboard`);
console.log(` hashtagData - Access raw data array`);
return results;
})();
📊 Sample Output
{
"id": "1234567890123456789",
"url": "https://x.com/username/status/1234567890123456789",
"author": "username",
"displayName": "Display Name",
"verified": true,
"text": "This is an amazing post about #AI and the future of technology! 🚀",
"timestamp": "2025-12-15T14:30:00.000Z",
"displayTime": "Dec 15",
"replies": 42,
"retweets": 156,
"likes": 1234,
"views": 45000,
"media": [
{
"type": "image",
"url": "https://pbs.twimg.com/media/xxxxx?format=jpg&name=large"
}
],
"hashtags": ["#ai", "#technology", "#future"],
"mentions": ["@openai", "@anthropic"],
"urls": ["https://example.com/article"],
"isRetweet": false,
"targetHashtag": "#ai",
"scrapedAt": "2025-12-15T15:00:00.000Z"
}
🖥️ Example 2: Node.js with Puppeteer (Production)
Best for: Automated hashtag monitoring, large-scale scraping, scheduled jobs, CI/CD pipelines
Prerequisites
npm install puppeteer
Full Script
Create a file called hashtag-scraper.js:
// ============================================
// XActions - Hashtag Scraper (Node.js + Puppeteer)
// Production-ready hashtag scraping script
// Author: nich (@nichxbt)
// ============================================
const puppeteer = require('puppeteer');
const fs = require('fs');
const path = require('path');
// Configuration
const CONFIG = {
hashtag: process.argv[2] || '#AI', // Pass hashtag as argument
filter: process.argv[3] || 'top', // 'top' or 'latest'
targetCount: parseInt(process.argv[4]) || 100, // Number of tweets to scrape
scrollDelay: 2000, // ms between scrolls
maxRetries: 15, // Max retries when no new tweets
headless: true, // Run in headless mode
outputDir: './output', // Output directory
cookiesPath: './cookies.json', // Optional: path to cookies file
};
// Ensure hashtag starts with #
if (!CONFIG.hashtag.startsWith('#')) {
CONFIG.hashtag = '#' + CONFIG.hashtag;
}
// Build search URL
function buildSearchUrl(hashtag, filter) {
const encodedHashtag = encodeURIComponent(hashtag);
let url = `https://x.com/search?q=${encodedHashtag}&src=typed_query`;
if (filter === 'latest') {
url += '&f=live';
} else if (filter === 'top') {
url += '&f=top';
}
return url;
}
// Main scraper function
async function scrapeHashtag() {
console.log('#️⃣ XActions Hashtag Scraper');
console.log('================================');
console.log(`🏷️ Hashtag: ${CONFIG.hashtag}`);
console.log(`📋 Filter: ${CONFIG.filter}`);
console.log(`📊 Target: ${CONFIG.targetCount} tweets`);
console.log(`🔇 Headless: ${CONFIG.headless}`);
console.log('');
// Launch browser
const browser = await puppeteer.launch({
headless: CONFIG.headless,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--no-first-run',
'--no-zygote',
'--disable-gpu',
'--window-size=1920,1080',
],
defaultViewport: {
width: 1920,
height: 1080,
},
});
const page = await browser.newPage();
// Set user agent
await page.setUserAgent(
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
);
// Load cookies if available
if (fs.existsSync(CONFIG.cookiesPath)) {
try {
const cookies = JSON.parse(fs.readFileSync(CONFIG.cookiesPath, 'utf8'));
await page.setCookie(...cookies);
console.log('🍪 Loaded cookies from file');
} catch (e) {
console.warn('⚠️ Could not load cookies:', e.message);
}
}
try {
const searchUrl = buildSearchUrl(CONFIG.hashtag, CONFIG.filter);
console.log(`🌐 Navigating to: ${searchUrl}`);
await page.goto(searchUrl, {
waitUntil: 'networkidle2',
timeout: 60000,
});
// Wait for tweets to load
await page.waitForSelector('article[data-testid="tweet"]', { timeout: 30000 });
console.log('✅ Page loaded, starting scrape...\n');
const tweets = new Map();
let retries = 0;
// Scraping loop
while (tweets.size < CONFIG.targetCount && retries < CONFIG.maxRetries) {
// Extract tweets from page
const extracted = await page.evaluate(() => {
const parseCount = (str) => {
if (!str) return 0;
str = str.trim().replace(/,/g, '');
if (str.endsWith('K')) return Math.round(parseFloat(str) * 1000);
if (str.endsWith('M')) return Math.round(parseFloat(str) * 1000000);
if (str.endsWith('B')) return Math.round(parseFloat(str) * 1000000000);
return parseInt(str) || 0;
};
const articles = document.querySelectorAll('article[data-testid="tweet"]');
const results = [];
articles.forEach(article => {
try {
// Get tweet ID
const tweetLink = article.querySelector('a[href*="/status/"]');
const href = tweetLink?.getAttribute('href') || '';
const statusMatch = href.match(/\/status\/(\d+)/);
const tweetId = statusMatch ? statusMatch[1] : null;
if (!tweetId) return;
// Get author info
const userLinks = article.querySelectorAll('a[href^="/"][role="link"]');
let author = null;
for (const link of userLinks) {
const linkHref = link.getAttribute('href') || '';
if (linkHref.match(/^\/[a-zA-Z0-9_]+$/) && !linkHref.includes('/status/')) {
author = linkHref.slice(1);
break;
}
}
// Get display name
const nameEl = article.querySelector('[data-testid="User-Name"]');
const displayName = nameEl?.querySelector('span')?.textContent?.trim() || null;
// Check if verified
const verified = !!article.querySelector('[data-testid="icon-verified"]') ||
!!article.querySelector('svg[aria-label*="Verified"]');
// Get tweet text
const textEl = article.querySelector('[data-testid="tweetText"]');
const text = textEl?.textContent?.trim() || '';
// Get timestamp
const timeEl = article.querySelector('time');
const timestamp = timeEl?.getAttribute('datetime') || null;
const displayTime = timeEl?.textContent?.trim() || null;
// Get engagement metrics
const replyBtn = article.querySelector('[data-testid="reply"]');
const retweetBtn = article.querySelector('[data-testid="retweet"]');
const likeBtn = article.querySelector('[data-testid="like"]');
const viewsEl = article.querySelector('a[href*="/analytics"]');
const replies = parseCount(replyBtn?.textContent);
const retweets = parseCount(retweetBtn?.textContent);
const likes = parseCount(likeBtn?.textContent);
const views = parseCount(viewsEl?.textContent);
// Get media
const mediaUrls = [];
const images = article.querySelectorAll('[data-testid="tweetPhoto"] img');
images.forEach(img => {
const src = img.getAttribute('src');
if (src && src.includes('pbs.twimg.com/media')) {
const highRes = src.replace(/&name=\w+/, '&name=large');
mediaUrls.push({ type: 'image', url: highRes });
}
});
const videos = article.querySelectorAll('video');
videos.forEach(video => {
const poster = video.getAttribute('poster');
const src = video.querySelector('source')?.getAttribute('src');
mediaUrls.push({
type: video.closest('[data-testid="videoPlayer"]') ? 'video' : 'gif',
url: src || poster || null,
thumbnail: poster,
});
});
// Extract hashtags, mentions, URLs
const hashtags = (text.match(/#\w+/g) || []).map(h => h.toLowerCase());
const mentions = (text.match(/@\w+/g) || []).map(m => m.toLowerCase());
const urlEls = article.querySelectorAll('[data-testid="tweetText"] a[href^="http"]');
const urls = Array.from(urlEls).map(a => a.getAttribute('href')).filter(Boolean);
// Check if retweet
const socialContext = article.querySelector('[data-testid="socialContext"]');
const isRetweet = socialContext?.textContent?.toLowerCase().includes('reposted') || false;
results.push({
id: tweetId,
url: `https://x.com/${author}/status/${tweetId}`,
author,
displayName,
verified,
text,
timestamp,
displayTime,
replies,
retweets,
likes,
views,
media: mediaUrls,
hashtags,
mentions,
urls,
isRetweet,
scrapedAt: new Date().toISOString(),
});
} catch (e) {
// Skip problematic tweets
}
});
return results;
});
const prevSize = tweets.size;
extracted.forEach(tweet => {
if (!tweets.has(tweet.id)) {
tweet.targetHashtag = CONFIG.hashtag;
tweets.set(tweet.id, tweet);
}
});
const progress = Math.min(100, Math.round((tweets.size / CONFIG.targetCount) * 100));
process.stdout.write(`\r📊 Progress: ${tweets.size}/${CONFIG.targetCount} tweets (${progress}%)`);
if (tweets.size === prevSize) {
retries++;
} else {
retries = 0;
}
// Scroll to load more tweets
await page.evaluate(() => {
window.scrollTo(0, document.body.scrollHeight);
});
await new Promise(r => setTimeout(r, CONFIG.scrollDelay));
}
console.log('\n');
// Convert to array and sort
const results = Array.from(tweets.values())
.sort((a, b) => (b.likes + b.retweets) - (a.likes + a.retweets));
// Calculate and display stats
const stats = {
hashtag: CONFIG.hashtag,
filter: CONFIG.filter,
totalTweets: results.length,
uniqueAuthors: new Set(results.map(t => t.author)).size,
verifiedAuthors: results.filter(t => t.verified).length,
totalLikes: results.reduce((sum, t) => sum + t.likes, 0),
totalRetweets: results.reduce((sum, t) => sum + t.retweets, 0),
totalReplies: results.reduce((sum, t) => sum + t.replies, 0),
totalViews: results.reduce((sum, t) => sum + t.views, 0),
averageLikes: Math.round(results.reduce((sum, t) => sum + t.likes, 0) / results.length) || 0,
averageRetweets: Math.round(results.reduce((sum, t) => sum + t.retweets, 0) / results.length) || 0,
tweetsWithMedia: results.filter(t => t.media.length > 0).length,
originalTweets: results.filter(t => !t.isRetweet).length,
scrapedAt: new Date().toISOString(),
};
console.log('✅ Scraping complete!');
console.log('================================');
console.log(`🏷️ Hashtag: ${stats.hashtag}`);
console.log(`📊 Total tweets: ${stats.totalTweets}`);
console.log(`👥 Unique authors: ${stats.uniqueAuthors}`);
console.log(`✅ Verified accounts: ${stats.verifiedAuthors}`);
console.log(`❤️ Total likes: ${stats.totalLikes.toLocaleString()}`);
console.log(`🔄 Total retweets: ${stats.totalRetweets.toLocaleString()}`);
console.log(`💬 Total replies: ${stats.totalReplies.toLocaleString()}`);
console.log(`👀 Total views: ${stats.totalViews.toLocaleString()}`);
console.log(`📸 Tweets with media: ${stats.tweetsWithMedia}`);
console.log(`📝 Original tweets: ${stats.originalTweets}`);
// Find top performers
if (results.length > 0) {
console.log('\n🏆 Top 3 Tweets by Engagement:');
results.slice(0, 3).forEach((tweet, i) => {
console.log(` ${i + 1}. @${tweet.author} - ❤️ ${tweet.likes.toLocaleString()} | 🔄 ${tweet.retweets.toLocaleString()}`);
console.log(` "${tweet.text.substring(0, 60)}..."`);
});
}
// Ensure output directory exists
if (!fs.existsSync(CONFIG.outputDir)) {
fs.mkdirSync(CONFIG.outputDir, { recursive: true });
}
const timestamp = Date.now();
const safeHashtag = CONFIG.hashtag.replace('#', '').replace(/[^a-zA-Z0-9]/g, '_');
// Save JSON
const jsonPath = path.join(CONFIG.outputDir, `hashtag_${safeHashtag}_${timestamp}.json`);
fs.writeFileSync(jsonPath, JSON.stringify({ stats, tweets: results }, null, 2));
console.log(`\n📁 JSON saved: ${jsonPath}`);
// Save CSV
const csvPath = path.join(CONFIG.outputDir, `hashtag_${safeHashtag}_${timestamp}.csv`);
const headers = ['id', 'url', 'author', 'displayName', 'verified', 'text', 'timestamp',
'likes', 'retweets', 'replies', 'views', 'hashtags', 'mentions', 'hasMedia', 'isRetweet'];
const csvRows = [headers.join(',')];
results.forEach(tweet => {
const row = [
tweet.id,
tweet.url,
tweet.author,
`"${(tweet.displayName || '').replace(/"/g, '""')}"`,
tweet.verified,
`"${tweet.text.replace(/"/g, '""').replace(/\n/g, ' ')}"`,
tweet.timestamp,
tweet.likes,
tweet.retweets,
tweet.replies,
tweet.views,
`"${tweet.hashtags.join(', ')}"`,
`"${tweet.mentions.join(', ')}"`,
tweet.media.length > 0,
tweet.isRetweet,
];
csvRows.push(row.join(','));
});
fs.writeFileSync(csvPath, csvRows.join('\n'));
console.log(`📁 CSV saved: ${csvPath}`);
// Save stats summary
const statsPath = path.join(CONFIG.outputDir, `hashtag_${safeHashtag}_${timestamp}_stats.json`);
fs.writeFileSync(statsPath, JSON.stringify(stats, null, 2));
console.log(`📁 Stats saved: ${statsPath}`);
return { stats, tweets: results };
} catch (error) {
console.error('\n❌ Error during scraping:', error.message);
throw error;
} finally {
await browser.close();
console.log('\n🔒 Browser closed');
}
}
// Run the scraper
scrapeHashtag()
.then(({ stats }) => {
console.log(`\n✨ Successfully scraped ${stats.totalTweets} tweets for ${stats.hashtag}`);
process.exit(0);
})
.catch(error => {
console.error('Failed:', error.message);
process.exit(1);
});
🚀 Usage
# Basic usage (default: #AI, top tweets, 100 tweets)
node hashtag-scraper.js
# Scrape specific hashtag
node hashtag-scraper.js "#crypto"
# Scrape with filter (top or latest)
node hashtag-scraper.js "#bitcoin" latest
# Specify number of tweets
node hashtag-scraper.js "#ethereum" top 500
# Scrape latest tweets with custom count
node hashtag-scraper.js "#defi" latest 200
📂 Output Structure
output/
├── hashtag_AI_1735689600000.json # Full data with stats
├── hashtag_AI_1735689600000.csv # CSV export
└── hashtag_AI_1735689600000_stats.json # Summary statistics
🔐 Using Cookies for Authentication
For better results and higher rate limits, export your Twitter cookies:
- Login to X/Twitter in Chrome
- Open DevTools (F12) → Application → Cookies
- Right-click → Export to JSON (or use a cookie export extension)
- Save as
cookies.jsonin the script directory
// Cookies file format (cookies.json)
[
{
"name": "auth_token",
"value": "your_auth_token_here",
"domain": ".x.com",
"path": "/",
"secure": true,
"httpOnly": true
},
{
"name": "ct0",
"value": "your_ct0_token_here",
"domain": ".x.com",
"path": "/",
"secure": true,
"httpOnly": true
}
]
🎯 Popular Hashtag Search Patterns
📈 Trend Monitoring
# Single trending hashtag
node hashtag-scraper.js "#trending" latest 200
# Tech trends
node hashtag-scraper.js "#AI" top 100
node hashtag-scraper.js "#crypto" latest 100
node hashtag-scraper.js "#Web3" top 100
🏢 Brand & Campaign Tracking
# Track branded hashtags
node hashtag-scraper.js "#YourBrandName" latest 500
# Product launch monitoring
node hashtag-scraper.js "#ProductLaunch2025" latest 200
# Event hashtags
node hashtag-scraper.js "#TechConference2025" latest 300
🔍 Competitive Analysis
# Monitor competitor campaigns
node hashtag-scraper.js "#CompetitorBrand" top 100
# Industry hashtags
node hashtag-scraper.js "#SaaS" latest 200
node hashtag-scraper.js "#StartupLife" top 100
📊 Advanced Analysis Examples
JavaScript: Analyze Scraped Data
// Load scraped data
const data = require('./output/hashtag_AI_1735689600000.json');
const tweets = data.tweets;
// Find most engaging tweets
const topByLikes = [...tweets].sort((a, b) => b.likes - a.likes).slice(0, 10);
console.log('Top 10 by likes:', topByLikes.map(t => ({ author: t.author, likes: t.likes })));
// Find most active authors
const authorCounts = tweets.reduce((acc, t) => {
acc[t.author] = (acc[t.author] || 0) + 1;
return acc;
}, {});
const topAuthors = Object.entries(authorCounts)
.sort((a, b) => b[1] - a[1])
.slice(0, 10);
console.log('Most active authors:', topAuthors);
// Find co-occurring hashtags
const coHashtags = tweets.reduce((acc, t) => {
t.hashtags.forEach(h => {
if (h !== data.stats.hashtag.toLowerCase()) {
acc[h] = (acc[h] || 0) + 1;
}
});
return acc;
}, {});
const topCoHashtags = Object.entries(coHashtags)
.sort((a, b) => b[1] - a[1])
.slice(0, 10);
console.log('Related hashtags:', topCoHashtags);
// Engagement by time of day
const byHour = tweets.reduce((acc, t) => {
const hour = new Date(t.timestamp).getHours();
acc[hour] = acc[hour] || { count: 0, likes: 0 };
acc[hour].count++;
acc[hour].likes += t.likes;
return acc;
}, {});
console.log('Engagement by hour:', byHour);
// Verified vs non-verified engagement
const verifiedLikes = tweets.filter(t => t.verified).reduce((s, t) => s + t.likes, 0);
const nonVerifiedLikes = tweets.filter(t => !t.verified).reduce((s, t) => s + t.likes, 0);
console.log('Verified account likes:', verifiedLikes);
console.log('Non-verified account likes:', nonVerifiedLikes);
Python: Data Analysis
import json
import pandas as pd
from collections import Counter
# Load data
with open('output/hashtag_AI_1735689600000.json', 'r') as f:
data = json.load(f)
tweets = data['tweets']
df = pd.DataFrame(tweets)
# Basic stats
print(f"Total tweets: {len(df)}")
print(f"Unique authors: {df['author'].nunique()}")
print(f"Average likes: {df['likes'].mean():.2f}")
print(f"Average retweets: {df['retweets'].mean():.2f}")
# Top authors by tweet count
top_authors = df['author'].value_counts().head(10)
print("\nTop authors by tweet count:")
print(top_authors)
# Top tweets by engagement
df['engagement'] = df['likes'] + df['retweets'] + df['replies']
top_tweets = df.nlargest(10, 'engagement')[['author', 'text', 'engagement']]
print("\nTop tweets by engagement:")
print(top_tweets)
# Hashtag co-occurrence
all_hashtags = [h for hashtags in df['hashtags'] for h in hashtags]
hashtag_counts = Counter(all_hashtags)
print("\nMost common co-occurring hashtags:")
print(hashtag_counts.most_common(10))
# Verified account analysis
verified_df = df[df['verified'] == True]
print(f"\nVerified accounts: {len(verified_df)}")
print(f"Verified avg engagement: {verified_df['engagement'].mean():.2f}")
💡 Tips & Best Practices
🚀 Performance Tips
- Start with "Top" filter - Gets most engaging content first
- Use Latest for real-time - Monitor live events and breaking news
- Set realistic targets - Start with 100-200 tweets, scale up gradually
- Add delays - 2-3 seconds between scrolls to avoid rate limiting
⚠️ Rate Limiting & Anti-Detection
- Don't scrape too aggressively (use delays)
- Use authenticated sessions for higher limits
- Rotate user agents if doing large-scale scraping
- Consider proxy rotation for extensive monitoring
- Break large scrapes into smaller batches
🔐 Authentication Benefits
Using cookies/authentication provides:
- Higher rate limits
- Access to more tweets
- Better data quality
- Less chance of blocking
📊 Data Quality Tips
// Filter out retweets for original content only
const originalTweets = tweets.filter(t => !t.isRetweet);
// Filter by minimum engagement
const qualityTweets = tweets.filter(t => t.likes >= 10);
// Get only verified accounts
const verifiedTweets = tweets.filter(t => t.verified);
// Get tweets with media
const mediaTweets = tweets.filter(t => t.media.length > 0);
// Get tweets from last 24 hours
const recentTweets = tweets.filter(t => {
const tweetDate = new Date(t.timestamp);
const dayAgo = new Date(Date.now() - 24 * 60 * 60 * 1000);
return tweetDate > dayAgo;
});
🌐 Website Alternative
Don't want to run scripts? Use our web app:
xactions.app
- ✅ No coding required
- ✅ Visual hashtag search interface
- ✅ Real-time hashtag monitoring
- ✅ One-click export to CSV/JSON
- ✅ Save hashtag tracking templates
- ✅ Schedule recurring hashtag scrapes
- ✅ Advanced filtering and analytics
- ✅ Trend visualization dashboard
- ✅ Compare multiple hashtags
📚 Related Examples
- Search Tweets - Advanced search with operators
- Tweet Scraping - Scrape tweets from profiles
- Followers Scraping - Get follower lists
- Profile Scraping - Extract profile data
- Trend Analysis - Monitor account changes
🔗 Resources
Author: nich (@nichxbt)
License: MIT
⚡ Ready to try #⃣ Hashtag Scraping?
XActions is 100% free and open-source. No API keys, no fees, no signup.
Browse All Scripts