This project is now deprecated and succeeded by Trawler, which is (trying to) offers more and better features than this, written in C#.
A 'smol' program that crawls following/followers/statuses count data from Twitter account profile page using Selenium, and put the crawled data into MySQL database using PyMySQL.
The purpose of this program is to record the followers count daily and see how the count changes everyday. MAYBE THIS IS NOT PRODUCTION-READY, AND SHOULD BE USED FOR RESEARCH PURPOSE ONLY. Use this with caution!
YES, I HAD. but one day Twitter suspended my API application, even though I didn't overuse or abuse it! Probably this is an Elon thing
Source code of original implementation, which uses Twitter API using python-twitter, is stored in old branch.
Dockerfile is ready, in both current and old(original) source tree.
To build:
$ cd <root-directory-of-source>
$ docker build -t twitter-account-data-crawler:latest .After build, run:
$ docker run -d \
--name twitter-account-data-crawler \
-v <path-of-config.yaml>:/app/config/config.yaml \
twitter-account-data-crawlerYou have to prepare configuration file(config.yaml). Please refer the example config file and create your own.
If you're using Podman, just replace docker with podman in command line.
You may still run the program without Docker or OCI-compliant runtimes.
To get this work:
$ cd <root-direvtory-of-source>
# Install requirements
$ pip install -r requirements.txt
# and run!
$ python index.pyConfiguration file(config.yaml) should be exist in config folder.
Currently only MySQL(and probably MySQL-based DBMS like MariaDB) is supported.
Creating tables per target account is recommended.
The table at least should have these columns:
date: type datefollowing_count: type int, unsignedfollower_count: type int, unsignedtweet_count: type int, unsigned
An example SQL query for these columns:
CREATE TABLE `account_track_table` (
`date` date NOT NULL,
`following_count` int UNSIGNED NOT NULL,
`follower_count` int UNSIGNED NOT NULL,
`tweet_count` int UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;