File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 1+ name : Generate DB and Create Release
2+
3+ on :
4+ push :
5+ branches :
6+ - main
7+ schedule :
8+ - cron : ' 0 0 * * *' # Run daily at midnight UTC
9+ workflow_dispatch : # Allows triggering the workflow manually from the GitHub UI
10+
11+ jobs :
12+ build_and_release :
13+ runs-on : ubuntu-latest
14+
15+ steps :
16+ - name : Checkout code
17+ uses : actions/checkout@v4 # Checkout the repository code
18+
19+ - name : Set up Python
20+ uses : actions/setup-python@v5
21+ with :
22+ python-version : ' 3.x' # Use your desired Python version
23+
24+ - name : Install dependencies
25+ run : |
26+ python -m pip install --upgrade pip
27+ pip install -r requirements.txt
28+
29+ - name : Create data directory
30+ run : mkdir -p data # Ensure the data directory exists
31+
32+ - name : Run database generation script
33+ run : python scripts/download_issues.py # Execute your local script
34+
35+ - name : Get current date for release name
36+ id : get_date
37+ run : echo "RELEASE_DATE=$(date +'%Y-%m-%d-%H-%M')" >> $GITHUB_ENV # Set environment variable with current date and time
38+
39+ - name : Create Release and Upload Database File
40+ uses : softprops/action-gh-release@v1
41+ with :
42+ # Use the current date for the release name
43+ name : Database Release ${{ env.RELEASE_DATE }}
44+ # A unique tag is required for releases.
45+ # We'll use the date and the run ID to ensure uniqueness.
46+ tag_name : db-${{ env.RELEASE_DATE }}-${{ github.run_id }}
47+ # Attach the generated database file to the release
48+ files : data/github.db
49+ env :
50+ GITHUB_TOKEN : ${{ secrets.GITHUB_TOKEN }} # GitHub automatically provides this token
Original file line number Diff line number Diff line change @@ -172,3 +172,6 @@ cython_debug/
172172
173173# PyPI configuration file
174174.pypirc
175+
176+ # Where we download the data
177+ data
Original file line number Diff line number Diff line change 1+ pandas
2+ sqlite3
3+ github-to-sqlite
4+ rich
Original file line number Diff line number Diff line change 1+ from subprocess import run
2+ import pandas as pd
3+ import sqlite3
4+ from rich .progress import track
5+ from pathlib import Path
6+
7+ here = Path (__file__ ).parent
8+
9+ def df_from_sql (query , db ):
10+ con = sqlite3 .connect (db )
11+ return pd .read_sql (query , con )
12+ con .close ()
13+
14+ def download_issues_data (org , db ):
15+ # Load all repositories in a local DB
16+ cmd = f"github-to-sqlite repos { db } { org } "
17+ print (cmd )
18+ run (cmd .split ())
19+ repos = df_from_sql ("SELECT * FROM repos;" , db )
20+ repos = repos .set_index ("id" )
21+
22+ # For each repository, download its issues
23+ repos = repos ["full_name" ].tolist ()
24+ print (f"Downloading issues from { len (repos )} repositories..." )
25+ for repo in track (repos ):
26+ cmd = f"github-to-sqlite issues { db } { repo } "
27+ print (cmd )
28+ run (cmd .split ())
29+ print (f"Finished loading new issues to { db } " )
30+
31+ path_out = (here / ".." / "data" / "github.db" ).resolve ()
32+ print (f"Downloading to { path_out } " )
33+ download_issues_data ("jupyter-book" , path_out )
You can’t perform that action at this time.
0 commit comments