|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +A Ruby gem that downloads postal/zipcode data from GeoNames.org, processes it via an ETL pipeline, and outputs an SQLite3 database and optional CSV files. Supports single-country or all-countries processing. |
| 8 | + |
| 9 | +## Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +# Install dependencies (vendored to vendor/bundle, binstubs in stubs/) |
| 13 | +bundle install |
| 14 | + |
| 15 | +# Run all tests |
| 16 | +bundle exec rspec |
| 17 | + |
| 18 | +# Run a single test file |
| 19 | +bundle exec rspec spec/path/to/file_spec.rb |
| 20 | + |
| 21 | +# Run a specific test by line number |
| 22 | +bundle exec rspec spec/path/to/file_spec.rb:42 |
| 23 | + |
| 24 | +# Lint |
| 25 | +bundle exec rubocop |
| 26 | + |
| 27 | +# Lint with auto-correct |
| 28 | +bundle exec rubocop -a |
| 29 | + |
| 30 | +# Version bumping (do on develop branch, not master) |
| 31 | +bundle exec rake version:bump_patch |
| 32 | +bundle exec rake version:bump_minor |
| 33 | +bundle exec rake version:bump_major |
| 34 | + |
| 35 | +# Build and install gem |
| 36 | +bundle exec rake build |
| 37 | +bundle exec rake install |
| 38 | + |
| 39 | +# Release gem |
| 40 | +bundle exec rake release |
| 41 | +``` |
| 42 | + |
| 43 | +## Architecture |
| 44 | + |
| 45 | +The gem follows an ETL (Extract, Transform, Load) pattern using the Kiba gem: |
| 46 | + |
| 47 | +1. **Extract**: `DataSource` downloads zip files from GeoNames.org, extracts them, and prepares CSV files with headers |
| 48 | +2. **Source**: `CsvSource` (Kiba source) feeds rows from the prepared CSV into the pipeline |
| 49 | +3. **Load**: Four Kiba destination table classes write rows into an in-memory SQLite database |
| 50 | + |
| 51 | +### Key Flow |
| 52 | + |
| 53 | +`bin/free_zipcode_data` → `Runner#start` → `DataSource#download` → `DataSource#datafile` (extract zip + add CSV headers) → `SqliteRam` (in-memory DB) → `ETL::FreeZipcodeDataJob` (Kiba pipeline) → `SqliteRam#save_to_disk` |
| 54 | + |
| 55 | +### Core Classes |
| 56 | + |
| 57 | +- **`FreeZipcodeData::Runner`** - CLI entry point; parses args via Optimist, orchestrates the full pipeline |
| 58 | +- **`FreeZipcodeData::DataSource`** - Downloads and extracts GeoNames zip files, prepares CSV with headers |
| 59 | +- **`SqliteRam`** - Wraps SQLite3; works entirely in-memory then saves to disk via `SQLite3::Backup` |
| 60 | +- **`FreeZipcodeData::DbTable`** - Base class for all table classes; provides progress bar, SQL helpers, and country lookup from `country_lookup_table.yml` |
| 61 | +- **`FreeZipcodeData::CountryTable`/`StateTable`/`CountyTable`/`ZipcodeTable`** - Kiba destinations; each has `build` (creates schema + indexes) and `write` (inserts rows, swallows duplicate constraint violations) |
| 62 | +- **`ETL::FreeZipcodeDataJob`** - Configures the Kiba pipeline with one source and four destinations |
| 63 | +- **`CsvSource`** - Kiba-compatible CSV reader |
| 64 | + |
| 65 | +### Singletons |
| 66 | + |
| 67 | +`Options` and `Logger` are singletons (via Ruby's `Singleton` module). `Runner` has an `.instance` convenience class method (returns `new` each time, not cached). |
| 68 | + |
| 69 | +## Configuration |
| 70 | + |
| 71 | +- `.ruby-version`: 3.4.8 |
| 72 | +- Bundle path: `vendor/bundle` (binstubs in `stubs/`) |
| 73 | +- Environment: `APP_ENV` controls environment (`test`, `development`) |
| 74 | +- Config file: `~/.free_zipcode_data.yml` (overridable via `FZD_CONFIG_FILE` env var; uses `spec/fixtures/` version in test) |
| 75 | + |
| 76 | +## Rubocop |
| 77 | + |
| 78 | +Key style settings (`.rubocop.yml`): |
| 79 | +- Target Ruby 3.4 |
| 80 | +- Max line length: 110 |
| 81 | +- Max method length: 30 lines |
| 82 | +- `Style/ClassVars`, `Style/Documentation`, `Metrics/AbcSize`, `Lint/SuppressedException` disabled |
| 83 | +- `vendor/` and `stubs/` excluded |
| 84 | + |
| 85 | +## Git Workflow |
| 86 | + |
| 87 | +- `master` is the release branch |
| 88 | +- `develop` is the development branch |
| 89 | +- Version bumps should happen on `develop`, then merge to `master` before `rake release` |
0 commit comments