Instapaper
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎BUILD.md‎
Lines changed: 74 additions & 0 deletions b/‎BUILD.md‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎LICENSE‎
Lines changed: 21 additions & 0 deletions b/‎LICENSE‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 250 additions & 0 deletions b/‎README.md‎
Lines changed: 250 additions & 0 deletions
@@ -0,0 +1,2 @@
+__pycache__
+dist/
@@ -0,0 +1,74 @@
+# Build Instructions
+
+## Prerequisites
+
+- Python 3.10 or later
+- pip
+
+## Setup
+
+Install the runtime and build dependencies:
+
+```sh
+pip install -e .
+```
+
+For development (linting, type-checking, testing), also install the dev extras:
+
+```sh
+pip install -e ".[dev]"
+```
+
+For builds and deploys , also install the build extras:
+
+```sh
+pip install -e ".[build]"
+```
+
+## Running Tests
+
+```sh
+pytest
+```
+
+With coverage:
+
+```sh
+pytest --cov=instaparser
+```
+
+## Linting & Type Checking
+
+```sh
+ruff check instaparser/
+black --check instaparser/
+mypy instaparser/
+```
+
+## Building the Package
+
+Build both the sdist and wheel:
+
+```sh
+python -m build
+```
+
+The outputs will be in the `dist/` directory.
+
+## Publishing to PyPI
+
+Upload the built artifacts with twine:
+
+```sh
+twine upload dist/*
+```
+
+To test against Test PyPI first:
+
+```sh
+twine upload --repository testpypi dist/*
+```
+
+## Versioning
+
+The package version is defined in `instaparser/__init__.py` and read automatically by Hatch at build time via the `[tool.hatch.version]` configuration in `pyproject.toml`.
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 Instant Paper, Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,250 @@
+# Instaparser Python Library
+
+A Python client library for the [Instaparser API](https://www.instaparser.com), providing a simple and intuitive interface for parsing articles, generating summaries, and processing PDFs.
+
+## Installation
+
+```bash
+pip install instaparser
+```
+
+## Quick Start
+
+```python
+from instaparser import InstaparserClient
+
+# Initialize the client with your API key
+client = InstaparserClient(api_key="your-api-key")
+
+# Parse an article from a URL
+article = client.Article(url="https://example.com/article")
+
+# Access article properties
+print(article.title)
+print(article.body)  # HTML or text content
+print(article.author)
+print(article.words)
+```
+
+## Features
+
+- **Article Parsing**: Extract clean HTML or text from web articles
+- **Summary Generation**: Generate AI-powered summaries with key sentences
+- **PDF Processing**: Parse PDFs from URLs or file uploads
+- **Error Handling**: Comprehensive exception handling for API errors
+- **Type Hints**: Full type annotations for better IDE support
+
+## Usage
+
+### Article Parsing
+
+Parse articles from URLs or HTML content:
+
+```python
+from instaparser import InstaparserClient
+
+client = InstaparserClient(api_key="your-api-key")
+
+# Parse from URL (HTML output)
+article = client.Article(url="https://example.com/article")
+print(article.html)  # HTML content
+print(article.body)  # Same as html when output='html'
+
+# Parse from URL (text output)
+article = client.Article(url="https://example.com/article", output="text")
+print(article.text)  # Plain text content
+print(article.body)  # Same as text when output='text'
+
+# Parse from HTML content
+html_content = "<html><body><h1>Title</h1><p>Content</p></body></html>"
+article = client.Article(url="https://example.com/article", content=html_content)
+
+# Disable cache
+article = client.Article(url="https://example.com/article", use_cache=False)
+```
+
+### Article Properties
+
+The `Article` object provides access to all parsed metadata:
+
+```python
+article = client.Article(url="https://example.com/article")
+
+# Basic properties
+article.url          # Canonical URL
+article.title        # Article title
+article.site_name    # Website name
+article.author       # Author name
+article.date         # Published date (UNIX timestamp)
+article.description  # Article description
+article.thumbnail    # Thumbnail image URL
+article.words        # Word count
+article.is_rtl       # Right-to-left language flag
+
+# Content
+article.body         # HTML or text (depending on output format)
+article.html         # HTML content (if output='html')
+article.text         # Plain text (if output='text')
+
+# Media
+article.images       # List of images
+article.videos       # List of embedded videos
+```
+
+### Summary Generation
+
+Generate AI-powered summaries:
+
+```python
+# Generate summary
+summary = client.Summary(url="https://example.com/article")
+
+print(summary.overview)        # Concise summary
+print(summary.key_sentences)   # List of key sentences
+
+# Stream summary with callback (for real-time updates)
+def on_stream_line(line):
+    print(f"Streaming: {line}")
+
+summary = client.Summary(
+    url="https://example.com/article",
+    stream_callback=on_stream_line
+)
+```
+
+### PDF Processing
+
+Parse PDFs from URLs or files. The PDF class inherits from Article, so it has all the same properties:
+
+```python
+# Parse PDF from URL
+pdf = client.PDF(url="https://example.com/document.pdf")
+
+# Parse PDF from file
+with open('document.pdf', 'rb') as f:
+    pdf = client.PDF(file=f)
+
+# Parse PDF with text output
+pdf = client.PDF(url="https://example.com/document.pdf", output="text")
+print(pdf.text)
+print(pdf.body)  # Same as text when output='text'
+
+# Access all Article properties
+print(pdf.title)
+print(pdf.words)
+print(pdf.images)
+```
+
+## Error Handling
+
+The SDK provides specific exception types for different error scenarios:
+
+```python
+from instaparser import (
+    InstaparserClient,
+    InstaparserAuthenticationError,
+    InstaparserRateLimitError,
+    InstaparserValidationError,
+    InstaparserAPIError,
+)
+
+client = InstaparserClient(api_key="your-api-key")
+
+try:
+    article = client.Article(url="https://example.com/article")
+except InstaparserAuthenticationError:
+    print("Invalid API key")
+except InstaparserRateLimitError:
+    print("Rate limit exceeded")
+except InstaparserValidationError:
+    print("Invalid request parameters")
+except InstaparserAPIError as e:
+    print(f"API error: {e} (status: {e.status_code})")
+```
+
+## API Reference
+
+### InstaparserClient
+
+Main client class for interacting with the Instaparser API.
+
+#### `__init__(api_key: str)`
+
+Initialize the client.
+
+- `api_key`: Your Instaparser API key
+
+#### `Article(url: str, content: Optional[str] = None, output: str = 'html', use_cache: bool = True) -> Article`
+
+Parse an article from a URL or HTML content.
+
+- `url`: URL of the article (required)
+- `content`: Optional HTML content to parse instead of fetching from URL
+- `output`: Output format - `'html'` (default) or `'text'`
+- `use_cache`: Whether to use cache (default: `True`)
+
+Returns: `Article` object
+
+#### `Summary(url: str, content: Optional[str] = None, use_cache: bool = True, stream_callback: Optional[Callable[[str], None]] = None) -> Summary`
+
+Generate a summary of an article.
+
+- `url`: URL of the article (required)
+- `content`: Optional HTML content to parse instead of fetching from URL
+- `use_cache`: Whether to use cache (default: `True`)
+- `stream_callback`: Optional callback function called for each line of streaming response. If provided, enables streaming mode.
+
+Returns: `Summary` object with `key_sentences` and `overview` attributes
+
+#### `PDF(url: Optional[str] = None, file: Optional[Union[BinaryIO, bytes]] = None, output: str = 'html', use_cache: bool = True) -> PDF`
+
+Parse a PDF from a URL or file.
+
+- `url`: URL of the PDF (required for GET request)
+- `file`: PDF file to upload (required for POST request)
+- `output`: Output format - `'html'` (default) or `'text'`
+- `use_cache`: Whether to use cache (default: `True`)
+
+Returns: `PDF` object (inherits from `Article`)
+
+### Article
+
+Represents a parsed article from Instaparser.
+
+#### Properties
+
+- `url`: Canonical URL
+- `title`: Article title
+- `site_name`: Website name
+- `author`: Author name
+- `date`: Published date (UNIX timestamp)
+- `description`: Article description
+- `thumbnail`: Thumbnail image URL
+- `words`: Word count
+- `is_rtl`: Right-to-left language flag
+- `images`: List of images
+- `videos`: List of embedded videos
+- `body`: Article body (HTML or text)
+- `html`: HTML content (if output was 'html')
+- `text`: Plain text content (if output was 'text')
+
+### PDF
+
+Represents a parsed PDF from Instaparser. Inherits from `Article` and has all the same properties. PDFs always have `is_rtl=False` and `videos=[]`.
+
+### Summary
+
+Represents a summary result from Instaparser.
+
+#### Properties
+
+- `key_sentences`: List of key sentences extracted from the article
+- `overview`: Concise summary of the article
+
+## License
+
+MIT
+
+## Support
+
+For support, email support@instaparser.com or visit [https://www.instaparser.com](https://www.instaparser.com).