πŸ•ΈοΈ Ada Research Browser

README.md
← Back

CMMC Watch πŸ›‘οΈ

Automated daily news aggregator for CMMC/NIST compliance professionals.

Daily Build Live Site

CMMC Watch Python License


🎯 What is CMMC Watch?

CMMC Watch is a free, open-source news aggregator that automatically collects, categorizes, and publishes daily news about:

Live Site: cmmcwatch.com


πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/fubak/cmmcwatch.git
cd cmmcwatch

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env and add your API keys (see Configuration section)

# Run the pipeline
cd scripts
python main.py

The generated website will be in public/index.html.


πŸ”‘ Configuration

Required API Keys

Create a .env file in the project root with the following keys:

# AI Generation (at least one required)
GROQ_API_KEY=gsk_your_key_here           # Recommended - fastest, free tier
OPENROUTER_API_KEY=sk-or-your_key_here   # Backup
GOOGLE_AI_API_KEY=your_key_here          # Alternative

# Image Fetching (recommended)
PEXELS_API_KEY=your_key_here             # 200 req/hour free
UNSPLASH_ACCESS_KEY=your_key_here        # 50 req/hour free

# LinkedIn Scraping (optional)
APIFY_API_KEY=apify_api_your_key_here    # $5/month free tier
APIFY_ACTOR_ID=scraper-engine/linkedin-post-scraper

Get API Keys

Service Free Tier Sign Up
Groq Yes (30 req/min) console.groq.com
OpenRouter Yes ($1 credit) openrouter.ai
Google AI Yes (60 req/min) makersuite.google.com
Pexels Yes (200/hour) pexels.com/api
Unsplash Yes (50/hour) unsplash.com/developers
Apify Yes ($5/month) console.apify.com

Total Monthly Cost: $0 (all free tiers) πŸ’°


πŸ“š Data Sources

RSS Feeds (20 sources)

Reddit Communities (4 subreddits)

LinkedIn Influencers (4 profiles)

See SOURCES.md for complete list.


πŸ—οΈ Architecture

10-Step Pipeline

The pipeline runs automatically daily at 6 AM EST via GitHub Actions:

  1. Archive - Save previous website
  2. Collect Trends - Fetch from RSS, Reddit, LinkedIn
  3. Fetch Images - Download relevant images
  4. Generate Design - AI-powered design system
  5. Generate Editorial - Write daily summary article
  6. Build Website - Render HTML/CSS/JS
  7. Generate RSS - Create RSS feed
  8. PWA Assets - Service worker, manifest
  9. Generate Sitemap - SEO optimization
  10. Cleanup - Remove old archives (30-day retention)

Project Structure

cmmcwatch/
β”œβ”€β”€ scripts/              # Pipeline modules
β”‚   β”œβ”€β”€ main.py          # Pipeline orchestrator
β”‚   β”œβ”€β”€ collect_trends.py    # Data collection
β”‚   β”œβ”€β”€ fetch_images.py      # Image fetching
β”‚   β”œβ”€β”€ generate_design.py   # AI design generation
β”‚   β”œβ”€β”€ editorial_generator.py  # Article writing
β”‚   β”œβ”€β”€ build_website.py     # HTML generation
β”‚   └── config.py        # All settings
β”œβ”€β”€ templates/           # Jinja2 HTML templates
β”œβ”€β”€ public/             # Generated website
β”œβ”€β”€ data/               # Pipeline data (JSON)
└── .github/workflows/  # GitHub Actions

πŸ§ͺ Testing

Run tests:

pytest tests/

Run with coverage:

pytest --cov=scripts tests/

πŸ› οΈ Development

Local Development

# Run pipeline without archiving
cd scripts
python main.py --no-archive

# Dry run (collect data only, don't build)
python main.py --dry-run

# Run specific modules
python collect_trends.py     # Test data collection
python fetch_images.py       # Test image fetching
python editorial_generator.py --test  # Test article generation

Environment Variables

Check your .env configuration:

cd scripts
python -c "from config import *; import os; print('βœ“ Config loaded')"

Code Style

We use ruff for linting:

ruff check scripts/
ruff format scripts/

🚒 Deployment

GitHub Pages (Automatic)

The site automatically deploys to GitHub Pages on every push to main:

  1. GitHub Actions runs the pipeline
  2. Generates fresh content daily at 6 AM EST
  3. Deploys to https://cmmcwatch.com

Manual Deployment

# Run the pipeline
cd scripts
python main.py

# The output is in public/
# Deploy public/ to any static hosting service

πŸ“Š Monitoring

Build Status

Check GitHub Actions for build status.

If a build fails, an issue is automatically created with the error details.

API Usage

Monitor your API usage: - Groq: console.groq.com - Apify: console.apify.com/account/usage - Pexels: Dashboard at pexels.com


🀝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Contribution Ideas


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments


πŸ“ž Support


πŸ”’ Security

If you discover a security vulnerability, please email [security contact here] instead of using the issue tracker.


πŸ“ˆ Roadmap

Current (v1.0)

Planned (v1.1)

Future (v2.0)


Built with ❀️ for the CMMC compliance community