add RAM mode and overhaul the build pipeline

- introduce RAM build mode with in-memory preload/index datasets and stage timing
- refactor build orchestration, indexing, content processing, and template handling
- improve cache/rebuild logic and parallel worker execution across generators
- enhance posts/pages/tags/authors/archives/feed generation and related-post flow
- update CLI/config/README for new build options and performance tuning
- harden timing logic to handle locale-specific EPOCHREALTIME decimal separators
This commit is contained in:
Stefano Marinelli 2026-02-10 19:08:59 +01:00
parent e2822ad620
commit cbc08b06cc
19 changed files with 3757 additions and 751 deletions

204
README.md
View file

@ -28,19 +28,21 @@
- Generates HTML from Markdown using pandoc, commonmark, or markdown.pl (configurable)
- Supports post metadata (title, date, tags)
- Supports `lastmod` timestamp in frontmatter for tracking content updates (used in sitemap, RSS feed, and optionally displayed on posts).
- Supports `lastmod` timestamp in frontmatter for tracking content updates (used in sitemap, RSS feed, and optionally displayed on posts)
- Full date and time support with timezone awareness
- Post descriptions/summaries for previews, OpenGraph, and RSS
- Admin interface for managing posts and scheduling publications (planned for future release)
- Standalone post editor with modern Ghost-like interface for visual content creation
- Creates tag index pages
- Related Posts: Automatically suggests related posts based on shared tags at the end of each post
- Author index pages with conditional navigation menu
- Creates tag index pages with optional tag RSS feeds
- Related Posts: automatically suggests related posts based on shared tags at the end of each post
- Author index pages with conditional navigation menu and optional author RSS feeds
- Archives by year and month for chronological browsing
- Dynamic menu generation based on available pages
- Support for primary and secondary pages with automatic menu organization
- Generates sitemap.xml and RSS feed with timezone support
- Asset pre-compression: Can automatically create gzipped versions of text-based files (`.html`, `.css`, `.xml`, `.js`) during the build for servers that support serving pre-compressed content.
- Generates `sitemap.xml` and RSS feeds with timezone support
- Two build modes: `normal` (incremental, cache-backed) and `ram` (memory-first)
- RAM mode stage timing summary printed at the end of each RAM build
- Asset pre-compression with incremental and parallel gzip processing (`.html`, `.css`, `.xml`, `.js`)
- Clean design
- No JavaScript required (except for admin interface)
- Works well without images
@ -52,11 +54,9 @@
- Supports static files (images, CSS, JS, etc.)
- Configurable clean output directory option
- Draft posts support
- Post scheduling system
- Backup and restore functionality
- Incremental builds with file caching for improved performance
- Smart metadata caching system
- Parallel processing support using GNU parallel (if available)
- Incremental builds with file and metadata caching for improved performance
- Parallel processing with GNU parallel (if available) plus shell-worker fallbacks
- File locking for safe concurrent operations
- Automatic handling of different operating systems (Linux/macOS/BSDs)
- Custom URL slugs with SEO-friendly permalinks
@ -207,20 +207,25 @@ BSSG/
├── scripts/ # Supporting scripts
│ ├── build/ # Modular build scripts
│ │ ├── main.sh # Main build orchestrator
│ │ ├── utils.sh # Utility functions (colors, formatting, etc.)
│ │ ├── cli.sh # Command-line argument parsing
│ │ ├── config_loader.sh # Loads default and user configuration
│ │ ├── deps.sh # Dependency checking
│ │ ├── cache.sh # Cache management functions
│ │ ├── content_discovery.sh # Finds posts, pages, drafts
│ │ ├── markdown_processor.sh # Markdown conversion logic
│ │ ├── process_posts.sh # Processes individual posts
│ │ ├── process_pages.sh # Processes individual pages
│ │ ├── generate_indexes.sh # Creates index, tag, and archive pages
│ │ ├── generate_feeds.sh # Creates RSS feed and sitemap
│ │ ├── config_loader.sh # Loads defaults and local overrides
│ │ ├── deps.sh # Dependency checks
│ │ ├── cache.sh # Cache/config hash helpers
│ │ ├── content.sh # Metadata/excerpt/markdown helpers
│ │ ├── indexing.sh # File/tags/authors/archive index builders
│ │ ├── templates.sh # Template preload/menu generation
│ │ ├── generate_posts.sh # Post rendering
│ │ ├── generate_pages.sh # Static page rendering
│ │ ├── generate_index.sh # Homepage/pagination generation
│ │ ├── generate_tags.sh # Tag pages (+ optional tag RSS)
│ │ ├── generate_authors.sh # Author pages (+ optional author RSS)
│ │ ├── generate_archives.sh # Archive pages (year/month)
│ │ ├── generate_feeds.sh # Main RSS + sitemap
│ │ ├── generate_secondary_pages.sh # Creates pages.html index
│ │ ├── copy_static.sh # Copies static files and theme assets
│ │ └── theme_utils.sh # Theme-related utilities
│ │ ├── related_posts.sh # Related-post indexing/render helpers
│ │ ├── post_process.sh # URL rewrite + permissions fixes
│ │ ├── assets.sh # Static copy + CSS/theme handling
│ │ ├── ram_mode.sh # RAM-mode preload/in-memory datasets
│ │ └── utils.sh # Shared helpers (time, URLs, parallel)
│ ├── post.sh # Handles post creation
│ ├── page.sh # Handles page creation
│ ├── edit.sh # Handles post/page editing (updates lastmod)
@ -228,6 +233,8 @@ BSSG/
│ ├── list.sh # Lists posts, pages, drafts, tags
│ ├── backup.sh # Backup functionality
│ ├── restore.sh # Restore functionality
│ ├── benchmark.sh # Build benchmarking helper
│ ├── server.sh # Local development server implementation
│ ├── theme.sh # Theme management and processing (legacy helper)
│ ├── template.sh # Template processing utilities (legacy helper)
│ └── css.sh # CSS generation utilities (legacy helper)
@ -261,58 +268,33 @@ BSSG/
```bash
cd BSSG
./bssg.sh [command] [options]
./bssg.sh [--config <path>] [command] [options]
```
### Available Commands
```
Usage: ./bssg.sh command [options]
Usage: ./bssg.sh [--config <path>] command [options]
Commands:
post [-html] [draft_file] # Interactive: Create/edit post/draft, prompt for title, open editor.
# Rebuilds site afterwards if REBUILD_AFTER_POST=true in config.
# Use -html for HTML format.
post [-html] [draft_file]
Interactive: create/edit post or continue a draft.
post -t <title> [-T <tags>] [-s <slug>] [--html] [-d] {-c <content> | -f <file> | --stdin} [--build]
# Command-line: Create post non-interactively.
# -t: Title (required)
# -T: Tags (comma-sep)
# -s: Slug (optional)
# --html: HTML format (default: MD)
# -d: Save as draft
# -c: Content string
# -f: Content file
# --stdin: Content from stdin
# --build: Force rebuild (overrides REBUILD_AFTER_POST=false)
page [-html] [-s] [draft_file] Create a new page (in $PAGES_DIR or $DRAFTS_DIR/pages)
or continue editing a draft (in $DRAFTS_DIR/pages)
Use -html to edit in HTML instead of Markdown
Use -s to mark page as secondary (for menu)
edit [-n] <file> Edit an existing post/page/draft (updates lastmod)
File path should point to $SRC_DIR, $PAGES_DIR, $DRAFTS_DIR etc.
Use -n to rename based on title (posts/drafts only currently)
delete [-f] <file> Delete a post/page/draft
File path should point to $SRC_DIR, $PAGES_DIR, $DRAFTS_DIR etc.
Use -f to skip confirmation
list {posts|pages|drafts|tags [-n]}
List posts ($SRC_DIR), pages ($PAGES_DIR),
drafts ($DRAFTS_DIR and $DRAFTS_DIR/pages), or tags.
For tags, use -n to sort by count.
backup Create a backup of all posts, pages, drafts, and config
restore [backup_file|ID] Restore from a backup (all content by default)
Options: --no-content, --no-config
backups List all available backups
build [opts] Build the site using the modular build system in scripts/build/
Options: -c|--clean-output, -f|--force-rebuild,
--config FILE, --theme NAME,
--site-url URL, --output DIR
init <target_directory> Initialize a new, empty site structure in the specified directory.
This is useful for separating your site content from the BSSG core scripts.
The script will preserve the path format you provide (relative, absolute, or tilde-prefixed)
in the generated site 'config.sh.local' for portability.
Note: If using '~' for your home directory, quote the path (e.g., '~/mysite' or "~/mysite")
to ensure the tilde is preserved in the generated config.
help Show this help message
Command-line: create post non-interactively.
page [-html] [-s] [draft_file]
Create a page or continue a page draft.
edit [-n] <file> Edit an existing post/page/draft (updates lastmod).
delete [-f] <file> Delete a post/page/draft.
list List all posts.
tags [-n] List all tags. Use -n to sort by post count.
drafts List all draft posts.
backup Create a backup of posts, pages, drafts, and config.
restore [backup_file|ID] Restore from a backup (options: --no-content, --no-config).
backups List all available backups.
build [options] Build the site (run './bssg.sh build --help' for full options).
server [options] Build and run local server (run './bssg.sh server --help').
init <target_directory> Initialize a new site in the specified directory.
help Show help.
```
### Creating Posts and Pages
@ -469,23 +451,39 @@ You can use these options with restore to selectively restore content:
Usage: ./bssg.sh build [options]
Options:
-c, --clean-output Empty the output directory before building
--src DIR Override source directory (from config: SRC_DIR)
--pages DIR Override pages directory (from config: PAGES_DIR)
--drafts DIR Override drafts directory (from config: DRAFTS_DIR)
--output DIR Override output directory (from config: OUTPUT_DIR)
--templates DIR Override templates directory (from config: TEMPLATES_DIR)
--themes-dir DIR Override themes directory (from config: THEMES_DIR)
--theme NAME Override theme for this build
--static DIR Override static directory (from config: STATIC_DIR)
--clean-output [bool] Clean output directory before build (default from config)
-f, --force-rebuild Ignore cache and rebuild all files
--config FILE Use a specific configuration file (e.g., my_config.sh)
instead of the default config.sh
--src DIR Override the SRC_DIR specified in the config file
--pages DIR Override the PAGES_DIR specified in the config file
--drafts DIR Override the DRAFTS_DIR specified in the config file
--output DIR Build the site to a specific output directory
--templates DIR Override the TEMPLATES_DIR specified in the config file
--themes-dir DIR Override the THEMES_DIR specified in the config file
--theme NAME Override the theme specified in the config file for this build
--static DIR Override the STATIC_DIR specified in the config file
--site-url URL Override the SITE_URL specified in the config file for this build
--build-mode MODE Build mode: normal or ram
--site-title TITLE Override site title
--site-url URL Override site URL
--site-description DESC Override site description
--author-name NAME Override author name
--author-email EMAIL Override author email
--posts-per-page NUM Override pagination size
--deploy Force deployment after successful build (overrides config)
--no-deploy Prevent deployment after build (overrides config)
--no-deploy Skip deployment after build (overrides config)
--help Show build help
```
`--config <path>` is a global option and can be passed with any command (including `build`) to load a specific configuration file.
Examples:
```bash
./bssg.sh --config /path/to/site/config.sh.local build --build-mode ram
./bssg.sh build --output ./public --clean-output true
```
The option list above reflects the current `build --help` output.
### Internationalization (i18n)
BSSG supports generating the site in different languages.
@ -707,6 +705,16 @@ CLEAN_OUTPUT=false # If true, BSSG will always perform a full rebuild
REBUILD_AFTER_POST=true # Build site automatically after creating a new post (scripts/post.sh)
REBUILD_AFTER_EDIT=true # Build site automatically after editing a post (scripts/edit.sh)
PRECOMPRESS_ASSETS="false" # Options: "true", "false". If true, compress text assets (HTML, CSS, XML, JS) with gzip during build.
BUILD_MODE="normal" # Options: "normal", "ram". RAM mode preloads inputs and keeps build indexes/data in memory.
# Optional performance tunables (not required):
# RAM_MODE_MAX_JOBS=6 # Cap parallel workers in RAM mode (defaults to 6)
# RAM_MODE_VERBOSE=false # Extra RAM-mode debug/timing logs
# PRECOMPRESS_GZIP_LEVEL=9 # gzip level for precompression (1-9)
# PRECOMPRESS_MAX_JOBS=0 # 0=auto based on CPU/RAM mode cap
# PRECOMPRESS_VERBOSE=false # Verbose logs for precompression
# RAM_RSS_PREFILL_MIN_HITS=2 # RAM tag-RSS cache prefill threshold
# RAM_RSS_PREFILL_MAX_POSTS=24 # RAM tag-RSS prefill upper bound
# Customization
CUSTOM_CSS="" # Optional: Path to custom CSS file relative to output root (e.g., "/css/custom.css"). File should be placed in STATIC_DIR.
@ -1130,13 +1138,25 @@ The system maintains a cache of extracted metadata from markdown files to reduce
- File index information is stored in `.bssg_cache/file_index.txt`
- Tags index information is stored in `.bssg_cache/tags_index.txt`
### RAM Build Mode
BSSG supports a RAM-first build mode for faster full rebuilds and lower disk churn:
- Set `BUILD_MODE="ram"` in `config.sh.local`, or run `./bssg.sh build --build-mode ram`
- Source/posts/pages/templates/locales are preloaded in memory
- Build indexes (file/tags/authors/archive, plus page lists) are kept in memory
- RAM mode intentionally skips cache persistence and always behaves like an in-memory full rebuild
- A stage timing summary is printed at the end of RAM-mode builds
- On low-end disk-bound hosts, RAM mode can significantly reduce build time by avoiding repeated disk reads
### Parallel Processing
If GNU parallel is installed on your system, BSSG can process multiple files simultaneously:
BSSG uses multiple execution strategies to process files in parallel:
- Automatically detects GNU parallel and enables it for builds with many files
- Uses 80% of available CPU cores for optimal performance
- Falls back to sequential processing if parallel is not available
- Falls back to internal shell workers when GNU parallel is unavailable or unsuitable for a stage
- Auto-detects CPU core count for worker sizing
- In RAM mode, worker count is capped by `RAM_MODE_MAX_JOBS` (default: `6`) to reduce memory pressure
To take advantage of parallel processing, install GNU parallel:
@ -1151,6 +1171,10 @@ brew install parallel
pkg install parallel
```
### Real-World Result
On a single-core OpenBSD server with spinning disks, the maintainer observed build time dropping to about one third of the previous release when building with `BUILD_MODE="ram"`.
## Site Configuration
Key configuration options:
@ -1168,6 +1192,7 @@ DATE_FORMAT="%Y-%m-%d %H:%M:%S %z"
TIMEZONE="local" # Options: "local", "GMT", or a specific timezone
SHOW_TIMEZONE="false" # Options: "true", "false". Determines if the timezone offset (e.g., +0200) is shown in displayed dates.
POSTS_PER_PAGE=10
BUILD_MODE="normal" # "normal" (incremental cache-backed) or "ram" (memory-first)
ENABLE_ARCHIVES=true # Enable or disable archives by year/month
URL_SLUG_FORMAT="Year/Month/Day/slug" # Format for post URLs
RSS_ITEM_LIMIT=15 # Number of items to include in the RSS feed.
@ -1175,6 +1200,18 @@ RSS_INCLUDE_FULL_CONTENT="false" # Options: "true", "false". If set to "true", t
INDEX_SHOW_FULL_CONTENT="false" # Options: "true", "false". If set to "true", the full post content will be displayed on the homepage and paginated index pages instead of just the description/excerpt.
ENABLE_TAG_RSS=true # Options: "true", "false". If set to "true" (default), an additional RSS feed will be generated for each tag at `output/tags/<tag-slug>/rss.xml`.
# Precompression options
PRECOMPRESS_ASSETS="false" # Generate .gz siblings for changed text assets
# PRECOMPRESS_GZIP_LEVEL=9
# PRECOMPRESS_MAX_JOBS=0
# PRECOMPRESS_VERBOSE=false
# RAM-mode tuning (optional)
# RAM_MODE_MAX_JOBS=6
# RAM_MODE_VERBOSE=false
# RAM_RSS_PREFILL_MIN_HITS=2
# RAM_RSS_PREFILL_MAX_POSTS=24
# Related Posts configuration
ENABLE_RELATED_POSTS=true # Options: "true", "false". If set to "true" (default), related posts based on shared tags will be shown at the end of each post.
RELATED_POSTS_COUNT=3 # Number of related posts to display (default: 3, recommended maximum: 5).
@ -1284,4 +1321,3 @@ This project is licensed under the BSD 3-Clause License - see the LICENSE file f
- **Themes**: Explore the available themes in the `themes` directory.
- **Backup & Restore**: Use `./bssg.sh backup` and `./bssg.sh restore` to manage content backups.
- **Development Blog**: Stay up-to-date with the latest release notes, development progress, and announcements on the official BSSG Dev Blog: [https://blog.bssg.dragas.net](https://blog.bssg.dragas.net)

View file

@ -28,6 +28,7 @@ CLEAN_OUTPUT=false # If true, BSSG will always perform a full rebuild
REBUILD_AFTER_POST=true # Build site automatically after creating a new post (scripts/post.sh)
REBUILD_AFTER_EDIT=true # Build site automatically after editing a post (scripts/edit.sh)
PRECOMPRESS_ASSETS="false" # Options: "true", "false". If true, compress text assets (HTML, CSS, XML, JS) with gzip during build.
BUILD_MODE="ram" # Options: "normal", "ram". "ram" preloads inputs and keeps build state in memory (writes only output artifacts).
# Customization
CUSTOM_CSS="" # Optional: Path to custom CSS file relative to output root (e.g., "/css/custom.css"). File should be placed in STATIC_DIR.

View file

@ -99,10 +99,17 @@ fi
# Terminal colors (still needed here if config_loader doesn't export them, though it should)
# These are now primarily set and exported by config_loader.sh based on config files.
# The ':-' syntax provides a fallback if they somehow aren't set, using tput.
RED="${RED:-$(tput setaf 1)}"
GREEN="${GREEN:-$(tput setaf 2)}"
YELLOW="${YELLOW:-$(tput setaf 3)}"
NC="${NC:-$(tput sgr0)}" # Reset color
if [[ -t 1 ]] && command -v tput > /dev/null 2>&1 && tput setaf 1 > /dev/null 2>&1; then
RED="${RED:-$(tput setaf 1)}"
GREEN="${GREEN:-$(tput setaf 2)}"
YELLOW="${YELLOW:-$(tput setaf 3)}"
NC="${NC:-$(tput sgr0)}" # Reset color
else
RED="${RED:-}"
GREEN="${GREEN:-}"
YELLOW="${YELLOW:-}"
NC="${NC:-}"
fi
# Make sure all scripts are executable
chmod +x scripts/*.sh 2>/dev/null || true
@ -163,6 +170,7 @@ show_build_help() {
echo " --static DIR Override Static directory (from config: ${STATIC_DIR:-static})"
echo " --clean-output [bool] Clean output directory before building (default from config: ${CLEAN_OUTPUT:-false})"
echo " --force-rebuild, -f Force rebuild of all files regardless of modification time"
echo " --build-mode MODE Build mode: normal or ram (default from config: ${BUILD_MODE:-normal})"
echo " --site-title TITLE Override Site title"
echo " --site-url URL Override Site URL"
echo " --site-description DESC Override Site description"
@ -301,6 +309,22 @@ main() {
export FORCE_REBUILD=true
shift 1
;;
--build-mode)
if [[ -z "$2" || "$2" == -* ]]; then
echo -e "${RED}Error: --build-mode requires a value (normal|ram).${NC}" >&2
exit 1
fi
case "$2" in
normal|ram)
export BUILD_MODE="$2"
;;
*)
echo -e "${RED}Error: Invalid --build-mode '$2'. Use 'normal' or 'ram'.${NC}" >&2
exit 1
;;
esac
shift 2
;;
--site-title)
export SITE_TITLE="$2"
shift 2

View file

@ -34,6 +34,7 @@ RSS_FILENAME="${RSS_FILENAME:-rss.xml}" # Default RSS filename
INDEX_SHOW_FULL_CONTENT="${INDEX_SHOW_FULL_CONTENT:-false}" # Default: show excerpt on homepage
CLEAN_OUTPUT="${CLEAN_OUTPUT:-false}"
FORCE_REBUILD="${FORCE_REBUILD:-false}"
BUILD_MODE="${BUILD_MODE:-normal}" # Build mode: normal or ram
SITE_LANG="${SITE_LANG:-en}"
LOCALE_DIR="${LOCALE_DIR:-locales}"
PAGES_DIR="${PAGES_DIR:-pages}"
@ -62,11 +63,19 @@ BSSG_SERVER_HOST_DEFAULT="${BSSG_SERVER_HOST_DEFAULT:-localhost}"
CUSTOM_CSS="${CUSTOM_CSS:-}" # Default to empty string
# Define default colors here so utils.sh can use them if not overridden by config
RED="${RED:-$(tput setaf 1)}"
GREEN="${GREEN:-$(tput setaf 2)}"
YELLOW="${YELLOW:-$(tput setaf 3)}"
BLUE="${BLUE:-$(tput setaf 4)}" # Added Blue for print_info, using tput
NC="${NC:-$(tput sgr0)}" # No Color, using tput
if [[ -t 1 ]] && command -v tput > /dev/null 2>&1 && tput setaf 1 > /dev/null 2>&1; then
RED="${RED:-$(tput setaf 1)}"
GREEN="${GREEN:-$(tput setaf 2)}"
YELLOW="${YELLOW:-$(tput setaf 3)}"
BLUE="${BLUE:-$(tput setaf 4)}"
NC="${NC:-$(tput sgr0)}"
else
RED="${RED:-}"
GREEN="${GREEN:-}"
YELLOW="${YELLOW:-}"
BLUE="${BLUE:-}"
NC="${NC:-}"
fi
# --- Default Configuration Variables --- END ---
@ -219,7 +228,7 @@ BSSG_CONFIG_VARS_ARRAY=(
SITE_TITLE SITE_DESCRIPTION SITE_URL AUTHOR_NAME AUTHOR_EMAIL
DATE_FORMAT TIMEZONE SHOW_TIMEZONE POSTS_PER_PAGE RSS_ITEM_LIMIT RSS_INCLUDE_FULL_CONTENT RSS_FILENAME
INDEX_SHOW_FULL_CONTENT
CLEAN_OUTPUT FORCE_REBUILD SITE_LANG LOCALE_DIR PAGES_DIR MARKDOWN_PROCESSOR
CLEAN_OUTPUT FORCE_REBUILD BUILD_MODE SITE_LANG LOCALE_DIR PAGES_DIR MARKDOWN_PROCESSOR
MARKDOWN_PL_PATH ENABLE_ARCHIVES URL_SLUG_FORMAT PAGE_URL_FORMAT
DRAFTS_DIR REBUILD_AFTER_POST REBUILD_AFTER_EDIT
CUSTOM_CSS
@ -261,6 +270,7 @@ export RSS_FILENAME
export INDEX_SHOW_FULL_CONTENT
export CLEAN_OUTPUT
export FORCE_REBUILD
export BUILD_MODE
export SITE_LANG
export LOCALE_DIR
export PAGES_DIR
@ -316,4 +326,4 @@ export MSG_MONTH_09 MSG_MONTH_10 MSG_MONTH_11 MSG_MONTH_12
# --- Final Path Adjustments (after all sourcing) --- START ---
# Ensure relevant directory paths are exported if not already absolute.
# ... existing code ...
# --- Final Path Adjustments (after all sourcing) --- END ---
# --- Final Path Adjustments (after all sourcing) --- END ---

View file

@ -14,10 +14,29 @@ source "$(dirname "$0")/utils.sh" || { echo >&2 "Error: Failed to source utils.s
parse_metadata() {
local file="$1"
local field="$2"
local value=""
# RAM mode: parse directly from preloaded content to avoid disk/cache I/O.
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$file"; then
local file_content frontmatter
file_content=$(ram_mode_get_content "$file")
frontmatter=$(printf '%s\n' "$file_content" | awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { exit; }
}
in_fm { print; }
')
if [ -n "$frontmatter" ]; then
value=$(printf '%s\n' "$frontmatter" | grep -m 1 "^$field:[[:space:]]*" | cut -d ':' -f 2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
fi
echo "$value"
return 0
fi
# IMPORTANT: Assumes CACHE_DIR is exported/available
local cache_file="${CACHE_DIR:-.bssg_cache}/meta/$(basename "$file")"
local value=""
# Get locks for cache access
# IMPORTANT: Assumes lock_file/unlock_file are sourced/available
@ -70,9 +89,13 @@ extract_metadata() {
local file="$1"
local metadata_cache_file="${CACHE_DIR:-.bssg_cache}/meta/$(basename "$file")"
local frontmatter_changes_marker="${CACHE_DIR:-.bssg_cache}/frontmatter_changes_marker"
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$file"; then
ram_mode_active=true
fi
# Check if file exists
if [ ! -f "$file" ]; then
if ! $ram_mode_active && [ ! -f "$file" ]; then
echo "ERROR_FILE_NOT_FOUND"
return 1
fi
@ -81,7 +104,7 @@ extract_metadata() {
local frontmatter_changed=false
# Check if cache exists and is newer than the source file
if [ "${FORCE_REBUILD:-false}" = false ] && [ -f "$metadata_cache_file" ] && [ "$metadata_cache_file" -nt "$file" ]; then
if ! $ram_mode_active && [ "${FORCE_REBUILD:-false}" = false ] && [ -f "$metadata_cache_file" ] && [ "$metadata_cache_file" -nt "$file" ]; then
# Read from cache file (optimized - read once)
echo "$(cat "$metadata_cache_file")"
return 0
@ -98,25 +121,39 @@ extract_metadata() {
# Parse <meta> tags for HTML files
# Use grep -m 1 for efficiency, handle missing tags gracefully
# Note: This is basic parsing, assumes simple meta tag structure.
title=$(grep -m 1 -o '<title>[^<]*</title>' "$file" 2>/dev/null | sed -e 's/<title>//' -e 's/<\/title>//')
date=$(grep -m 1 -o 'name="date" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
lastmod=$(grep -m 1 -o 'name="lastmod" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
tags=$(grep -m 1 -o 'name="tags" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
slug=$(grep -m 1 -o 'name="slug" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image=$(grep -m 1 -o 'name="image" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image_caption=$(grep -m 1 -o 'name="image_caption" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
description=$(grep -m 1 -o 'name="description" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_name=$(grep -m 1 -o 'name="author_name" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_email=$(grep -m 1 -o 'name="author_email" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
local html_source=""
if $ram_mode_active; then
html_source=$(ram_mode_get_content "$file")
title=$(printf '%s\n' "$html_source" | grep -m 1 -o '<title>[^<]*</title>' 2>/dev/null | sed -e 's/<title>//' -e 's/<\/title>//')
date=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="date" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
lastmod=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="lastmod" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
tags=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="tags" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
slug=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="slug" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="image" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image_caption=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="image_caption" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
description=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="description" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_name=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="author_name" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_email=$(printf '%s\n' "$html_source" | grep -m 1 -o 'name="author_email" content="[^"]*"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
else
title=$(grep -m 1 -o '<title>[^<]*</title>' "$file" 2>/dev/null | sed -e 's/<title>//' -e 's/<\/title>//')
date=$(grep -m 1 -o 'name="date" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
lastmod=$(grep -m 1 -o 'name="lastmod" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
tags=$(grep -m 1 -o 'name="tags" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
slug=$(grep -m 1 -o 'name="slug" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image=$(grep -m 1 -o 'name="image" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
image_caption=$(grep -m 1 -o 'name="image_caption" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
description=$(grep -m 1 -o 'name="description" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_name=$(grep -m 1 -o 'name="author_name" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
author_email=$(grep -m 1 -o 'name="author_email" content="[^"]*"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
fi
# Note: Excerpt generation (fallback for description) might not work well for HTML
elif [[ "$file" == *.md ]]; then
# Parse YAML frontmatter for Markdown files
# Use awk with a here document for reliable script passing
# Run awk and read results
# Use a shared awk parser for both disk and RAM paths.
local parsed_data
parsed_data=$(awk -f - "$file" <<'EOF'
local awk_frontmatter_parser
awk_frontmatter_parser=$(cat <<'EOF'
BEGIN {
in_fm = 0;
found_fm = 0;
@ -162,6 +199,12 @@ extract_metadata() {
}
EOF
)
if $ram_mode_active; then
parsed_data=$(printf '%s\n' "$(ram_mode_get_content "$file")" | awk "$awk_frontmatter_parser")
else
parsed_data=$(awk "$awk_frontmatter_parser" "$file")
fi
IFS='|' read -r title date lastmod tags slug image image_caption description author_name author_email <<< "$parsed_data"
@ -207,7 +250,7 @@ EOF
local new_metadata="$title|$date|$lastmod|$tags|$slug|$image|$image_caption|$description|$author_name|$author_email"
# Check if there was a previous metadata file and compare
if [ -f "$metadata_cache_file" ]; then
if ! $ram_mode_active && [ -f "$metadata_cache_file" ]; then
local old_metadata=$(cat "$metadata_cache_file")
if [ "$old_metadata" != "$new_metadata" ]; then
frontmatter_changed=true
@ -215,13 +258,15 @@ EOF
fi
# Store all metadata in one write operation
lock_file "$metadata_cache_file"
mkdir -p "$(dirname "$metadata_cache_file")"
echo "$new_metadata" > "$metadata_cache_file"
unlock_file "$metadata_cache_file"
if ! $ram_mode_active; then
lock_file "$metadata_cache_file"
mkdir -p "$(dirname "$metadata_cache_file")"
echo "$new_metadata" > "$metadata_cache_file"
unlock_file "$metadata_cache_file"
fi
# If frontmatter has changed, update the marker file's timestamp
if $frontmatter_changed; then
if ! $ram_mode_active && $frontmatter_changed; then
touch "$frontmatter_changes_marker"
fi
@ -234,17 +279,30 @@ generate_excerpt() {
local file="$1"
local max_length="${2:-160}" # Default to 160 characters
# Extract content after frontmatter
local start_line=$(grep -n "^---$" "$file" | head -1 | cut -d: -f1)
local end_line=$(grep -n "^---$" "$file" | head -n 2 | tail -1 | cut -d: -f1)
local raw_content_stream
if [[ -n "$start_line" && -n "$end_line" && $start_line -lt $end_line ]]; then
# Stream content after frontmatter
raw_content_stream=$(tail -n +$((end_line + 1)) "$file")
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$file"; then
# Remove frontmatter directly from preloaded content
raw_content_stream=$(printf '%s\n' "$(ram_mode_get_content "$file")" | awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { in_fm = 0; next; }
}
{ if (!in_fm) print; }
')
else
# No valid frontmatter, stream the whole file
raw_content_stream=$(cat "$file")
# Extract content after frontmatter
local start_line end_line
start_line=$(grep -n "^---$" "$file" | head -1 | cut -d: -f1)
end_line=$(grep -n "^---$" "$file" | head -n 2 | tail -1 | cut -d: -f1)
if [[ -n "$start_line" && -n "$end_line" && $start_line -lt $end_line ]]; then
# Stream content after frontmatter
raw_content_stream=$(tail -n +$((end_line + 1)) "$file")
else
# No valid frontmatter, stream the whole file
raw_content_stream=$(cat "$file")
fi
fi
# Sanitize and extract the first non-empty paragraph/line
@ -324,26 +382,19 @@ convert_markdown_to_html() {
elif [ "$MARKDOWN_PROCESSOR" = "markdown.pl" ]; then
# Preprocess content to handle fenced code blocks for markdown.pl
local preprocessed_content="$content"
local temp_file
temp_file=$(mktemp)
# Use printf to avoid issues with content starting with -
printf '%s' "$preprocessed_content" > "$temp_file"
# Handle fenced code blocks (``` and ~~~) -> indented
# Requires awk
if command -v awk &> /dev/null; then
preprocessed_content=$(awk '
preprocessed_content=$(printf '%s' "$preprocessed_content" | awk '
BEGIN { in_code = 0; }
/^```[a-zA-Z0-9]*$/ || /^~~~[a-zA-Z0-9]*$/ { if (!in_code) { in_code = 1; print ""; next; } }
/^```$/ || /^~~~$/ { if (in_code) { in_code = 0; print ""; next; } }
{ if (in_code) { print " " $0; } else { print $0; } }
' "$temp_file")
rm "$temp_file"
')
else
echo -e "${YELLOW}Warning: awk not found, markdown.pl fenced code block conversion skipped.${NC}" >&2
# Content remains as original if awk fails
preprocessed_content=$(cat "$temp_file")
rm "$temp_file"
preprocessed_content="$content"
fi
# Ensure MARKDOWN_PL_PATH is set and executable
@ -366,4 +417,4 @@ convert_markdown_to_html() {
return 0
}
# --- Content Functions --- END ---
# --- Content Functions --- END ---

View file

@ -14,6 +14,312 @@ source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.s
# Helper Functions for Archive Generation
# ==============================================================================
_generate_ram_year_archive_page() {
local year="$1"
[ -z "$year" ] && return 0
local year_index_page="$OUTPUT_DIR/archives/$year/index.html"
mkdir -p "$(dirname "$year_index_page")"
local year_header="$HEADER_TEMPLATE"
local year_footer="$FOOTER_TEMPLATE"
local year_page_title="${MSG_ARCHIVES_FOR:-"Archives for"} $year"
local year_archive_rel_url="/archives/$year/"
year_header=${year_header//\{\{site_title\}\}/"$SITE_TITLE"}
year_header=${year_header//\{\{page_title\}\}/"$year_page_title"}
year_header=${year_header//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
year_header=${year_header//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
year_header=${year_header//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
year_header=${year_header//\{\{og_type\}\}/"website"}
year_header=${year_header//\{\{page_url\}\}/"$year_archive_rel_url"}
year_header=${year_header//\{\{site_url\}\}/"$SITE_URL"}
year_header=${year_header//\{\{og_image\}\}/""}
year_header=${year_header//\{\{twitter_image\}\}/""}
local year_schema_json
year_schema_json='<script type="application/ld+json">{"@context": "https://schema.org","@type": "CollectionPage","name": "'"$year_page_title"'","description": "Archive of posts from '"$year"'","url": "'"$SITE_URL$year_archive_rel_url"'","isPartOf": {"@type": "WebSite","name": "'"$SITE_TITLE"'","url": "'"$SITE_URL"'"}}</script>'
year_header=${year_header//\{\{schema_json_ld\}\}/"$year_schema_json"}
year_footer=${year_footer//\{\{current_year\}\}/$(date +%Y)}
year_footer=${year_footer//\{\{author_name\}\}/"$AUTHOR_NAME"}
{
echo "$year_header"
echo "<h1>$year_page_title</h1>"
echo "<ul class=\"month-list\">"
local month_key
for month_key in $(printf '%s\n' "${!month_posts[@]}" | awk -F'|' -v y="$year" '$1 == y { print $0 }' | sort -t'|' -k2,2nr); do
local month_num="${month_key#*|}"
local month_name="${month_name_map[$month_key]}"
local month_post_count
month_post_count=$(printf '%s\n' "${month_posts[$month_key]}" | awk 'NF { c++ } END { print c+0 }')
local month_idx_formatted
month_idx_formatted=$(printf "%02d" "$((10#$month_num))")
local month_var_name="MSG_MONTH_${month_idx_formatted}"
local current_month_name="${!month_var_name:-$month_name}"
local month_url
month_url=$(fix_url "/archives/$year/$month_idx_formatted/")
echo "<li><a href=\"$month_url\">$current_month_name ($month_post_count)</a></li>"
done
echo "</ul>"
echo "$year_footer"
} > "$year_index_page"
}
_generate_ram_month_archive_page() {
local month_key="$1"
[ -z "$month_key" ] && return 0
local year="${month_key%|*}"
local month_num="${month_key#*|}"
local month_idx_formatted
month_idx_formatted=$(printf "%02d" "$((10#$month_num))")
local month_index_page="$OUTPUT_DIR/archives/$year/$month_idx_formatted/index.html"
mkdir -p "$(dirname "$month_index_page")"
local month_name_var="MSG_MONTH_${month_idx_formatted}"
local month_name="${!month_name_var:-${month_name_map[$month_key]}}"
[ -z "$month_name" ] && month_name="Month $month_idx_formatted"
local month_header="$HEADER_TEMPLATE"
local month_footer="$FOOTER_TEMPLATE"
local month_page_title="${MSG_ARCHIVES_FOR:-"Archives for"} $month_name $year"
local month_archive_rel_url="/archives/$year/$month_idx_formatted/"
month_header=${month_header//\{\{site_title\}\}/"$SITE_TITLE"}
month_header=${month_header//\{\{page_title\}\}/"$month_page_title"}
month_header=${month_header//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
month_header=${month_header//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
month_header=${month_header//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
month_header=${month_header//\{\{og_type\}\}/"website"}
month_header=${month_header//\{\{page_url\}\}/"$month_archive_rel_url"}
month_header=${month_header//\{\{site_url\}\}/"$SITE_URL"}
month_header=${month_header//\{\{og_image\}\}/""}
month_header=${month_header//\{\{twitter_image\}\}/""}
local month_schema_json
month_schema_json='<script type="application/ld+json">{"@context": "https://schema.org","@type": "CollectionPage","name": "'"$month_page_title"'","description": "Archive of posts from '"$month_name $year"'","url": "'"$SITE_URL$month_archive_rel_url"'","isPartOf": {"@type": "WebSite","name": "'"$SITE_TITLE"'","url": "'"$SITE_URL"'"}}</script>'
month_header=${month_header//\{\{schema_json_ld\}\}/"$month_schema_json"}
month_footer=${month_footer//\{\{current_year\}\}/$(date +%Y)}
month_footer=${month_footer//\{\{author_name\}\}/"$AUTHOR_NAME"}
{
echo "$month_header"
echo "<h1>$month_page_title</h1>"
echo "<div class=\"posts-list\">"
while IFS='|' read -r _ _ _ title date lastmod filename slug image image_caption description author_name author_email; do
[ -z "$title" ] && continue
local post_year post_month post_day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
post_year="${BASH_REMATCH[1]}"
post_month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
post_day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
post_year=$(date +%Y); post_month=$(date +%m); post_day=$(date +%d)
fi
local url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$post_year}"
url_path="${url_path//Month/$post_month}"
url_path="${url_path//Day/$post_day}"
url_path="${url_path//slug/$slug}"
local post_url="/$(echo "$url_path" | sed 's|^/||; s|/*$|/|')"
post_url="${SITE_URL}${post_url}"
local display_date_format="$DATE_FORMAT"
if [ "${SHOW_TIMEZONE:-false}" = false ]; then
display_date_format=$(echo "$display_date_format" | sed -e 's/%[zZ]//g' -e 's/[[:space:]]*$//')
fi
local formatted_date
formatted_date=$(format_date "$date" "$display_date_format")
local display_author_name="${author_name:-${AUTHOR_NAME:-Anonymous}}"
cat << EOF
<article>
<h3><a href="${post_url}">$title</a></h3>
<div class="meta">${MSG_PUBLISHED_ON:-\"Published on\"} $formatted_date ${MSG_BY:-\"by\"} <strong>$display_author_name</strong></div>
EOF
if [ -n "$image" ]; then
local image_url
image_url=$(fix_url "$image")
local alt_text="${image_caption:-$title}"
local figcaption_content="${image_caption:-$title}"
cat << EOF
<figure class="featured-image tag-image">
<a href="${post_url}">
<img src="$image_url" alt="$alt_text" />
</a>
<figcaption>$figcaption_content</figcaption>
</figure>
EOF
fi
if [ -n "$description" ]; then
cat << EOF
<div class="summary">
$description
</div>
EOF
fi
cat << EOF
</article>
EOF
done < <(printf '%s\n' "${month_posts[$month_key]}" | awk 'NF' | sort -t'|' -k5,5r)
echo "</div>"
echo "$month_footer"
} > "$month_index_page"
}
_generate_archive_pages_ram() {
echo -e "${YELLOW}Processing archive pages...${NC}"
local archive_index_data
archive_index_data=$(ram_mode_get_dataset "archive_index")
if [ -z "$archive_index_data" ]; then
echo -e "${YELLOW}Warning: No archive index data in RAM. Skipping archive generation.${NC}"
return 0
fi
declare -A month_posts=()
declare -A month_name_map=()
declare -A year_map=()
local line
while IFS= read -r line; do
[ -z "$line" ] && continue
local year month month_name
IFS='|' read -r year month month_name _ <<< "$line"
[ -z "$year" ] && continue
[ -z "$month" ] && continue
local month_key="${year}|${month}"
month_posts["$month_key"]+="$line"$'\n'
month_name_map["$month_key"]="$month_name"
year_map["$year"]=1
done <<< "$archive_index_data"
local header_content="$HEADER_TEMPLATE"
local footer_content="$FOOTER_TEMPLATE"
header_content=${header_content//\{\{site_title\}\}/"$SITE_TITLE"}
header_content=${header_content//\{\{page_title\}\}/"${MSG_ARCHIVES:-"Archives"}"}
header_content=${header_content//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_type\}\}/"website"}
header_content=${header_content//\{\{page_url\}\}/"archives/"}
header_content=${header_content//\{\{site_url\}\}/"$SITE_URL"}
header_content=${header_content//\{\{og_image\}\}/""}
header_content=${header_content//\{\{twitter_image\}\}/""}
local schema_json_ld
schema_json_ld='<script type="application/ld+json">{"@context": "https://schema.org","@type": "CollectionPage","name": "Archives","description": "'"$SITE_DESCRIPTION"'","url": "'"$SITE_URL"'/archives/","isPartOf": {"@type": "WebSite","name": "'"$SITE_TITLE"'","url": "'"$SITE_URL"'"}}</script>'
header_content=${header_content//\{\{schema_json_ld\}\}/"$schema_json_ld"}
footer_content=${footer_content//\{\{current_year\}\}/$(date +%Y)}
footer_content=${footer_content//\{\{author_name\}\}/"$AUTHOR_NAME"}
local archives_index_page="$OUTPUT_DIR/archives/index.html"
mkdir -p "$(dirname "$archives_index_page")"
{
echo "$header_content"
echo "<h1>${MSG_ARCHIVES:-"Archives"}</h1>"
echo "<div class=\"archives-list year-list\">"
local year
for year in $(printf '%s\n' "${!year_map[@]}" | sort -nr); do
[ -z "$year" ] && continue
local year_url
year_url=$(fix_url "/archives/$year/")
echo " <h2><a href=\"$year_url\">$year</a></h2>"
echo " <ul class=\"month-list-detailed\">"
local month_key
for month_key in $(printf '%s\n' "${!month_posts[@]}" | awk -F'|' -v y="$year" '$1 == y { print $0 }' | sort -t'|' -k2,2nr); do
local month_num="${month_key#*|}"
local month_name="${month_name_map[$month_key]}"
local month_idx_formatted
month_idx_formatted=$(printf "%02d" "$((10#$month_num))")
local month_var_name="MSG_MONTH_${month_idx_formatted}"
local current_month_name="${!month_var_name:-$month_name}"
local month_url
month_url=$(fix_url "/archives/$year/$month_idx_formatted/")
local month_post_count
month_post_count=$(printf '%s\n' "${month_posts[$month_key]}" | awk 'NF { c++ } END { print c+0 }')
echo " <li>"
echo " <a href=\"$month_url\">$current_month_name ($month_post_count)</a>"
if [ "${ARCHIVES_LIST_ALL_POSTS:-false}" = true ] && [ "$month_post_count" -gt 0 ]; then
echo " <ul class=\"post-list-condensed-inline\">"
while IFS='|' read -r _ _ _ title date _ filename slug _ _ _ author_name author_email; do
[ -z "$title" ] && continue
local post_year post_month post_day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
post_year="${BASH_REMATCH[1]}"
post_month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
post_day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
post_year=$(date +%Y); post_month=$(date +%m); post_day=$(date +%d)
fi
local url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$post_year}"
url_path="${url_path//Month/$post_month}"
url_path="${url_path//Day/$post_day}"
url_path="${url_path//slug/$slug}"
local post_url="/$(echo "$url_path" | sed 's|^/||; s|/*$|/|')"
post_url=$(fix_url "$post_url")
local display_date
display_date=$(echo "$date" | cut -d' ' -f1)
echo " <li><a href=\"$post_url\">[$display_date] $title</a></li>"
done < <(printf '%s\n' "${month_posts[$month_key]}" | awk 'NF' | sort -t'|' -k5,5r)
echo " </ul>"
fi
echo " </li>"
done
echo " </ul>"
done
echo "</div>"
echo "$footer_content"
} > "$archives_index_page"
local year_count=${#year_map[@]}
local month_count=${#month_posts[@]}
local year_jobs month_jobs max_workers
max_workers=$(get_parallel_jobs)
year_jobs="$max_workers"
month_jobs="$max_workers"
if [ "$year_jobs" -gt "$year_count" ]; then
year_jobs="$year_count"
fi
if [ "$month_jobs" -gt "$month_count" ]; then
month_jobs="$month_count"
fi
if [ "$year_jobs" -gt 1 ] && [ "$year_count" -gt 1 ]; then
echo -e "${GREEN}Using shell parallel workers for ${year_count} RAM-mode year archive pages${NC}"
run_parallel "$year_jobs" < <(
while IFS= read -r year; do
[ -z "$year" ] && continue
printf "_generate_ram_year_archive_page '%s'\n" "$year"
done < <(printf '%s\n' "${!year_map[@]}" | sort -nr)
) || return 1
else
local year
for year in $(printf '%s\n' "${!year_map[@]}" | sort -nr); do
_generate_ram_year_archive_page "$year"
done
fi
if [ "$month_jobs" -gt 1 ] && [ "$month_count" -gt 1 ]; then
echo -e "${GREEN}Using shell parallel workers for ${month_count} RAM-mode monthly archive pages${NC}"
run_parallel "$month_jobs" < <(
while IFS= read -r month_key; do
[ -z "$month_key" ] && continue
printf "_generate_ram_month_archive_page '%s'\n" "$month_key"
done < <(printf '%s\n' "${!month_posts[@]}" | sort -t'|' -k1,1nr -k2,2nr)
) || return 1
else
local month_key
for month_key in $(printf '%s\n' "${!month_posts[@]}" | sort -t'|' -k1,1nr -k2,2nr); do
_generate_ram_month_archive_page "$month_key"
done
fi
echo -e "${GREEN}Archive page processing complete.${NC}"
}
# Check if the main archive index page needs rebuilding
_check_archive_index_rebuild_needed() {
local archive_index_file="$CACHE_DIR/archive_index.txt"
@ -307,7 +613,7 @@ process_single_month() {
# Generate header
local header_content="$HEADER_TEMPLATE"
local month_page_title="${MSG_ARCHIVES_FOR:-\"Archives for\"} $month_name $year"
local month_page_title="${MSG_ARCHIVES_FOR:-"Archives for"} $month_name $year"
local month_archive_rel_url="/archives/$year/$month_num/"
header_content=${header_content//\{\{site_title\}\}/"$SITE_TITLE"}
header_content=${header_content//\{\{page_title\}\}/"$month_page_title"}
@ -432,6 +738,11 @@ _process_single_month_parallel_wrapper() {
# Main Archive Generation Orchestrator
# ==============================================================================
generate_archive_pages() {
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
_generate_archive_pages_ram
return $?
fi
echo -e "${YELLOW}Processing archive pages...${NC}"
local archive_index_file="$CACHE_DIR/archive_index.txt"
@ -549,4 +860,4 @@ generate_archive_pages() {
}
# Make the function available for sourcing
export -f generate_archive_pages
export -f generate_archive_pages

View file

@ -13,8 +13,204 @@ source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.s
# shellcheck source=generate_feeds.sh disable=SC1091
source "$(dirname "$0")/generate_feeds.sh" || { echo >&2 "Error: Failed to source generate_feeds.sh from generate_authors.sh"; exit 1; }
_generate_author_pages_ram() {
echo -e "${YELLOW}Processing author pages${NC}${ENABLE_AUTHOR_RSS:+" and RSS feeds"}...${NC}"
local authors_index_data
authors_index_data=$(ram_mode_get_dataset "authors_index")
local main_authors_index_output="$OUTPUT_DIR/authors/index.html"
mkdir -p "$OUTPUT_DIR/authors"
if [ -z "$authors_index_data" ]; then
echo -e "${YELLOW}No authors found in RAM index. Skipping author page generation.${NC}"
return 0
fi
declare -A author_posts_by_slug=()
declare -A author_name_by_slug=()
declare -A author_email_by_slug=()
local line author author_slug author_email
while IFS= read -r line; do
[ -z "$line" ] && continue
IFS='|' read -r author author_slug author_email _ <<< "$line"
[ -z "$author" ] && continue
[ -z "$author_slug" ] && continue
if [[ -z "${author_name_by_slug[$author_slug]+_}" ]]; then
author_name_by_slug["$author_slug"]="$author"
author_email_by_slug["$author_slug"]="$author_email"
fi
author_posts_by_slug["$author_slug"]+="$line"$'\n'
done <<< "$authors_index_data"
local author_slug_key
for author_slug_key in $(printf '%s\n' "${!author_name_by_slug[@]}" | sort); do
author="${author_name_by_slug[$author_slug_key]}"
local author_data="${author_posts_by_slug[$author_slug_key]}"
local author_page_html_file="$OUTPUT_DIR/authors/$author_slug_key/index.html"
local author_rss_file="$OUTPUT_DIR/authors/$author_slug_key/${RSS_FILENAME:-rss.xml}"
local author_page_rel_url="authors/${author_slug_key}/"
local author_rss_rel_url="/authors/${author_slug_key}/${RSS_FILENAME:-rss.xml}"
local post_count
post_count=$(printf '%s\n' "$author_data" | awk 'NF { c++ } END { print c+0 }')
mkdir -p "$(dirname "$author_page_html_file")"
local author_page_content=""
author_page_content+="<h1>${MSG_POSTS_BY:-Posts by} $author</h1>"$'\n'
if [ "${ENABLE_AUTHOR_RSS:-false}" = true ]; then
author_page_content+="<p><a href=\"$author_rss_rel_url\">${MSG_RSS_FEED:-RSS Feed}</a></p>"$'\n'
fi
author_page_content+="<div class=\"posts-list\">"$'\n'
while IFS='|' read -r author_name_inner author_slug_inner author_email_inner post_title post_date post_lastmod post_filename post_slug post_image post_image_caption post_description; do
[ -z "$post_title" ] && continue
local post_url
if [ -n "$post_date" ] && [[ "$post_date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
local year month day url_path
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$year}"
url_path="${url_path//Month/$month}"
url_path="${url_path//Day/$day}"
url_path="${url_path//slug/$post_slug}"
post_url="/$(echo "$url_path" | sed 's|^/||; s|/*$|/|')"
else
post_url="/$(echo "$post_slug" | sed 's|^/||; s|/*$|/|')"
fi
post_url="${BASE_URL}${post_url}"
local formatted_date
formatted_date=$(format_date "$post_date")
author_page_content+="<article>"$'\n'
author_page_content+=" <h2><a href=\"$post_url\">$post_title</a></h2>"$'\n'
author_page_content+=" <div class=\"meta\">"$'\n'
author_page_content+=" <time datetime=\"$post_date\">$formatted_date</time>"$'\n'
author_page_content+=" </div>"$'\n'
if [ -n "$post_description" ]; then
author_page_content+=" <p class=\"summary\">$post_description</p>"$'\n'
fi
if [ -n "$post_image" ]; then
author_page_content+=" <div class=\"author-image\">"$'\n'
author_page_content+=" <img src=\"$post_image\" alt=\"$post_image_caption\" loading=\"lazy\">"$'\n'
author_page_content+=" </div>"$'\n'
fi
author_page_content+="</article>"$'\n'
done < <(printf '%s\n' "$author_data" | awk 'NF' | sort -t'|' -k5,5r)
author_page_content+="</div>"$'\n'
local page_title="${MSG_POSTS_BY:-Posts by} $author"
local page_description="${MSG_POSTS_BY:-Posts by} $author - $post_count ${MSG_POSTS:-posts}"
local header_content="$HEADER_TEMPLATE"
local footer_content="$FOOTER_TEMPLATE"
header_content=${header_content//\{\{site_title\}\}/"$SITE_TITLE"}
header_content=${header_content//\{\{page_title\}\}/"$page_title"}
header_content=${header_content//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_description\}\}/"$page_description"}
header_content=${header_content//\{\{twitter_description\}\}/"$page_description"}
header_content=${header_content//\{\{og_type\}\}/"website"}
header_content=${header_content//\{\{page_url\}\}/"$author_page_rel_url"}
header_content=${header_content//\{\{site_url\}\}/"$SITE_URL"}
header_content=${header_content//\{\{og_image\}\}/}
header_content=${header_content//\{\{twitter_image\}\}/}
header_content=${header_content//<!-- bssg:tag_rss_link -->/}
if [ "${ENABLE_AUTHOR_RSS:-false}" = true ]; then
local author_rss_link="<link rel=\"alternate\" type=\"application/rss+xml\" title=\"$author RSS Feed\" href=\"$SITE_URL$author_rss_rel_url\">"
header_content=${header_content//<!-- bssg:tag_rss_link -->/$author_rss_link}
fi
local schema_json
schema_json="{\"@context\": \"https://schema.org\",\"@type\": \"CollectionPage\",\"name\": \"$page_title\",\"description\": \"$page_description\",\"url\": \"$SITE_URL$author_page_rel_url\",\"isPartOf\": {\"@type\": \"WebSite\",\"name\": \"$SITE_TITLE\",\"url\": \"$SITE_URL\"}}"
header_content=${header_content//\{\{schema_json_ld\}\}/"<script type=\"application/ld+json\">$schema_json</script>"}
local current_year
current_year=$(date +%Y)
footer_content=${footer_content//\{\{current_year\}\}/"$current_year"}
footer_content=${footer_content//\{\{author_name\}\}/"$AUTHOR_NAME"}
footer_content=${footer_content//\{\{all_rights_reserved\}\}/"${MSG_ALL_RIGHTS_RESERVED:-All rights reserved.}"}
{
echo "$header_content"
echo "$author_page_content"
echo "$footer_content"
} > "$author_page_html_file"
if [ "${ENABLE_AUTHOR_RSS:-false}" = true ]; then
local author_post_data
author_post_data=$(printf '%s\n' "$author_data" | awk 'NF' | sort -t'|' -k5,5r | awk -F'|' '{
author_name = $1
author_email = $3
title = $4
date = $5
lastmod = $6
filename = $7
post_slug = $8
image = $9
image_caption = $10
description = $11
printf "%s|%s|%s|%s|%s||%s|%s|%s|%s|%s|%s\n", filename, filename, title, date, lastmod, post_slug, image, image_caption, description, author_name, author_email
}')
_generate_rss_feed "$author_rss_file" "$SITE_TITLE - ${MSG_POSTS_BY:-Posts by} $author" "${MSG_POSTS_BY:-Posts by} $author" "$author_page_rel_url" "$author_rss_rel_url" "$author_post_data"
fi
done
local page_title="${MSG_ALL_AUTHORS:-All Authors}"
local page_description="${MSG_ALL_AUTHORS:-All Authors} - $SITE_DESCRIPTION"
local header_content="$HEADER_TEMPLATE"
local footer_content="$FOOTER_TEMPLATE"
local main_content=""
header_content=${header_content//\{\{site_title\}\}/"$SITE_TITLE"}
header_content=${header_content//\{\{page_title\}\}/"$page_title"}
header_content=${header_content//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_description\}\}/"$page_description"}
header_content=${header_content//\{\{twitter_description\}\}/"$page_description"}
header_content=${header_content//\{\{og_type\}\}/"website"}
header_content=${header_content//\{\{page_url\}\}/"authors/"}
header_content=${header_content//\{\{site_url\}\}/"$SITE_URL"}
header_content=${header_content//\{\{og_image\}\}/}
header_content=${header_content//\{\{twitter_image\}\}/}
header_content=${header_content//<!-- bssg:tag_rss_link -->/}
local schema_json
schema_json="{\"@context\": \"https://schema.org\",\"@type\": \"CollectionPage\",\"name\": \"$page_title\",\"description\": \"List of all authors on $SITE_TITLE\",\"url\": \"$SITE_URL/authors/\",\"isPartOf\": {\"@type\": \"WebSite\",\"name\": \"$SITE_TITLE\",\"url\": \"$SITE_URL\"}}"
header_content=${header_content//\{\{schema_json_ld\}\}/"<script type=\"application/ld+json\">$schema_json</script>"}
local current_year
current_year=$(date +%Y)
footer_content=${footer_content//\{\{current_year\}\}/"$current_year"}
footer_content=${footer_content//\{\{author_name\}\}/"$AUTHOR_NAME"}
footer_content=${footer_content//\{\{all_rights_reserved\}\}/"${MSG_ALL_RIGHTS_RESERVED:-All rights reserved.}"}
main_content+="<h1>${MSG_ALL_AUTHORS:-All Authors}</h1>"$'\n'
main_content+="<div class=\"tags-list\">"$'\n'
for author_slug_key in $(printf '%s\n' "${!author_name_by_slug[@]}" | sort); do
author="${author_name_by_slug[$author_slug_key]}"
local post_count
post_count=$(printf '%s\n' "${author_posts_by_slug[$author_slug_key]}" | awk 'NF { c++ } END { print c+0 }')
if [ "$post_count" -gt 0 ]; then
main_content+=" <a href=\"$BASE_URL/authors/$author_slug_key/\">$author <span class=\"tag-count\">($post_count)</span></a>"$'\n'
fi
done
main_content+="</div>"$'\n'
{
echo "$header_content"
echo "$main_content"
echo "$footer_content"
} > "$main_authors_index_output"
echo -e "${GREEN}Author pages processed!${NC}"
echo -e "${GREEN}Generated author list pages.${NC}"
}
# Generate author pages
generate_author_pages() {
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
_generate_author_pages_ram
return $?
fi
echo -e "${YELLOW}Processing author pages${NC}${ENABLE_AUTHOR_RSS:+" and RSS feeds"}...${NC}"
local authors_index_file="$CACHE_DIR/authors_index.txt"
@ -473,4 +669,4 @@ generate_author_pages() {
echo -e "${GREEN}Author pages processed!${NC}"
echo -e "${GREEN}Generated author list pages.${NC}"
}
}

View file

@ -14,6 +14,180 @@ source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.s
source "$(dirname "$0")/content.sh" || { echo >&2 "Error: Failed to source content.sh from generate_feeds.sh"; exit 1; }
# Note: Needs access to primary_pages and SECONDARY_PAGES which should be exported by templates.sh
declare -gA BSSG_RAM_RSS_FULL_CONTENT_CACHE=()
declare -g BSSG_RAM_RSS_FULL_CONTENT_CACHE_READY=false
declare -gA BSSG_RAM_RSS_PUBDATE_CACHE=()
declare -gA BSSG_RAM_RSS_UPDATED_ISO_CACHE=()
declare -gA BSSG_RAM_RSS_URL_CACHE=()
declare -gA BSSG_RAM_RSS_ITEM_XML_CACHE=()
declare -g BSSG_RAM_RSS_METADATA_CACHE_READY=false
_normalize_relative_url_path() {
local path="$1"
while [[ "$path" == */ ]]; do
path="${path%/}"
done
path="${path#/}"
if [ -z "$path" ]; then
printf '/'
else
printf '/%s/' "$path"
fi
}
_ram_strip_frontmatter_for_rss() {
awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { in_fm = 0; next; }
}
{ if (!in_fm) print; }
'
}
_ram_cache_full_content_for_file() {
local file="$1"
local resolved="$file"
if declare -F ram_mode_resolve_key > /dev/null; then
resolved=$(ram_mode_resolve_key "$file")
fi
if [[ -z "$resolved" ]]; then
return 1
fi
if [[ -n "${BSSG_RAM_RSS_FULL_CONTENT_CACHE[$resolved]+_}" ]]; then
return 0
fi
if ! declare -F ram_mode_has_file > /dev/null || ! ram_mode_has_file "$resolved"; then
return 1
fi
local raw_content
raw_content=$(ram_mode_get_content "$resolved")
local stripped_content
stripped_content=$(printf '%s\n' "$raw_content" | _ram_strip_frontmatter_for_rss)
local converted_html
converted_html=$(convert_markdown_to_html "$stripped_content" "$resolved")
local convert_status=$?
if [ $convert_status -ne 0 ] || [ -z "$converted_html" ]; then
return 1
fi
BSSG_RAM_RSS_FULL_CONTENT_CACHE["$resolved"]="$converted_html"
return 0
}
prepare_ram_rss_full_content_cache() {
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RSS_INCLUDE_FULL_CONTENT:-false}" != true ]; then
return 0
fi
if [ "$BSSG_RAM_RSS_FULL_CONTENT_CACHE_READY" = true ]; then
return 0
fi
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
BSSG_RAM_RSS_FULL_CONTENT_CACHE_READY=true
return 0
fi
local file filename title date lastmod tags slug image image_caption description author_name author_email
while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
[ -z "$file" ] && continue
_ram_cache_full_content_for_file "$file" > /dev/null || true
done <<< "$file_index_data"
BSSG_RAM_RSS_FULL_CONTENT_CACHE_READY=true
}
_ram_prime_rss_metadata_entry() {
local date="$1"
local lastmod="$2"
local slug="$3"
local rss_date_fmt="$4"
local build_timestamp_iso="$5"
local source_file="$6"
if [ -n "$date" ] && [[ -z "${BSSG_RAM_RSS_PUBDATE_CACHE[$date]+_}" ]]; then
BSSG_RAM_RSS_PUBDATE_CACHE["$date"]=$(format_date "$date" "$rss_date_fmt")
fi
if [ -n "$lastmod" ] && [[ -z "${BSSG_RAM_RSS_UPDATED_ISO_CACHE[$lastmod]+_}" ]]; then
local updated_date_iso
updated_date_iso=$(format_date "$lastmod" "%Y-%m-%dT%H:%M:%S%z")
if [[ "$updated_date_iso" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
updated_date_iso="${updated_date_iso::${#updated_date_iso}-2}:${BASH_REMATCH[2]}"
fi
[ -z "$updated_date_iso" ] && updated_date_iso="$build_timestamp_iso"
BSSG_RAM_RSS_UPDATED_ISO_CACHE["$lastmod"]="$updated_date_iso"
fi
if [ -n "$date" ] && [ -n "$slug" ]; then
local url_key="${date}|${slug}"
if [[ -z "${BSSG_RAM_RSS_URL_CACHE[$url_key]+_}" ]]; then
local year month day formatted_path item_url
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
if [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo "Warning: Invalid date format '$date' for file $source_file, cannot precompute RSS URL." >&2
fi
return 1
fi
formatted_path="${URL_SLUG_FORMAT//Year/$year}"
formatted_path="${formatted_path//Month/$month}"
formatted_path="${formatted_path//Day/$day}"
formatted_path="${formatted_path//slug/$slug}"
item_url=$(_normalize_relative_url_path "$formatted_path")
BSSG_RAM_RSS_URL_CACHE["$url_key"]=$(fix_url "$item_url")
fi
fi
return 0
}
prepare_ram_rss_metadata_cache() {
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
return 0
fi
if [ "$BSSG_RAM_RSS_METADATA_CACHE_READY" = true ]; then
return 0
fi
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
BSSG_RAM_RSS_METADATA_CACHE_READY=true
return 0
fi
local rss_date_fmt="%a, %d %b %Y %H:%M:%S %z"
local build_timestamp_iso
build_timestamp_iso=$(format_date "now" "%Y-%m-%dT%H:%M:%S%z")
if [[ "$build_timestamp_iso" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
build_timestamp_iso="${build_timestamp_iso::${#build_timestamp_iso}-2}:${BASH_REMATCH[2]}"
fi
local file filename title date lastmod tags slug image image_caption description author_name author_email
while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
[ -z "$file" ] && continue
_ram_prime_rss_metadata_entry "$date" "$lastmod" "$slug" "$rss_date_fmt" "$build_timestamp_iso" "$file" >/dev/null || true
done <<< "$file_index_data"
BSSG_RAM_RSS_METADATA_CACHE_READY=true
}
# Function to get the latest lastmod date from a file index, optionally filtered
# Usage: get_latest_mod_date <index_file> [field_index] [filter_pattern] [date_format]
# Example: get_latest_mod_date "$file_index" 5 "" "%Y-%m-%d" # Latest overall post
@ -53,6 +227,212 @@ get_latest_mod_date() {
fi
}
# Fast path for RAM datasets: pick max YYYY-MM-DD from a given field without external sort/head.
_ram_latest_date_from_dataset() {
local dataset="$1"
local field_index="$2"
local date_format="${3:-%Y-%m-%d}"
local latest_date_str
latest_date_str=$(printf '%s\n' "$dataset" | awk -F'|' -v field_index="$field_index" '
NF {
value = substr($field_index, 1, 10)
if (value != "" && value > max_date) {
max_date = value
}
}
END {
if (max_date != "") {
print max_date
}
}
')
if [ -n "$latest_date_str" ]; then
printf '%s\n' "$latest_date_str"
else
format_date "now" "$date_format"
fi
}
_generate_sitemap_with_awk_inputs() {
local sitemap="$1"
local file_index_input="$2"
local primary_pages_input="$3"
local secondary_pages_input="$4"
local tags_index_input="$5"
local authors_index_input="$6"
local latest_post_mod_date="$7"
local latest_tag_page_mod_date="$8"
local latest_author_page_mod_date="$9"
local sitemap_date_fmt="${10:-%Y-%m-%d}"
# Determine the best awk command locally to avoid potential scoping issues with AWK_CMD.
local effective_awk_cmd="awk"
if command -v gawk > /dev/null 2>&1; then
effective_awk_cmd="gawk"
fi
"$effective_awk_cmd" -v site_url="$SITE_URL" \
-v url_slug_format="$URL_SLUG_FORMAT" \
-v latest_post_mod_date="$latest_post_mod_date" \
-v latest_tag_page_mod_date="$latest_tag_page_mod_date" \
-v latest_author_page_mod_date="$latest_author_page_mod_date" \
-v enable_author_pages="${ENABLE_AUTHOR_PAGES:-true}" \
-v sitemap_date_fmt="$sitemap_date_fmt" \
-F'|' \
-f - \
"$file_index_input" "$primary_pages_input" "$secondary_pages_input" "$tags_index_input" "$authors_index_input" <<'AWK_EOF' > "$sitemap"
# AWK script for sitemap generation.
BEGIN {
OFS = ""
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
print "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">"
# Homepage
print " <url>"
print " <loc>" fix_url_awk("/", site_url) "</loc>"
print " <lastmod>" latest_post_mod_date "</lastmod>"
print " <changefreq>daily</changefreq>"
print " <priority>1.0</priority>"
print " </url>"
}
function fix_url_awk(path, base_url) {
if (substr(path, 1, 1) == "/") {
sub(/\/$/, "", base_url)
sub(/^\/+/, "/", path)
sub(/\/index\.html$/, "/", path)
if (substr(path, length(path), 1) != "/") {
path = path "/"
}
if (base_url == "" || base_url ~ /^http:\/\/localhost(:[0-9]+)?$/) {
return path
} else {
return base_url path
}
} else {
return path
}
}
# Process file_index (posts).
FILENAME == ARGV[1] {
file = $1
date = $4
lastmod = $5
slug = $7
if (length(file) == 0 || length(date) == 0 || length(lastmod) == 0 || length(slug) == 0) next
year = substr(date, 1, 4)
month = substr(date, 6, 2)
day = substr(date, 9, 2)
if (year ~ /^[0-9]{4}$/ && month ~ /^[0-9]{2}$/ && day ~ /^[0-9]{2}$/) {
formatted_path = url_slug_format
gsub(/Year/, year, formatted_path)
gsub(/Month/, month, formatted_path)
gsub(/Day/, day, formatted_path)
gsub(/slug/, slug, formatted_path)
item_url = "/" formatted_path
sub(/\/+$/, "/", item_url)
mod_time = substr(lastmod, 1, 10)
if (mod_time == "") next
print " <url>"
print " <loc>" fix_url_awk(item_url, site_url) "</loc>"
print " <lastmod>" mod_time "</lastmod>"
print " <changefreq>weekly</changefreq>"
print " <priority>0.8</priority>"
print " </url>"
}
}
# Process primary pages.
FILENAME == ARGV[2] {
url = $2
date = $3
if (length(url) == 0 || length(date) == 0) next
sitemap_url = url
sub(/index\.html$/, "", sitemap_url)
sub(/\/+$/, "/", sitemap_url)
mod_time = substr(date, 1, 10)
if (mod_time == "") next
print " <url>"
print " <loc>" fix_url_awk(sitemap_url, site_url) "</loc>"
print " <lastmod>" mod_time "</lastmod>"
print " <changefreq>monthly</changefreq>"
print " <priority>0.7</priority>"
print " </url>"
}
# Process secondary pages.
FILENAME == ARGV[3] {
url = $2
date = $3
if (length(url) == 0 || length(date) == 0) next
sitemap_url = url
sub(/index\.html$/, "", sitemap_url)
sub(/\/+$/, "/", sitemap_url)
mod_time = substr(date, 1, 10)
if (mod_time == "") next
print " <url>"
print " <loc>" fix_url_awk(sitemap_url, site_url) "</loc>"
print " <lastmod>" mod_time "</lastmod>"
print " <changefreq>monthly</changefreq>"
print " <priority>0.6</priority>"
print " </url>"
}
# Process tags index.
FILENAME == ARGV[4] {
tag_slug = $2
if (length(tag_slug) == 0) next
if (!(tag_slug in processed_tags)) {
processed_tags[tag_slug] = 1
item_url = "/tags/" tag_slug "/"
print " <url>"
print " <loc>" fix_url_awk(item_url, site_url) "</loc>"
print " <lastmod>" latest_tag_page_mod_date "</lastmod>"
print " <changefreq>weekly</changefreq>"
print " <priority>0.5</priority>"
print " </url>"
}
}
# Process authors index.
FILENAME == ARGV[5] && enable_author_pages == "true" {
author_slug = $2
if (length(author_slug) == 0) next
if (!(author_slug in processed_authors)) {
processed_authors[author_slug] = 1
if (!authors_index_added) {
authors_index_added = 1
print " <url>"
print " <loc>" fix_url_awk("/authors/", site_url) "</loc>"
print " <lastmod>" latest_author_page_mod_date "</lastmod>"
print " <changefreq>weekly</changefreq>"
print " <priority>0.6</priority>"
print " </url>"
}
item_url = "/authors/" author_slug "/"
print " <url>"
print " <loc>" fix_url_awk(item_url, site_url) "</loc>"
print " <lastmod>" latest_author_page_mod_date "</lastmod>"
print " <changefreq>weekly</changefreq>"
print " <priority>0.5</priority>"
print " </url>"
}
}
END {
print "</urlset>"
}
AWK_EOF
}
# Core RSS generation function
# Usage: _generate_rss_feed <output_file> <feed_title> <feed_description> <feed_link_rel> <feed_atom_link_rel> <post_data_input>
# <post_data_input> should be a string containing the filtered, sorted, and limited post data,
@ -80,67 +460,95 @@ _generate_rss_feed() {
# Ensure output directory exists
mkdir -p "$(dirname "$output_file")"
# Create the RSS feed header
cat > "$output_file" << EOF
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title>$(html_escape "$feed_title")</title>
<link>$(fix_url "$feed_link_rel")</link>
<description>$(html_escape "$feed_description")</description>
<language>${SITE_LANG:-en}</language>
<lastBuildDate>$(format_date "now" "$rss_date_fmt")</lastBuildDate>
<atom:link href="$(fix_url "$feed_atom_link_rel")" rel="self" type="application/rss+xml" />
EOF
local escaped_feed_title escaped_feed_description feed_link feed_atom_link channel_last_build_date
escaped_feed_title=$(html_escape "$feed_title")
escaped_feed_description=$(html_escape "$feed_description")
feed_link=$(fix_url "$feed_link_rel")
feed_atom_link=$(fix_url "$feed_atom_link_rel")
channel_last_build_date=$(format_date "now" "$rss_date_fmt")
exec 4> "$output_file" || return 1
printf '%s\n' \
'<?xml version="1.0" encoding="UTF-8" ?>' \
'<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">' \
'<channel>' \
" <title>${escaped_feed_title}</title>" \
" <link>${feed_link}</link>" \
" <description>${escaped_feed_description}</description>" \
" <language>${SITE_LANG:-en}</language>" \
" <lastBuildDate>${channel_last_build_date}</lastBuildDate>" \
" <atom:link href=\"${feed_atom_link}\" rel=\"self\" type=\"application/rss+xml\" />" >&4
# Process the provided post data
echo "$post_data_input" | while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
# Ignore blank trailing lines from callers.
if [ -z "$file" ] && [ -z "$filename" ] && [ -z "$title" ] && [ -z "$date" ] && [ -z "$lastmod" ] && [ -z "$tags" ] && [ -z "$slug" ] && [ -z "$image" ] && [ -z "$image_caption" ] && [ -z "$description" ] && [ -z "$author_name" ] && [ -z "$author_email" ]; then
continue
fi
# Skip if essential fields are missing (robustness)
if [ -z "$file" ] || [ -z "$title" ] || [ -z "$date" ] || [ -z "$lastmod" ] || [ -z "$slug" ]; then
echo "Warning: Skipping RSS item due to missing fields in input line: file=$file, title=$title, date=$date, lastmod=$lastmod, slug=$slug" >&2
continue
fi
# Format dates for RSS
local pub_date=$(format_date "$date" "$rss_date_fmt")
local updated_date_iso=$(format_date "$lastmod" "%Y-%m-%dT%H:%M:%S%z")
# Convert timezone format again if needed
if [[ "$updated_date_iso" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
updated_date_iso="${updated_date_iso::${#updated_date_iso}-2}:${BASH_REMATCH[2]}"
local rss_item_cache_key=""
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
rss_item_cache_key="${RSS_INCLUDE_FULL_CONTENT:-false}|${file}|${date}|${lastmod}|${slug}|${title}"
if [[ -n "${BSSG_RAM_RSS_ITEM_XML_CACHE[$rss_item_cache_key]+_}" ]]; then
printf '%s' "${BSSG_RAM_RSS_ITEM_XML_CACHE[$rss_item_cache_key]}" >&4
continue
fi
fi
# Fallback for updated_date_iso
[ -z "$updated_date_iso" ] && updated_date_iso="$build_timestamp_iso"
# Construct post URL based on URL_SLUG_FORMAT
local year month day formatted_path item_url
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
# Format dates and URL (RAM mode caches repeated values across many tag feeds).
local pub_date updated_date_iso full_url
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
_ram_prime_rss_metadata_entry "$date" "$lastmod" "$slug" "$rss_date_fmt" "$build_timestamp_iso" "$file" || {
echo "Warning: Invalid date format '$date' for file $file, cannot generate URL." >&2
continue
}
pub_date="${BSSG_RAM_RSS_PUBDATE_CACHE[$date]}"
updated_date_iso="${BSSG_RAM_RSS_UPDATED_ISO_CACHE[$lastmod]}"
full_url="${BSSG_RAM_RSS_URL_CACHE[${date}|${slug}]}"
else
echo "Warning: Invalid date format '$date' for file $file, cannot generate URL." >&2
continue # Skip item if URL cannot be generated
fi
formatted_path="${URL_SLUG_FORMAT//Year/$year}"
formatted_path="${formatted_path//Month/$month}"
formatted_path="${formatted_path//Day/$day}"
formatted_path="${formatted_path//slug/$slug}"
item_url="/$(echo "$formatted_path" | sed 's|/*$|/|')" # Ensure trailing slash
pub_date=$(format_date "$date" "$rss_date_fmt")
updated_date_iso=$(format_date "$lastmod" "%Y-%m-%dT%H:%M:%S%z")
if [[ "$updated_date_iso" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
updated_date_iso="${updated_date_iso::${#updated_date_iso}-2}:${BASH_REMATCH[2]}"
fi
[ -z "$updated_date_iso" ] && updated_date_iso="$build_timestamp_iso"
local full_url=$(fix_url "$item_url") # Use fix_url to prepend SITE_URL
local year month day formatted_path item_url
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
echo "Warning: Invalid date format '$date' for file $file, cannot generate URL." >&2
continue
fi
formatted_path="${URL_SLUG_FORMAT//Year/$year}"
formatted_path="${formatted_path//Month/$month}"
formatted_path="${formatted_path//Day/$day}"
formatted_path="${formatted_path//slug/$slug}"
item_url=$(_normalize_relative_url_path "$formatted_path")
full_url=$(fix_url "$item_url")
fi
# --- RSS Item Description Enhancement ---
local item_description_content=""
local figure_part=""
local caption_part=""
local content_part=""
local escaped_title
escaped_title=$(html_escape "$title")
# Build figure part
if [ -n "$image" ]; then
local img_src
[[ "$image" =~ ^https?:// ]] && img_src="$image" || img_src=$(fix_url "$image")
# Escape alt/title attributes safely using html_escape from utils.sh
local img_alt=$(html_escape "$title")
local img_alt="$escaped_title"
local img_title=$(html_escape "$image_caption")
[ -z "$img_title" ] && img_title="$img_alt" # Use alt if title is empty
@ -155,8 +563,24 @@ EOF
# Build content part (excerpt or full)
if [ "${RSS_INCLUDE_FULL_CONTENT:-false}" = true ]; then
local raw_content_cache_file="${CACHE_DIR:-.bssg_cache}/content/$(basename "$file")"
if [ -f "$raw_content_cache_file" ]; then
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local resolved_file="$file"
if declare -F ram_mode_resolve_key > /dev/null; then
resolved_file=$(ram_mode_resolve_key "$file")
fi
if _ram_cache_full_content_for_file "$resolved_file"; then
content_part="${BSSG_RAM_RSS_FULL_CONTENT_CACHE[$resolved_file]}"
else
# RAM mode is memory-only: never fall back to disk cache reads.
if [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo "Warning: RAM content not available for RSS item ($file). Falling back to excerpt." >&2
fi
content_part="$description"
fi
else
local raw_content_cache_file="${CACHE_DIR:-.bssg_cache}/content/$(basename "$file")"
if [ -f "$raw_content_cache_file" ]; then
local raw_content=$(cat "$raw_content_cache_file")
local converted_html=$(convert_markdown_to_html "$raw_content" "$file")
local convert_status=$?
@ -166,9 +590,10 @@ EOF
echo "Warning: Failed to convert markdown to HTML for RSS item ($file, status: $convert_status). Falling back to excerpt." >&2
content_part="$description"
fi
else
echo "Warning: Cached raw markdown content file '$raw_content_cache_file' not found for RSS item ($file). Falling back to excerpt." >&2
content_part="$description"
else
echo "Warning: Cached raw markdown content file '$raw_content_cache_file' not found for RSS item ($file). Falling back to excerpt." >&2
content_part="$description"
fi
fi
else
content_part="$description"
@ -194,26 +619,35 @@ EOF
fi
fi
cat >> "$output_file" << EOF
<item>
<title>$(html_escape "$title")</title>
local rss_item_xml
rss_item_xml=" <item>
<title>${escaped_title}</title>
<link>${full_url}</link>
<guid isPermaLink="true">${full_url}</guid>
<guid isPermaLink=\"true\">${full_url}</guid>
<pubDate>${pub_date}</pubDate>
<atom:updated>${updated_date_iso}</atom:updated>
<description>${final_description}</description>
${author_element}
</item>
EOF
done
"
if [ -n "$author_element" ]; then
rss_item_xml+="${author_element}"$'\n'
fi
rss_item_xml+=" </item>
"
printf '%s' "$rss_item_xml" >&4
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
BSSG_RAM_RSS_ITEM_XML_CACHE["$rss_item_cache_key"]="$rss_item_xml"
fi
done <<< "$post_data_input"
# Close the RSS feed
cat >> "$output_file" << EOF
</channel>
</rss>
EOF
printf '%s\n' '</channel>' '</rss>' >&4
exec 4>&-
echo -e "${GREEN}RSS feed generated at $output_file${NC}"
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "${GREEN}RSS feed generated at $output_file${NC}"
fi
}
export -f _generate_rss_feed # Export for potential parallel use or sourcing
@ -221,6 +655,28 @@ export -f _generate_rss_feed # Export for potential parallel use or sourcing
generate_rss() {
echo -e "${YELLOW}Generating main RSS feed...${NC}"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
echo -e "${YELLOW}No file index data in RAM. Skipping RSS generation.${NC}"
return 0
fi
prepare_ram_rss_metadata_cache >/dev/null || true
local rss="$OUTPUT_DIR/${RSS_FILENAME:-rss.xml}"
local feed_title="${MSG_RSS_FEED_TITLE:-${SITE_TITLE} - RSS Feed}"
local feed_desc="${MSG_RSS_FEED_DESCRIPTION:-${SITE_DESCRIPTION}}"
local feed_link_rel="/"
local feed_atom_link_rel="/${RSS_FILENAME:-rss.xml}"
local rss_item_limit=${RSS_ITEM_LIMIT:-15}
local sorted_posts
sorted_posts=$(printf '%s\n' "$file_index_data" | awk 'NF' | sort -t'|' -k4,4r -k5,5r | head -n "$rss_item_limit")
_generate_rss_feed "$rss" "$feed_title" "$feed_desc" "$feed_link_rel" "$feed_atom_link_rel" "$sorted_posts"
return 0
fi
# Ensure needed functions/vars are available
if ! command -v convert_markdown_to_html &> /dev/null; then
echo -e "${RED}Error: convert_markdown_to_html function not found.${NC}" >&2; return 1; fi
@ -296,6 +752,39 @@ export -f generate_rss
generate_sitemap() {
echo -e "${YELLOW}Generating sitemap.xml...${NC}"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local sitemap="$OUTPUT_DIR/sitemap.xml"
local file_index_data tags_index_data authors_index_data primary_pages_data secondary_pages_data
file_index_data=$(ram_mode_get_dataset "file_index")
tags_index_data=$(ram_mode_get_dataset "tags_index")
authors_index_data=$(ram_mode_get_dataset "authors_index")
primary_pages_data=$(ram_mode_get_dataset "primary_pages")
secondary_pages_data=$(ram_mode_get_dataset "secondary_pages")
local latest_post_mod_date latest_tag_page_mod_date latest_author_page_mod_date
latest_post_mod_date=$(_ram_latest_date_from_dataset "$file_index_data" 5 "%Y-%m-%d")
latest_tag_page_mod_date=$(_ram_latest_date_from_dataset "$tags_index_data" 5 "%Y-%m-%d")
latest_author_page_mod_date=$(_ram_latest_date_from_dataset "$authors_index_data" 6 "%Y-%m-%d")
[ -z "$latest_tag_page_mod_date" ] && latest_tag_page_mod_date="$latest_post_mod_date"
[ -z "$latest_author_page_mod_date" ] && latest_author_page_mod_date="$latest_post_mod_date"
_generate_sitemap_with_awk_inputs \
"$sitemap" \
<(printf '%s\n' "$file_index_data") \
<(printf '%s\n' "$primary_pages_data") \
<(printf '%s\n' "$secondary_pages_data") \
<(printf '%s\n' "$tags_index_data") \
<(printf '%s\n' "$authors_index_data") \
"$latest_post_mod_date" \
"$latest_tag_page_mod_date" \
"$latest_author_page_mod_date" \
"%Y-%m-%d"
echo -e "${GREEN}Sitemap generated!${NC}"
return 0
fi
local sitemap="$OUTPUT_DIR/sitemap.xml"
local file_index="$CACHE_DIR/file_index.txt"
local tags_index="$CACHE_DIR/tags_index.txt"
@ -342,196 +831,23 @@ generate_sitemap() {
local latest_tag_page_mod_date=$(get_latest_mod_date "$tags_index" 5 "" "$sitemap_date_fmt") # Assumes lastmod is relevant field in tags_index
local latest_author_page_mod_date=$(get_latest_mod_date "$authors_index" 6 "" "$sitemap_date_fmt") # Field 6 is lastmod in authors_index
# --- Generate Sitemap using AWK --- START ---
echo "Generating sitemap content using awk..."
# Determine the best awk command locally to avoid potential scoping issues with AWK_CMD
local effective_awk_cmd="awk" # Default to standard awk
if command -v gawk > /dev/null 2>&1; then
effective_awk_cmd="gawk" # Prefer gawk if available
fi
# Use awk with a here-doc for the script for cleaner quoting
# Use the locally determined effective_awk_cmd
"$effective_awk_cmd" -v site_url="$SITE_URL" \
-v url_slug_format="$URL_SLUG_FORMAT" \
-v latest_post_mod_date="$latest_post_mod_date" \
-v latest_tag_page_mod_date="$latest_tag_page_mod_date" \
-v latest_author_page_mod_date="$latest_author_page_mod_date" \
-v enable_author_pages="${ENABLE_AUTHOR_PAGES:-true}" \
-v sitemap_date_fmt="$sitemap_date_fmt" \
-F'|' \
-f - \
"$file_index" "$primary_pages_cache" "$secondary_pages_cache" "$tags_index" "$authors_index" <<'AWK_EOF' > "$sitemap"
# AWK script for sitemap generation (fed via here-doc)
BEGIN {
OFS=""; # No output field separator needed for XML
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
print "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">";
# Homepage
print " <url>";
print " <loc>" fix_url_awk("/", site_url) "</loc>";
print " <lastmod>" latest_post_mod_date "</lastmod>";
print " <changefreq>daily</changefreq>";
print " <priority>1.0</priority>";
print " </url>";
}
# Custom function to replicate fix_url shell function logic
function fix_url_awk(path, base_url) {
if (substr(path, 1, 1) == "/") {
# Remove trailing slash from base_url if present
sub(/\/$/, "", base_url);
# Ensure path doesnt start with //
sub(/^\/+/, "/", path);
# Remove index.html if present
sub(/\/index\.html$/, "/", path);
# Ensure trailing slash
if (substr(path, length(path), 1) != "/") {
path = path "/";
}
# Handle case where base_url is empty or just http://localhost* - skip prepending
if (base_url == "" || base_url ~ /^http:\/\/localhost(:[0-9]+)?$/) {
return path
} else {
return base_url path;
}
} else {
return path; # Should not happen for sitemap paths?
}
}
# Process file_index.txt (Posts)
FILENAME == ARGV[1] {
file=$1; filename=$2; title=$3; date=$4; lastmod=$5; tags=$6; slug=$7;
if (length(file) == 0 || length(date) == 0 || length(lastmod) == 0 || length(slug) == 0) next;
year=substr(date, 1, 4);
month=substr(date, 6, 2);
day=substr(date, 9, 2);
# Ensure valid numbers? Basic check:
if (year ~ /^[0-9]{4}$/ && month ~ /^[0-9]{2}$/ && day ~ /^[0-9]{2}$/) {
formatted_path = url_slug_format;
gsub(/Year/, year, formatted_path);
gsub(/Month/, month, formatted_path);
gsub(/Day/, day, formatted_path);
gsub(/slug/, slug, formatted_path);
item_url = "/" formatted_path;
# Clean URL logic from shell script
sub(/\/+$/, "/", item_url);
mod_time = substr(lastmod, 1, 10); # Extract YYYY-MM-DD from lastmod ($5)
if (mod_time == "") next; # Skip if date is invalid/empty
print " <url>";
print " <loc>" fix_url_awk(item_url, site_url) "</loc>";
print " <lastmod>" mod_time "</lastmod>";
print " <changefreq>weekly</changefreq>";
print " <priority>0.8</priority>";
print " </url>";
}
}
# Process primary_pages.tmp
FILENAME == ARGV[2] {
url=$2; date=$3; # $1=_, $4=source_file
if (length(url) == 0 || length(date) == 0) next;
sitemap_url = url;
sub(/index\.html$/, "", sitemap_url); # Remove index.html
sub(/\/+$/, "/", sitemap_url); # Ensure trailing slash
mod_time = substr(date, 1, 10); # Extract YYYY-MM-DD from date ($3)
if (mod_time == "") next; # Skip if date is invalid/empty
print " <url>";
print " <loc>" fix_url_awk(sitemap_url, site_url) "</loc>";
print " <lastmod>" mod_time "</lastmod>";
print " <changefreq>monthly</changefreq>";
print " <priority>0.7</priority>";
print " </url>";
}
# Process secondary_pages.tmp
FILENAME == ARGV[3] {
url=$2; date=$3; # $1=_, $4=source_file
if (length(url) == 0 || length(date) == 0) next;
sitemap_url = url;
sub(/index\.html$/, "", sitemap_url);
sub(/\/+$/, "/", sitemap_url);
mod_time = substr(date, 1, 10); # Extract YYYY-MM-DD from date ($3)
if (mod_time == "") next; # Skip if date is invalid/empty
print " <url>";
print " <loc>" fix_url_awk(sitemap_url, site_url) "</loc>";
print " <lastmod>" mod_time "</lastmod>";
print " <changefreq>monthly</changefreq>";
print " <priority>0.6</priority>"; # Lower priority for secondary?
print " </url>";
}
# Process tags_index.txt (Tag Pages)
FILENAME == ARGV[4] {
tag=$1; tag_slug=$2; # $5 = lastmod for posts with this tag
if (length(tag_slug) == 0) next;
# Check if tag slug already processed
if ( !(tag_slug in processed_tags) ) {
processed_tags[tag_slug] = 1; # Mark as processed
item_url = "/tags/" tag_slug "/";
# Use the overall latest tag mod date for all tag pages?
mod_time = latest_tag_page_mod_date;
print " <url>";
print " <loc>" fix_url_awk(item_url, site_url) "</loc>";
print " <lastmod>" mod_time "</lastmod>";
print " <changefreq>weekly</changefreq>";
print " <priority>0.5</priority>";
print " </url>";
}
}
# Process authors_index.txt (Author Pages) - only if author pages are enabled
FILENAME == ARGV[5] && enable_author_pages == "true" {
author_name=$1; author_slug=$2; # $6 = lastmod for posts with this author
if (length(author_slug) == 0) next;
# Check if author slug already processed
if ( !(author_slug in processed_authors) ) {
processed_authors[author_slug] = 1; # Mark as processed
# Add main authors index page (only once)
if (!authors_index_added) {
authors_index_added = 1;
print " <url>";
print " <loc>" fix_url_awk("/authors/", site_url) "</loc>";
print " <lastmod>" latest_author_page_mod_date "</lastmod>";
print " <changefreq>weekly</changefreq>";
print " <priority>0.6</priority>";
print " </url>";
}
# Add individual author page
item_url = "/authors/" author_slug "/";
mod_time = latest_author_page_mod_date;
print " <url>";
print " <loc>" fix_url_awk(item_url, site_url) "</loc>";
print " <lastmod>" mod_time "</lastmod>";
print " <changefreq>weekly</changefreq>";
print " <priority>0.5</priority>";
print " </url>";
}
}
END {
print "</urlset>";
}
AWK_EOF
# awk exit status check - optional
# local awk_status=$?
# if [ $awk_status -ne 0 ]; then
# echo -e "${RED}Error: awk script for sitemap generation failed with status $awk_status${NC}" >&2
# # Decide whether to return 1 or continue
# fi
# --- Generate Sitemap using AWK --- END ---
_generate_sitemap_with_awk_inputs \
"$sitemap" \
"$file_index" \
"$primary_pages_cache" \
"$secondary_pages_cache" \
"$tags_index" \
"$authors_index" \
"$latest_post_mod_date" \
"$latest_tag_page_mod_date" \
"$latest_author_page_mod_date" \
"$sitemap_date_fmt"
echo -e "${GREEN}Sitemap generated!${NC}"
}
# Export public functions
export -f generate_sitemap generate_rss
export -f _normalize_relative_url_path
export -f _ram_strip_frontmatter_for_rss _ram_cache_full_content_for_file prepare_ram_rss_full_content_cache
export -f generate_sitemap generate_rss

View file

@ -10,8 +10,297 @@ source "$(dirname "$0")/utils.sh" || { echo >&2 "Error: Failed to source utils.s
# shellcheck source=cache.sh disable=SC1091
source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.sh from generate_index.sh"; exit 1; }
_generate_index_ram() {
echo -e "${YELLOW}Generating index pages...${NC}"
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
echo -e "${YELLOW}No posts found in RAM file index. Skipping index generation.${NC}"
return 0
fi
local total_posts_orig
total_posts_orig=$(printf '%s\n' "$file_index_data" | awk 'NF { c++ } END { print c+0 }')
local total_pages=$(( (total_posts_orig + POSTS_PER_PAGE - 1) / POSTS_PER_PAGE ))
[ "$total_pages" -eq 0 ] && total_pages=1
mapfile -t file_index_lines < <(printf '%s\n' "$file_index_data" | awk 'NF')
echo -e "Generating ${GREEN}$total_pages${NC} index pages for ${GREEN}$total_posts_orig${NC} posts"
local current_page
for (( current_page = 1; current_page <= total_pages; current_page++ )); do
local output_file
if [ "$current_page" -eq 1 ]; then
output_file="$OUTPUT_DIR/index.html"
else
output_file="$OUTPUT_DIR/page/$current_page/index.html"
mkdir -p "$(dirname "$output_file")"
fi
local page_header="$HEADER_TEMPLATE"
if [ "$current_page" -eq 1 ]; then
page_header=${page_header//\{\{site_title\}\}/"$SITE_TITLE"}
page_header=${page_header//\{\{page_title\}\}/"${MSG_HOME:-"Home"}"}
page_header=${page_header//\{\{og_type\}\}/"website"}
page_header=${page_header//\{\{page_url\}\}/""}
page_header=${page_header//\{\{site_url\}\}/"$SITE_URL"}
local home_schema
home_schema=$(cat <<EOF
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "WebSite",
"name": "$SITE_TITLE",
"description": "$SITE_DESCRIPTION",
"url": "$SITE_URL/",
"potentialAction": {
"@type": "SearchAction",
"target": "$SITE_URL/search?q={search_term_string}",
"query-input": "required name=search_term_string"
},
"publisher": {
"@type": "Organization",
"name": "$SITE_TITLE",
"url": "$SITE_URL"
}
}
</script>
EOF
)
page_header=${page_header//\{\{schema_json_ld\}\}/"$home_schema"}
else
local pag_title
pag_title=$(printf "${MSG_PAGINATION_TITLE:-"%s - Page %d"}" "$SITE_TITLE" "$current_page")
page_header=${page_header//\{\{site_title\}\}/"$SITE_TITLE"}
page_header=${page_header//\{\{page_title\}\}/"$pag_title"}
page_header=${page_header//\{\{og_type\}\}/"website"}
local paginated_rel_url="/page/$current_page/"
page_header=${page_header//\{\{page_url\}\}/"$paginated_rel_url"}
page_header=${page_header//\{\{site_url\}\}/"$SITE_URL"}
local collection_schema
collection_schema=$(cat <<EOF
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "CollectionPage",
"name": "$pag_title",
"description": "$SITE_DESCRIPTION",
"url": "$SITE_URL${paginated_rel_url}",
"isPartOf": {
"@type": "WebSite",
"name": "$SITE_TITLE",
"url": "$SITE_URL"
}
}
</script>
EOF
)
page_header=${page_header//\{\{schema_json_ld\}\}/"$collection_schema"}
fi
page_header=${page_header//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
page_header=${page_header//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
page_header=${page_header//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
page_header=${page_header//\{\{og_image\}\}/""}
page_header=${page_header//\{\{twitter_image\}\}/""}
local page_footer="$FOOTER_TEMPLATE"
page_footer=${page_footer//\{\{current_year\}\}/$(date +%Y)}
page_footer=${page_footer//\{\{author_name\}\}/"$AUTHOR_NAME"}
cat > "$output_file" <<EOF
$page_header
EOF
local index_file="${PAGES_DIR}/index.md"
local has_custom_index=false
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$index_file"; then
has_custom_index=true
elif [ -f "$index_file" ]; then
has_custom_index=true
fi
if [ "$current_page" -eq 1 ] && [ "$has_custom_index" = true ]; then
local content="" html_content="" in_frontmatter=false found_frontmatter=false source_stream=""
if [ "${BSSG_RAM_MODE:-false}" = true ] && ram_mode_has_file "$index_file"; then
source_stream=$(ram_mode_get_content "$index_file")
else
source_stream=$(cat "$index_file")
fi
while IFS= read -r line; do
if [[ "$line" == "---" ]]; then
if ! $in_frontmatter && ! $found_frontmatter; then
in_frontmatter=true
found_frontmatter=true
continue
elif $in_frontmatter; then
in_frontmatter=false
continue
fi
fi
if ! $in_frontmatter && $found_frontmatter; then
content+="$line"$'\n'
fi
done <<< "$source_stream"
if ! $found_frontmatter; then
content="$source_stream"
fi
html_content=$(convert_markdown_to_html "$content")
echo "$html_content" >> "$output_file"
cat >> "$output_file" <<EOF
$page_footer
EOF
continue
fi
if [ "$total_posts_orig" -gt 0 ]; then
cat >> "$output_file" <<EOF
<h1>${MSG_LATEST_POSTS:-"Latest Posts"}</h1>
<div class="posts-list">
EOF
local start_index=$(( (current_page - 1) * POSTS_PER_PAGE ))
local end_index=$(( start_index + POSTS_PER_PAGE - 1 ))
local i
for (( i = start_index; i <= end_index && i < total_posts_orig; i++ )); do
local file filename title date lastmod tags slug image image_caption description author_name author_email
IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email <<< "${file_index_lines[$i]}"
[ -z "$file" ] && continue
[ -z "$title" ] && continue
[ -z "$date" ] && continue
local post_year post_month post_day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
post_year="${BASH_REMATCH[1]}"
post_month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
post_day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
post_year=$(date +%Y); post_month=$(date +%m); post_day=$(date +%d)
fi
local formatted_path="${URL_SLUG_FORMAT//Year/$post_year}"
formatted_path="${formatted_path//Month/$post_month}"
formatted_path="${formatted_path//Day/$post_day}"
formatted_path="${formatted_path//slug/$slug}"
local post_link="/$formatted_path/"
local display_date_format="$DATE_FORMAT"
if [ "${SHOW_TIMEZONE:-false}" = false ]; then
display_date_format=$(echo "$display_date_format" | sed -e 's/%[zZ]//g' -e 's/[[:space:]]*$//')
fi
local formatted_date
formatted_date=$(format_date "$date" "$display_date_format")
cat >> "$output_file" <<EOF
<article>
<h3><a href="$(fix_url "$post_link")">$title</a></h3>
<div class="meta">${MSG_PUBLISHED_ON:-"Published on"} $formatted_date${author_name:+" ${MSG_BY:-"by"} ${author_name:-$AUTHOR_NAME}"}</div>
EOF
if [ -n "$image" ]; then
local image_url="$image"
if [[ "$image" == /* ]]; then
image_url="${SITE_URL}${image}"
fi
cat >> "$output_file" <<EOF
<div class="featured-image index-image">
<a href="$(fix_url "$post_link")">
<img src="$image_url" alt="${image_caption:-$title}" title="${image_caption:-$title}" />
</a>
</div>
EOF
fi
if [ "${INDEX_SHOW_FULL_CONTENT:-false}" = "true" ]; then
local post_content="" html_content=""
if [ "${BSSG_RAM_MODE:-false}" = true ] && ram_mode_has_file "$file"; then
local source_stream
source_stream=$(ram_mode_get_content "$file")
post_content=$(printf '%s\n' "$source_stream" | awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { in_fm = 0; next; }
}
{ if (!in_fm) print; }
')
fi
if [ -n "$post_content" ]; then
if [[ "$file" == *.md ]]; then
html_content=$(convert_markdown_to_html "$post_content")
else
html_content="$post_content"
fi
fi
if [ -n "$html_content" ]; then
cat >> "$output_file" <<EOF
<div class="post-content">
$html_content
</div>
EOF
fi
elif [ -n "$description" ]; then
cat >> "$output_file" <<EOF
<div class="summary">
$description
</div>
EOF
fi
cat >> "$output_file" <<EOF
</article>
EOF
done
cat >> "$output_file" <<EOF
</div> <!-- .posts-list -->
EOF
if [ "$total_pages" -gt 1 ]; then
cat >> "$output_file" <<EOF
<!-- Pagination -->
<div class="pagination">
EOF
if [ "$current_page" -gt 1 ]; then
local prev_page=$((current_page - 1))
local prev_url="/"
if [ $prev_page -ne 1 ]; then
prev_url="/page/$prev_page/"
fi
cat >> "$output_file" <<PAG_EOF
<a href="$(fix_url "$prev_url")" class="prev">&laquo; ${MSG_NEWER_POSTS:-Newer}</a>
PAG_EOF
fi
cat >> "$output_file" <<PAG_EOF
<span class="page-info">$(printf "${MSG_PAGE_INFO_TEMPLATE:-Page %d of %d}" "$current_page" "$total_pages")</span>
PAG_EOF
if [ "$current_page" -lt "$total_pages" ]; then
local next_page=$((current_page + 1))
cat >> "$output_file" <<PAG_EOF
<a href="$(fix_url "/page/$next_page/")" class="next">${MSG_OLDER_POSTS:-Older} &raquo;</a>
PAG_EOF
fi
cat >> "$output_file" <<EOF
</div>
EOF
fi
fi
cat >> "$output_file" <<EOF
$page_footer
EOF
done
echo -e "${GREEN}Index pages processed!${NC}"
}
# Generate main index page (homepage) and paginated pages
generate_index() {
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
_generate_index_ram
return $?
fi
echo -e "${YELLOW}Generating index pages...${NC}"
# Check if rebuild is needed (using function from cache.sh)
@ -235,8 +524,20 @@ EOF
local post_content=""
local content_cache_file="${CACHE_DIR:-.bssg_cache}/content/$(basename "$file")"
# Try RAM preload first
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$file"; then
local source_stream
source_stream=$(ram_mode_get_content "$file")
post_content=$(printf '%s\n' "$source_stream" | awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { in_fm = 0; next; }
}
{ if (!in_fm) print; }
')
# Try to get content from cache first
if [ -f "$content_cache_file" ]; then
elif [ -f "$content_cache_file" ]; then
post_content=$(cat "$content_cache_file")
else
# Extract content from source file if cache doesn't exist
@ -354,9 +655,8 @@ EOF
# Use GNU parallel if available and beneficial
if [ "${HAS_PARALLEL:-false}" = true ] && [ "$total_pages" -gt 2 ] ; then
echo -e "${GREEN}Using GNU parallel to process index pages${NC}"
local cores=1
if command -v nproc > /dev/null 2>&1; then cores=$(nproc);
elif command -v sysctl > /dev/null 2>&1; then cores=$(sysctl -n hw.ncpu 2>/dev/null || echo 1); fi
local cores
cores=$(get_parallel_jobs)
# Use all detected cores
local jobs=$cores
@ -393,4 +693,4 @@ EOF
}
# Make the function available for sourcing
export -f generate_index
export -f generate_index

View file

@ -24,9 +24,13 @@ convert_page() {
# IMPORTANT: Assumes CACHE_DIR, FORCE_REBUILD, PAGES_DIR, SITE_TITLE, SITE_DESCRIPTION, SITE_URL, AUTHOR_NAME are exported/available
local output_html_file="$output_base_path/index.html"
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$input_file"; then
ram_mode_active=true
fi
# Check if the source file exists
if [ ! -f "$input_file" ]; then
if ! $ram_mode_active && [ ! -f "$input_file" ]; then
echo -e "${RED}Error: Source page '$input_file' not found${NC}" >&2
return 1
fi
@ -45,21 +49,31 @@ convert_page() {
if [[ "$input_file" == *.html ]]; then
# For HTML files, extract content between <body> tags (simple approach)
html_content=$(sed -n '/<body>/,/<\/body>/p' "$input_file" | sed '1d;$d')
local html_source=""
if $ram_mode_active; then
html_source=$(ram_mode_get_content "$input_file")
else
html_source=$(cat "$input_file")
fi
html_content=$(printf '%s\n' "$html_source" | sed -n '/<body>/,/<\/body>/p' | sed '1d;$d')
# We might not have raw content for reading time easily here
content=$(echo "$html_content" | sed 's/<[^>]*>//g') # Basic text extraction for reading time
else
# For markdown files, extract content after frontmatter
local start_line=$(grep -n "^---$" "$input_file" | head -1 | cut -d: -f1)
local end_line=$(grep -n "^---$" "$input_file" | head -2 | tail -1 | cut -d: -f1)
if [[ -z "$start_line" || -z "$end_line" || ! $start_line -lt $end_line ]]; then
# No valid frontmatter found, use the whole file
content=$(cat "$input_file")
local source_stream=""
if $ram_mode_active; then
source_stream=$(ram_mode_get_content "$input_file")
else
# Extract content after the second --- line
content=$(tail -n +$((end_line + 1)) "$input_file")
source_stream=$(cat "$input_file")
fi
content=$(printf '%s\n' "$source_stream" | awk '
BEGIN { in_fm = 0; found_fm = 0; }
/^---$/ {
if (!in_fm && !found_fm) { in_fm = 1; found_fm = 1; next; }
if (in_fm) { in_fm = 0; next; }
}
{ if (!in_fm) print; }
')
# --- MODIFIED PART --- START ---
# Convert markdown content to HTML using the function from content.sh
@ -178,10 +192,13 @@ process_all_pages() {
return 0
fi
echo -e "Checking ${GREEN}${#page_files[@]}${NC} pages for changes"
# Use mapfile -t to read sorted files into array (newline-separated, trailing newline stripped)
mapfile -t page_files < <(find "${PAGES_DIR:-pages}" -type f \( -name "*.md" -o -name "*.html" \) -not -path "*/.*" | sort)
local page_files=()
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_list_page_files > /dev/null; then
mapfile -t page_files < <(ram_mode_list_page_files)
else
mapfile -t page_files < <(find "${PAGES_DIR:-pages}" -type f \( -name "*.md" -o -name "*.html" \) -not -path "*/.*" | sort)
fi
local num_pages=${#page_files[@]}
if [ "$num_pages" -eq 0 ]; then
@ -190,14 +207,37 @@ process_all_pages() {
fi
echo -e "Found ${GREEN}$num_pages${NC} potential pages."
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_active=true
fi
# RAM mode keeps source content only in-process (bash arrays).
# GNU parallel spawns fresh shells that cannot access those arrays.
if $ram_mode_active; then
if [ "$num_pages" -gt 1 ]; then
echo -e "${YELLOW}Using shell parallel workers for $num_pages RAM-mode pages${NC}"
local cores
cores=$(get_parallel_jobs)
{
local file quoted_file
for file in "${page_files[@]}"; do
printf -v quoted_file '%q' "$file"
echo "process_single_page_file $quoted_file"
done
} | run_parallel "$cores"
else
echo -e "${YELLOW}Using sequential processing for RAM-mode pages${NC}"
process_single_page_file "${page_files[0]}"
fi
# Use GNU parallel if available, otherwise fallback
# IMPORTANT: Assumes HAS_PARALLEL is exported/available
if [ "${HAS_PARALLEL:-false}" = true ]; then
elif [ "${HAS_PARALLEL:-false}" = true ]; then
echo -e "${GREEN}Using GNU parallel to generate pages${NC}"
# Determine number of cores
local cores=1
if command -v nproc > /dev/null 2>&1; then cores=$(nproc);
elif command -v sysctl > /dev/null 2>&1; then cores=$(sysctl -n hw.ncpu 2>/dev/null || echo 1); fi
local cores
cores=$(get_parallel_jobs)
# Export functions needed by the parallel process and its children
export -f convert_page process_single_page_file
@ -223,4 +263,4 @@ process_all_pages() {
echo -e "${GREEN}Static page processing complete!${NC}"
}
# --- Page Generation Functions --- END ---
# --- Page Generation Functions --- END ---

View file

@ -16,6 +16,63 @@ source "$(dirname "$0")/related_posts.sh" || { echo >&2 "Error: Failed to source
# --- Post Generation Functions --- START ---
declare -gA BSSG_POST_ISO8601_CACHE=()
format_iso8601_post_date() {
local input_dt="$1"
local iso_dt=""
if [ -z "$input_dt" ]; then
echo ""
return
fi
local cache_key="${TIMEZONE:-local}|${input_dt}"
if [[ "$(declare -p BSSG_POST_ISO8601_CACHE 2>/dev/null || true)" != "declare -A"* ]]; then
unset BSSG_POST_ISO8601_CACHE 2>/dev/null || true
declare -gA BSSG_POST_ISO8601_CACHE=()
fi
if [[ -n "${BSSG_POST_ISO8601_CACHE[$cache_key]+_}" ]]; then
echo "${BSSG_POST_ISO8601_CACHE[$cache_key]}"
return
fi
# Handle "now" separately
if [ "$input_dt" = "now" ]; then
iso_dt=$(LC_ALL=C date +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
else
# Try parsing different formats based on OS
if [[ "$OSTYPE" == "darwin"* ]] || [[ "$OSTYPE" == *"bsd"* ]]; then
# Format 1: YYYY-MM-DD HH:MM:SS ZZZZ (e.g., +0200)
iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d %H:%M:%S %z" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
# Format 2: YYYY-MM-DD HH:MM:SS
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d %H:%M:%S" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
# Format 3: YYYY-MM-DD (assume T00:00:00)
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d" "$input_dt" +"%Y-%m-%dT00:00:00%z" 2>/dev/null)
# Format 4: RFC 2822 subset (e.g., 07 Sep 2023 08:10:00 +0200)
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%d %b %Y %H:%M:%S %z" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
else
# GNU date -d handles many formats.
iso_dt=$(LC_ALL=C date -d "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
fi
fi
# Normalize timezone from +0000 to Z and +hhmm to +hh:mm.
if [ -n "$iso_dt" ] && [[ "$iso_dt" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
local tz_offset="${BASH_REMATCH[0]}"
local tz_hh="${BASH_REMATCH[1]}"
local tz_mm="${BASH_REMATCH[2]}"
if [ "$tz_hh" = "+00" ] && [ "$tz_mm" = "00" ]; then
iso_dt="${iso_dt%$tz_offset}Z"
else
iso_dt="${iso_dt%$tz_offset}${tz_hh}:${tz_mm}"
fi
fi
BSSG_POST_ISO8601_CACHE["$cache_key"]="$iso_dt"
echo "$iso_dt"
}
# Convert markdown to HTML
convert_markdown() {
local input_file="$1"
@ -30,61 +87,74 @@ convert_markdown() {
local description="${10}"
local author_name="${11}"
local author_email="${12}"
local skip_rebuild_check="${13:-false}"
local content_cache_file="${CACHE_DIR:-.bssg_cache}/content/$(basename "$input_file")"
local output_html_file="$output_base_path/index.html"
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$input_file"; then
ram_mode_active=true
fi
# Check if the source file exists
if [ ! -f "$input_file" ]; then
if ! $ram_mode_active && [ ! -f "$input_file" ]; then
echo -e "${RED}Error: Source file '$input_file' not found${NC}" >&2
return 1
fi
# Skip if output file is newer than input file and no force rebuild
if ! file_needs_rebuild "$input_file" "$output_html_file"; then
echo -e "Skipping unchanged file: ${YELLOW}$(basename "$input_file")${NC}"
return 0
# Skip if output file is newer than input file and no force rebuild.
# When callers already prefiltered rebuild candidates, this check can be skipped.
if [ "$skip_rebuild_check" != true ]; then
if ! file_needs_rebuild "$input_file" "$output_html_file"; then
echo -e "Skipping unchanged file: ${YELLOW}$(basename "$input_file")${NC}"
return 0
fi
fi
echo -e "Processing post: ${GREEN}$(basename "$input_file")${NC}"
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "Processing post: ${GREEN}$(basename "$input_file")${NC}"
fi
# IMPORTANT: Assumes lock_file/unlock_file are sourced/available
lock_file "$content_cache_file"
# Try to get content from cache or file
# Extract body content (without frontmatter) in one awk pass.
# This is materially faster than line-by-line bash parsing on large markdown files.
local content=""
local in_frontmatter=false
local found_frontmatter=false
{
while IFS= read -r line; do
if [[ "$line" == "---" ]]; then
if ! $in_frontmatter && ! $found_frontmatter; then
in_frontmatter=true
found_frontmatter=true
continue
elif $in_frontmatter; then
in_frontmatter=false
continue # Skip the closing --- line itself
fi
fi
if ! $in_frontmatter && $found_frontmatter; then
content+="$line"$'\n'
fi
done
} < "$input_file"
# If no frontmatter was found, use the whole file as content
if ! $found_frontmatter; then
content=$(cat "$input_file")
local source_stream=""
if $ram_mode_active; then
source_stream=$(ram_mode_get_content "$input_file")
else
source_stream=$(cat "$input_file")
fi
content=$(printf '%s' "$source_stream" | awk '
NR == 1 {
if ($0 == "---") {
has_frontmatter = 1
in_frontmatter = 1
next
}
}
{
if (has_frontmatter) {
if (in_frontmatter) {
if ($0 == "---") {
in_frontmatter = 0
}
next
}
print
} else {
print
}
}
')
# Cache the markdown content *without frontmatter* for potential use in RSS full content
if [ -n "$CACHE_DIR" ] && [ -d "${CACHE_DIR}/content" ]; then
if ! $ram_mode_active && [ -n "$CACHE_DIR" ] && [ -d "${CACHE_DIR}/content" ]; then
# Write the $content variable (which has frontmatter removed) to the cache file
lock_file "$content_cache_file"
printf '%s' "$content" > "$content_cache_file"
unlock_file "$content_cache_file"
fi
unlock_file "$content_cache_file"
# Calculate reading time
local reading_time
@ -122,7 +192,7 @@ convert_markdown() {
[[ -z "$tag" ]] && continue
local tag_slug=$(echo "$tag" | tr '[:upper:]' '[:lower:]' | sed -e 's/ /-/g' -e 's/[^a-z0-9-]//g')
if [[ -n "$tag_slug" ]]; then # Ensure tag slug is not empty
tags_html+=$(printf ' <a href="%s/tags/%s/" class="tag">%s</a>' "${SITE_URL:-}" "$tag_slug" "$tag")
tags_html+=" <a href=\"${SITE_URL:-}/tags/${tag_slug}/\" class=\"tag\">${tag}</a>"
fi
done
tags_html+="</div>"
@ -178,62 +248,16 @@ convert_markdown() {
if [ -n "$date" ]; then
local iso_date iso_lastmod_date
# Function to format date to ISO 8601 with corrected timezone
format_iso8601() {
local input_dt="$1"
local iso_dt=""
if [ -z "$input_dt" ]; then echo ""; return; fi
# Handle "now" separately
if [ "$input_dt" = "now" ]; then
iso_dt=$(LC_ALL=C date +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
else
# Try parsing different formats based on OS
# Add LC_ALL=C for consistent parsing
if [[ "$OSTYPE" == "darwin"* ]] || [[ "$OSTYPE" == *"bsd"* ]]; then
# macOS/BSD: Try formats one by one with date -j -f
# Format 1: YYYY-MM-DD HH:MM:SS ZZZZ (e.g., +0200)
iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d %H:%M:%S %z" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
# Format 2: YYYY-MM-DD HH:MM:SS
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d %H:%M:%S" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
# Format 3: YYYY-MM-DD (assume T00:00:00)
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%Y-%m-%d" "$input_dt" +"%Y-%m-%dT00:00:00%z" 2>/dev/null)
# Format 4: RFC 2822 subset (e.g., 07 Sep 2023 08:10:00 +0200)
[ -z "$iso_dt" ] && iso_dt=$(LC_ALL=C date -j -f "%d %b %Y %H:%M:%S %z" "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
else # Linux
# GNU date -d is more flexible and handles many formats automatically
iso_dt=$(LC_ALL=C date -d "$input_dt" +"%Y-%m-%dT%H:%M:%S%z" 2>/dev/null)
fi
fi
# If parsing succeeded, fix timezone format
if [ -n "$iso_dt" ]; then
# Fix timezone format from +0000 to +00:00 or Z
if [[ "$iso_dt" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
local tz_offset="${BASH_REMATCH[0]}"
local tz_hh="${BASH_REMATCH[1]}"
local tz_mm="${BASH_REMATCH[2]}"
if [ "$tz_hh" == "+00" ] && [ "$tz_mm" == "00" ]; then
iso_dt="${iso_dt%$tz_offset}Z"
else
iso_dt="${iso_dt%$tz_offset}${tz_hh}:${tz_mm}"
fi
fi
echo "$iso_dt"
else
echo "" # Return empty if formatting failed
fi
}
iso_date=$(format_iso8601 "$date")
iso_date=$(format_iso8601_post_date "$date")
# Use date as fallback for lastmod, then format
iso_lastmod_date=$(format_iso8601 "${lastmod:-$date}")
iso_lastmod_date=$(format_iso8601_post_date "${lastmod:-$date}")
# If lastmod still empty, use iso_date as fallback
[ -z "$iso_lastmod_date" ] && iso_lastmod_date="$iso_date"
# Fallback to build time if both are empty (should be rare)
if [ -z "$iso_date" ]; then
local now_iso=$(format_iso8601 "now")
local now_iso
now_iso=$(format_iso8601_post_date "now")
iso_date="$now_iso"
iso_lastmod_date="$now_iso"
fi
@ -319,10 +343,24 @@ convert_markdown() {
# Generate related posts if enabled and tags exist
local related_posts_html=""
if [ "${ENABLE_RELATED_POSTS:-true}" = true ] && [ -n "$tags" ]; then
echo -e "${BLUE}DEBUG: Generating related posts for $slug with tags: $tags${NC}"
related_posts_html=$(generate_related_posts "$slug" "$tags" "$date" "${RELATED_POSTS_COUNT:-3}")
# RAM fast path: direct map lookup avoids per-post command-substitution/function overhead.
if [ "${BSSG_RAM_MODE:-false}" = true ] && \
[ "${BSSG_RAM_RELATED_POSTS_READY:-false}" = true ] && \
[ "${BSSG_RAM_RELATED_POSTS_LIMIT:-}" = "${RELATED_POSTS_COUNT:-3}" ]; then
related_posts_html="${BSSG_RAM_RELATED_POSTS_HTML[$slug]-}"
if [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "${BLUE}DEBUG: Generating related posts for $slug with tags: $tags${NC}"
fi
else
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "${BLUE}DEBUG: Generating related posts for $slug with tags: $tags${NC}"
fi
related_posts_html=$(generate_related_posts "$slug" "$tags" "$date" "${RELATED_POSTS_COUNT:-3}")
fi
else
echo -e "${BLUE}DEBUG: Skipping related posts for $slug - ENABLE_RELATED_POSTS=${ENABLE_RELATED_POSTS:-true}, tags=$tags${NC}"
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "${BLUE}DEBUG: Skipping related posts for $slug - ENABLE_RELATED_POSTS=${ENABLE_RELATED_POSTS:-true}, tags=$tags${NC}"
fi
fi
# Construct article body
@ -368,13 +406,24 @@ process_all_markdown_files() {
local modified_tags_list="${CACHE_DIR:-.bssg_cache}/modified_tags.list" # Define path for modified tags
local modified_authors_list="${CACHE_DIR:-.bssg_cache}/modified_authors.list" # Define path for modified authors
local file_index_prev="${CACHE_DIR:-.bssg_cache}/file_index_prev.txt" # Path to previous index
local ram_mode_active=false
local file_index_data=""
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_active=true
file_index_data=$(ram_mode_get_dataset "file_index")
fi
if [ ! -f "$file_index" ]; then
if ! $ram_mode_active && [ ! -f "$file_index" ]; then
echo -e "${RED}Error: File index not found at '$file_index'. Run indexing first.${NC}" >&2
return 1
fi
local total_file_count=$(wc -l < "$file_index")
local total_file_count=0
if $ram_mode_active; then
total_file_count=$(printf '%s\n' "$file_index_data" | awk 'NF { c++ } END { print c+0 }')
else
total_file_count=$(wc -l < "$file_index")
fi
if [ "$total_file_count" -eq 0 ]; then
echo -e "${YELLOW}No posts found in file index. Skipping post generation.${NC}"
return 0
@ -386,7 +435,7 @@ process_all_markdown_files() {
local posts_needing_rebuild=0
# Only do expensive Pass 1 if related posts are enabled AND posts might need rebuilding
if [ "${ENABLE_RELATED_POSTS:-true}" = true ]; then
if [ "${ENABLE_RELATED_POSTS:-true}" = true ] && ! $ram_mode_active; then
echo -e "${BLUE}DEBUG: Related posts enabled, starting quick scan...${NC}"
# Quick scan to see if ANY posts need rebuilding before doing expensive Pass 1
echo -e "${YELLOW}Quick scan: Checking if any posts need rebuilding...${NC}"
@ -435,13 +484,19 @@ process_all_markdown_files() {
# Early exit optimization: if we find posts needing rebuild, we need Pass 1
break
fi
done < "$file_index"
done < <(
if $ram_mode_active; then
printf '%s\n' "$file_index_data" | awk 'NF'
else
cat "$file_index"
fi
)
echo -e "Quick scan result: ${GREEN}$posts_needing_rebuild${NC} posts need rebuilding"
fi
# --- PASS 1: Only run if needed (posts need rebuilding AND related posts enabled) ---
if [ "$needs_pass1" = true ] && [ "${ENABLE_RELATED_POSTS:-true}" = true ]; then
if [ "$needs_pass1" = true ] && [ "${ENABLE_RELATED_POSTS:-true}" = true ] && ! $ram_mode_active; then
echo -e "${BLUE}DEBUG: Both needs_pass1=true and ENABLE_RELATED_POSTS=true, running Pass 1...${NC}"
echo -e "${YELLOW}Pass 1: Identifying modified tags for related posts cache invalidation...${NC}"
@ -554,6 +609,8 @@ process_all_markdown_files() {
# Export the list for use in pass 2
export RELATED_POSTS_INVALIDATED_LIST
fi
elif $ram_mode_active; then
echo -e "${BLUE}DEBUG: RAM mode active, skipping Pass 1 related-posts invalidation (in-memory computation).${NC}"
else
echo -e "${BLUE}DEBUG: Pass 1 skipped - needs_pass1=$needs_pass1, ENABLE_RELATED_POSTS=${ENABLE_RELATED_POSTS:-true}${NC}"
fi
@ -566,63 +623,81 @@ process_all_markdown_files() {
local files_to_process_count=0
local skipped_count=0
while IFS= read -r line; do
local file filename title date lastmod tags slug image image_caption description author_name author_email
IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email <<< "$line"
# Basic check if it looks like a post
if [ -z "$date" ] || [[ "$file" != "$SRC_DIR"* ]]; then
# echo -e "Skipping non-post file listed in index (pre-check): ${YELLOW}$file${NC}" >&2 # Too verbose
continue
fi
# Calculate expected output path (logic copied from process_single_file)
local output_path
local year month day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
year=$(date +%Y); month=$(date +%m); day=$(date +%d)
fi
local url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$year}"; url_path="${url_path//Month/$month}";
url_path="${url_path//Day/$day}"; url_path="${url_path//slug/$slug}"
local output_html_file="${OUTPUT_DIR:-output}/$url_path/index.html"
# Perform the rebuild check here
common_rebuild_check "$output_html_file"
local common_result=$?
local needs_rebuild=false
if [ $common_result -eq 0 ]; then
needs_rebuild=true # Common checks failed (config changed, template newer, output missing)
else # common_result is 2 (output exists and newer than templates/locale)
local input_time=$(get_file_mtime "$file")
local output_time=$(get_file_mtime "$output_html_file")
if (( input_time > output_time )); then
needs_rebuild=true # Input file is newer
if $ram_mode_active && [ "${FORCE_REBUILD:-false}" = true ]; then
echo -e "RAM mode force rebuild: skipping per-post rebuild checks."
while IFS= read -r line; do
local file filename title date
IFS='|' read -r file filename _ date _ <<< "$line"
if [ -n "$date" ] && [[ "$file" == "$SRC_DIR"* ]]; then
files_to_process_list+=("$line")
files_to_process_count=$((files_to_process_count + 1))
fi
fi
done < <(printf '%s\n' "$file_index_data" | awk 'NF')
else
while IFS= read -r line; do
local file filename title date lastmod tags slug image image_caption description author_name author_email
IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email <<< "$line"
# Check if this post needs rebuilding due to related posts cache invalidation
if [ "$needs_rebuild" = false ] && [ -n "${RELATED_POSTS_INVALIDATED_LIST:-}" ] && [ -f "$RELATED_POSTS_INVALIDATED_LIST" ]; then
if grep -Fxq "$slug" "$RELATED_POSTS_INVALIDATED_LIST" 2>/dev/null; then
needs_rebuild=true # Related posts cache was invalidated
echo -e "Rebuilding ${GREEN}$(basename "$file")${NC} due to related posts cache invalidation"
# Basic check if it looks like a post
if [ -z "$date" ] || [[ "$file" != "$SRC_DIR"* ]]; then
# echo -e "Skipping non-post file listed in index (pre-check): ${YELLOW}$file${NC}" >&2 # Too verbose
continue
fi
fi
if $needs_rebuild; then
files_to_process_list+=("$line")
files_to_process_count=$((files_to_process_count + 1))
else
# Only print skip message if not rebuilding
echo -e "Skipping unchanged file: ${YELLOW}$(basename "$file")${NC}"
skipped_count=$((skipped_count + 1))
fi
done < "$file_index"
# Calculate expected output path (logic copied from process_single_file)
local output_path
local year month day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
year=$(date +%Y); month=$(date +%m); day=$(date +%d)
fi
local url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$year}"; url_path="${url_path//Month/$month}";
url_path="${url_path//Day/$day}"; url_path="${url_path//slug/$slug}"
local output_html_file="${OUTPUT_DIR:-output}/$url_path/index.html"
# Perform the rebuild check here
common_rebuild_check "$output_html_file"
local common_result=$?
local needs_rebuild=false
if [ $common_result -eq 0 ]; then
needs_rebuild=true # Common checks failed (config changed, template newer, output missing)
else # common_result is 2 (output exists and newer than templates/locale)
local input_time=$(get_file_mtime "$file")
local output_time=$(get_file_mtime "$output_html_file")
if (( input_time > output_time )); then
needs_rebuild=true # Input file is newer
fi
fi
# Check if this post needs rebuilding due to related posts cache invalidation
if ! $ram_mode_active && [ "$needs_rebuild" = false ] && [ -n "${RELATED_POSTS_INVALIDATED_LIST:-}" ] && [ -f "$RELATED_POSTS_INVALIDATED_LIST" ]; then
if grep -Fxq "$slug" "$RELATED_POSTS_INVALIDATED_LIST" 2>/dev/null; then
needs_rebuild=true # Related posts cache was invalidated
echo -e "Rebuilding ${GREEN}$(basename "$file")${NC} due to related posts cache invalidation"
fi
fi
if $needs_rebuild; then
files_to_process_list+=("$line")
files_to_process_count=$((files_to_process_count + 1))
else
# Only print skip message if not rebuilding
echo -e "Skipping unchanged file: ${YELLOW}$(basename "$file")${NC}"
skipped_count=$((skipped_count + 1))
fi
done < <(
if $ram_mode_active; then
printf '%s\n' "$file_index_data" | awk 'NF'
else
cat "$file_index"
fi
)
fi
# Check if any files need processing
if [ $files_to_process_count -eq 0 ]; then
@ -633,6 +708,10 @@ process_all_markdown_files() {
echo -e "Found ${GREEN}$files_to_process_count${NC} posts needing processing out of $total_file_count (Skipped: $skipped_count)."
if $ram_mode_active && [ "${ENABLE_RELATED_POSTS:-true}" = true ]; then
prepare_related_posts_ram_cache "${RELATED_POSTS_COUNT:-3}"
fi
# Define a function for processing a single file line from the *filtered* list
process_single_file_for_rebuild() {
local line="$1"
@ -658,21 +737,59 @@ process_all_markdown_files() {
url_path="${url_path//Day/$day}"; url_path="${url_path//slug/$slug}"
output_path="${OUTPUT_DIR:-output}/$url_path"
# Call the main conversion function
# We no longer rely on its internal file_needs_rebuild check
# TODO: Consider modifying convert_markdown to accept a force flag or skip its check
if ! convert_markdown "$file" "$output_path" "$title" "$date" "$lastmod" "$tags" "$slug" "$image" "$image_caption" "$description" "$author_name" "$author_email"; then
# Call the conversion function, skipping internal rebuild checks because this
# function only receives files pre-selected for rebuild.
if ! convert_markdown "$file" "$output_path" "$title" "$date" "$lastmod" "$tags" "$slug" "$image" "$image_caption" "$description" "$author_name" "$author_email" true; then
local exit_code=$?
echo -e "${RED}ERROR:${NC} convert_markdown failed for '$file' with exit code $exit_code. Output HTML may be missing or incomplete." >&2
fi
}
# Use GNU parallel if available
if [ "${HAS_PARALLEL:-false}" = true ]; then
if $ram_mode_active; then
local cores
cores=$(get_parallel_jobs)
if [ "$cores" -gt "$files_to_process_count" ]; then
cores="$files_to_process_count"
fi
if [ "$files_to_process_count" -gt 1 ] && [ "$cores" -gt 1 ]; then
echo -e "${YELLOW}Using shell parallel workers for $files_to_process_count RAM-mode posts${NC}"
local worker_pids=()
local worker_idx
for ((worker_idx = 0; worker_idx < cores; worker_idx++)); do
(
local idx
for ((idx = worker_idx; idx < files_to_process_count; idx += cores)); do
process_single_file_for_rebuild "${files_to_process_list[$idx]}"
done
) &
worker_pids+=("$!")
done
local pid
local worker_failed=false
for pid in "${worker_pids[@]}"; do
if ! wait "$pid"; then
worker_failed=true
fi
done
if $worker_failed; then
echo -e "${RED}Parallel RAM-mode post processing failed.${NC}"
exit 1
fi
else
echo -e "${YELLOW}Using sequential processing for $files_to_process_count RAM-mode posts${NC}"
local line
for line in "${files_to_process_list[@]}"; do
process_single_file_for_rebuild "$line"
done
fi
elif [ "${HAS_PARALLEL:-false}" = true ]; then
echo -e "${GREEN}Using GNU parallel to process $files_to_process_count posts${NC}"
local cores=1
if command -v nproc > /dev/null 2>&1; then cores=$(nproc);
elif command -v sysctl > /dev/null 2>&1; then cores=$(sysctl -n hw.ncpu 2>/dev/null || echo 1); fi
local cores
cores=$(get_parallel_jobs)
# Export functions and variables needed by parallel tasks
# Note: We export the new process function
@ -680,6 +797,7 @@ process_all_markdown_files() {
# Export dependencies of convert_markdown and its helpers
export -f file_needs_rebuild get_file_mtime common_rebuild_check config_has_changed # Still needed by convert_markdown *internally* for now
export -f calculate_reading_time generate_slug format_date fix_url parse_metadata extract_metadata convert_markdown_to_html
export -f format_iso8601_post_date
export -f portable_md5sum # Used by cache funcs
export CACHE_DIR FORCE_REBUILD OUTPUT_DIR SITE_URL URL_SLUG_FORMAT HEADER_TEMPLATE FOOTER_TEMPLATE
export SITE_TITLE SITE_DESCRIPTION AUTHOR_NAME MARKDOWN_PROCESSOR MARKDOWN_PL_PATH DATE_FORMAT TIMEZONE SHOW_TIMEZONE

View file

@ -14,6 +14,10 @@ generate_pages_index() {
# --- Define Target File ---
local pages_index="$OUTPUT_DIR/pages.html"
local secondary_pages_list_file="${CACHE_DIR:-.bssg_cache}/secondary_pages.list"
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_active=true
fi
# --- Cache Check --- START ---
# Rebuild if force flag is set OR if list file exists and output is older than list file
@ -22,13 +26,13 @@ generate_pages_index() {
if [[ "${FORCE_REBUILD:-false}" == true ]]; then
should_rebuild=true
echo -e "${YELLOW}Forcing pages index rebuild (--force-rebuild).${NC}"
elif [ ! -f "$secondary_pages_list_file" ]; then
elif ! $ram_mode_active && [ ! -f "$secondary_pages_list_file" ]; then
# If list file doesn't exist, we need to generate pages.html (or handle absence)
# This case might mean 0 secondary pages after a clean build.
# Let the existing logic handle the case of 0 pages later.
should_rebuild=true
echo -e "${YELLOW}Secondary pages list file not found, rebuilding pages index.${NC}"
elif [ ! -f "$pages_index" ] || [ "$pages_index" -ot "$secondary_pages_list_file" ]; then
elif ! $ram_mode_active && { [ ! -f "$pages_index" ] || [ "$pages_index" -ot "$secondary_pages_list_file" ]; }; then
should_rebuild=true
echo -e "${YELLOW}Pages index is older than secondary pages list, rebuilding.${NC}"
# Add checks for template file changes? More complex, rely on overall rebuild for now.
@ -47,7 +51,9 @@ generate_pages_index() {
# --- Read secondary pages from cache file --- START ---
local temp_secondary_pages=()
if [ -f "$secondary_pages_list_file" ]; then
if $ram_mode_active; then
mapfile -t temp_secondary_pages < <(printf '%s\n' "$(ram_mode_get_dataset "secondary_pages")" | awk 'NF')
elif [ -f "$secondary_pages_list_file" ]; then
# Use mapfile (readarray) to read lines into the array
mapfile -t temp_secondary_pages < "$secondary_pages_list_file"
# Optional: Trim whitespace from each element if necessary (mapfile usually handles newlines)
@ -86,10 +92,8 @@ generate_pages_index() {
# Generate CollectionPage schema
local schema_json_ld=""
local tmp_schema=$(mktemp)
# Create CollectionPage schema
cat > "$tmp_schema" << EOF
schema_json_ld=$(cat << EOF
<script type="application/ld+json">
{
"@context": "https://schema.org",
@ -105,12 +109,7 @@ generate_pages_index() {
}
</script>
EOF
# Read the schema from the temporary file
schema_json_ld=$(cat "$tmp_schema")
# Remove the temporary file
rm "$tmp_schema"
)
# Add schema markup to header
header_content=${header_content//\{\{schema_json_ld\}\}/"$schema_json_ld"}
@ -150,4 +149,4 @@ EOF
}
# Make function available for sourcing
export -f generate_pages_index
export -f generate_pages_index

View file

@ -13,8 +13,602 @@ source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.s
# shellcheck source=generate_feeds.sh disable=SC1091
source "$(dirname "$0")/generate_feeds.sh" || { echo >&2 "Error: Failed to source generate_feeds.sh from generate_tags.sh"; exit 1; }
declare -gA BSSG_RAM_TAG_POST_SLUGS_BY_SLUG=()
declare -gA BSSG_RAM_TAG_POST_COUNT_BY_SLUG=()
declare -gA BSSG_RAM_TAG_ARTICLE_HTML_BY_SLUG=()
declare -gA BSSG_RAM_RSS_TEMPLATE_BY_SLUG=()
declare -g BSSG_RAM_TAG_DISPLAY_DATE_FORMAT=""
declare -g BSSG_RAM_TAG_HEADER_BASE=""
declare -g BSSG_RAM_TAG_FOOTER_CONTENT=""
_bssg_tags_now_ms() {
if declare -F _bssg_ram_timing_now_ms > /dev/null; then
_bssg_ram_timing_now_ms
return
fi
if [ -n "${EPOCHREALTIME:-}" ]; then
local epoch_norm sec frac ms_part
# Some locales expose EPOCHREALTIME with ',' instead of '.' as decimal separator.
epoch_norm="${EPOCHREALTIME/,/.}"
if [[ "$epoch_norm" =~ ^([0-9]+)([.][0-9]+)?$ ]]; then
sec="${BASH_REMATCH[1]}"
frac="${BASH_REMATCH[2]#.}"
frac="${frac}000"
ms_part="${frac:0:3}"
printf '%s\n' $(( 10#$sec * 1000 + 10#$ms_part ))
return
fi
fi
if command -v perl >/dev/null 2>&1; then
perl -MTime::HiRes=time -e 'printf("%.0f\n", time()*1000)'
else
printf '%s\n' $(( $(date +%s) * 1000 ))
fi
}
_bssg_tags_format_ms() {
local ms="${1:-0}"
printf '%d.%03ds' $((ms / 1000)) $((ms % 1000))
}
_write_tag_rss_from_cached_items_ram() {
local output_file="$1"
local feed_link_rel="$2"
local feed_atom_link_rel="$3"
local tag="$4"
local rss_items_xml="$5"
local feed_title="${SITE_TITLE} - ${MSG_TAG_PAGE_TITLE:-"Posts tagged with"}: $tag"
local feed_description="${MSG_POSTS_TAGGED_WITH:-"Posts tagged with"}: $tag"
local rss_date_fmt="%a, %d %b %Y %H:%M:%S %z"
local escaped_feed_title escaped_feed_description feed_link feed_atom_link channel_last_build_date
escaped_feed_title=$(html_escape "$feed_title")
escaped_feed_description=$(html_escape "$feed_description")
feed_link=$(fix_url "$feed_link_rel")
feed_atom_link=$(fix_url "$feed_atom_link_rel")
channel_last_build_date=$(format_date "now" "$rss_date_fmt")
exec 4> "$output_file" || return 1
printf '%s\n' \
'<?xml version="1.0" encoding="UTF-8" ?>' \
'<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">' \
'<channel>' \
" <title>${escaped_feed_title}</title>" \
" <link>${feed_link}</link>" \
" <description>${escaped_feed_description}</description>" \
" <language>${SITE_LANG:-en}</language>" \
" <lastBuildDate>${channel_last_build_date}</lastBuildDate>" \
" <atom:link href=\"${feed_atom_link}\" rel=\"self\" type=\"application/rss+xml\" />" >&4
if [ -n "$rss_items_xml" ]; then
printf '%s' "$rss_items_xml" >&4
fi
printf '%s\n' '</channel>' '</rss>' >&4
exec 4>&-
if [ "${BSSG_RAM_MODE:-false}" != true ] || [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
echo -e "${GREEN}RSS feed generated at $output_file${NC}"
fi
}
_process_single_tag_page_ram() {
local tag_url="$1"
local tag="$2"
local tag_page_html_file="$OUTPUT_DIR/tags/$tag_url/index.html"
local tag_rss_file="$OUTPUT_DIR/tags/$tag_url/${RSS_FILENAME:-rss.xml}"
local tag_page_rel_url="/tags/${tag_url}/"
local tag_rss_rel_url="/tags/${tag_url}/${RSS_FILENAME:-rss.xml}"
mkdir -p "$(dirname "$tag_page_html_file")"
local header_content="$BSSG_RAM_TAG_HEADER_BASE"
header_content=${header_content//\{\{page_title\}\}/"${MSG_TAG_PAGE_TITLE:-"Posts tagged with"}: $tag"}
header_content=${header_content//\{\{page_url\}\}/"$tag_page_rel_url"}
if [ "${ENABLE_TAG_RSS:-false}" = true ]; then
header_content=${header_content//<!-- bssg:tag_rss_link -->/<link rel="alternate" type="application/rss+xml" title="${SITE_TITLE} - Posts tagged with ${tag}" href="${SITE_URL}${tag_rss_rel_url}">}
else
header_content=${header_content//<!-- bssg:tag_rss_link -->/}
fi
local schema_json_ld
schema_json_ld=$(cat <<EOF
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "CollectionPage",
"name": "Posts tagged with: $tag",
"description": "Posts with tag: $tag",
"url": "$SITE_URL${tag_page_rel_url}",
"isPartOf": {
"@type": "WebSite",
"name": "$SITE_TITLE",
"url": "$SITE_URL"
}
}
</script>
EOF
)
header_content=${header_content//\{\{schema_json_ld\}\}/"$schema_json_ld"}
local footer_content="$BSSG_RAM_TAG_FOOTER_CONTENT"
exec 3> "$tag_page_html_file"
printf '%s\n' "$header_content" >&3
printf '<h1>%s: %s</h1>\n' "${MSG_TAG_PAGE_TITLE:-Posts tagged with}" "$tag" >&3
printf '<div class="posts-list">\n' >&3
local rss_item_limit=${RSS_ITEM_LIMIT:-15}
local rss_count=0
local cached_rss_items=""
local rss_all_items_cached=true
local -a selected_rss_templates=()
local tag_post_slugs=""
if [[ -n "${BSSG_RAM_TAG_POST_SLUGS_BY_SLUG[$tag_url]+_}" ]]; then
tag_post_slugs="${BSSG_RAM_TAG_POST_SLUGS_BY_SLUG[$tag_url]}"
fi
local slug cached_article_html rss_template
while IFS= read -r slug; do
[ -z "$slug" ] && continue
cached_article_html="${BSSG_RAM_TAG_ARTICLE_HTML_BY_SLUG[$slug]}"
if [ -n "$cached_article_html" ]; then
printf '%s' "$cached_article_html" >&3
fi
if [ "${ENABLE_TAG_RSS:-false}" = true ] && [ "$rss_count" -lt "$rss_item_limit" ]; then
rss_template="${BSSG_RAM_RSS_TEMPLATE_BY_SLUG[$slug]}"
if [ -n "$rss_template" ]; then
selected_rss_templates+=("$rss_template")
if $rss_all_items_cached; then
local rss_file rss_filename rss_title rss_date rss_lastmod rss_tags rss_slug rss_image rss_image_caption rss_description rss_author_name rss_author_email
IFS='|' read -r rss_file rss_filename rss_title rss_date rss_lastmod rss_tags rss_slug rss_image rss_image_caption rss_description rss_author_name rss_author_email <<< "$rss_template"
local rss_item_cache_key="${RSS_INCLUDE_FULL_CONTENT:-false}|${rss_file}|${rss_date}|${rss_lastmod}|${rss_slug}|${rss_title}"
local rss_item_xml="${BSSG_RAM_RSS_ITEM_XML_CACHE[$rss_item_cache_key]-}"
if [ -n "$rss_item_xml" ]; then
cached_rss_items+="$rss_item_xml"
else
rss_all_items_cached=false
fi
fi
rss_count=$((rss_count + 1))
fi
fi
done <<< "$tag_post_slugs"
printf '</div>\n' >&3
printf '<p><a href="%s/tags/">%s</a></p>\n' "$SITE_URL" "${MSG_ALL_TAGS:-All Tags}" >&3
printf '%s\n' "$footer_content" >&3
exec 3>&-
if [ "${ENABLE_TAG_RSS:-false}" = true ] && [ "${#selected_rss_templates[@]}" -gt 0 ]; then
if $rss_all_items_cached; then
_write_tag_rss_from_cached_items_ram "$tag_rss_file" "$tag_page_rel_url" "$tag_rss_rel_url" "$tag" "$cached_rss_items"
else
local tag_post_data=""
local rss_template_entry
for rss_template_entry in "${selected_rss_templates[@]}"; do
tag_post_data+="${rss_template_entry//%TAG%/$tag}"$'\n'
done
_generate_rss_feed "$tag_rss_file" "${SITE_TITLE} - ${MSG_TAG_PAGE_TITLE:-"Posts tagged with"}: $tag" "${MSG_POSTS_TAGGED_WITH:-"Posts tagged with"}: $tag" "$tag_page_rel_url" "$tag_rss_rel_url" "$tag_post_data"
fi
fi
}
_generate_tag_pages_ram() {
echo -e "${YELLOW}Processing tag pages${NC}${ENABLE_TAG_RSS:+" and RSS feeds"}...${NC}"
local ram_tags_timing_enabled=false
if [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
ram_tags_timing_enabled=true
fi
local tags_total_start_ms=0
local tags_phase_start_ms=0
local tags_prep_ms=0
local tags_render_ms=0
local tags_index_ms=0
local tags_total_ms=0
if [ "$ram_tags_timing_enabled" = true ]; then
tags_total_start_ms="$(_bssg_tags_now_ms)"
tags_phase_start_ms="$tags_total_start_ms"
fi
local tags_index_data
tags_index_data=$(ram_mode_get_dataset "tags_index")
local main_tags_index_output="$OUTPUT_DIR/tags/index.html"
mkdir -p "$OUTPUT_DIR/tags"
if [ -z "$tags_index_data" ]; then
echo -e "${YELLOW}No tags found in RAM index. Skipping tag page generation.${NC}"
return 0
fi
BSSG_RAM_TAG_POST_SLUGS_BY_SLUG=()
BSSG_RAM_TAG_POST_COUNT_BY_SLUG=()
BSSG_RAM_TAG_ARTICLE_HTML_BY_SLUG=()
BSSG_RAM_RSS_TEMPLATE_BY_SLUG=()
declare -A tag_name_by_slug=()
local sorted_tag_urls=()
declare -A rss_prefill_slug_set=()
declare -A rss_prefill_slug_hits=()
local rss_prefill_slugs=()
local rss_prefill_occurrences=0
local rss_item_limit="${RSS_ITEM_LIMIT:-15}"
local rss_prefill_min_hits="${RAM_RSS_PREFILL_MIN_HITS:-2}"
local rss_prefill_max_posts="${RAM_RSS_PREFILL_MAX_POSTS:-24}"
if ! [[ "$rss_prefill_min_hits" =~ ^[0-9]+$ ]] || [ "$rss_prefill_min_hits" -lt 1 ]; then
rss_prefill_min_hits=1
fi
if ! [[ "$rss_prefill_max_posts" =~ ^[0-9]+$ ]]; then
rss_prefill_max_posts=24
fi
declare -A seen_post_slugs=()
local display_date_format="$DATE_FORMAT"
if [ "${SHOW_TIMEZONE:-false}" = false ]; then
display_date_format=$(echo "$display_date_format" | sed -e 's/%[zZ]//g' -e 's/[[:space:]]*$//')
fi
BSSG_RAM_TAG_DISPLAY_DATE_FORMAT="$display_date_format"
# Prime per-post caches once from file_index (one row per post), then build
# lightweight tag->post mappings from tags_index (many rows per post).
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
local can_prime_rss_metadata=false
local rss_date_fmt="%a, %d %b %Y %H:%M:%S %z"
local build_timestamp_iso=""
if [ "${ENABLE_TAG_RSS:-false}" = true ] && declare -F _ram_prime_rss_metadata_entry > /dev/null; then
can_prime_rss_metadata=true
build_timestamp_iso=$(format_date "now" "%Y-%m-%dT%H:%M:%S%z")
if [[ "$build_timestamp_iso" =~ ([+-][0-9]{2})([0-9]{2})$ ]]; then
build_timestamp_iso="${build_timestamp_iso::${#build_timestamp_iso}-2}:${BASH_REMATCH[2]}"
fi
fi
local file filename title date lastmod tags slug image image_caption description author_name author_email
while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
[ -z "$file" ] && continue
[ -z "$slug" ] && continue
[[ -n "${seen_post_slugs[$slug]+_}" ]] && continue
seen_post_slugs["$slug"]=1
local post_year post_month post_day
if [[ "$date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
post_year="${BASH_REMATCH[1]}"
post_month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
post_day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
post_year=$(date +%Y); post_month=$(date +%m); post_day=$(date +%d)
fi
local formatted_path="${URL_SLUG_FORMAT//Year/$post_year}"
formatted_path="${formatted_path//Month/$post_month}"
formatted_path="${formatted_path//Day/$post_day}"
formatted_path="${formatted_path//slug/$slug}"
local post_link="/${formatted_path}/"
local formatted_date
formatted_date=$(format_date "$date" "$display_date_format")
local display_author_name="${author_name:-${AUTHOR_NAME:-Anonymous}}"
local article_html=""
article_html+=' <article>'$'\n'
article_html+=" <h3><a href=\"${SITE_URL}${post_link}\">${title}</a></h3>"$'\n'
article_html+=" <div class=\"meta\">${MSG_PUBLISHED_ON:-Published on} ${formatted_date} ${MSG_BY:-by} <strong>${display_author_name}</strong></div>"$'\n'
if [ -n "$image" ]; then
local image_url alt_text figcaption_content
image_url=$(fix_url "$image")
alt_text="${image_caption:-$title}"
figcaption_content="${image_caption:-$title}"
article_html+=' <figure class="featured-image tag-image">'$'\n'
article_html+=" <a href=\"${SITE_URL}${post_link}\">"$'\n'
article_html+=" <img src=\"${image_url}\" alt=\"${alt_text}\" />"$'\n'
article_html+=' </a>'$'\n'
article_html+=" <figcaption>${figcaption_content}</figcaption>"$'\n'
article_html+=' </figure>'$'\n'
fi
if [ -n "$description" ]; then
article_html+=' <div class="summary">'$'\n'
article_html+=" ${description}"$'\n'
article_html+=' </div>'$'\n'
fi
article_html+=' </article>'$'\n'
BSSG_RAM_TAG_ARTICLE_HTML_BY_SLUG["$slug"]="$article_html"
BSSG_RAM_RSS_TEMPLATE_BY_SLUG["$slug"]="${filename}|${filename}|${title}|${date}|${lastmod}|%TAG%|${slug}|${image}|${image_caption}|${description}|${author_name}|${author_email}"
if $can_prime_rss_metadata; then
_ram_prime_rss_metadata_entry "$date" "$lastmod" "$slug" "$rss_date_fmt" "$build_timestamp_iso" "$file" >/dev/null || true
fi
done <<< "$file_index_data"
if $can_prime_rss_metadata; then
BSSG_RAM_RSS_METADATA_CACHE_READY=true
fi
# Sort once globally by tag slug, then by publish date/lastmod descending.
# Aggregate per-tag rows in awk to reduce per-line bash map churn.
local aggregated_tags_data
aggregated_tags_data=$(printf '%s\n' "$tags_index_data" | awk 'NF' | LC_ALL=C sort -t'|' -k2,2 -k4,4r -k5,5r | awk -F'|' -v OFS='|' '
{
tag = $1
tag_slug = $2
post_slug = $7
if (tag == "" || tag_slug == "") next
if (current_tag_slug != "" && tag_slug != current_tag_slug) {
print current_tag_slug, current_tag_name, current_count, current_post_slugs
current_count = 0
current_post_slugs = ""
}
if (tag_slug != current_tag_slug) {
current_tag_slug = tag_slug
current_tag_name = tag
}
if (post_slug != "") {
if (current_post_slugs == "") {
current_post_slugs = post_slug
} else {
current_post_slugs = current_post_slugs "," post_slug
}
}
current_count++
}
END {
if (current_tag_slug != "") {
print current_tag_slug, current_tag_name, current_count, current_post_slugs
}
}')
local tag_slug tag_name tag_count_value tag_post_slugs_csv
while IFS='|' read -r tag_slug tag_name tag_count_value tag_post_slugs_csv; do
[ -z "$tag_slug" ] && continue
tag_name_by_slug["$tag_slug"]="$tag_name"
BSSG_RAM_TAG_POST_COUNT_BY_SLUG["$tag_slug"]="$tag_count_value"
local tag_post_slugs_newline=""
if [ -n "$tag_post_slugs_csv" ]; then
tag_post_slugs_newline="${tag_post_slugs_csv//,/$'\n'}"
fi
BSSG_RAM_TAG_POST_SLUGS_BY_SLUG["$tag_slug"]="$tag_post_slugs_newline"
sorted_tag_urls+=("$tag_slug")
if [ "${ENABLE_TAG_RSS:-false}" = true ] && [ -n "$tag_post_slugs_newline" ]; then
local rss_prefill_count=0
local rss_prefill_slug=""
while IFS= read -r rss_prefill_slug; do
[ -z "$rss_prefill_slug" ] && continue
rss_prefill_occurrences=$((rss_prefill_occurrences + 1))
rss_prefill_slug_hits["$rss_prefill_slug"]=$(( ${rss_prefill_slug_hits[$rss_prefill_slug]:-0} + 1 ))
if [[ -z "${rss_prefill_slug_set[$rss_prefill_slug]+_}" ]]; then
rss_prefill_slug_set["$rss_prefill_slug"]=1
rss_prefill_slugs+=("$rss_prefill_slug")
fi
rss_prefill_count=$((rss_prefill_count + 1))
if [ "$rss_prefill_count" -ge "$rss_item_limit" ]; then
break
fi
done <<< "$tag_post_slugs_newline"
fi
done <<< "$aggregated_tags_data"
if [ "${ENABLE_TAG_RSS:-false}" = true ] && [ "$rss_prefill_min_hits" -gt 1 ] && [ "${#rss_prefill_slugs[@]}" -gt 0 ]; then
local -a rss_prefill_filtered_slugs=()
local rss_prefill_slug
for rss_prefill_slug in "${rss_prefill_slugs[@]}"; do
if [ "${rss_prefill_slug_hits[$rss_prefill_slug]:-0}" -ge "$rss_prefill_min_hits" ]; then
rss_prefill_filtered_slugs+=("$rss_prefill_slug")
fi
done
if [ "${#rss_prefill_filtered_slugs[@]}" -gt 0 ]; then
rss_prefill_slugs=("${rss_prefill_filtered_slugs[@]}")
fi
fi
local rss_prefill_pool_count="${#rss_prefill_slugs[@]}"
if [ "${ENABLE_TAG_RSS:-false}" = true ] && [ "$rss_prefill_max_posts" -gt 0 ] && [ "${#rss_prefill_slugs[@]}" -gt "$rss_prefill_max_posts" ]; then
local -a rss_prefill_ranked_lines=()
local rss_prefill_slug
for rss_prefill_slug in "${rss_prefill_slugs[@]}"; do
rss_prefill_ranked_lines+=("${rss_prefill_slug_hits[$rss_prefill_slug]:-0}|$rss_prefill_slug")
done
local -a rss_prefill_capped_slugs=()
local rss_prefill_rank_line
while IFS= read -r rss_prefill_rank_line; do
[ -z "$rss_prefill_rank_line" ] && continue
rss_prefill_capped_slugs+=("${rss_prefill_rank_line#*|}")
done < <(
printf '%s\n' "${rss_prefill_ranked_lines[@]}" \
| LC_ALL=C sort -t'|' -k1,1nr -k2,2 \
| head -n "$rss_prefill_max_posts"
)
if [ "${#rss_prefill_capped_slugs[@]}" -gt 0 ]; then
rss_prefill_slugs=("${rss_prefill_capped_slugs[@]}")
fi
fi
local footer_base="$FOOTER_TEMPLATE"
footer_base=${footer_base//\{\{current_year\}\}/$(date +%Y)}
footer_base=${footer_base//\{\{author_name\}\}/"$AUTHOR_NAME"}
BSSG_RAM_TAG_FOOTER_CONTENT="$footer_base"
local header_base="$HEADER_TEMPLATE"
header_base=${header_base//\{\{site_title\}\}/"$SITE_TITLE"}
header_base=${header_base//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
header_base=${header_base//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
header_base=${header_base//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
header_base=${header_base//\{\{og_type\}\}/"website"}
header_base=${header_base//\{\{site_url\}\}/"$SITE_URL"}
header_base=${header_base//\{\{og_image\}\}/""}
header_base=${header_base//\{\{twitter_image\}\}/""}
BSSG_RAM_TAG_HEADER_BASE="$header_base"
local tag_count="${#sorted_tag_urls[@]}"
echo -e "Generating ${GREEN}$tag_count${NC} tag pages from RAM index."
if [ "${ENABLE_TAG_RSS:-false}" = true ]; then
if declare -F prepare_ram_rss_metadata_cache > /dev/null; then
prepare_ram_rss_metadata_cache
fi
if [ "${RSS_INCLUDE_FULL_CONTENT:-false}" = true ] && declare -F prepare_ram_rss_full_content_cache > /dev/null; then
prepare_ram_rss_full_content_cache
fi
# Pre-warm RAM RSS item XML cache once in parent process so worker
# subshells inherit it read-only and avoid rebuilding duplicate items.
if declare -F _generate_rss_feed > /dev/null; then
local rss_prefill_post_data=""
local rss_prefill_slug rss_template_entry
for rss_prefill_slug in "${rss_prefill_slugs[@]}"; do
rss_template_entry="${BSSG_RAM_RSS_TEMPLATE_BY_SLUG[$rss_prefill_slug]}"
[ -z "$rss_template_entry" ] && continue
rss_prefill_post_data+="${rss_template_entry//%TAG%/__prefill__}"$'\n'
done
if [ -n "$rss_prefill_post_data" ]; then
if [ "${RAM_MODE_VERBOSE:-false}" = true ]; then
local max_posts_label="unlimited"
if [ "$rss_prefill_max_posts" -gt 0 ]; then
max_posts_label="$rss_prefill_max_posts"
fi
echo -e "DEBUG: Pre-warming RAM RSS item cache for ${#rss_prefill_slugs[@]} posts (${rss_prefill_occurrences} tag-RSS slots, min hits: ${rss_prefill_min_hits}, max posts: ${max_posts_label}, pool: ${rss_prefill_pool_count})."
fi
_generate_rss_feed "/dev/null" "__prefill__" "__prefill__" "/" "/rss.xml" "$rss_prefill_post_data" >/dev/null || true
fi
fi
fi
if [ "$ram_tags_timing_enabled" = true ]; then
local now_ms
now_ms="$(_bssg_tags_now_ms)"
tags_prep_ms=$((now_ms - tags_phase_start_ms))
tags_phase_start_ms="$now_ms"
fi
local tag_url
local cores
cores=$(get_parallel_jobs)
if [ "$cores" -gt "$tag_count" ]; then
cores="$tag_count"
fi
if [ "$tag_count" -gt 1 ] && [ "$cores" -gt 1 ]; then
local worker_pids=()
local worker_idx
for ((worker_idx = 0; worker_idx < cores; worker_idx++)); do
(
local idx local_tag_url local_tag
for ((idx = worker_idx; idx < tag_count; idx += cores)); do
local_tag_url="${sorted_tag_urls[$idx]}"
local_tag="${tag_name_by_slug[$local_tag_url]}"
_process_single_tag_page_ram "$local_tag_url" "$local_tag"
done
) &
worker_pids+=("$!")
done
local pid
local worker_failed=false
for pid in "${worker_pids[@]}"; do
if ! wait "$pid"; then
worker_failed=true
fi
done
if $worker_failed; then
echo -e "${RED}Parallel RAM-mode tag processing failed.${NC}"
exit 1
fi
else
for tag_url in "${sorted_tag_urls[@]}"; do
tag="${tag_name_by_slug[$tag_url]}"
_process_single_tag_page_ram "$tag_url" "$tag"
done
fi
if [ "$ram_tags_timing_enabled" = true ]; then
local now_ms
now_ms="$(_bssg_tags_now_ms)"
tags_render_ms=$((now_ms - tags_phase_start_ms))
tags_phase_start_ms="$now_ms"
fi
local header_content="$HEADER_TEMPLATE"
local footer_content="$FOOTER_TEMPLATE"
header_content=${header_content//\{\{site_title\}\}/"$SITE_TITLE"}
header_content=${header_content//\{\{page_title\}\}/"${MSG_ALL_TAGS:-"All Tags"}"}
header_content=${header_content//\{\{site_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{twitter_description\}\}/"$SITE_DESCRIPTION"}
header_content=${header_content//\{\{og_type\}\}/"website"}
header_content=${header_content//\{\{page_url\}\}/"/tags/"}
header_content=${header_content//\{\{site_url\}\}/"$SITE_URL"}
header_content=${header_content//<!-- bssg:tag_rss_link -->/}
local tags_schema_json
tags_schema_json=$(cat <<EOF
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "CollectionPage",
"name": "${MSG_ALL_TAGS:-"All Tags"}",
"description": "List of all tags on $SITE_TITLE",
"url": "$SITE_URL/tags/",
"isPartOf": {
"@type": "WebSite",
"name": "$SITE_TITLE",
"url": "$SITE_URL"
}
}
</script>
EOF
)
header_content=${header_content//\{\{schema_json_ld\}\}/"$tags_schema_json"}
header_content=${header_content//\{\{og_image\}\}/""}
header_content=${header_content//\{\{twitter_image\}\}/""}
footer_content=${footer_content//\{\{current_year\}\}/$(date +%Y)}
footer_content=${footer_content//\{\{author_name\}\}/"$AUTHOR_NAME"}
exec 5> "$main_tags_index_output"
printf '%s\n' "$header_content" >&5
printf '<h1>%s</h1>\n' "${MSG_ALL_TAGS:-All Tags}" >&5
printf '<div class="tags-list">\n' >&5
for tag_url in "${sorted_tag_urls[@]}"; do
tag="${tag_name_by_slug[$tag_url]}"
local post_count="${BSSG_RAM_TAG_POST_COUNT_BY_SLUG[$tag_url]:-0}"
printf ' <a href="%s/tags/%s/">%s <span class="tag-count">(%s)</span></a>\n' "$SITE_URL" "$tag_url" "$tag" "$post_count" >&5
done
printf '</div>\n' >&5
printf '%s\n' "$footer_content" >&5
exec 5>&-
if [ "$ram_tags_timing_enabled" = true ]; then
local now_ms
now_ms="$(_bssg_tags_now_ms)"
tags_index_ms=$((now_ms - tags_phase_start_ms))
tags_total_ms=$((now_ms - tags_total_start_ms))
echo -e "${BLUE}RAM tags sub-timing:${NC}"
echo -e " Prepare maps/cache: $(_bssg_tags_format_ms "$tags_prep_ms")"
echo -e " Tag pages+RSS: $(_bssg_tags_format_ms "$tags_render_ms")"
echo -e " tags/index.html: $(_bssg_tags_format_ms "$tags_index_ms")"
echo -e " Total tags stage: $(_bssg_tags_format_ms "$tags_total_ms")"
fi
BSSG_RAM_TAG_POST_SLUGS_BY_SLUG=()
BSSG_RAM_TAG_POST_COUNT_BY_SLUG=()
BSSG_RAM_TAG_ARTICLE_HTML_BY_SLUG=()
BSSG_RAM_RSS_TEMPLATE_BY_SLUG=()
BSSG_RAM_TAG_HEADER_BASE=""
BSSG_RAM_TAG_FOOTER_CONTENT=""
BSSG_RAM_TAG_DISPLAY_DATE_FORMAT=""
echo -e "${GREEN}Tag pages processed!${NC}"
}
# Generate tag pages
generate_tag_pages() {
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
_generate_tag_pages_ram
return $?
fi
echo -e "${YELLOW}Processing tag pages${NC}${ENABLE_TAG_RSS:+" and RSS feeds"}...${NC}"
local tags_index_file="$CACHE_DIR/tags_index.txt"
@ -493,9 +1087,8 @@ EOF
# Use parallel
if [ "${HAS_PARALLEL:-false}" = true ] ; then
echo -e "${GREEN}Using GNU parallel to process tag pages${NC}${ENABLE_TAG_RSS:+/feeds}"
local cores=1
if command -v nproc > /dev/null 2>&1; then cores=$(nproc);
elif command -v sysctl > /dev/null 2>&1; then cores=$(sysctl -n hw.ncpu 2>/dev/null || echo 1); fi
local cores
cores=$(get_parallel_jobs)
local jobs=$cores # Use all cores for tags by default if parallel
# Export necessary functions and variables

View file

@ -175,12 +175,33 @@ _process_raw_file_index() {
}
# Optimized file index building - orchestrates raw build and processing
_build_file_index_from_ram() {
while IFS= read -r file; do
[[ -z "$file" ]] && continue
local metadata
metadata=$(extract_metadata "$file") || continue
local filename
filename=$(basename "$file")
echo "$file|$filename|$metadata"
done < <(ram_mode_list_src_files) | sort -t '|' -k 4,4r -k 1,1
}
optimized_build_file_index() {
echo -e "${YELLOW}Building file index...${NC}"
local file_index="${CACHE_DIR:-.bssg_cache}/file_index.txt"
local index_marker="${CACHE_DIR:-.bssg_cache}/index_marker"
local frontmatter_changes_marker="${CACHE_DIR:-.bssg_cache}/frontmatter_changes_marker"
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_list_src_files > /dev/null; then
local file_index_data
file_index_data=$(_build_file_index_from_ram)
ram_mode_set_dataset "file_index" "$file_index_data"
ram_mode_clear_dataset "file_index_prev"
ram_mode_set_dataset "frontmatter_changes_marker" "1"
echo -e "${GREEN}File index built from RAM preload with $(ram_mode_dataset_line_count "file_index") complete entries!${NC}"
return 0
fi
# Check if rebuild is needed
if [ "${FORCE_REBUILD:-false}" = false ] && [ -f "$file_index" ] && [ -f "$index_marker" ]; then
@ -293,6 +314,44 @@ build_tags_index() {
local tags_index_file="${CACHE_DIR:-.bssg_cache}/tags_index.txt"
local frontmatter_changes_marker="${CACHE_DIR:-.bssg_cache}/frontmatter_changes_marker"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local file_index_data tags_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
ram_mode_set_dataset "tags_index" ""
ram_mode_clear_dataset "has_tags"
echo -e "${GREEN}Tags index built!${NC}"
return 0
fi
tags_index_data=$(printf '%s\n' "$file_index_data" | awk -F'|' -v OFS='|' '
{
if (length($6) > 0) {
split($6, tags_array, ",");
for (i in tags_array) {
tag = tags_array[i];
gsub(/^[[:space:]]+|[[:space:]]+$/, "", tag);
if (length(tag) == 0) continue;
tag_slug = tolower(tag);
gsub(/[^a-z0-9]+/, "-", tag_slug);
gsub(/^-+|-+$/, "", tag_slug);
if (length(tag_slug) == 0) tag_slug = "-";
print tag, tag_slug, $3, $4, $5, $2, $7, $8, $9, $10, $11, $12;
}
}
}')
ram_mode_set_dataset "tags_index" "$tags_index_data"
if [ -n "$tags_index_data" ]; then
ram_mode_set_dataset "has_tags" "1"
else
ram_mode_clear_dataset "has_tags"
fi
echo -e "${GREEN}Tags index built!${NC}"
return 0
fi
# --- Optimized Rebuild Check --- START ---
local rebuild_needed=false
local reason=""
@ -376,6 +435,39 @@ build_authors_index() {
local file_index="${CACHE_DIR:-.bssg_cache}/file_index.txt"
local authors_index_file="${CACHE_DIR:-.bssg_cache}/authors_index.txt"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local file_index_data authors_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
ram_mode_set_dataset "authors_index" ""
ram_mode_clear_dataset "has_authors"
echo -e "${GREEN}Authors index built!${NC}"
return 0
fi
authors_index_data=$(printf '%s\n' "$file_index_data" | awk -F'|' -v OFS='|' '
{
author_name = $11;
author_email = $12;
if (length(author_name) == 0) next;
author_slug = tolower(author_name);
gsub(/[^a-z0-9]+/, "-", author_slug);
gsub(/^-+|-+$/, "", author_slug);
if (length(author_slug) == 0) author_slug = "anonymous";
print author_name, author_slug, author_email, $3, $4, $5, $2, $7, $8, $9, $10;
}')
ram_mode_set_dataset "authors_index" "$authors_index_data"
if [ -n "$authors_index_data" ]; then
ram_mode_set_dataset "has_authors" "1"
else
ram_mode_clear_dataset "has_authors"
fi
echo -e "${GREEN}Authors index built!${NC}"
return 0
fi
# Check if rebuild is needed: missing cache or input/dependencies changed
local rebuild_needed=false
if [ ! -f "$authors_index_file" ]; then
@ -443,6 +535,18 @@ identify_affected_authors() {
export AFFECTED_AUTHORS=""
export AUTHORS_INDEX_NEEDS_REBUILD="false"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local authors_index_data
authors_index_data=$(ram_mode_get_dataset "authors_index")
if [ -n "$authors_index_data" ]; then
AFFECTED_AUTHORS=$(printf '%s\n' "$authors_index_data" | awk -F'|' 'NF { print $1 }' | sort -u | tr '\n' ' ')
AUTHORS_INDEX_NEEDS_REBUILD="true"
fi
export AFFECTED_AUTHORS
export AUTHORS_INDEX_NEEDS_REBUILD
return 0
fi
# If previous index doesn't exist, all authors in the current index are affected,
# and the main index needs rebuilding.
if [ ! -f "$authors_index_prev_file" ]; then
@ -519,6 +623,43 @@ build_archive_index() {
local file_index="${CACHE_DIR:-.bssg_cache}/file_index.txt"
local archive_index_file="${CACHE_DIR:-.bssg_cache}/archive_index.txt"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local file_index_data archive_index_data=""
file_index_data=$(ram_mode_get_dataset "file_index")
if [ -z "$file_index_data" ]; then
ram_mode_set_dataset "archive_index" ""
echo -e "${GREEN}Archive index built!${NC}"
return 0
fi
local line file filename title date lastmod tags slug image image_caption description author_name author_email
while IFS= read -r line; do
[ -z "$line" ] && continue
IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email <<< "$line"
[ -z "$date" ] && continue
local year month month_name
if [[ "$date" =~ ^([0-9]{4})[-/]([0-9]{1,2})[-/]([0-9]{1,2}) ]]; then
year="${BASH_REMATCH[1]}"
month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
else
continue
fi
local month_name_var="MSG_MONTH_${month}"
month_name="${!month_name_var}"
if [[ -z "$month_name" ]]; then
month_name="$month"
fi
archive_index_data+="$year|$month|$month_name|$title|$date|$lastmod|$filename.html|$slug|$image|$image_caption|$description|$author_name|$author_email"$'\n'
done <<< "$file_index_data"
ram_mode_set_dataset "archive_index" "$archive_index_data"
echo -e "${GREEN}Archive index built!${NC}"
return 0
fi
# Check if rebuild is needed: missing cache or input/dependencies changed
local rebuild_needed=false
if [ ! -f "$archive_index_file" ]; then
@ -604,6 +745,18 @@ identify_affected_archive_months() {
export AFFECTED_ARCHIVE_MONTHS=""
export ARCHIVE_INDEX_NEEDS_REBUILD="false"
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local archive_index_data
archive_index_data=$(ram_mode_get_dataset "archive_index")
if [ -n "$archive_index_data" ]; then
AFFECTED_ARCHIVE_MONTHS=$(printf '%s\n' "$archive_index_data" | awk -F'|' 'NF { print $1 "|" $2 }' | sort -u | tr '\n' ' ')
ARCHIVE_INDEX_NEEDS_REBUILD="true"
fi
export AFFECTED_ARCHIVE_MONTHS
export ARCHIVE_INDEX_NEEDS_REBUILD
return 0
fi
# If previous index doesn't exist, all months in the current index are affected,
# and the main index needs rebuilding.
if [ ! -f "$archive_index_prev_file" ]; then
@ -673,4 +826,4 @@ identify_affected_archive_months() {
trap - RETURN # Remove trap upon successful completion
}
# --- Indexing Functions --- END ---
# --- Indexing Functions --- END ---

View file

@ -13,6 +13,7 @@ BUILD_START_TIME=$(date +%s)
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Determine the project root (one level up from the SCRIPT_DIR's parent)
PROJECT_ROOT="$( dirname "$( dirname "$SCRIPT_DIR" )" )"
export BSSG_PROJECT_ROOT="$PROJECT_ROOT"
# Check if PROJECT_ROOT is already the current directory to avoid unnecessary cd
if [ "$PWD" != "$PROJECT_ROOT" ]; then
echo "Changing directory to project root: $PROJECT_ROOT"
@ -81,25 +82,180 @@ fi
# shellcheck source=utils.sh
source "${SCRIPT_DIR}/utils.sh" || { echo -e "\033[0;31mError: Failed to source utils.sh\033[0m"; exit 1; }
# Build mode validation and setup
BUILD_MODE="${BUILD_MODE:-normal}"
case "$BUILD_MODE" in
normal|ram) ;;
*)
echo -e "${RED}Error: Invalid BUILD_MODE '$BUILD_MODE'. Use 'normal' or 'ram'.${NC}" >&2
exit 1
;;
esac
export BUILD_MODE
export BSSG_RAM_MODE=false
# Print the theme being used for this build (final value after potential random selection)
echo -e "${GREEN}Using theme: ${THEME}${NC}"
echo "Loaded utilities."
# --- RAM Mode Stage Timing --- START ---
BSSG_RAM_TIMING_ENABLED=false
if [ "$BUILD_MODE" = "ram" ]; then
BSSG_RAM_TIMING_ENABLED=true
fi
declare -ga BSSG_RAM_TIMING_STAGE_KEYS=()
declare -ga BSSG_RAM_TIMING_STAGE_LABELS=()
declare -ga BSSG_RAM_TIMING_STAGE_MS=()
BSSG_RAM_TIMING_STAGE_ACTIVE=false
BSSG_RAM_TIMING_CURRENT_STAGE_KEY=""
BSSG_RAM_TIMING_CURRENT_STAGE_LABEL=""
BSSG_RAM_TIMING_CURRENT_STAGE_START_MS=0
_bssg_ram_timing_now_ms() {
if [ -n "${EPOCHREALTIME:-}" ]; then
local epoch_norm sec frac ms_part
# Some locales expose EPOCHREALTIME with ',' instead of '.' as decimal separator.
epoch_norm="${EPOCHREALTIME/,/.}"
if [[ "$epoch_norm" =~ ^([0-9]+)([.][0-9]+)?$ ]]; then
sec="${BASH_REMATCH[1]}"
frac="${BASH_REMATCH[2]#.}"
frac="${frac}000"
ms_part="${frac:0:3}"
printf '%s\n' $(( 10#$sec * 1000 + 10#$ms_part ))
return
fi
fi
if command -v perl >/dev/null 2>&1; then
perl -MTime::HiRes=time -e 'printf("%.0f\n", time()*1000)'
else
printf '%s\n' $(( $(date +%s) * 1000 ))
fi
}
_bssg_ram_timing_format_ms() {
local ms="$1"
printf '%d.%03ds' $((ms / 1000)) $((ms % 1000))
}
bssg_ram_timing_start() {
if [ "$BSSG_RAM_TIMING_ENABLED" != true ]; then
return
fi
if [ "$BSSG_RAM_TIMING_STAGE_ACTIVE" = true ]; then
bssg_ram_timing_end
fi
BSSG_RAM_TIMING_CURRENT_STAGE_KEY="$1"
BSSG_RAM_TIMING_CURRENT_STAGE_LABEL="$2"
BSSG_RAM_TIMING_CURRENT_STAGE_START_MS="$(_bssg_ram_timing_now_ms)"
BSSG_RAM_TIMING_STAGE_ACTIVE=true
}
bssg_ram_timing_end() {
if [ "$BSSG_RAM_TIMING_ENABLED" != true ] || [ "$BSSG_RAM_TIMING_STAGE_ACTIVE" != true ]; then
return
fi
local end_ms elapsed_ms
end_ms="$(_bssg_ram_timing_now_ms)"
elapsed_ms=$((end_ms - BSSG_RAM_TIMING_CURRENT_STAGE_START_MS))
if [ "$elapsed_ms" -lt 0 ]; then
elapsed_ms=0
fi
BSSG_RAM_TIMING_STAGE_KEYS+=("$BSSG_RAM_TIMING_CURRENT_STAGE_KEY")
BSSG_RAM_TIMING_STAGE_LABELS+=("$BSSG_RAM_TIMING_CURRENT_STAGE_LABEL")
BSSG_RAM_TIMING_STAGE_MS+=("$elapsed_ms")
BSSG_RAM_TIMING_STAGE_ACTIVE=false
BSSG_RAM_TIMING_CURRENT_STAGE_KEY=""
BSSG_RAM_TIMING_CURRENT_STAGE_LABEL=""
BSSG_RAM_TIMING_CURRENT_STAGE_START_MS=0
}
bssg_ram_timing_print_summary() {
if [ "$BSSG_RAM_TIMING_ENABLED" != true ]; then
return
fi
# Close any open stage (defensive; build flow should end stages explicitly).
if [ "$BSSG_RAM_TIMING_STAGE_ACTIVE" = true ]; then
bssg_ram_timing_end
fi
local count="${#BSSG_RAM_TIMING_STAGE_MS[@]}"
if [ "$count" -eq 0 ]; then
return
fi
local total_ms=0
local max_ms=0
local max_label=""
local i
for ((i = 0; i < count; i++)); do
local stage_ms="${BSSG_RAM_TIMING_STAGE_MS[$i]}"
total_ms=$((total_ms + stage_ms))
if [ "$stage_ms" -gt "$max_ms" ]; then
max_ms="$stage_ms"
max_label="${BSSG_RAM_TIMING_STAGE_LABELS[$i]}"
fi
done
echo "------------------------------------------------------"
echo -e "${GREEN}RAM mode timing summary:${NC}"
printf " %-26s %12s %10s\n" "Stage" "Duration" "Share"
for ((i = 0; i < count; i++)); do
local stage_label="${BSSG_RAM_TIMING_STAGE_LABELS[$i]}"
local stage_ms="${BSSG_RAM_TIMING_STAGE_MS[$i]}"
local pct_tenths=0
if [ "$total_ms" -gt 0 ]; then
pct_tenths=$(( (stage_ms * 1000 + total_ms / 2) / total_ms ))
fi
local formatted_ms
formatted_ms="$(_bssg_ram_timing_format_ms "$stage_ms")"
printf " %-26s %12s %6d.%d%%\n" "$stage_label" "$formatted_ms" $((pct_tenths / 10)) $((pct_tenths % 10))
done
echo -e " ${GREEN}Total (timed stages):$(_bssg_ram_timing_format_ms "$total_ms")${NC}"
if [ -n "$max_label" ]; then
echo -e " ${YELLOW}Slowest stage:${NC} ${max_label} ($(_bssg_ram_timing_format_ms "$max_ms"))"
fi
}
# --- RAM Mode Stage Timing --- END ---
# Check Dependencies
# shellcheck source=deps.sh
bssg_ram_timing_start "dependencies" "Dependencies"
source "${SCRIPT_DIR}/deps.sh" || { echo -e "${RED}Error: Failed to source deps.sh${NC}"; exit 1; }
check_dependencies # Call the function to perform checks and export HAS_PARALLEL
bssg_ram_timing_end
if [ "$BUILD_MODE" = "ram" ]; then
export BSSG_RAM_MODE=true
export FORCE_REBUILD=true
# shellcheck source=ram_mode.sh
source "${SCRIPT_DIR}/ram_mode.sh" || { echo -e "${RED}Error: Failed to source ram_mode.sh${NC}"; exit 1; }
print_info "RAM mode enabled: source/template files and build indexes are held in memory."
print_info "RAM mode parallel worker cap: ${RAM_MODE_MAX_JOBS:-6} (set RAM_MODE_MAX_JOBS to tune)."
fi
echo "Checked dependencies. Parallel available: ${HAS_PARALLEL:-false}"
# Source Cache Manager (defines cache functions)
# shellcheck source=cache.sh
bssg_ram_timing_start "cache_setup" "Cache Setup/Clean"
source "${SCRIPT_DIR}/cache.sh" || { echo -e "${RED}Error: Failed to source cache.sh${NC}"; exit 1; }
echo "Loaded cache manager."
# Check if config changed BEFORE updating the hash file, store status for later use
BSSG_CONFIG_CHANGED_STATUS=1 # Default to 1 (not changed)
if config_has_changed; then
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
# RAM mode is intentionally ephemeral, always rebuild from preloaded inputs.
BSSG_CONFIG_CHANGED_STATUS=0
elif config_has_changed; then
BSSG_CONFIG_CHANGED_STATUS=0 # Set to 0 (changed)
fi
export BSSG_CONFIG_CHANGED_STATUS
@ -118,17 +274,21 @@ fi
# --- Add check for CLEAN_OUTPUT influencing FORCE_REBUILD --- END ---
# Handle --force-rebuild first
if [ "${FORCE_REBUILD:-false}" = true ]; then
if [ "${BSSG_RAM_MODE:-false}" != true ] && [ "${FORCE_REBUILD:-false}" = true ]; then
echo -e "${YELLOW}Force rebuild enabled, deleting entire cache directory (${CACHE_DIR:-.bssg_cache})...${NC}"
rm -rf "${CACHE_DIR:-.bssg_cache}"
echo -e "${GREEN}Cache deleted!${NC}"
fi
echo "Ensuring cache directory structure exists... (${CACHE_DIR:-.bssg_cache})"
mkdir -p "${CACHE_DIR:-.bssg_cache}/meta" "${CACHE_DIR:-.bssg_cache}/content"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
echo "Ensuring cache directory structure exists... (${CACHE_DIR:-.bssg_cache})"
mkdir -p "${CACHE_DIR:-.bssg_cache}/meta" "${CACHE_DIR:-.bssg_cache}/content"
# Create initial config hash *after* ensuring cache dir exists
create_config_hash
# Create initial config hash *after* ensuring cache dir exists
create_config_hash
else
echo "RAM mode: skipping cache directory creation and config hash persistence."
fi
# --- Initial Cache Setup & Cleaning --- END
# Handle --clean-output flag (using logic moved from original main/clean_output_directory)
@ -148,10 +308,12 @@ if [ "${CLEAN_OUTPUT:-false}" = true ]; then
echo -e "${YELLOW}Output directory (${OUTPUT_DIR:-output}) does not exist, no need to clean.${NC}"
fi
fi
bssg_ram_timing_end
# Source Content Processor (defines functions like extract_metadata, convert_markdown_to_html)
# Moved up before indexing as indexing uses some content functions (e.g., generate_slug)
# shellcheck source=content.sh
bssg_ram_timing_start "index_build" "Index/Data Build"
source "${SCRIPT_DIR}/content.sh" || { echo -e "${RED}Error: Failed to source content.sh${NC}"; exit 1; }
echo "Loaded content processing functions."
@ -161,17 +323,23 @@ echo "Loaded content processing functions."
source "${SCRIPT_DIR}/indexing.sh" || { echo -e "${RED}Error: Failed to source indexing.sh${NC}"; exit 1; }
echo "Loaded indexing functions."
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_preload_inputs || { echo -e "${RED}Error: RAM preload failed.${NC}"; exit 1; }
fi
# --- Build Intermediate Indexes ---
# Moved up before preload_templates
# --- Start Change: Snapshot previous file index ---
file_index_file="${CACHE_DIR:-.bssg_cache}/file_index.txt"
file_index_prev_file="${CACHE_DIR:-.bssg_cache}/file_index_prev.txt"
if [ -f "$file_index_file" ]; then
echo "Snapshotting previous file index to $file_index_prev_file" >&2 # Debug
cp "$file_index_file" "$file_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$file_index_prev_file"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
if [ -f "$file_index_file" ]; then
echo "Snapshotting previous file index to $file_index_prev_file" >&2 # Debug
cp "$file_index_file" "$file_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$file_index_prev_file"
fi
fi
# --- End Change ---
optimized_build_file_index || { echo -e "${RED}Error: Failed to build file index.${NC}"; exit 1; }
@ -179,12 +347,14 @@ optimized_build_file_index || { echo -e "${RED}Error: Failed to build file index
# --- Start Change: Snapshot previous tags index ---
tags_index_file="${CACHE_DIR:-.bssg_cache}/tags_index.txt"
tags_index_prev_file="${CACHE_DIR:-.bssg_cache}/tags_index_prev.txt"
if [ -f "$tags_index_file" ]; then
echo "Snapshotting previous tags index to $tags_index_prev_file" >&2 # Debug
cp "$tags_index_file" "$tags_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$tags_index_prev_file"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
if [ -f "$tags_index_file" ]; then
echo "Snapshotting previous tags index to $tags_index_prev_file" >&2 # Debug
cp "$tags_index_file" "$tags_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$tags_index_prev_file"
fi
fi
# --- End Change ---
@ -199,12 +369,14 @@ build_tags_index || { echo -e "${RED}Error: Failed to build tags index.${NC}"; e
# --- Start Change: Snapshot previous authors index ---
authors_index_file="${CACHE_DIR:-.bssg_cache}/authors_index.txt"
authors_index_prev_file="${CACHE_DIR:-.bssg_cache}/authors_index_prev.txt"
if [ -f "$authors_index_file" ]; then
echo "Snapshotting previous authors index to $authors_index_prev_file" >&2 # Debug
cp "$authors_index_file" "$authors_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$authors_index_prev_file"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
if [ -f "$authors_index_file" ]; then
echo "Snapshotting previous authors index to $authors_index_prev_file" >&2 # Debug
cp "$authors_index_file" "$authors_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$authors_index_prev_file"
fi
fi
# --- End Change ---
@ -218,12 +390,14 @@ if [ "${ENABLE_ARCHIVES:-false}" = true ]; then
# --- Start Change: Snapshot previous archive index ---
archive_index_file="${CACHE_DIR:-.bssg_cache}/archive_index.txt"
archive_index_prev_file="${CACHE_DIR:-.bssg_cache}/archive_index_prev.txt"
if [ -f "$archive_index_file" ]; then
echo "Snapshotting previous archive index to $archive_index_prev_file" >&2 # Debug
cp "$archive_index_file" "$archive_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$archive_index_prev_file"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
if [ -f "$archive_index_file" ]; then
echo "Snapshotting previous archive index to $archive_index_prev_file" >&2 # Debug
cp "$archive_index_file" "$archive_index_prev_file"
else
# Ensure previous file doesn't exist if current doesn't
rm -f "$archive_index_prev_file"
fi
fi
# --- End Change ---
build_archive_index || { echo -e "${RED}Error: Failed to build archive index.${NC}"; exit 1; }
@ -232,10 +406,12 @@ if [ "${ENABLE_ARCHIVES:-false}" = true ]; then
# --- End Change ---
fi
echo "Built intermediate cache indexes."
bssg_ram_timing_end
# Load Templates (and generate dynamic menus, exports vars like HEADER_TEMPLATE)
# Moved down after indexing
# shellcheck source=templates.sh
bssg_ram_timing_start "templates" "Template Prep"
source "${SCRIPT_DIR}/templates.sh" || { echo -e "${RED}Error: Failed to source templates.sh${NC}"; exit 1; }
preload_templates # Call the function
echo "Loaded and processed templates."
@ -279,6 +455,7 @@ fi
export BSSG_MAX_TEMPLATE_LOCALE_TIME=$latest_template_locale_time
echo "Latest template/locale time: $BSSG_MAX_TEMPLATE_LOCALE_TIME (Header: $header_time, Footer: $footer_time, Locale: $locale_time)"
# --- Pre-calculate Max Template/Locale Time --- END ---
bssg_ram_timing_end
# --- Prepare for Parallel Processing ---
if [ "${HAS_PARALLEL:-false}" = true ]; then
@ -311,32 +488,39 @@ fi
# --- Generate Content HTML ---
# Source and run Post Generator
# shellcheck source=generate_posts.sh
bssg_ram_timing_start "posts" "Posts"
source "${SCRIPT_DIR}/generate_posts.sh" || { echo -e "${RED}Error: Failed to source generate_posts.sh${NC}"; exit 1; }
process_all_markdown_files || { echo -e "${RED}Error: Post processing failed.${NC}"; exit 1; }
echo "Generated post HTML files."
bssg_ram_timing_end
# --- Post Generation --- END ---
# --- Page Generation --- START --
# Source the page generation script
# shellcheck source=generate_pages.sh disable=SC1091
bssg_ram_timing_start "pages" "Static Pages"
source "$SCRIPT_DIR/generate_pages.sh" || { echo -e "${RED}Error: Failed to source generate_pages.sh${NC}"; exit 1; }
# Call the main page processing function
process_all_pages || { echo -e "${RED}Error: Page processing failed.${NC}"; exit 1; }
bssg_ram_timing_end
# --- Page Generation --- END ---
# --- Tag Page Generation --- START ---
# Source and run Tag Page Generator
# shellcheck source=generate_tags.sh disable=SC1091
bssg_ram_timing_start "tags" "Tags"
source "$SCRIPT_DIR/generate_tags.sh" || { echo -e "${RED}Error: Failed to source generate_tags.sh${NC}"; exit 1; }
# Call the main function from the sourced script
generate_tag_pages || { echo -e "${RED}Error: Tag page generation failed.${NC}"; exit 1; }
echo "Generated tag list pages."
bssg_ram_timing_end
# --- Tag Page Generation --- END ---
# --- Author Page Generation --- START ---
# Source and run Author Page Generator (if enabled)
if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ]; then
bssg_ram_timing_start "authors" "Authors"
# shellcheck source=generate_authors.sh disable=SC1091
source "$SCRIPT_DIR/generate_authors.sh" || { echo -e "${RED}Error: Failed to source generate_authors.sh${NC}"; exit 1; }
@ -344,12 +528,14 @@ if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ]; then
# It will internally use AFFECTED_AUTHORS and AUTHORS_INDEX_NEEDS_REBUILD
generate_author_pages || { echo -e "${RED}Error: Author page generation failed.${NC}"; exit 1; }
echo "Generated author pages."
bssg_ram_timing_end
fi
# --- Author Page Generation --- END ---
# --- Archive Page Generation --- START ---
# Source and run Archive Page Generator (if enabled)
if [ "${ENABLE_ARCHIVES:-false}" = true ]; then
bssg_ram_timing_start "archives" "Archives"
# Source the script (loads functions)
# shellcheck source=generate_archives.sh disable=SC1091
source "$SCRIPT_DIR/generate_archives.sh" || { echo -e "${RED}Error: Failed to source generate_archives.sh${NC}"; exit 1; }
@ -358,28 +544,68 @@ if [ "${ENABLE_ARCHIVES:-false}" = true ]; then
# It will internally use AFFECTED_ARCHIVE_MONTHS and ARCHIVE_INDEX_NEEDS_REBUILD
generate_archive_pages || { echo -e "${RED}Error: Archive page generation failed.${NC}"; exit 1; }
echo "Generated archive pages."
bssg_ram_timing_end
fi
# --- Archive Page Generation --- END ---
# --- Main Index Page Generation --- START ---
# Source and run Main Index Page Generator
# shellcheck source=generate_index.sh disable=SC1091
bssg_ram_timing_start "main_index" "Main Index"
source "$SCRIPT_DIR/generate_index.sh" || { echo -e "${RED}Error: Failed to source generate_index.sh${NC}"; exit 1; }
# Call the main function from the sourced script
generate_index || { echo -e "${RED}Error: Index page generation failed.${NC}"; exit 1; }
echo "Generated main index/pagination pages."
bssg_ram_timing_end
# --- Main Index Page Generation --- END ---
# --- Feed Generation --- START ---
# Source and run Feed Generator
# shellcheck source=generate_feeds.sh disable=SC1091
bssg_ram_timing_start "feeds" "Sitemap/RSS"
source "$SCRIPT_DIR/generate_feeds.sh" || { echo -e "${RED}Error: Failed to source generate_feeds.sh${NC}"; exit 1; }
# Call the functions from the sourced script
echo "Timing sitemap generation..."
generate_sitemap || echo -e "${YELLOW}Sitemap generation failed, continuing build...${NC}" # Allow failure
echo "Timing RSS feed generation..."
generate_rss || echo -e "${YELLOW}RSS feed generation failed, continuing build...${NC}" # Allow failure
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
echo "Timing RSS feed generation..."
feed_jobs=0
feed_jobs=$(get_parallel_jobs)
if [ "$feed_jobs" -gt 1 ]; then
echo "RAM mode: generating sitemap and RSS in parallel..."
sitemap_failed=false
rss_failed=false
generate_sitemap &
sitemap_pid=$!
generate_rss &
rss_pid=$!
if ! wait "$sitemap_pid"; then
sitemap_failed=true
fi
if ! wait "$rss_pid"; then
rss_failed=true
fi
if $sitemap_failed; then
echo -e "${YELLOW}Sitemap generation failed, continuing build...${NC}"
fi
if $rss_failed; then
echo -e "${YELLOW}RSS feed generation failed, continuing build...${NC}"
fi
else
generate_sitemap || echo -e "${YELLOW}Sitemap generation failed, continuing build...${NC}" # Allow failure
generate_rss || echo -e "${YELLOW}RSS feed generation failed, continuing build...${NC}" # Allow failure
fi
else
generate_sitemap || echo -e "${YELLOW}Sitemap generation failed, continuing build...${NC}" # Allow failure
echo "Timing RSS feed generation..."
generate_rss || echo -e "${YELLOW}RSS feed generation failed, continuing build...${NC}" # Allow failure
fi
echo "Generated RSS feed and sitemap."
bssg_ram_timing_end
# --- Feed Generation --- END ---
# --- Secondary Pages Index Generation --- START ---
@ -389,10 +615,12 @@ echo "Generated RSS feed and sitemap."
# We attempt to reconstruct the array from the exported string.
# shellcheck disable=SC2154 # SECONDARY_PAGES is exported by templates.sh
if [ -n "$SECONDARY_PAGES" ] && [ "$SECONDARY_PAGES" != "()" ]; then
bssg_ram_timing_start "secondary_index" "Secondary Index"
# shellcheck source=generate_secondary_pages.sh disable=SC1091
source "$SCRIPT_DIR/generate_secondary_pages.sh" || { echo -e "${RED}Error: Failed to source generate_secondary_pages.sh${NC}"; exit 1; }
generate_pages_index || echo -e "${YELLOW}Secondary pages index generation failed, continuing build...${NC}" # Allow failure
echo "Generated secondary pages index."
bssg_ram_timing_end
else
echo "No secondary pages defined, skipping secondary index generation."
fi
@ -401,6 +629,7 @@ fi
# --- Asset Handling --- START ---
# Source the asset handling script
# shellcheck source=assets.sh disable=SC1091
bssg_ram_timing_start "assets" "Assets/CSS"
source "$SCRIPT_DIR/assets.sh" || { echo -e "${RED}Error: Failed to source assets.sh${NC}"; exit 1; }
# Copy static assets
echo "Timing static files copy..."
@ -409,41 +638,60 @@ copy_static_files || { echo -e "${RED}Error: Failed to copy static assets.${NC}"
echo "Timing CSS/Theme processing..."
create_css "$OUTPUT_DIR" "$THEME" || { echo -e "${RED}Error: Failed to process CSS.${NC}"; exit 1; } # Pass OUTPUT_DIR and THEME
echo "Handled static assets and CSS."
bssg_ram_timing_end
# --- Asset Handling --- END ---
# --- Post Processing --- START ---
# Source and run Post Processor
# shellcheck source=post_process.sh disable=SC1091
bssg_ram_timing_start "post_process" "Post Processing"
source "$SCRIPT_DIR/post_process.sh" || { echo -e "${RED}Error: Failed to source post_process.sh${NC}"; exit 1; }
echo "Timing URL post-processing..."
post_process_urls || echo -e "${YELLOW}URL post-processing failed, continuing...${NC}" # Allow failure
echo "Timing output permissions fix..."
fix_output_permissions || echo -e "${YELLOW}Fixing output permissions failed, continuing...${NC}" # Allow failure
echo "Completed post-processing."
bssg_ram_timing_end
# --- Post Processing --- END ---
# --- Final Cache Update --- START ---
create_config_hash
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
create_config_hash
fi
# --- Final Cache Update --- END ---
# --- Final Cleanup --- START ---
echo "Cleaning up previous index files..."
rm -f "${CACHE_DIR:-.bssg_cache}/file_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/tags_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/authors_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/archive_index_prev.txt"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
echo "Cleaning up previous index files..."
rm -f "${CACHE_DIR:-.bssg_cache}/file_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/tags_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/authors_index_prev.txt"
rm -f "${CACHE_DIR:-.bssg_cache}/archive_index_prev.txt"
# Remove the frontmatter changes marker if it exists
rm -f "${CACHE_DIR:-.bssg_cache}/frontmatter_changes_marker"
# Remove the frontmatter changes marker if it exists
rm -f "${CACHE_DIR:-.bssg_cache}/frontmatter_changes_marker"
# Clean up related posts temporary files to prevent unnecessary cache invalidation on next build
rm -f "${CACHE_DIR:-.bssg_cache}/modified_tags.list"
rm -f "${CACHE_DIR:-.bssg_cache}/modified_authors.list"
rm -f "${CACHE_DIR:-.bssg_cache}/related_posts_invalidated.list"
# Clean up related posts temporary files to prevent unnecessary cache invalidation on next build
rm -f "${CACHE_DIR:-.bssg_cache}/modified_tags.list"
rm -f "${CACHE_DIR:-.bssg_cache}/modified_authors.list"
rm -f "${CACHE_DIR:-.bssg_cache}/related_posts_invalidated.list"
fi
# --- Final Cleanup --- END ---
# --- Pre-compress Assets --- START ---
_precompress_single_file() {
local file="$1"
local gzfile="$2"
local compression_level="$3"
local verbose_logs="$4"
if [ "$verbose_logs" = "true" ]; then
echo "Compressing: $file"
fi
gzip -c "-${compression_level}" -- "$file" > "$gzfile"
}
precompress_assets() {
# Check if pre-compression is enabled in the config.
if [ ! "${PRECOMPRESS_ASSETS:-false}" = "true" ]; then
@ -451,6 +699,11 @@ precompress_assets() {
fi
echo "Starting pre-compression of assets..."
local compression_level="${PRECOMPRESS_GZIP_LEVEL:-9}"
if ! [[ "$compression_level" =~ ^[1-9]$ ]]; then
compression_level=9
fi
local verbose_logs="${PRECOMPRESS_VERBOSE:-${RAM_MODE_VERBOSE:-false}}"
# 1. Cleanup: Remove any .gz file that does not have a corresponding original file.
# This handles cases where original files were deleted.
@ -465,25 +718,63 @@ precompress_assets() {
# 2. Compression: Compress text files if they are new or have been updated.
# We target .html, .css, .xml and .js files.
find "${OUTPUT_DIR}" -type f \( -name "*.html" -o -name "*.css" -o -name "*.xml" -o -name "*.js" \) -print0 | while IFS= read -r -d '' file; do
gzfile="${file}.gz"
local changed_files=()
while IFS= read -r -d '' file; do
local gzfile="${file}.gz"
# Compress if the .gz file doesn't exist, or if the original file is newer.
if [ ! -f "$gzfile" ] || [ "$file" -nt "$gzfile" ]; then
echo "Compressing: $file"
# Use gzip with best compression (-9) and write to stdout, then redirect.
# This is a robust way to handle output and overwriting.
gzip -c -9 -- "$file" > "$gzfile"
changed_files+=("$file")
fi
done
done < <(find "${OUTPUT_DIR}" -type f \( -name "*.html" -o -name "*.css" -o -name "*.xml" -o -name "*.js" \) -print0)
if [ "${#changed_files[@]}" -eq 0 ]; then
echo "No changed assets to pre-compress."
echo "Asset pre-compression finished."
return
fi
local compress_jobs
compress_jobs=$(get_parallel_jobs "${PRECOMPRESS_MAX_JOBS:-0}")
if [ "$compress_jobs" -gt "${#changed_files[@]}" ]; then
compress_jobs="${#changed_files[@]}"
fi
if [ "$compress_jobs" -gt 1 ]; then
local file gzfile q_file q_gzfile q_level q_verbose
q_level=$(printf '%q' "$compression_level")
q_verbose=$(printf '%q' "$verbose_logs")
run_parallel "$compress_jobs" < <(
for file in "${changed_files[@]}"; do
gzfile="${file}.gz"
q_file=$(printf '%q' "$file")
q_gzfile=$(printf '%q' "$gzfile")
printf "_precompress_single_file %s %s %s %s\n" "$q_file" "$q_gzfile" "$q_level" "$q_verbose"
done
) || { echo -e "${RED}Asset pre-compression failed.${NC}"; return 1; }
else
local file gzfile
for file in "${changed_files[@]}"; do
gzfile="${file}.gz"
_precompress_single_file "$file" "$gzfile" "$compression_level" "$verbose_logs" || {
echo -e "${RED}Asset pre-compression failed for ${file}.${NC}"
return 1
}
done
fi
echo "Pre-compressed ${#changed_files[@]} assets using ${compress_jobs} worker(s) (gzip -${compression_level})."
echo "Asset pre-compression finished."
}
# Execute the asset compression.
bssg_ram_timing_start "precompress" "Pre-compress"
precompress_assets
bssg_ram_timing_end
# --- Pre-compress Assets --- END ---
# --- Deployment --- START ---
bssg_ram_timing_start "deployment" "Deployment Decision/Run"
deploy_now="false"
if [[ "${CMD_DEPLOY_OVERRIDE:-unset}" == "true" ]]; then # Use default value for safety
deploy_now="true"
@ -544,12 +835,15 @@ if [[ "$deploy_now" == "true" ]]; then
echo -e "${YELLOW}Warning: Deployment was requested, but DEPLOY_SCRIPT is not set in configuration.${NC}"
fi
fi
bssg_ram_timing_end
# --- Deployment --- END ---
# --- End of execution ---
BUILD_END_TIME=$(date +%s)
BUILD_DURATION=$((BUILD_END_TIME - BUILD_START_TIME))
bssg_ram_timing_print_summary
echo "------------------------------------------------------"
echo -e "${GREEN}Build process completed in ${BUILD_DURATION} seconds.${NC}"
exit 0

204
scripts/build/ram_mode.sh Normal file
View file

@ -0,0 +1,204 @@
#!/usr/bin/env bash
#
# BSSG - RAM Build Helpers
# Preloads input content in memory and provides lookup helpers.
#
# Guard against duplicate sourcing
if [[ -n "${BSSG_RAM_MODE_SCRIPT_LOADED:-}" ]]; then
return 0
fi
export BSSG_RAM_MODE_SCRIPT_LOADED=1
# In-memory stores
declare -gA BSSG_RAM_FILE_CONTENT=()
declare -gA BSSG_RAM_FILE_MTIME=()
declare -gA BSSG_RAM_DATASET=()
declare -gA BSSG_RAM_BASENAME_KEY=()
declare -ga BSSG_RAM_SRC_FILES=()
declare -ga BSSG_RAM_PAGE_FILES=()
declare -ga BSSG_RAM_TEMPLATE_FILES=()
ram_mode_enabled() {
[[ "${BSSG_RAM_MODE:-false}" == "true" ]]
}
_ram_mode_disk_mtime() {
local file="$1"
local kernel_name
kernel_name=$(uname -s)
if [[ "$kernel_name" == "Darwin" ]] || [[ "$kernel_name" == *"BSD" ]]; then
stat -f "%m" "$file" 2>/dev/null || echo "0"
else
stat -c "%Y" "$file" 2>/dev/null || echo "0"
fi
}
ram_mode_resolve_key() {
local file="$1"
if [[ -n "${BSSG_RAM_FILE_CONTENT[$file]+_}" || -n "${BSSG_RAM_FILE_MTIME[$file]+_}" ]]; then
echo "$file"
return 0
fi
if [[ "$file" == /* && -n "${BSSG_PROJECT_ROOT:-}" ]]; then
local prefix="${BSSG_PROJECT_ROOT%/}/"
if [[ "$file" == "$prefix"* ]]; then
local rel="${file#"$prefix"}"
if [[ -n "${BSSG_RAM_FILE_CONTENT[$rel]+_}" || -n "${BSSG_RAM_FILE_MTIME[$rel]+_}" ]]; then
echo "$rel"
return 0
fi
fi
fi
if [[ "$file" != */* && -n "${BSSG_RAM_BASENAME_KEY[$file]+_}" ]]; then
local mapped="${BSSG_RAM_BASENAME_KEY[$file]}"
if [[ "$mapped" != "__AMBIGUOUS__" ]]; then
echo "$mapped"
return 0
fi
fi
echo "$file"
return 0
}
ram_mode_has_file() {
local key
key=$(ram_mode_resolve_key "$1")
[[ -n "${BSSG_RAM_FILE_CONTENT[$key]+_}" || -n "${BSSG_RAM_FILE_MTIME[$key]+_}" ]]
}
ram_mode_get_content() {
local key
key=$(ram_mode_resolve_key "$1")
if [[ -n "${BSSG_RAM_FILE_CONTENT[$key]+_}" ]]; then
printf '%s' "${BSSG_RAM_FILE_CONTENT[$key]}"
fi
}
ram_mode_get_mtime() {
local key
key=$(ram_mode_resolve_key "$1")
if [[ -n "${BSSG_RAM_FILE_MTIME[$key]+_}" ]]; then
printf '%s\n' "${BSSG_RAM_FILE_MTIME[$key]}"
else
printf '0\n'
fi
}
ram_mode_list_src_files() {
printf '%s\n' "${BSSG_RAM_SRC_FILES[@]}"
}
ram_mode_list_page_files() {
printf '%s\n' "${BSSG_RAM_PAGE_FILES[@]}"
}
ram_mode_set_dataset() {
local key="$1"
local value="$2"
BSSG_RAM_DATASET["$key"]="$value"
}
ram_mode_get_dataset() {
local key="$1"
if [[ -n "${BSSG_RAM_DATASET[$key]+_}" ]]; then
printf '%s' "${BSSG_RAM_DATASET[$key]}"
fi
}
ram_mode_clear_dataset() {
local key="$1"
unset 'BSSG_RAM_DATASET[$key]'
}
ram_mode_dataset_line_count() {
local key="$1"
local data
data=$(ram_mode_get_dataset "$key")
if [[ -z "$data" ]]; then
echo "0"
return 0
fi
printf '%s\n' "$data" | awk 'NF { c++ } END { print c+0 }'
}
_ram_mode_store_file() {
local file="$1"
[[ -f "$file" ]] || return 0
local file_content
file_content=$(cat "$file")
BSSG_RAM_FILE_CONTENT["$file"]="$file_content"
BSSG_RAM_FILE_MTIME["$file"]="$(_ram_mode_disk_mtime "$file")"
local base
base=$(basename "$file")
if [[ -z "${BSSG_RAM_BASENAME_KEY[$base]+_}" ]]; then
BSSG_RAM_BASENAME_KEY["$base"]="$file"
elif [[ "${BSSG_RAM_BASENAME_KEY[$base]}" != "$file" ]]; then
BSSG_RAM_BASENAME_KEY["$base"]="__AMBIGUOUS__"
fi
}
_ram_mode_collect_content_files() {
local dir="$1"
[[ -d "$dir" ]] || return 0
find "$dir" -type f \( -name "*.md" -o -name "*.html" \) -not -path "*/.*" | sort
}
_ram_mode_collect_template_files() {
local dir="$1"
[[ -d "$dir" ]] || return 0
find "$dir" -type f -name "*.html" -not -path "*/.*" | sort
}
ram_mode_preload_inputs() {
if ! ram_mode_enabled; then
return 0
fi
BSSG_RAM_FILE_CONTENT=()
BSSG_RAM_FILE_MTIME=()
BSSG_RAM_DATASET=()
BSSG_RAM_BASENAME_KEY=()
BSSG_RAM_SRC_FILES=()
BSSG_RAM_PAGE_FILES=()
BSSG_RAM_TEMPLATE_FILES=()
local file
while IFS= read -r file; do
[[ -z "$file" ]] && continue
BSSG_RAM_SRC_FILES+=("$file")
_ram_mode_store_file "$file"
done < <(_ram_mode_collect_content_files "${SRC_DIR:-src}")
while IFS= read -r file; do
[[ -z "$file" ]] && continue
BSSG_RAM_PAGE_FILES+=("$file")
_ram_mode_store_file "$file"
done < <(_ram_mode_collect_content_files "${PAGES_DIR:-pages}")
while IFS= read -r file; do
[[ -z "$file" ]] && continue
BSSG_RAM_TEMPLATE_FILES+=("$file")
_ram_mode_store_file "$file"
done < <(_ram_mode_collect_template_files "${TEMPLATES_DIR:-templates}")
# Preload active locale (and fallback locale) so date/menu rendering avoids disk reads.
if [[ -f "${LOCALE_DIR:-locales}/${SITE_LANG:-en}.sh" ]]; then
_ram_mode_store_file "${LOCALE_DIR:-locales}/${SITE_LANG:-en}.sh"
fi
if [[ -f "${LOCALE_DIR:-locales}/en.sh" ]]; then
_ram_mode_store_file "${LOCALE_DIR:-locales}/en.sh"
fi
print_info "RAM mode preloaded ${#BSSG_RAM_FILE_CONTENT[@]} text files (${#BSSG_RAM_SRC_FILES[@]} posts, ${#BSSG_RAM_PAGE_FILES[@]} pages)."
}
export -f ram_mode_enabled ram_mode_resolve_key ram_mode_has_file ram_mode_get_content ram_mode_get_mtime
export -f ram_mode_list_src_files ram_mode_list_page_files ram_mode_preload_inputs
export -f ram_mode_set_dataset ram_mode_get_dataset ram_mode_clear_dataset
export -f ram_mode_dataset_line_count

View file

@ -12,6 +12,169 @@ source "$(dirname "$0")/cache.sh" || { echo >&2 "Error: Failed to source cache.s
# --- Related Posts Functions --- START ---
declare -gA BSSG_RAM_RELATED_POSTS_HTML=()
declare -g BSSG_RAM_RELATED_POSTS_READY=false
declare -g BSSG_RAM_RELATED_POSTS_LIMIT=""
_build_post_url_from_date_slug() {
local post_date="$1"
local post_slug="$2"
local post_year post_month post_day
if [[ "$post_date" =~ ^([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ]]; then
post_year="${BASH_REMATCH[1]}"
post_month=$(printf "%02d" "$((10#${BASH_REMATCH[2]}))")
post_day=$(printf "%02d" "$((10#${BASH_REMATCH[3]}))")
else
post_year=$(date +%Y)
post_month=$(date +%m)
post_day=$(date +%d)
fi
local url_path="${URL_SLUG_FORMAT:-Year/Month/Day/slug}"
url_path="${url_path//Year/$post_year}"
url_path="${url_path//Month/$post_month}"
url_path="${url_path//Day/$post_day}"
url_path="${url_path//slug/$post_slug}"
printf '/%s/\n' "$url_path"
}
_build_ram_related_posts_cache() {
local max_results="${1:-3}"
local file_index_data
file_index_data=$(ram_mode_get_dataset "file_index")
BSSG_RAM_RELATED_POSTS_HTML=()
BSSG_RAM_RELATED_POSTS_READY=true
BSSG_RAM_RELATED_POSTS_LIMIT="$max_results"
[ -z "$file_index_data" ] && return 0
local scored_results=""
scored_results=$(printf '%s\n' "$file_index_data" | awk -F'|' '
function trim(s) {
gsub(/^[[:space:]]+|[[:space:]]+$/, "", s)
return s
}
{
n++
title[n] = $3
date[n] = $4
tags_raw[n] = $6
slug[n] = $7
desc[n] = $10
split(tags_raw[n], tag_arr, ",")
for (k in tag_arr) {
t = trim(tag_arr[k])
if (t != "") {
tags[n SUBSEP t] = 1
}
}
}
END {
for (i = 1; i <= n; i++) {
if (slug[i] == "" || tags_raw[i] == "") {
continue
}
split(tags_raw[i], i_tags, ",")
for (j = 1; j <= n; j++) {
if (i == j || slug[j] == "" || date[j] == "" || tags_raw[j] == "") {
continue
}
score = 0
delete seen
for (k in i_tags) {
t = trim(i_tags[k])
if (t == "" || seen[t]) {
continue
}
seen[t] = 1
if (tags[j SUBSEP t]) {
score++
}
}
if (score > 0) {
printf "%s|%d|%s|%s|%s|%s\n", slug[i], score, date[j], title[j], slug[j], desc[j]
}
}
}
}
' | sort -t'|' -k1,1 -k2,2nr -k3,3r)
[ -z "$scored_results" ] && return 0
local current_slug="" current_count=0
local html_output=""
local slug score date title related_slug description
while IFS='|' read -r slug score date title related_slug description; do
[ -z "$slug" ] && continue
if [ "$slug" != "$current_slug" ]; then
if [ -n "$current_slug" ] && [ "$current_count" -gt 0 ]; then
html_output+='</div>'$'\n'
html_output+='</section>'$'\n'
BSSG_RAM_RELATED_POSTS_HTML["$current_slug"]="$html_output"
fi
current_slug="$slug"
current_count=0
html_output=""
fi
if [ "$current_count" -ge "$max_results" ]; then
continue
fi
local post_url
post_url=$(_build_post_url_from_date_slug "$date" "$related_slug")
local short_desc="$description"
if [[ ${#short_desc} -gt 120 ]]; then
short_desc="${short_desc:0:117}..."
fi
if [ "$current_count" -eq 0 ]; then
html_output+='<section class="related-posts">'$'\n'
html_output+='<h3>'"${MSG_RELATED_POSTS:-Related Posts}"'</h3>'$'\n'
html_output+='<div class="related-posts-list">'$'\n'
fi
html_output+='<article class="related-post">'$'\n'
html_output+='<h4><a href="'"${SITE_URL:-}${post_url}"'">'"$title"'</a></h4>'$'\n'
if [ -n "$short_desc" ]; then
html_output+='<p>'"$short_desc"'</p>'$'\n'
fi
html_output+='</article>'$'\n'
current_count=$((current_count + 1))
done <<< "$scored_results"
if [ -n "$current_slug" ] && [ "$current_count" -gt 0 ]; then
html_output+='</div>'$'\n'
html_output+='</section>'$'\n'
BSSG_RAM_RELATED_POSTS_HTML["$current_slug"]="$html_output"
fi
}
prepare_related_posts_ram_cache() {
local max_results="${1:-3}"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
return 0
fi
if [ "$BSSG_RAM_RELATED_POSTS_READY" = true ] && [ "$BSSG_RAM_RELATED_POSTS_LIMIT" = "$max_results" ]; then
return 0
fi
_build_ram_related_posts_cache "$max_results"
}
# Generate related posts for a given post based on shared tags
# Args: $1=current_post_slug $2=current_post_tags $3=current_post_date $4=max_results (optional, default=3)
# Returns: HTML snippet with related posts
@ -26,6 +189,17 @@ generate_related_posts() {
return 0 # No related posts if missing essential data
fi
# RAM mode uses a precomputed in-memory map to avoid repeated O(n^2) scans.
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
if [ "$BSSG_RAM_RELATED_POSTS_READY" != true ] || [ "$BSSG_RAM_RELATED_POSTS_LIMIT" != "$max_results" ]; then
_build_ram_related_posts_cache "$max_results"
fi
if [[ -n "${BSSG_RAM_RELATED_POSTS_HTML[$current_slug]+_}" ]]; then
printf '%s' "${BSSG_RAM_RELATED_POSTS_HTML[$current_slug]}"
fi
return 0
fi
# Check cache first
local cache_file="${CACHE_DIR:-.bssg_cache}/related_posts/${current_slug}.html"
local file_index="${CACHE_DIR:-.bssg_cache}/file_index.txt"
@ -60,8 +234,18 @@ compute_related_posts() {
local max_results="$4"
local file_index="${CACHE_DIR:-.bssg_cache}/file_index.txt"
local file_index_data=""
local ram_mode_active=false
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_active=true
file_index_data=$(ram_mode_get_dataset "file_index")
fi
if [[ ! -f "$file_index" ]]; then
if $ram_mode_active; then
if [[ -z "$file_index_data" ]]; then
return 0
fi
elif [[ ! -f "$file_index" ]]; then
return 0 # No posts to compare against
fi
@ -81,7 +265,7 @@ compute_related_posts() {
fi
# Process all posts and calculate similarity scores
local temp_results=$(mktemp)
local temp_results=""
while IFS='|' read -r file filename title date lastmod tags slug image image_caption description author_name author_email; do
# Skip current post
@ -113,17 +297,22 @@ compute_related_posts() {
# Only consider posts with at least one shared tag
if [[ $score -gt 0 ]]; then
# Store: score|date|title|slug|description
echo "${score}|${date}|${title}|${slug}|${description}" >> "$temp_results"
temp_results+="${score}|${date}|${title}|${slug}|${description}"$'\n'
fi
done < "$file_index"
done < <(
if $ram_mode_active; then
printf '%s\n' "$file_index_data" | awk 'NF'
else
cat "$file_index"
fi
)
# Sort by score (descending), then by date (descending), limit results
local sorted_results=""
if [[ -s "$temp_results" ]]; then
sorted_results=$(sort -t'|' -k1,1nr -k2,2r "$temp_results" | head -n "$max_results")
if [[ -n "$temp_results" ]]; then
sorted_results=$(printf '%s\n' "$temp_results" | awk 'NF' | sort -t'|' -k1,1nr -k2,2r | head -n "$max_results")
fi
rm -f "$temp_results"
# Generate HTML output
if [[ -z "$sorted_results" ]]; then
@ -248,4 +437,5 @@ invalidate_related_posts_cache_for_tags() {
# --- Related Posts Functions --- END ---
# Export functions for use by other scripts
export -f generate_related_posts compute_related_posts clean_related_posts_cache invalidate_related_posts_cache_for_tags
export -f generate_related_posts compute_related_posts clean_related_posts_cache invalidate_related_posts_cache_for_tags
export -f prepare_related_posts_ram_cache

View file

@ -63,7 +63,9 @@ load_template() {
# Function to pre-load all templates and process menus/placeholders
preload_templates() {
# Create template cache directory if it doesn't exist
mkdir -p "$TEMPLATE_CACHE_DIR"
if [ "${BSSG_RAM_MODE:-false}" != true ]; then
mkdir -p "$TEMPLATE_CACHE_DIR"
fi
local template_dir
local templates_to_load=("header.html" "footer.html" "post.html" "page.html" "index.html" "tag.html" "archive.html")
@ -131,8 +133,12 @@ preload_templates() {
# Scan pages directory for markdown and HTML files
if [ -d "${PAGES_DIR:-pages}" ]; then
local page_files
page_files=($(find "${PAGES_DIR:-pages}" -type f \( -name "*.md" -o -name "*.html" \) | sort))
local page_files=()
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_list_page_files > /dev/null; then
mapfile -t page_files < <(ram_mode_list_page_files)
else
page_files=($(find "${PAGES_DIR:-pages}" -type f \( -name "*.md" -o -name "*.html" \) | sort))
fi
for file in "${page_files[@]}"; do
# Skip if file is hidden
@ -144,10 +150,19 @@ preload_templates() {
local title slug date secondary
if [[ "$file" == *.html ]]; then
# Crude HTML parsing - assumes specific meta tags exist
title=$(grep -m 1 '<title>' "$file" 2>/dev/null | sed 's/<[^>]*>//g')
slug=$(grep -m 1 'meta name="slug"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
date=$(grep -m 1 'meta name="date"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/') # Extract date from meta
secondary=$(grep -m 1 'meta name="secondary"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
local html_source=""
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_has_file > /dev/null && ram_mode_has_file "$file"; then
html_source=$(ram_mode_get_content "$file")
title=$(printf '%s\n' "$html_source" | grep -m 1 '<title>' 2>/dev/null | sed 's/<[^>]*>//g')
slug=$(printf '%s\n' "$html_source" | grep -m 1 'meta name="slug"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
date=$(printf '%s\n' "$html_source" | grep -m 1 'meta name="date"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
secondary=$(printf '%s\n' "$html_source" | grep -m 1 'meta name="secondary"' 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
else
title=$(grep -m 1 '<title>' "$file" 2>/dev/null | sed 's/<[^>]*>//g')
slug=$(grep -m 1 'meta name="slug"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
date=$(grep -m 1 'meta name="date"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/') # Extract date from meta
secondary=$(grep -m 1 'meta name="secondary"' "$file" 2>/dev/null | sed 's/.*content="\([^"]*\)".*/\1/')
fi
else
# Assumes parse_metadata is available
title=$(parse_metadata "$file" "title")
@ -206,18 +221,33 @@ preload_templates() {
# Add standard menu items
local tags_flag_file="${CACHE_DIR:-.bssg_cache}/has_tags.flag"
# Add tags link only if the flag file exists (meaning tags were found in the last indexing run)
if [ -f "$tags_flag_file" ]; then
local has_tags=false
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
[ -n "$(ram_mode_get_dataset "has_tags")" ] && has_tags=true
elif [ -f "$tags_flag_file" ]; then
has_tags=true
fi
# Add tags link only if tags are present.
if [ "$has_tags" = true ]; then
menu_items+=" <a href=\"${SITE_URL}/tags/\">${MSG_TAGS:-"Tags"}</a>"
fi
# Add Authors link if enabled and multiple authors exist
local authors_flag_file="${CACHE_DIR:-.bssg_cache}/has_authors.flag"
if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ] && [ -f "$authors_flag_file" ]; then
if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ]; then
# Check if we have multiple authors (more than the threshold)
local authors_index_file="${CACHE_DIR:-.bssg_cache}/authors_index.txt"
if [ -f "$authors_index_file" ]; then
local unique_author_count=$(awk -F'|' '{print $1}' "$authors_index_file" | sort -u | wc -l)
local unique_author_count=0
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local authors_index_data
authors_index_data=$(ram_mode_get_dataset "authors_index")
if [ -n "$authors_index_data" ]; then
unique_author_count=$(printf '%s\n' "$authors_index_data" | awk -F'|' 'NF { print $1 }' | sort -u | wc -l | tr -d ' ')
fi
elif [ -f "$authors_index_file" ] && [ -f "$authors_flag_file" ]; then
unique_author_count=$(awk -F'|' '{print $1}' "$authors_index_file" | sort -u | wc -l)
fi
if [ "$unique_author_count" -gt 0 ]; then
local threshold="${SHOW_AUTHORS_MENU_THRESHOLD:-2}"
if [ "$unique_author_count" -ge "$threshold" ]; then
menu_items+=" <a href=\"${SITE_URL}/authors/\">${MSG_AUTHORS:-"Authors"}</a>"
@ -233,14 +263,23 @@ preload_templates() {
menu_items+=" <a href=\"${SITE_URL}/${RSS_FILENAME:-rss.xml}\">${MSG_RSS:-"RSS"}</a>"
# Add tags link to footer only if the flag file exists
if [ -f "$tags_flag_file" ]; then
if [ "$has_tags" = true ]; then
footer_items+=" <a href=\"${SITE_URL}/tags/\">${MSG_TAGS:-"Tags"}</a> &middot;"
fi
# Add Authors link to footer if enabled and multiple authors exist
if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ] && [ -f "$authors_flag_file" ]; then
if [ -f "$authors_index_file" ]; then
local unique_author_count_footer=$(awk -F'|' '{print $1}' "$authors_index_file" | sort -u | wc -l)
if [ "${ENABLE_AUTHOR_PAGES:-true}" = true ]; then
local unique_author_count_footer=0
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local authors_index_data_footer
authors_index_data_footer=$(ram_mode_get_dataset "authors_index")
if [ -n "$authors_index_data_footer" ]; then
unique_author_count_footer=$(printf '%s\n' "$authors_index_data_footer" | awk -F'|' 'NF { print $1 }' | sort -u | wc -l | tr -d ' ')
fi
elif [ -f "$authors_index_file" ] && [ -f "$authors_flag_file" ]; then
unique_author_count_footer=$(awk -F'|' '{print $1}' "$authors_index_file" | sort -u | wc -l)
fi
if [ "$unique_author_count_footer" -gt 0 ]; then
local threshold_footer="${SHOW_AUTHORS_MENU_THRESHOLD:-2}"
if [ "$unique_author_count_footer" -ge "$threshold_footer" ]; then
footer_items+=" <a href=\"${SITE_URL}/authors/\">${MSG_AUTHORS:-"Authors"}</a> &middot;"
@ -299,50 +338,55 @@ preload_templates() {
HEADER_TEMPLATE=$(echo "$HEADER_TEMPLATE" | sed "s|{{[[:space:]]*custom_css_link[[:space:]]*}}|${custom_css_tag}|")
# --- Handle Custom CSS --- END ---
# Write primary and secondary page lists to cache files only if changed
local primary_pages_cache="$CACHE_DIR/primary_pages.tmp"
local secondary_pages_cache="$CACHE_DIR/secondary_pages.tmp"
local secondary_pages_list_file="$CACHE_DIR/secondary_pages.list" # <-- Define list file path
# Prepare content in temporary files
local primary_tmp=$(mktemp)
local secondary_tmp=$(mktemp)
local secondary_list_tmp=$(mktemp) # <-- Temp file for the list
# Write current content to temporary files
# Use printf for safer writing
for page in "${primary_pages[@]}"; do
printf "%s\n" "$page" >> "$primary_tmp"
done
for page in "${SECONDARY_PAGES[@]}"; do
# Write to the temp file for comparison
printf "%s\n" "$page" >> "$secondary_tmp"
# Also write to the list temp file, one per line
printf "%s\n" "$page" >> "$secondary_list_tmp"
done
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
ram_mode_set_dataset "primary_pages" "$(printf '%s\n' "${primary_pages[@]}")"
ram_mode_set_dataset "secondary_pages" "$(printf '%s\n' "${SECONDARY_PAGES[@]}")"
else
# Write primary and secondary page lists to cache files only if changed
local primary_pages_cache="$CACHE_DIR/primary_pages.tmp"
local secondary_pages_cache="$CACHE_DIR/secondary_pages.tmp"
local secondary_pages_list_file="$CACHE_DIR/secondary_pages.list" # <-- Define list file path
# Prepare content in temporary files
local primary_tmp=$(mktemp)
local secondary_tmp=$(mktemp)
local secondary_list_tmp=$(mktemp) # <-- Temp file for the list
# Write current content to temporary files
# Use printf for safer writing
for page in "${primary_pages[@]}"; do
printf "%s\n" "$page" >> "$primary_tmp"
done
for page in "${SECONDARY_PAGES[@]}"; do
# Write to the temp file for comparison
printf "%s\n" "$page" >> "$secondary_tmp"
# Also write to the list temp file, one per line
printf "%s\n" "$page" >> "$secondary_list_tmp"
done
# Function to compare and update cache file
update_cache_if_changed() {
local temp_file="$1"
local cache_file="$2"
local file_desc="$3"
# Function to compare and update cache file
update_cache_if_changed() {
local temp_file="$1"
local cache_file="$2"
local file_desc="$3"
if [ ! -f "$cache_file" ] || ! cmp -s "$temp_file" "$cache_file"; then
mv "$temp_file" "$cache_file"
# echo "DEBUG: Updated $file_desc cache file." # Optional debug
else
rm "$temp_file"
# echo "DEBUG: $file_desc cache file unchanged." # Optional debug
fi
}
if [ ! -f "$cache_file" ] || ! cmp -s "$temp_file" "$cache_file"; then
mv "$temp_file" "$cache_file"
# echo "DEBUG: Updated $file_desc cache file." # Optional debug
else
rm "$temp_file"
# echo "DEBUG: $file_desc cache file unchanged." # Optional debug
fi
}
# Compare and update cache files
update_cache_if_changed "$primary_tmp" "$primary_pages_cache"
update_cache_if_changed "$secondary_tmp" "$secondary_pages_cache"
update_cache_if_changed "$secondary_list_tmp" "$secondary_pages_list_file" # <-- Update the list file
# Compare and update cache files
update_cache_if_changed "$primary_tmp" "$primary_pages_cache"
update_cache_if_changed "$secondary_tmp" "$secondary_pages_cache"
update_cache_if_changed "$secondary_list_tmp" "$secondary_pages_list_file" # <-- Update the list file
# Clean up temporary files
rm -f "$primary_tmp" "$secondary_tmp" "$secondary_list_tmp" # <-- Cleanup list temp file
# Clean up temporary files
rm -f "$primary_tmp" "$secondary_tmp" "$secondary_list_tmp" # <-- Cleanup list temp file
fi
echo -e "${GREEN}Templates pre-processed (menus, locale placeholders).${NC}"
}
@ -364,4 +408,4 @@ export FOOTER_TEMPLATE
# Export functions - Do not export the SECONDARY_PAGES array itself anymore
export -f preload_templates
# export SECONDARY_PAGES # <-- Remove this export
# export SECONDARY_PAGES # <-- Remove this export

View file

@ -19,6 +19,31 @@ else
NC=""
fi
# Cache kernel name once to avoid repeated `uname` calls in hot paths.
if [ -z "${BSSG_KERNEL_NAME:-}" ]; then
BSSG_KERNEL_NAME="$(uname -s 2>/dev/null || echo "")"
fi
# Cache repeated date formatting work across stages in the same process.
declare -gA BSSG_FORMAT_DATE_CACHE=()
declare -gA BSSG_FORMAT_DATE_TS_CACHE=()
# GNU parallel workers import functions, but array declarations may not carry over.
# Keep date caches associative in every process to avoid bad-subscript errors.
_bssg_ensure_assoc_cache() {
local var_name="$1"
local var_decl
var_decl=$(declare -p "$var_name" 2>/dev/null || true)
if [[ "$var_decl" == declare\ -A* ]]; then
return 0
fi
unset "$var_name" 2>/dev/null || true
declare -gA "$var_name"
eval "$var_name=()"
}
# --- Printing Functions --- START ---
print_error() {
# Print message in red to stderr
@ -69,7 +94,10 @@ format_date() {
local format_override="$2" # Optional format string
local target_format=${format_override:-"$DATE_FORMAT"} # Use override or global DATE_FORMAT
local formatted_date
local kernel_name=$(uname -s) # Get kernel name (e.g., Linux, Darwin, FreeBSD)
local kernel_name="${BSSG_KERNEL_NAME:-}"
if [ -z "$kernel_name" ]; then
kernel_name="$(uname -s)"
fi
# Skip formatting if date is empty
if [ -z "$input_date" ]; then
@ -91,27 +119,46 @@ format_date() {
return
fi
_bssg_ensure_assoc_cache "BSSG_FORMAT_DATE_CACHE"
# Use cached values for stable (non-"now") inputs.
local cache_tz="${TIMEZONE:-local}"
local cache_key="${cache_tz}|${target_format}|${input_date}"
if [[ -n "${BSSG_FORMAT_DATE_CACHE[$cache_key]+_}" ]]; then
echo "${BSSG_FORMAT_DATE_CACHE[$cache_key]}"
return
fi
# Try to format the date using the configured format
# IMPORTANT: DATE_FORMAT must be exported or sourced *before* calling this
if [[ "$kernel_name" == "Darwin" ]] || [[ "$kernel_name" == *"BSD" ]]; then
# macOS/BSD date formatting (uses date -j -f)
# IMPORTANT: Using ISO 8601 format (YYYY-MM-DD HH:MM:SS) in source
# files is strongly recommended for portability.
# Try parsing full ISO date-time first
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d %H:%M:%S\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
# If failed, try RFC2822 format
if [ -z "$formatted_date" ]; then
# Fast-path common stable inputs to avoid multiple failed parse attempts.
if [[ "$input_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
elif [[ "$input_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}[[:space:]][0-9]{2}:[0-9]{2}:[0-9]{2}$ ]]; then
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d %H:%M:%S\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
elif [[ "$input_date" =~ ^[A-Za-z]{3},[[:space:]][0-9]{2}[[:space:]][A-Za-z]{3}[[:space:]][0-9]{4}[[:space:]][0-9]{2}:[0-9]{2}:[0-9]{2}[[:space:]][+-][0-9]{4}$ ]]; then
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%a, %d %b %Y %H:%M:%S %z\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
fi
# If still failed, try parsing date-only (YYYY-MM-DD) and assume midnight
# Fallback parser chain for uncommon/legacy input variants.
if [ -z "$formatted_date" ]; then
# Check if input looks like YYYY-MM-DD using shell pattern matching
if [[ "$input_date" == [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ]]; then
# Try parsing by appending midnight time
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d %H:%M:%S\" \"$input_date 00:00:00\" +\"$target_format\"" 2>/dev/null)
# Try parsing full ISO date-time first
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d %H:%M:%S\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
# If failed, try RFC2822 format
if [ -z "$formatted_date" ]; then
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%a, %d %b %Y %H:%M:%S %z\" \"$input_date\" +\"$target_format\"" 2>/dev/null)
fi
# If still failed, try parsing date-only (YYYY-MM-DD) and assume midnight
if [ -z "$formatted_date" ]; then
# Check if input looks like YYYY-MM-DD using shell pattern matching
if [[ "$input_date" == [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ]]; then
# Try parsing by appending midnight time
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -j -f \"%Y-%m-%d %H:%M:%S\" \"$input_date 00:00:00\" +\"$target_format\"" 2>/dev/null)
fi
fi
fi
@ -125,6 +172,7 @@ format_date() {
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -d \"$input_date\" +\"$target_format\"" 2>/dev/null || echo "$input_date")
fi
BSSG_FORMAT_DATE_CACHE["$cache_key"]="$formatted_date"
echo "$formatted_date"
}
@ -141,6 +189,16 @@ format_date_from_timestamp() {
return
fi
_bssg_ensure_assoc_cache "BSSG_FORMAT_DATE_TS_CACHE"
# Cache by timestamp/format/timezone.
local cache_tz="${TIMEZONE:-local}"
local cache_key="${cache_tz}|${target_format}|${timestamp}"
if [[ -n "${BSSG_FORMAT_DATE_TS_CACHE[$cache_key]+_}" ]]; then
echo "${BSSG_FORMAT_DATE_TS_CACHE[$cache_key]}"
return
fi
# Set TZ environment variable if TIMEZONE is set and not "local"
local tz_prefix=""
if [ -n "${TIMEZONE:-}" ] && [ "${TIMEZONE:-local}" != "local" ]; then
@ -159,6 +217,7 @@ format_date_from_timestamp() {
formatted_date=$(eval "${tz_prefix}LC_ALL=C date -d \"@$timestamp\" +\"$target_format\"" 2>/dev/null || echo "")
fi
BSSG_FORMAT_DATE_TS_CACHE["$cache_key"]="$formatted_date"
echo "$formatted_date"
}
@ -226,7 +285,21 @@ unlock_file() {
# Get file modification time in a portable way
get_file_mtime() {
local file="$1"
local kernel_name=$(uname -s)
local kernel_name="${BSSG_KERNEL_NAME:-}"
# In RAM mode, prefer preloaded input timestamps.
if [ "${BSSG_RAM_MODE:-false}" = true ] && declare -F ram_mode_get_mtime > /dev/null; then
local ram_mtime
ram_mtime=$(ram_mode_get_mtime "$file")
if [ -n "$ram_mtime" ] && [ "$ram_mtime" != "0" ]; then
echo "$ram_mtime"
return 0
fi
fi
if [ -z "$kernel_name" ]; then
kernel_name="$(uname -s)"
fi
# Use specific stat flags based on kernel name
# %m for BSD/macOS (seconds since Epoch)
@ -242,58 +315,108 @@ get_file_mtime() {
# Fallback parallel implementation using background processes
# Used when GNU parallel is not available
detect_cpu_cores() {
if command -v nproc > /dev/null 2>&1; then
nproc
elif command -v sysctl > /dev/null 2>&1; then
sysctl -n hw.ncpu 2>/dev/null || echo 1
else
echo 2
fi
}
# Determine worker count.
# In RAM mode we cap concurrency by default to reduce memory pressure from
# large inherited in-memory arrays in each worker process.
get_parallel_jobs() {
local requested_jobs="$1"
local jobs=0
if [[ "$requested_jobs" =~ ^[0-9]+$ ]] && [ "$requested_jobs" -gt 0 ]; then
jobs="$requested_jobs"
else
jobs=$(detect_cpu_cores)
fi
if [ "${BSSG_RAM_MODE:-false}" = true ]; then
local ram_cap="${RAM_MODE_MAX_JOBS:-6}"
if ! [[ "$ram_cap" =~ ^[0-9]+$ ]] || [ "$ram_cap" -lt 1 ]; then
ram_cap=6
fi
if [ "$jobs" -gt "$ram_cap" ]; then
jobs="$ram_cap"
fi
fi
if [ "$jobs" -lt 1 ]; then
jobs=1
fi
echo "$jobs"
}
run_parallel() {
local max_jobs="$1"
shift
if [ -z "$max_jobs" ] || [ "$max_jobs" -lt 1 ]; then
# Determine number of CPU cores if not specified
if command -v nproc > /dev/null 2>&1; then
# Linux
max_jobs=$(nproc)
elif command -v sysctl > /dev/null 2>&1; then
# macOS, BSD
max_jobs=$(sysctl -n hw.ncpu 2>/dev/null || echo 1)
else
# Default to 2 jobs if we can't determine
max_jobs=2
fi
max_jobs=$(get_parallel_jobs "$max_jobs")
local had_error=0
local wait_n_supported=0
if [[ ${BASH_VERSINFO[0]:-0} -gt 4 ]] || { [[ ${BASH_VERSINFO[0]:-0} -eq 4 ]] && [[ ${BASH_VERSINFO[1]:-0} -ge 3 ]]; }; then
wait_n_supported=1
fi
local job_count=0
local pids=()
if [ "$wait_n_supported" -eq 1 ]; then
local running_jobs=0
# Read commands from stdin
while read -r cmd; do
# Skip empty lines
[ -z "$cmd" ] && continue
while read -r cmd; do
[ -z "$cmd" ] && continue
# If we've reached max jobs, wait for one to finish
if [ $job_count -ge $max_jobs ]; then
# Wait for any child process to finish
wait -n 2>/dev/null || true
# Cleanup finished jobs from pids array
local new_pids=()
for pid in "${pids[@]}"; do
if kill -0 $pid 2>/dev/null; then
new_pids+=($pid)
while [ "$running_jobs" -ge "$max_jobs" ]; do
if ! wait -n 2>/dev/null; then
had_error=1
fi
running_jobs=$((running_jobs - 1))
done
pids=("${new_pids[@]}")
# Update job count
job_count=${#pids[@]}
fi
(eval "$cmd") &
running_jobs=$((running_jobs + 1))
done
# Run the command in the background
(eval "$cmd") &
pids+=($!)
job_count=$((job_count + 1))
done
while [ "$running_jobs" -gt 0 ]; do
if ! wait -n 2>/dev/null; then
had_error=1
fi
running_jobs=$((running_jobs - 1))
done
else
# Portable fallback for older bash without wait -n.
local pids=()
while read -r cmd; do
[ -z "$cmd" ] && continue
# Wait for all remaining jobs to finish
wait
while [ "${#pids[@]}" -ge "$max_jobs" ]; do
local oldest_pid="${pids[0]}"
if ! wait "$oldest_pid"; then
had_error=1
fi
pids=("${pids[@]:1}")
done
(eval "$cmd") &
pids+=($!)
done
local pid
for pid in "${pids[@]}"; do
if ! wait "$pid"; then
had_error=1
fi
done
fi
return "$had_error"
}
# Add a reading time calculation function
@ -333,6 +456,8 @@ export -f generate_slug
export -f lock_file
export -f unlock_file
export -f get_file_mtime
export -f detect_cpu_cores
export -f get_parallel_jobs
export -f run_parallel
export -f calculate_reading_time
export -f html_escape
@ -340,4 +465,5 @@ export -f html_escape
export -f print_error
export -f print_warning
export -f print_success
export -f print_info
export -f print_info
export -f _bssg_ensure_assoc_cache