QA Wolf Take-Home Assignment

Project Overview

This project involved developing a Hacker News Article Sort Validator using Node.js and Microsoft's Playwright framework. The assignment required building a web scraping tool to fetch the newest articles from Hacker News and validate their chronological ordering, demonstrating proficiency in test automation, data validation, and quality assurance principles.

Assignment Scope & Requirements

Fetch exactly the first 100 articles from Hacker News/newest using Playwright.
Validate that articles are sorted from newest to oldest based on their timestamps.
Report any ordering inconsistencies with detailed error information.
Handle edge cases including invalid timestamps, pagination navigation, and error pages.
Provide configurable output options for flexibility (number of articles, titles to display, verbose mode).
Implement robust error handling and logging throughout the application.

Key Objectives & Skills Demonstrated

Web Scraping with Playwright: Navigated multi-page content, extracted dynamic data, and handled browser interactions.
Test Automation & Validation: Implemented validation logic to detect and report sorting errors with precision.
Data Processing & Timestamp Handling: Parsed ISO timestamp strings and compared them for chronological order.
Modular Code Architecture: Organized code into separate, reusable modules (CLI parser, validator, logger, result handler).
Error Handling & Edge Cases: Gracefully handled pagination, invalid data, network issues, and error pages.
Command-Line Interface Design: Built a user-friendly CLI with optional flags, default values, and help documentation.
Testing & Quality Assurance: Created comprehensive unit tests using Playwright's test framework to validate all components.
Attention to Detail: Implemented detailed logging, verbose output modes, and test-error injection for validation.

Technology Stack:

Node.js - Runtime environment
Playwright - Browser automation and web scraping
JavaScript (ES6+) - Implementation language
@playwright/test - Testing framework

Implementation Highlights

Modular Architecture

The solution is structured into focused modules:

index.js - Main orchestrator that coordinates scraping, validation, and output.
cli.js - Parses command-line arguments and provides help documentation.
validator.js - Core validation logic for timestamp parsing and error detection.
result.js - Result class that encapsulates article data and provides analysis methods.
logger.js - Singleton logger for collecting and reporting warnings.
debug.js - Test error injection utilities for validation testing.

Key Features

Pagination Handling: Automatically navigates through Hacker News pages until target article count is reached.
Timestamp Validation: Parses ISO format timestamps and identifies invalid or malformed entries.
Error Reporting: Provides detailed error information with article indices and comparison data.
Flexible Output: Supports displaying article titles, verbose error details, and execution timing.
Test Mode: Includes intentional error injection for validating the sorting algorithm.

Usage Examples

# Basic usage - fetch and validate 100 articles
node index.js

# Show first 5 article titles with validation
node index.js 100 5

# Verbose mode with detailed error reporting
node index.js 100 5 --verbose

# Test mode with intentional sorting errors
node index.js 100 0 --test-error

# Display help information
node index.js --help

Personal Contributions

As the sole developer of this assignment, I was responsible for:

Architecting the overall solution and module structure.
Implementing web scraping logic with Playwright, including pagination and dynamic content extraction.
Developing the timestamp parsing and validation algorithm to detect sorting errors.
Designing the CLI interface with argument parsing and flag handling.
Creating comprehensive logging and error reporting mechanisms.
Writing unit tests to validate all components and edge cases.
Implementing test-error injection for validating the validation logic itself.

Testing & Quality Assurance

The project includes comprehensive test coverage:

CLI Parser Tests: Validates argument parsing, default values, and flag handling.
Validator Tests: Tests timestamp parsing and error detection with various edge cases.
Result Tests: Verifies the Result class methods and error reporting.
Integration Tests: End-to-end validation of the entire workflow.
Test-Error Tests: Ensures error injection works correctly for validation testing.

All tests are run using Playwright's testing framework with the command: npx playwright test

Video Walkthrough

Watch a complete demonstration of the project, including the motivation for joining QA Wolf and a live walkthrough of the code and successful execution:

Home