Academic research paper with direct arXiv link demonstrating peer-review process; deanonymization study is verifiable and citable.
Academic research paper with direct arXiv link demonstrating peer-review process; deanonymization study is verifiable and citable.
Real-time strategy game environment for testing LLM coding capabilities with comparisons of Claude Opus 4.5 vs GPT 5.2. Includes working website, API docs, and playable demo.
Comment moderation tool using LLMs with specific, implementable features (fallacy detection, tone analysis). Demonstrates working interactive demo with tunable parameters.
Empirical observation about Claude's behavior with name generation bias. Lacks detailed methodology and sources but describes a replicable phenomenon worth investigating.
Functional development tool (language server) with working features, GitHub repository, documentation, and screen recordings. Related to development tooling ecosystem.
Live AI agent product with detailed architecture explanation, agent orchestration strategy, vector embeddings, and constraint management. Includes working demo and specific technical decisions.
Functional multi-agent framework for AI-assisted development with working code, demo video, and open-source repository. Clear technical implementation details and actionable use cases.