gmays's submissions | Hacker News

1.		Why SWE-bench Verified no longer measures frontier coding capabilities (openai.com)
		2 points by gmays 17 hours ago \| past \| discuss
2.		Realtime Prompting Guide (developers.openai.com)
		1 point by gmays 19 hours ago \| past \| discuss
3.		Next-Token Predictor Is an AI's Job, Not Its Species (astralcodexten.com)
		3 points by gmays 21 hours ago \| past \| discuss
4.		DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference (arxiv.org)
		2 points by gmays 22 hours ago \| past \| discuss
5.		Data Engineering for Scaling LLM Terminal Capabilities (arxiv.org)
		2 points by gmays 22 hours ago \| past \| discuss
6.		The AI Is the Computer (twitter.com/aravsrinivas)
		1 point by gmays 1 day ago \| past \| discuss
7.		The File System Is the New Database: How I Built a Personal OS for AI Agents (twitter.com/koylanai)
		1 point by gmays 1 day ago \| past \| discuss
8.		Security Boundaries in Agentic Architectures (vercel.com)
		1 point by gmays 1 day ago \| past \| discuss
9.		Build dynamic agentic workflows in Opal (blog.google)
		1 point by gmays 1 day ago \| past \| discuss
10.		Mathematics in the Library of Babel (daniellitt.com)
		1 point by gmays 1 day ago \| past \| discuss
11.		Long Horizon Tasks with Codex (developers.openai.com)
		1 point by gmays 1 day ago \| past \| discuss
12.		Why Developers Keep Choosing Claude over Every Other AI (bhusalmanish.com.np)
		65 points by gmays 1 day ago \| past \| 79 comments
13.		Discovering Multiagent Learning Algorithms with Large Language Models (arxiv.org)
		2 points by gmays 2 days ago \| past \| discuss
14.		Which web frameworks are most token-efficient for AI agents? (martinalderson.com)
		2 points by gmays 2 days ago \| past \| discuss
15.		Reinforcement Learning for LLMs (mesuvash.github.io)
		2 points by gmays 2 days ago \| past \| discuss
16.		Apple's upcoming AI smart glasses are starting to sound more exciting (9to5mac.com)
		2 points by gmays 2 days ago \| past \| 1 comment
17.		Grail’s Cancer Detection Test Fails in Major Study (nytimes.com)
		4 points by gmays 2 days ago \| past \| 1 comment
18.		Dinosaur eggshells can reveal the age of other fossils (arstechnica.com)
		3 points by gmays 2 days ago \| past \| discuss
19.		Improving Deep Agents with Harness Engineering (twitter.com/vtrivedy10)
		1 point by gmays 4 days ago \| past \| discuss
20.		Our First Proof Submissions (openai.com)
		3 points by gmays 5 days ago \| past \| discuss
21.		The RL Architecture Behind Minimax M2.5 (twitter.com/neural_avb)
		3 points by gmays 7 days ago \| past \| discuss
22.		They Do Mean the Effect on Jobs (thezvi.wordpress.com)
		1 point by gmays 7 days ago \| past \| discuss
23.		Implementing a secure sandbox for local agents (cursor.com)
		3 points by gmays 7 days ago \| past \| discuss
24.		Optimize_anything: A Universal API for Optimizing Any Text Parameter (gepa-ai.github.io)
		1 point by gmays 7 days ago \| past \| discuss
25.		Multi-agent cooperation through in-context co-player inference (arxiv.org)
		2 points by gmays 7 days ago \| past \| discuss
26.		ARC-AGI-3 Update (twitter.com/scaling01)
		2 points by gmays 7 days ago \| past \| discuss
27.		Soft Contamination Means Benchmarks Test Shallow Generalization (arxiv.org)
		2 points by gmays 7 days ago \| past \| discuss
28.		17 AI companies that raised $100M or more so far in 2026 (techcrunch.com)
		2 points by gmays 8 days ago \| past \| discuss
29.		OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments (huggingface.co)
		1 point by gmays 8 days ago \| past \| discuss
30.		Meta expands Nvidia deal to use AI data center chips (cnbc.com)
		2 points by gmays 8 days ago \| past \| 1 comment
		More