Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
Understanding Encoder and Decoder LLMs
(
sebastianraschka.com
)
1 point
by
jeffjeffbear
2 days ago
|
past
|
discuss
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
23 points
by
ibobev
16 days ago
|
past
|
1 comment
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
5 points
by
mzl
17 days ago
|
past
|
1 comment
Recommendations for Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
2 points
by
naves
18 days ago
|
past
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
8 points
by
giuliomagnifico
18 days ago
|
past
Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
4 points
by
quietlearning
37 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
vismit2000
42 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
ibobev
46 days ago
|
past
A Researcher's Field Guide to Non-Standard LLM Architectures
(
sebastianraschka.com
)
2 points
by
ModelForge
46 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
1 point
by
ibobev
66 days ago
|
past
Popular Attention Alternatives: GQA, MLA, SWA
(
sebastianraschka.com
)
4 points
by
ModelForge
66 days ago
|
past
Multi-Head Latent Attention
(
sebastianraschka.com
)
4 points
by
ModelForge
68 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
2 points
by
ibobev
71 days ago
|
past
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
(
sebastianraschka.com
)
4 points
by
ModelForge
76 days ago
|
past
Understanding and Implementing Qwen3 from Scratch
(
sebastianraschka.com
)
1 point
by
ibobev
3 months ago
|
past
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
(
sebastianraschka.com
)
490 points
by
ModelForge
4 months ago
|
past
|
97 comments
From GPT-2 to GPT-OSS: Analyzing the Architectural Advances
(
sebastianraschka.com
)
3 points
by
mdp2021
4 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
1 point
by
Anon84
4 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
mariuz
5 months ago
|
past
LLM architecture comparison
(
sebastianraschka.com
)
418 points
by
mdp2021
5 months ago
|
past
|
24 comments
The Big LLM Architecture Comparison
(
sebastianraschka.com
)
3 points
by
Quizzical4230
5 months ago
|
past
Comprehensive ML/AI questions and answers for interview prep
(
sebastianraschka.com
)
2 points
by
yaiml
5 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
sbbq
5 months ago
|
past
Intermediate ML and AI questions and answers for interview prep
(
sebastianraschka.com
)
3 points
by
sbbq
5 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
6 points
by
sbbq
6 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
2 points
by
tosh
6 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
4 points
by
sbbq
6 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
2 points
by
mdp2021
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
8 points
by
yaiml
8 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
9 points
by
jonbaer
8 months ago
|
past
More
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: