✍️ Gate 廣場「創作者認證激勵計劃」進行中!
我們歡迎優質創作者積極創作,申請認證
贏取豪華代幣獎池、Gate 精美周邊、流量曝光等超過 $10,000+ 豐厚獎勵!
立即報名 👉 https://www.gate.com/questionnaire/7159
📕 認證申請步驟:
1️⃣ App 首頁底部進入【廣場】 → 點擊右上角頭像進入個人主頁
2️⃣ 點擊頭像右下角【申請認證】進入認證頁面,等待審核
讓優質內容被更多人看到,一起共建創作者社區!
活動詳情:https://www.gate.com/announcements/article/47889
Since o1 launched, the biggest complaint is that it's "too verbose."
I just wanted to fix a simple bug, and it gave me three background explanations, two solution approaches plus error handling, and then wished me good luck on top of that.
I was only looking for a spelling mistake on line 12, but ended up having to review Python naming conventions all over again.
This blame falls squarely on RLHF. Annotators tend to give higher scores to longer responses, thinking more text looks more professional.
So the model desperately piles up "seemingly useful" filler, while the actual core information gets diluted.
Look at Claude next door—it's much more sensible about this, knowing what length matches what question.
The most painful part is the wallet: o1's output pricing is $60/1M tokens. For something that should take 100 tokens to explain, it deliberately pads it to 500, multiplying costs by five on the spot.
Now when asking questions you have to specifically add "code only," and even that doesn't always work.
The model's current state is: genius-level IQ, but EQ completely offline—it simply doesn't know when to shut up.