Hacker Newsnew | past | comments | ask | show | jobs | submit | mr_00ff00's commentslogin

Feel like this debate might be way different for novel writing vs every day writing.

I’m biased because I am not a very good writer, but I can see why in a book you might want to hint at how someone walked up to someone else to illustrate a point.

When writing articles to inform people, technical docs, or even just letters, don’t use big vocabulary to hint at ideas. Just spell it out literally.

Any other way of writing feels like you are trying to be fancy just for the sake of seeming smart.


>> Just spell it out literally.

Spelling it out literally is precisely what the GP is doing in each of the example sentences — literally saying what the subject is doing, and with the precision of choosing a single word better to convey not only the mere fact of bipedal locomotion, but also the WAY the person walked, with what pace, attitude, and feeling.

This carries MORE information about in the exact same amount of words. It is the most literal way to spell it out.

A big part of good writing is how to convey more meaning without more words.

Bad writing would be to add more clauses or sentences to say that our subject was confidently striding, conspiratorially sidling, or angrily tromping, and adding much more of those sentences and phrases soon gets tiresome for the reader. Better writing carries the heavier load in the same size sentence by using better word choice, metaphor, etc. (and doing it without going too far the other way and making the writing unintelligibly dense).

Think of "spelling it out literally" like the thousand-line IF statements, whereas good writing uses a more concise function to produce the desired output.


Those examples were simple, so it’s less of an issue, but if the words you use are so crazy that the reader has to read slower or has to stop to think about what you mean…then you aren’t making things more concise even if you are using less words.

For sure! Every author should know their audience and write for that audience.

An author's word choices can certainly fail to convey intended meaning, or convey it too slowly because they are too obscure or are a mismatch for the the intended audience — that is just falling off the other side of the good writing tightrope.

At technical paper is an example where the audience expects to see proper technical names and terms of art. Those terms will slow down a general reader who will be annoyed by the "jargon" but it would annoy every academic or professional if the "jargon" were edited out for less precise and more everyday words. And vice versa for the same topic published in a general interest magazine.

So, an important question is whether you are part of the intended audience.


Agreed.

Brevity is the soul of good communication.


What is a pre-training run?


Pre-training is just training, it got the name because most models have a post-training stage so to differentiate people call it pre-training.

Pre-training: You train on a vast amount of data, as varied and high quality as possible, this will determine the distribution the model can operate with, so LLMs are usually trained on a curated dataset of the whole internet, the output of the pre-training is usually called the base model.

Post-training: You narrow down the task by training on the specific model needs you want. You can do this through several ways:

- Supervised Finetuning (SFT): Training on a strict high quality dataset of the task you want. For example if you wanted a summarization model, you'd finetune the model on high quality text->summary pairs and the model would be able to summarize much better than the base model.

- Reinforcement Learning (RL): You train a separate model that ranks outputs, then use it to rate the output of the model, then use that data to train the model.

- Direct Preference Optimizaton (DPO): You have pairs of good/bad generations and use them to align the model towards/away the kinds of responses you want.

Post-training is what makes the models able to be easily used, the most common is instruction tuning that teaches to model to talk in turns, but post-training can be used for anything. E.g. if you want a translation model that always translates a certain way, or a model that knows how to use tools, etc. you'd achieve all that through post-training. Post-training is where most of the secret sauce in current models is nowadays.


Want to also add that the model doesn’t know how to respond in a user-> assistant style conversation after it’s pretraining, and it’s a pure text predictor (look at the open source base models)

There’s also what is being called mid-training where the model is trained on high(er) quality traces and acts as a bridge between pre and post training


just to go off of this there is also stochastic random overfit retraining process (SRORP). Idea behind SRORP is to avoid overfitting. SRORP will take data points from -any- aspect of the past process with replacment and create usually 3-9 bootstrap models randomly. The median is then taken from all model weights to wipe out outliers. This SRORP polishing -if done carefully- is usually good for a 3-4% gain in all benchmarks


If pre-training is just training, then how on earth can OpenAI not have "a successful pre-training run"? The word successful indicates that they tried, but failed.

It might be me misunderstanding how this works, but I assumed that the training phase was fairly reproducible. You might get different results on each run, do to changes in the input, but not massively so. If OpenAI can't continuously and reliably train new models, then they are even more overvalued that I previously assumed.


Because success for them doesn't mean it works, it means it works much better than what they currently have. If a 1% improvement comes at the cost of spending 10x more on training and 2x more on inference then you're failing at runs. (numbers out of ass)


That makes sense. It's not that the training didn't complete or returned a moronic model, but the capabilities have plateaued.


Maybe this has something to do with why they're declaring "code red".


- Reinforcement learning with verifiable rewards (RLVR): instead of using a grader model you use a domain that can be deterministically graded, such as math problems.


If you've an hour to spare this Karpathy video is good at explaining how it all works https://youtu.be/7xTGNNLPyMI


The first step in building a large language model. That's when the model is initiated and trained on a huge dataset to learn patterns and whatnot. The "P" in "GPT" stands for "pre-trained."


That’s where they take their big pile of data and train the model to do next-token-prediction.


Would current collapse make more than just Northern Europe colder? Or maybe they would be warmer?

They seem to suggest only certain northern countries would be affected because warm water stops flowing from the south.

So the southern waters would stay hotter right? Or what about across the Atlantic where the currents do the opposite (and make the winters so cold). Would Boston and New York get more temperate?


North of the Alps temperature would drop considerably. South of the Alps, probably fine due to the thermal mass of the mediterranean sea. However, for the whole Europe you would see a massive drop in rainfall, since basically all the humidity comes from the Atlantic's warm air that carries a lot of it.

Additionally, Carribeans, Mexico and South of the US would also be fucked since the energy wouldn't disperse and all the heat and humidity would stay there. Hurricanes would be much more violent, with way more rain, and likely more frequent.

Labrador current might become weaker though, but it is not a given. Currently, the waters from the gulf stream cool down and sink to the bottom of the ocean, so they don't displace the artic waters and hence are not likely the cause of how cold north eastern US is.


So what countries will be the beneficiaries of this process?


None? It is not certain any country will benefit. Countries built their infrastructure and population centers according to the weather of the location. If the weather changes probably every country will have to adjust.

If you are asking which area will benefit from climate change I would say Siberia as it will become increasingly important due to the northern corridor remaining ice free and because a lot of people will be displaced by weather/sea level. And that place is empty. Additionally, it has nice farming soil which right now is not used since there are easier places to farm but in a warming world this could change


I don’t know if this is ridiculous, but I’m curious if access to LLMs will one day be priced like the Bloomberg Terminal or something. Where access for one user is like 20,000 dollars. Maybe less than that, but like 5k per person.

Seems crazy by most software standards, but when Bloomberg became a software only program (they stopped selling physical terminals) and people were shocked when they paid almost nothing for excel but then so much for the second tool they needed as traders.

Yet it still is priced so high and people pay.


The difference is that Bloomberg Terminals were always expensive, and so people expected to pay. LLMs are basically free (subsidized) at this point, and people are very sensitive to large price increases.


Sure and I’m sure there would be a huge shock, but simple economics would dictate that if that’s the true equilibrium of price for LLMs to be economical, then it would have to get to the price eventually


There's good quality LLMs you can run for free today, so no.

The trend is going the opposite way, intelligence too cheap to meter according to @sama.


I think that's exactly what will happen but with robots.

A robot the cooks, cleans, and talks to you.

Many won't afford it so they'll maybe rent one by the day or for X number of hours.


Simple.

1. Is it worth 20k to anyone? Well depends on the advantage but maybe yes. People are dropping 200-1000 a month already as ordinary devs.

2. Is there competition? Yes lots. To get to 20k one model provider needs a real killer edge that no one else can keep up. Or alternatively constraints push prices up (like memory!) but then that is not margin anymore.


I think there could be a few directions. Consumer level LLM usage will become free. Corporate-grade LLM use will cost a lot of money. Maybe the highest grade LLM use will be regulated and only done through government labs.


What’s a high grade LLM though in such a competitive environment? And if China releases more high grade open source models, that pricing model is f8cked.

One interesting thing I heard someone say about LLM’s is this could be a people’s innovation, basically something so low margin it actually provides more value to “the people” than billionaires.


It just seems hard to imagine that simultaneously running 1,000 instances of claude code will be cheap in the next decade, but maybe running 1,000 instances of claude-like tools is what a corporate LLM subscription will give. And maybe running 1,000,000 or a billion such models is what the government will do once a contract gets awarded.

Just random speculation though.


It's my understanding that even the paid version ChatGPT is highly subsidized so yeah, the prices will have to be raised quite substantially to meet profitability.


that $18,000-$24,000 bloomberg terminal had a 300bps modem

it took longer to generate a page of content or get a complete answer than a free LLM takes on an i5 CPU.

(ex: llamaGPTJ for linux CLI)

its missing live market data and news but incorporating that with something akin to openrouter would be trivial.


chinese open source models disagree with you


They're not going to open source forever. Industry will consolidate. The Chinese winners will stop open sourcing their best models in the future.


https://www.nbcnews.com/news/amp/ncna95111

1 out of every 5 Ivy League students is prescribed stimulants.

I think it’s time we stop pretending like prescriptions magically mean the substance isn’t abused or is truly needed.


"1 out of every 5 Ivy League students is prescribed stimulants."

From the third paragraph of the article:

"Researchers analyzed responses from an online questionnaire of more than 600 Ivy League sophomores, juniors and seniors who were not diagnosed with ADHD — attention deficit/hyperactivity disorder — and therefore did not have a prescription for the medication. Writing essays and studying for an exam cause the most angst for students — of those who used stimulants, 69 percent said they used them to stay focused while writing and 66 percent said they used the drugs to help study."


Fair point, I was incorrect on the exact stat. But it is still very very high for legal prescriptions in the article compared to general populous.


“non-addictive at prescribed doses”

Less likely to be addictive, definitely not non-addictive.

https://talbottcampus.com/resources/how-adderall-addiction-s...

This has the same energy as the common incorrect statement “marijuana isn’t addictive”. I assume made by frequent users who want to downplay negatives.


The winters will get worse but also it would cause a new ice age?

Something seems off about saying these two things as if they mean the same thing.


An ice age is a period of expansion of glaciers and ice sheets, so winters could get worse without actually falling into an ice age.

Specifically, they're saying "this could cause an ice age, with winter temperatures plummeting to new extremes", which is a fair explanation of this potential future.


Automation with code = time consuming because you must build the instructions but will execute correctly basically 100% of the time once written

Automation with AI = less time consuming especially with elaborate rule sets, but it’s a probability model so by definition it will be wrong a percentage of the time


Why does it burn fuel so fast?


My guess is higher air density means more wind resistance, which acts as negative forward acceleration.


Not just that. Jet engines are efficient at higher speeds because the exhaust of the jet engine is fast.

If the plane is going fast as well, that exhaust is more or less stationary relative to the ground. The engine works to exchange the position of the plane with the position of the air in front of it.

If the plane is going slow, it's accelerating the air backwards. That's where the work is going, making the engine less efficient.

Think about it this way: if the jet airplane is tied to the ground, its engines are running at 0% efficiency, working hard to blow the air backwards. You wouldn't want to stand behind a jet engine when the plane is about to take off, when that's effectively the case.

The same applies to propeller-driven planes, of course. But those can vary the prop speed as well as propeller pitch, having more control on how fast the air is being pushed backwards. This allows the engine to be efficient at a wider ranger of speeds, particularly, at the slower range.

But the propeller has a limit of how fast it can push the air back. When the prop blades start reaching the speed of sound, weird shit starts happening [1]. So propeller-driven aircraft have a limit on speeds at which they can go efficiently.

Jet engines (turbofans when it comes to airliners) trade off low efficiency at low speed / low altitude (where the airplane is spending a small percentage of flight time) for higher efficiency at high speed / high altitude.

Variable pitch turbine fans[2] aim to address this tradeoff, but the tech has yet to catch on.

[1] https://en.wikipedia.org/wiki/Republic_XF-84H_Thunderscreech

[2] https://en.wikipedia.org/wiki/Variable_pitch_fan


That sounds like Oberth effect in rocketry, where the faster you go the more efficient your rocket be: https://en.wikipedia.org/wiki/Oberth_effect


they have nothing to do with each other.


I think about it like this:

Jet needs to suck air from front. If air is stopped, sucking is hard. If air is already being thrown at you, you don't even need to suck, just let it come in.


You are right that accelerating the air backwards more reduces efficiency but I think it should be mentioned that the jet engine has to accelerate the air backwards to do any work pushing the plane forward. Picking it up and setting it back down affects the air with a net force of zero and therefore the force pushing the plane forward is also zero.


So perhaps the differential air speed between the intake and exhaust is a big factor in the efficiency equation? The bigger the difference the more work is needed..


Variable pitch turbine fans sound very interesting! Perhaps in the future as tech improves and fuel efficiency incentives continue to increase.


So, newton's first law?


How does this work for dynamic casting? Say like if an age was submitted from a form?

I assume it’s a runtime error or does the compiler force you to handle this?


If you're using SPARK, it'll catch at compile time if there's ever a possibility that it would fit within that condition. Otherwise it'll throw an exception (constraint_error) during runtime for you to catch.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: