Templarrr

Karma: 200

Software engineer and small time DS/ML practitioner.

Templarrr 27 Nov 2025 13:52 UTC
4 points
0
on: ChatGPT 5.1 Codex Max
looking closer to linear progress
There is no “linear” progress on the chart, 2 reference lines are “exponential” and “superexponential”. The Y axis is logarithmic.

Templarrr 5 Sep 2025 13:44 UTC
1 point
−2
on: AI #132 Part 1: Improved AI Detection
least by my eyes even when they have relatively good taste they all reliably have terrible taste and even the samples people say are good are not good
We can get a lot here if we remember that a lot of “good writing” is centered around “not repeating itself” in different forms (words/phrases/structures etc) and current models are absolutely terrible in that. IF we can add temporary negative weights to the terms that were already used in answer that would decrease to zero with time, we can incentivise the LLMs to utilize wider variety of language.

Templarrr 28 Mar 2025 11:45 UTC
1 point
0
on: AI #108: Straight Line on a Graph
engineer, honestly
First I thought this was hilarious, as in “we really just want an engineer FFS”, but then I checked.

Engineer, honesTY. As in “engineer to research and improve models honesty”.

Templarrr 27 Mar 2025 22:15 UTC
0 points
0
on: Fun With GPT-4o Image Generation
Fun safety hiccup—the image generator is very persistent in not allowing to draw a hand that touches the blade of the sword, regardless how safe the context is. The hand can hover over it, be close to, touch the guard, but not the blade. I barely made it able to touch a blade by invoking the Mordhau and medieval fencing manuals, and even then it was just one hand on the blade, while it should’ve been both.

No trouble making it work with a wooden toy sword though, but that defeated the entire point of the picture.

Templarrr 18 Mar 2025 14:56 UTC
2 points
1
on: Monthly Roundup #28: March 2025
Would this even be legal in Germany? No wonder Europe is falling behind.
Case study “how to make your post much worse in a single sentence”.

There’s literally nothing they describe that requires to do active face recognition (the only part that could be a problem in Europe).
Most of the office spaces use personalized electronic key cards.
Office systems KNOW who just entered.

Solving non-existing problem by harder-then-necessary and illegal-in-some-places way can be fun, but isn’t as much of a dunk on others as author believes it to be. Without the last part it was fun experiment of a fellow tech person, with it …

Templarrr 29 Jan 2025 16:34 UTC
1 point
0
on: AI #100: Meet the New Boss
Which means, in turn, that you must (for that to make any sense) be using the AI in its non-aligned state to align itself and solve all those other problems
Strongly disagree on this.
The text doesn’t imply this at all. “While doing it” doesn’t mean you will be using AI, it just means that during the development your team uncovers a lot of corner cases and knowledge and skills needed that weren’t available to them before they started, which is how most of the engineering projects are done.
You may have general plan, but it is expected that you will come up with the details as your knowledge of the area extends.

Templarrr 21 Jan 2025 19:47 UTC
1 point
0
on: Monthly Roundup #26: January 2025
- Pointless busywork is bad.
100%. The problem usually is hidden in people mixing “I don’t (understand/agree with) the point of something” with “something is pointless”.

Templarrr 8 Dec 2024 17:42 UTC
1 point
0
on: AI #93: Happy Tuesday
what the median essay, story, or response to the assignment will look like so they can avoid and transcend it all
Obligatory joke about how terrible our education is, that half of the scores are below median!

Templarrr 11 Nov 2024 15:30 UTC
1 point
0
on: AI #89: Trump Card
they’re 99% sure are AI-generated, but the current rules mean they can’t penalise them.
The issue is proving it.
That is very much not the issue. The issue is that academy spent last few hundred years to make sure papers are written in the most inhuman way possible. No human being ever talks like whitepapers are written. The “we can’t distinguish if this was written by a machine or human that is really good at pretending being one” can’t be a problem if it was heavily encouraged for centuries. Also fun reverse-Turing test situation.

Templarrr 31 Oct 2024 9:49 UTC
10 points
−1
on: Occupational Licensing Roundup #1
Two things to note.

First—I feel like putting every occupation in the same pile and deciding are you for or against licensing isn’t helpful? I personally don’t need licensed lawnmower, but I would very much prefer licensed doctor. The cost of mistake in two occupations differs a lot and can be used for a threshold which jobs should require a license.

Second—there should be a difference between doing a thing to yourself (argument can be made even that here we shouldn’t have any limits), doing things for free to your friends/relatives with their full knowledge of your skill level and experience (most of the non life-threatening things can probably be allowed here) and selling your craft for money.

Templarrr 29 Oct 2024 20:15 UTC
6 points
3
on: AI #87: Staying in Character
llms don’t work on unseen data
Unfortunately I hear this quite often, sometimes even from people who should know better.
A lof of them confuses this with the actual thing that exist: “supervised ML models (which LLM is just a particular type of) tend to work much worse on the out-of-training distribution data”. If you train your model to determine the volume of apples and oranges and melons and other round-y shapes—it will work quite well on any round-y shape, including all kind of unseen ones. But it will suck at predicting the volume of a box.
You don’t need model to see every single game of chess, you just need the new situations to be within the distribution built from massive training data, and they most often are.
Real out-of-distribution example in this case would’ve been to only train it on chess and then ask what is the next best move in checkers (relatively easy OOD—same board, same type of game) or minecraft.

Templarrr 13 Oct 2024 18:49 UTC
16 points
13
on: AI #85: AI Wins the Nobel Prize
the *real* problem is the huge number of prompts clearly designed to create CSAM images
So, people with harmful and deviant from the social norm taste instead of causing problems in the real world try to isolate themselves in the digital fantasies and that is a problem...exactly how?
I mean, obviously, it’s coping mechanism, not trying to fix the problem, but also our society isn’t known to be very understanding to people coming out with this kind of deviations when they want to fix it.

Templarrr 3 Oct 2024 10:00 UTC
0 points
0
on: Monthly Roundup #22: September 2024
India getting remarkably better in at least one way, as the percentage of the bottom 20% who own a vehicle went from 6% to 40% in only ten years.
Is it better though? This stats show only “who owns a vehicle” not “who is happy about the fact”. It doesn’t show how many people were forced to take mortgage because owning a vehicle was an only way to live. In ideal world nobody should have a need for a personal vehicle to survive, leaving it only as a luxury, not a lifeline.

Templarrr 11 Sep 2024 12:39 UTC
3 points
0
on: AI #80: Never Have I Ever
The inclusion of ‘natural disaster’ shows that this simply is not a thing people are thinking about at all.
Chicxulub and Popigai impactors were both pretty natural. Actually within the listed 5 things “natural disasters” is the only category that had actual extinction events in the past. So I’m a bit confused with this comment.

Templarrr 21 Aug 2024 11:35 UTC
1 point
0
on: Monthly Roundup #21: August 2024
Peter Thiel on his struggle to leave California
Honestly, at this point one with some self-awareness would start to suspect that the problem may not be on the cities side. Nothing wrong with the search for the better place for themself, everyone is entitled to it, but when literally nothing fits...

Templarrr 19 Aug 2024 9:08 UTC
2 points
3
on: Beware the science fiction bias in predictions of the future
If the answer is yes to all of the above
Point 2 needs rephrasing.

“Does it sound exciting or boring?” “Yes”

Templarrr 28 Jul 2024 10:26 UTC
1 point
0
on: Monthly Roundup #20: July 2024
Most Importantly Missing
Where’s my “Babylon 5”? Honestly, risking to get the anger of trekkies here, but it’s “DS9 but better”

Templarrr 28 Jul 2024 10:14 UTC
1 point
0
on: Monthly Roundup #20: July 2024
Does the Nobel Prize sabotage future work?
My first thought was “regression to the mean” and judging from a lot of comments in the original post I’m not the only one. If you’re on the top of the world, the only way to go is down.

Templarrr 28 Jul 2024 10:07 UTC
1 point
1
on: Monthly Roundup #20: July 2024
Your periodic reminder.
Except there should also be an understanding what constitutes a constructive “questioning the science”. There can be no debate between quantum physicists and cobbler about quantum physics. Questioning the science isn’t “I decided I know better” and isn’t “I don’t want to beleive in your results” (by itself). You question the science by checking, double-checking, finding weaknesses in the previous science. And by making new, better, more rigorous science.
People tend to forget this part even more often than the part about questioning being the integral part of science.

Templarrr 27 Jul 2024 18:33 UTC
1 point
0
on: AI #74: GPT-4o Mini Me and Llama 3
Compared to how much carbon a human coder would have used? Huge improvement.
JSON formatting? That’s literally millisecond in dedicated tool. And contrary to LLM will not make mistakes you need to control for. Someone using LLM for this is just someone too lazy to turn on the brain.
That said, it’s not like people not using their brain isn’t frequent occurence, but still… not something to praise.