TehPers

TehPers@beehaw.org · 41 minutes ago

Are you suggesting that Rust can perform compile time array bounds checking for all code that uses arrays?

I’ll answer this question: no.

But it does make some optimizations around iterators and unnecessary bounds checks written in code at least.

But yes it does runtime bounds checking where necessary.

TehPers@beehaw.org · 44 minutes ago

C, C++, and Rust come to mind as other languages with sizes as part of an array’s type. This is necessary for the compiler to know how much stack memory to reserve for the values, and other languages that only rely on dynamically sized arrays and lists allocate those on the heap (with a few exceptions, like C#'s mostly unknown stackalloc keyword).

TehPers@beehaw.org · edit-2 59 minutes ago

Do you mean memory safety here? Because yes, for memory safety, this is proven. E.g. there are reports from Google that wide usage of memory-safe languages for new code reduces the number of bugs.

Memory safety is such a broad term that I don’t even know where to begin with this. Memory safety is entirely orthogonal to typing though. But since you brought it up, Rust’s memory safety is only possible due to its type system encoding lifetimes into types. Other languages often use GCs and runtime checking of pointers to enforce it.

Then, first, why don’t the claims that statically compiled languages come with claims on measurable, objective benefits? If they are really significantly better it should be easy to come up with such measures?

Because nobody’s out there trying to prove one language is better than another. That would be pointless when the goal is to write functional software and deliver it to users.

I have seen no such report - in spite of that they now have 16 years of experience with it.

I have seen no report that states the opposite. Google switched to Go (and now partially to Rust). If they stuck with it, then that’s your report. They don’t really have a reason to go out and post their 16 year update on using Go because that’s not their business.

And just for fun, Python itself is memory safe and concurrency bugs in Pyhton code can’t lead to undefined behaviour, like in C.

Python does have implementation-defined behavior though, and it comes up sometimes as “well technically it’s undocumented but CPython does this”.

Also, comparing concurrency bugs in Python to those in C is wildly misleading - Python’s GIL prevents two code snippets from executing in parallel while C needs to coordinate shared access with the CPU, sometimes even reordering instructions if needed. These are two completely different tasks. Despite that, Rust is a low level language that is also “memory safe”, except to an extent beyond Python - it also prevents data races, unlike Python (which still has multithreading despite running only one thread at a time),

Go is neither memory safe…

?

…nor has it that level of concurrency safety

That’s, uh, Go’s selling point. It’s the whole reason people use it. It has built-in primitives for concurrent programming and a whole green threading model built around it.

If you concurrently modify a hash table in two different threads, this will cause a crash.

This is true in so many more languages than just Go. It’s not the case in Python though because you can’t concurrently modify a hash table there. The crash is a feature, not a bug. It’s the runtime telling you that you dun goof’d and need to use a different data structure for the job to avoid a confusing data race.

TehPers@beehaw.org · edit-2 3 hours ago

It’s not hard to find articles explaining the benefits of using TypeScript over JavaScript or type hints in Python over no type hints online. It’s so well known at this point that libraries now require type hints in Python (Pydantic, FastAPI, etc) or require TypeScript (Angular, etc) and people expect types in their libraries now. Even the docs for FastAPI explain the benefits of type hints, but it uses annotated types as well for things like dependencies.

But for a more written out article, Cloudflare’s discussion on writing their new proxy in Rust (which has one of the strictest type systems in commonly used software programming languages) and Discord’s article switching from Go to Rust come to mind. To quote Cloudflare:

In fact, Pingora crashes are so rare we usually find unrelated issues when we do encounter one. Recently we discovered a kernel bug soon after our service started crashing. We’ve also discovered hardware issues on a few machines, in the past ruling out rare memory bugs caused by our software even after significant debugging was nearly impossible.

TehPers@beehaw.org · 18 hours ago

Can also confirm - Pyright is a god send.

TehPers@beehaw.org · 21 hours ago

There’s no scientific evidence that pissing in someone’s coffee is a bad idea, but it’s common sense not to do that.

You seem to be looking to apply the scientific method somewhere that it can’t be applied. You can’t scientifically prove that something is “better” than another thing because that’s not a measurable metric. You can scientifically prove that one code snippet has fewer bugs than another though, and there’s already mountains of evidence of static typing making code significantly less buggy on average.

If you want to use dynamic typing without any type hints or whatever, go for it. Just don’t ask me to contribute to unreadable gibberish - I do enough of that at work already dealing with broken Python codebases that don’t use type hints.

TehPers@beehaw.org · 1 day ago

TIL there’s such a thing as idiomatic C.

Jokes aside, microbenchmarks are not very useful, and even JS can compete in the right microbenchmark. In practice, C has the ability to give more performance in an application than Java or most other languages, but it requires way more work to do that, and it unrealistic for most devs to try to write the same applications in C that they would use Java to write.

But both are fast enough for most applications.

A more interesting comparison to me is Rust and C, where the compiler can make more guarantees at compile time and optimize around them than a C compiler can.

TehPers@beehaw.org · 1 day ago

Python 3.13 has really good support for typing now and while it doesn’t support more complex types, opting out with typing.Any or # type: ignore still works.

Type hints have been largely helpful from my experience, especially when the code itself came from someone else and is incomprehensible otherwise.

TehPers@beehaw.org · 1 day ago

Thanks. I couldn’t find a date on it.

TehPers@beehaw.org · 1 day ago

Honestly I’m surprised this is even still discussion. I plan to read this, but who out there is arguing against static typecheckers? And yes, I know it’s probably the verifiable NPCs on Twitter.

TehPers@beehaw.org · 8 days ago

You should also include the standardized name of the body the coordinates are relative to. Need to be able to differentiate between lat/long on Jupiter vs on Earth (where lat/long are much more “crunched” aka more precise with shorter floats).

This will be important if intelligent extraterrestrial life is found, or when Musk ships himself to Mars for the good of humanity.

TehPers@beehaw.org · 19 days ago

According to cppreference:

Unless otherwise specified, all standard library objects that have been moved from are placed in a “valid but unspecified state”, meaning the object’s class invariants hold (so functions without preconditions, such as the assignment operator, can be safely used on the object after it was moved from)

I would expect this to be true of all types. An easy way to do this is to null an internal pointer, set an internal fd to a sentinel, etc and check for that when needed, but this could be an easy source of errors if someone’s not paying attention.

Ideally it would be statically checked if a value is used after being moved, but that’s just my Rust brain speaking.

TehPers@beehaw.org · 25 days ago

I would expect it to use CRLF (on Windows) for all new newlines unless I tell it otherwise. It shouldn’t try to be smart about it. It should just do exactly what I tell it to do and nothing more.

TehPers@beehaw.org · edit-2 26 days ago

Where do you draw the line on “smart” features? Tab should not add indent spaces? Encoding or newline mechanisms? Determining EOF newline?

For a very basic default editor, I would expect it to include only what I typed, no “smart” features, no IDE features, nothing else, and use CRLF (on Windows) for newlines with at most a setting to configure it in the editor for that session.

Basically, I wouldn’t expect anything more than what nano does. If I want a fancy CLI editor, I’ll install one. At its core though, it should exist only to edit the text content of a text file and do nothing else. It should be as stable as possible, and have as little scope as possible, in my opinion.

With that said, basic text editing features, like undo/redo and cut/copy/paste would be nice. Bonus points if it even works with the system clipboard.

Edit: to add to the question of whether an automatic newline should be added, Windows has no requirement for terminating text documents with newlines, so I would not expect one. What happens in POSIX environments by tools written for those environments seems irrelevant here - if a valid text document in POSIX must be terminated by a newline, then a text editor there would naturally be expected to add one, or at least support adding one, but that has nothing to do with Windows.

TehPers@beehaw.org · 26 days ago

The only part of this process I’d consider automating with a LLM is summarizing the changes, and even then I’d only be interested looking at a suggested changelog, not something fully automated.

It’s amazing to me how far people will go to avoid writing a simple script. Thankfully determinism isn’t a requirement for a release pipeline. You wouldn’t want all of your releases to go smoothly. That would be no fun.

TehPers@beehaw.org · edit-2 30 days ago

Thank you for giving us a great example of how to appropriately use AI: turning a long comment with no line breaks into a blog post summarizing the comment.

Now I just need to pass your comment into ChatGPT to get a short summary.

Edit: I asked for a one-sentence summary of it and this is what I got:

The comment expresses deep frustration with modern technology, society, government, and politics, calling for radical change and greater individual skepticism.

TehPers@beehaw.org · 1 month ago

I haven’t used a LLM to help code in a while (yes I’ve tried), but I found them useful for repetitive configs, like asset files. Also sometimes it makes sense to just have 5 slightly different lines of code in a row instead of a new function.

In general though, reasonable use of DRY is a good idea. There will still be repetitive parts though where a LLM autocompleter lets you just hit tab 5 times.

TehPers@beehaw.org · 1 month ago

But how can we then ensure that I am not adding/processing products which are already in the “final” table, when I have no knowledge about ALL the products which are in this final table?

Without knowledge about your schema, I don’t know enough to answer this. However, the database doesn’t need to scan all rows in a table to check if a value exists if you can build an index on the relevant columns. If your products have some unique ID (or tuple of columns), then you can usually build an index on those values, which means the DB builds what is basically a lookup table for those indexed columns.

Without going into too much detail, you can think of an index as a way for a DB to make a “contains” (or “retrieve”) operation drop from O(n) (check all rows) to some much faster speed like O(log n) for example. The tradeoff is that you need more space for the index now.

This comes with an added benefit that uniqueness constraints can be easily enforced on indexed columns if needed. And yes, your PK is indexed by default.

Read more about index in Postgres’s docs. It actually has pretty readable documentation from my experience. Or read a book on indexes, or a video, etc. The concept is universal.

May you elaborate what you mean with read replicas? Storage in memory?

This highly depends on your needs. I’ll link PG’s docs on replication though.

If you’re migrating right now, I wouldn’t think about this too much. Replicas basically are duplicates of your database hosted on different servers (ideally in different warehouses, or even different regions if possible). Replicas work together to stay in sync, but depending on the kind of replica and the kind of query, any replica may be able to handle an incoming query (rather than a single central database).

If all you need are backups though, then replicas could be overkill. Either way, you definitely don’t want prod data all stored in a single machine, usually. I would talk to your management about backup requirements and potentially availability/uptime requirements.

TehPers@beehaw.org · 1 month ago

Pronouns are pointers. “Let us (let’s) move it over there.” Both “us” and “it” indirectly refer to something else by a new name. Like pointers, the pointees are defined by some context external to that sentence/statement (usually earlier sentences/statements or some other actions). The meaning of “us” and “it” can change as well in different contexts, and as such, those words are not bound to one value (and “rebinding” those words by changing contexts does not change the values they were previously bound to).

TehPers@beehaw.org · 1 month ago

This seems like the same problem that lifetimes solve in Rust - tracking when values are no longer used and thus fall “out of scope”. Automated tooling should really be doing lifetime analysis of these values, and that seems to me like it would fall well out of scope of what GenAI can be trusted to do.

If this is such a huge problem, are you able to create finalizers that close the resources instead, or better abstractions for managing the LTs of these resources? I don’t write Java anymore, but this seems like a problem better solved by other tools.