What file formats are the existing datasets you have? I also work on data processing in a scientific domain where HDF5 is a common format. Unfortunately Duckdb doesn't support HDF5 out of the box, and the existing hdf5 extension wasn't fast enough and didn't have the features needed, so I made a new one based on the c++ extension template. I'd love to collaborate on it if anyone is interested.
That's really fascinating. Is your format open source? I don't know if I'd have overlapping needs for something like that (though I did investigate hdf5 early on, it seemed very promising as a place to store our outputs) but I'd be curious to explore it and see what you're doing with it.
Right now we typically read from CSV or Excel, because that's what the scientists prefer to work with. For better or worse. There's a bit of parquet kicking around. The wrappers around handling imports for DuckDB are very, very thin. It handles just about everything seamlessly
Why would 2x the transportation cost be intractable, but ruining the environment, killing life in the oceans, destroying the basis of our future food production, etc, be tractable?
In some ways I agree with you. For example, exploring the vast empty cold space of the universe (or Jupiter) doesn't appeal to me at all. I'm ready to bet our planet is already the most marvelous place in existence. Let's hope we don't destroy it too much.
If you have a central database, what benefits are you getting from edge compute? This is a serious question. As far as I understand edge computing is good for reducing latency. If you have to communicate with a non-edge database anyway, is there any advantage from being on the edge?
Databases in Cloudflare are not edge. That is, they are tied to a central location. Where workers help is async stateless tasks. There are a lot of these (authentication, email, notifications, etc.)
Well you can cache stuff and also use read replicas. But yes, you are correct. For 'write' it doesn't help as much to say the least. But for some (most?) sites they are 99.9% read...
> Why would it plunge instead of re-focusing on things that are intrinsically important?
Because a lot of the economy is focused on creating and maintaining a surplus[1]: make people buy things that they don't really need, make them discard and replace things that they've been convinced are no longer worth it.
That's the current state yes. But that doesn't mean it's the only possible state. If that wasteful consumption disappear, would anyone be worse off? Hardly. But it would free up capacity to do more actually useful and valuable things. Sounds like a win to me.
What does that even mean? We're talking about an organization that serves billions of riders per year. Their passenger numbers increased 20-30% since 2020 so even if their delays are bad clearly it's not bad enough that most people seek alternatives.
If I'm using your transit service and you take me on a several hour detour without my consent passing a dozen or more possible stops because "you're not cleared for that", you aren't serving your mandate and I'm never using your service again.
I might even pull the emergency brake before it gets that far and cause you more problems, even.
reply