The inferencing logic needs to sample the file, so (1) the file path must be determined at compile time and (2) the file must be available to be read at compile time. If neither condition is true---like the filename is a runtime parameter, for example---then the user must supply the type in advance.
There is no magic here. No language can guess the type of anything without seeing what the thing is.
Yeah, i think that's what limits the utility of such systems. Polars does typechecking at query planning time. So before you really do computation. I don't expect that much can improve over this model due to the aforementioned limitations.
I think needing network access or file access at compile time is a semi-hard blocker for statically typed dataframes.