Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Databases are generally known for their good performance

If the data is on a filesystem, then sed, grep and cut pipelines will likely be your fastest option (Yahoo! processed petabytes of logfiles for decades that way.)

If the data is already inside a database table and indexed well, that could be fast enough. But generally speaking, the ETL is often a bottleneck. And DBAs are $$$$ compared to "the UNIX way."

Source: DBA



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: