Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi, I'm Evan, developer of Modern CSV. I started developing this out of frustration with how a certain spreadsheet program handles CSV files. I took inspiration from the text editors we've all come to love and this is the result. It's available for Windows, Mac, and Linux. Here are some of the features that make it stand out:

- Multiple cell/row/column editing - Easy navigation between files - Keyboard shortcut customization - Command palette - Read-only mode for super large files

With version 2, I'm adding some basic data analysis tools, some new themes, M1 compatibility for Mac users, and a whole bunch of editing commands, most of which are user-requested (well, technically all since I use my own product). The current beta version will work until June 25. It includes all Premium features without the need for a license. I'll be happy to hear any feedback you have about it!



I'm looking forward to trying this, learned the hard way that Excel will make unwanted changes without notice to CSV files. Like SSNs that start with zero having there leading zeros silently removed, breaking things.


Yeah excel silently mangling csv files has always infuriated me

Just gave this program a try and it lets me save changes without removing any of the double quotes, yay!


I'm convinced that 90% of the people who rail against CSV for "not being a real format" are the ones that got burned by Excel. It has always shocked me how terrible Excel is at handling CSV given the sheer number of people who have this use case.

The crazy part is that it wouldn't be all that hard to handle it properly. Excel could examine each column and apply a uniform transform on each column instead of applying transforms on a cell by cell basis. They could even put in real effort and let the user choose the format for each column as part of the import process. You know, like being able to specify "text" for columns like SSNs or Credit cards that you aren't going to do math on anyway.


And dates! And ISSNs (I work with library data)! Ugh.


Optimist: the glass is 1/2 full

Pessimist: the glass is 1/2 empty

Excel: the glass is 1900-02-01

---

We've had bugs raised against our software because the clients use meeting titles like 1-2-1 for supervisory reviews which when extracted for reporting and opened in Excel (they love to dump data into Excel no matter what report functions you include directly in the application) get interpreted as a date even when output as a quoted string in the CSV file. Try explaining to a client that we have formatted it correctly and Excel is reading it wrong…

(We could of course output Excel files directly, but some of them can't download office documents from web apps because of security policy at their end.)

Given how badly common tools mangle unambiguously correct CSV data, how many variations there are which make “unambiguously correct CSV data” a somewhat small proportion of what is out there, and how many tools not only expect but require mis-formatted data and/or output it, it is scary how much the format is relied upon in major industries.


"Given how badly common tools mangle unambiguously correct CSV data, how many variations there are which make “unambiguously correct CSV data” a somewhat small proportion of what is out there, and how many tools not only expect but require mis-formatted data and/or output it, it is scary how much the format is relied upon in major industries."

In a nutshell, CSV isn't a format. It's a family of formats, and it's not even a well-specified family of formats.

At least in semi-technical circles, I've had some success in using this to push back against CSV suggestions and get them to use better things. I'm sure that in non-technical circles I'd have zero success with this, though. It sure ain't a magic talisman you can use.

JSON isn't exactly a rigidly specified format, but it's got a lot less flex in it and I've not had as much trouble with it. Biggest problem I have is just getting people using dynamic scripting languages to please output either a string or a number, but don't just output "whatever the scripting language happened to decide based on what code paths I happened to run" when you don't even realize your code ends up casting it back and forth without you knowing and what comes out is effectively random from my point of view.


> In a nutshell, CSV isn't a format. It's a family of formats, and it's not even a well-specified family of formats.

There is RFC4180. Though by 2005 when that came about there were already so many different cases around that it became just one of a great many possible variants.

I try not to push back too hard about CSV, for fear of “well, there is this XML format that is supported”! (bad enough in itself, but sometimes the “XML format” is even more poorly specified than the client's CSV edge cases which we are expected to guess).

JSON is nice as long, as you say, that strings are real strings and numbers are real numbers.

Oh, and dates/times are in an RFC3339 (or ISO8601) numeric (no localised month names, etc.) format either in UTC or with the timezone always specified, as strings (though at a pinch I'll accept a posix time_t for datetime if based on UTC). Not specifying how to handle dates/times/both is the major problem with JSON in my experience.


Bug-for-bug compatible is a direct consequence of Postel's law: https://en.wikipedia.org/wiki/Robustness_principle


And credit card numbers.

string -> interpreted as a number -> displayed in scientific notation -> saved to disk with the last 3 or 4 digits zeroed


Why did you choose an expiring demo? I don't think I've seen this model outside video games.


That used to be the dominant model for getting software exposed to as many users as possible, video games or not. It’s the classic shareware/trialware model. Although, with internet downloads, the “sharing” aspect isn’t really that important anymore.


Oh man, I just realized/remembered that shareware was meant to be shared from one person to the other, as in "I have this cool software, here it is on a floppy", and then you had to contact the developer for a license.

Such a magical time that was.


"How can you be sure the copy wasn't modified? Why isn't the EXE signed?"


I want as many eyeballs on it as possible without needing a license, but I also don't want it to quite be a free giveaway.

I suppose I'm a trailblazer, after the video game space and Jetbrains, of course.


Isn't it basically a trial?


jetbrains uses this on their EAP software


I tried this and it couldn't open a 6MB CSV file with 1920 columns and 1080 rows. Or maybe it could sometime if I decide that I can wait long enough. Notepad++ can open this file almost instantly.

Not impressed.


I figured out it. It's due to the Auto-Fit Column Width setting. By default, it's enabled. Most files don't have a ton of columns so it's fine, but for 1920 columns, it can really slow things down. When I loaded the file with the setting enabled, it took about 40 seconds. When I tried again with the setting disabled, it took about 1 second.

For now, I may put a band-aid on it with a popup asking if you want to disable the feature when there are a lot of columns. I have some ideas on how to make it more efficient. I'll see what I can do before the full release.


Thanks. I changed the setting, but it didn't help much. It's smart to optimize for small files, since those are probably 99% of your use cases.

You might want to remove the part of your web site that says "View Large Files Quickly" because I was very excited when I read this and then very disappointed when it wasn't fast at all.

Notepad++ can open the file instantly and allows me to move around very quickly. You'd probably need something close that level of performance before you can claim that your product is fast for large files.

Gotta eat your own dog food, yes?


I made a file of 9s and it took about a second. I made a video: https://moderncsv.com/videos-temp/load-image9.mp4

You have to change the setting in the Settings file (Edit Settings command) under the "User Value" column. Changing it under the "Default Value" column is a common mistake that's really my fault, so I intend to rectify it soon.

If it still doesn't perform like that for you, let me know.


It's handled much larger files than that, so if it's something you don't mind sharing, you can send the file at https://www.moderncsv.com/report-a-bug. It is a beta version, so I'm trying to fix all these issues now.

Edit: On second thought, it's probably grayscale intensity hex values of a 1920x1080 image. I can reproduce that myself. Feel free to send me your file if you want, but it might not be necessary.


Yes, it's just a .CSV file of a 1920x1080 image. Every value in the .CSV file is 9.

9,9 ... until 1920 columns.

...

until 1080 rows.


I've been looking for a lightweight CSV reader for some time now. Thanks for making this.


CSVFileView is another option, https://www.nirsoft.net/utils/csv_file_view.html , but it says it has problems with very large files.


I've been using ModernCSV for a while to handle readonly CSVs (mainly for sanity checking) and it's been fantastic.

Version 2 looks great, it's a hard balance to add features and not become bloated so will be interesting to see what it's like going forward.

(I love the product, not affiliated, just been using it every day since November 2020 as a happily paid user and recommend it to anyone who has the "pleasure" of working with CSVs)


FYI, Symantec endpoint proection flags this as having "WS.Reputation.1" spyware. Which I think translates to them not being paid off sufficiently.


I'm quietly waiting for my IT department to swoop in and chastise me for downloading and running it. Worth it.


With Python scripting which treats the file data as pandas dataframe would be very interesting...


I'm not sure I see a huge advantage that can be obtained via tight integration that you can't get by simply clicking "Save" and then ALT-Tab-ing to your Python script/notebook/whatever and re-running it, targeted at the file you just saved. If you're feeling really feisty, see about integrating a file watch program to do it as soon as you save. It's not like the editor is going to be using a pandas layout internally it can just pass to Python without marshaling.


At some point (maybe version 3), I'd like to integrate Python to allow users to make their own commands. This would give them access to entire rows, columns, ranges, etc.


This app almost in its entirety has been a growing note in my idea vault for years now due to identical frustrations. Thank you! Very excited to try it.


Congrats. v1 is very good.


a nice addition would be the possibility to plot data




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: