As a User, I can stream reports in a .csv format #68

gabrielleberanger · 2020-12-16T14:44:13Z

WHY
Today, the only output stream format available is .njson (i.e. a file with n lines, each line being a dictionnary).
This format has two downsides:

It does not allow us to easily conduct preliminary analysis on the output data: .njson files cannot be directly forwarded to non-tech users, and cannot be put into a pandas DataFrame without undergoing preliminary transformations.
Some APIs natively return data in a .csv format: in these cases, we have to convert each line to a dictionnary, which can occasion parsing errors.

HOW
Create a .csv streamer.

The text was updated successfully, but these errors were encountered:

benoitgoujon · 2021-01-08T19:13:24Z

Hi there,

I've started working on this issue and I've noticed that we may encounter a problem with the current software architecture.

Currently, the format of the destination file is enforced. We will have a .njson file by default. Even though there is a Pickle option, it is never used in the code. If we want to introduce a new format like CSV, we must let users decide which format they prefer. It would be intuitive to have an option in the writer command, something like write_gcs --gcs-file-format csv.

BUT, to do so, we need to change the stream we use (CSVStream vs JSONStream) and this choice must be implemented in the read() function in the reader. So, that would force us to add the file format as an option of the reader, something like read_dv360 --dv360-file-format csv, which is not as intuitive as if it was in the writer options because we now mix up the reader and writer options.

Is it acceptable though?

What is your opinion regarding this issue?

gabrielleberanger added the new feature Creating a new feature label Dec 16, 2020

gabrielleberanger added the P1 1st priority label Dec 16, 2020

benoitgoujon self-assigned this Jan 8, 2021

gabrielleberanger added P2 2nd priority and removed P1 1st priority labels Jan 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

As a User, I can stream reports in a .csv format #68

As a User, I can stream reports in a .csv format #68

gabrielleberanger commented Dec 16, 2020 •

edited

Loading

benoitgoujon commented Jan 8, 2021

As a User, I can stream reports in a .csv format #68

As a User, I can stream reports in a .csv format #68

Comments

gabrielleberanger commented Dec 16, 2020 • edited Loading

benoitgoujon commented Jan 8, 2021

gabrielleberanger commented Dec 16, 2020 •

edited

Loading