-
Notifications
You must be signed in to change notification settings - Fork 8
Home
gluestick is still under heavy maintenance. Functions are subject to change behavior, and documentation will be updated accordingly.
This document serves as an initial barebones documentation - it will be improved in the near future.
Convenience method to read a set of CSV files in a folder, based on the read_csv(). This method assumes that the files are being pulled in a stream and follow a naming convention with the stream/ entity / table name is the first word in the file name for example; Account-20200811T121507.csv is for an entity called "Account".
read_csv_folder(path, converters={}, index_cols={})
:param path: the folder directory
:param converters: a dictionary with an array of converters that are passed to
read_csv, the key of the dictionary is the name of the entity.
:param index_cols: a dictionary with an array of
index_cols, the key of the dictionary is the name of the entity.
:return: a dict of pandas.DataFrames. the keys of which are the entity names
Examples
IN[31]: entity_data = read_csv_folder(CSV_FOLDER_PATH, index_cols={'Invoice': 'DocNumber'},
converters={'Invoice': {'Line': ast.literal_eval, 'CustomField': ast.literal_eval,
'Categories': ast.literal_eval}})
IN[32]: df = entity_data['Account']
Convert a column with a JSON tuple in it to two column (typically a name, value pair)
json_tuple_to_cols(df, column_name, col_config={'cols': {'key_prop': 'Name', 'value_prop': 'Value'},
'look_up': {'key_prop': 'name', 'value_prop': 'value'}}):
:param df: the data frame
:param column_name: column with the json tuple
:param col_config: conversion config
:return: a modified dataframe
Examples
IN[51]: qb_lookup_keys = {'key_prop': 'name', 'value_prop': 'value'}
IN[52]: invoices = json_tuple_to_cols(invoices, 'Line.DiscountLineDetail.DiscountAccountRef',
col_config={'cols': {'key_prop': 'Discount Details', 'value_prop': 'Discount %'},
'look_up': qb_lookup_keys})
Renames columns in DataFrame using a json format
rename(df, target_columns):
:param df : dataframe
:paramtarget_columns: the columns to rename as a dictionary
.. versionadded:: 1.0.0
df - a modified data frame
Examples
IN[52]: rename(df, )
Out[52]:
{'dict1.c': 1,
'dict1.d': 2,
'flat1': 1,
'nested.d': 2,
'nested.e.c': 1,
'nested.e.d': 2}
Explodes into multiple rows and expands into columns based on a column that has an array of JSON objects in it.
def explode_json_to_rows(df, column_name, drop=True, **kwargs):
:param df : the dataframe
:param column_name: the column that has the JSON in it.
:param drop: drop the source column in the result. Default is True
.. versionadded:: 1.0.0
df - a new data frame with the JSON line expanded into columns and rows
Examples
IN[52]: explode_json_to_rows(df, df['Line'] )
an example of the line would be:
[{"Id":"1","LineNum":"1","Amount":275,"DetailType":"SalesItemLineDetail","SalesItemLineDetail":{"ItemRef":{"value":"5","name":"Rock Fountain"},"ItemAccountRef":{"value":"79","name":"Sales of Product Income"},"TaxCodeRef":{"value":"TAX","name":null}},"SubTotalLineDetail":null,"DiscountLineDetail":null},{"Id":"2","LineNum":"2","Amount":12.75,"DetailType":"SalesItemLineDetail","SalesItemLineDetail":{"ItemRef":{"value":"11","name":"Pump"},"ItemAccountRef":{"value":"79","name":"Sales of Product Income"},"TaxCodeRef":{"value":"TAX","name":null}},"SubTotalLineDetail":null,"DiscountLineDetail":null},{"Id":"3","LineNum":"3","Amount":47.5,"DetailType":"SalesItemLineDetail","SalesItemLineDetail":{"ItemRef":{"value":"3","name":"Concrete"},"ItemAccountRef":{"value":"48","name":"Landscaping Services:Job Materials:Fountains and Garden Lighting"},"TaxCodeRef":{"value":"TAX","name":null}},"SubTotalLineDetail":null,"DiscountLineDetail":null},{"Id":null,"LineNum":null,"Amount":335.25,"DetailType":"SubTotalLineDetail","SalesItemLineDetail":null,"SubTotalLineDetail":{},"DiscountLineDetail":null}]
Out[52]:
Line.Id Line.LineNum Line.Amount Line.DetailType
Index
1037 1 1 275.00 SalesItemLineDetail
1037 2 2 12.75 SalesItemLineDetail
1037 3 3 47.50 SalesItemLineDetail
1037 None None 335.25 SubTotalLineDetail
1036 1 1 50.00 SalesItemLineDetail