Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically fall back to Stata file usage if parquet file not found #21

Open
kylebarron opened this issue Apr 4, 2018 · 2 comments
Open

Comments

@kylebarron
Copy link
Owner

No description provided.

@kylebarron kylebarron changed the title Automatically fall back to Stata usage if parquet file not found Automatically fall back to Stata file usage if parquet file not found Apr 4, 2018
@kylebarron
Copy link
Owner Author

Note that the Stata files don't have harmonized names! Will need a resolver function for that

@kylebarron
Copy link
Owner Author

to put in utils later

    # def _get_variables_to_import(self, year, data_type, import_vars):
    #     """Get list of variable names to import from given file
    #
    #     NOTE Not currently used
    #
    #     Returns:
    #         List of strings of variable names to import from file
    #     """
    #
    #     if type(year) != int:
    #         raise TypeError('year must be type int')
    #
    #     allowed_data_types = [
    #         'carc', 'carl', 'den', 'ipc', 'ipr', 'med', 'opc', 'opr', 'bsfab',
    #         'bsfcc', 'bsfcu', 'bsfd']
    #     if data_type not in allowed_data_types:
    #         msg = f'data_type must be one of: {allowed_data_types}'
    #         raise ValueError(msg)
    #
    #     import_vars = list(set(import_vars))
    #
    #     cols = fp.ParquetFile(self._fpath(self.percent, year, data_type)).columns
    #     tokeep_list = []
    #
    #     for var in import_vars[:]:
    #         # Keep columns that match text exactly
    #         if var in cols:
    #             tokeep_list.append(var)
    #             import_vars.remove(var)
    #
    #         # Then perform regex against other variables
    #         # else:
    #         #     re.search

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant