Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain files from the StatFi database fail to import #3

Open
ljleppan opened this issue Jul 19, 2017 · 0 comments
Open

Certain files from the StatFi database fail to import #3

ljleppan opened this issue Jul 19, 2017 · 0 comments

Comments

@ljleppan
Copy link
Owner

Certain files from the StatFi database, f.ex. 046_syyttr_tau_111_fi.px fail to import. The problem persists whether the file is downloaded using the python script or via browser.

The issues is that the dimensions of the generated row and column indices do not match the size of the DATA section in the input file. While reading in data, some of the data points also seem to contain tens of thousands of null bytes in their string representations.

A similar issue is encountered when using the pxR package to try to process the file:

> my.px.object <- read.px( "database/StatFin/oik/syyttr/046_syyttr_tau_111_fi.px" )
Error in read.px("database/StatFin/oik/syyttr/046_syyttr_tau_111_fi.px") : 
  The input file is malformed: data and varnames length differ
In addition: Warning message:
In scan(filename, what = "character", sep = "\n", quiet = TRUE,  :
  embedded nul(s) found in input

As the R library suffers from the same issue, I'm inclined to believe that this issue is with the input file, rather than the libraries themselves. A more in-depth investigation is needed to determine whether this is the case.

@ljleppan ljleppan changed the title BUG: Certain files from the StatFi database fail to import Certain files from the StatFi database fail to import Jul 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant