Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in ffdfrbind.fill() #56

Open
JakubKomarek opened this issue Aug 21, 2019 · 7 comments
Open

error in ffdfrbind.fill() #56

JakubKomarek opened this issue Aug 21, 2019 · 7 comments

Comments

@JakubKomarek
Copy link

I am trying to rbind two ffdfs objects and I follow the example from CRAN documentation. However, I always get this error:

Error in if (by < 1) stop("'by' must be > 0") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), :
NAs introduced by coercion to integer range
I have also tried using ffbase2 and creating tbl.ffdf objects and then joining both dataframes by dplyr but the same error occurs.

Any advise will be appreciated.

x <- ffdfrbind.fill( as.ffdf(iris),
as.ffdf(iris[, c("Sepal.Length", "Sepal.Width"
, "Petal.Length")])

@edwindj
Copy link
Owner

edwindj commented Aug 22, 2019

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris),
as.ffdf(iris[, c("Sepal.Length", "Sepal.Width"
, "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)
x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width"
, "Petal.Length")])
x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))
x2$Species <- ff(NA, vmode = "logical", length = nrow(x2))

x <- ffdfappend(x1, x2)

Still not working?

@JakubKomarek
Copy link
Author

JakubKomarek commented Aug 22, 2019 via email

@JakubKomarek
Copy link
Author

JakubKomarek commented Aug 26, 2019 via email

@edwindj
Copy link
Owner

edwindj commented Aug 26, 2019 via email

@JakubKomarek
Copy link
Author

JakubKomarek commented Aug 26, 2019 via email

@JakubKomarek
Copy link
Author

JakubKomarek commented Aug 26, 2019 via email

@edwindj
Copy link
Owner

edwindj commented Aug 28, 2019

I cannot reproduce the bug on Rhub (which runs on Windows 2008 SP2), but don't despair...

Technically it is in realm of ff (and not ffbase), but I do have a hunch what the problem might be, using the error message and glaring the ff code (which is not mine).

ff uses chunking to process large vectors and data.frames. The size of a chunk is determined by the option "ffbatchbytes". It seems that on your Windows 10 machine(s) the value for the option isn't set correctly. May be because you are using 32bits R (so one option is to switch to 64bits).

ff sets this value automatically when library(ff) is called (see following code)

copied from ff:::.onLoad()

   if (is.null(getOption("ffmaxbytes"))) {
        if (.Platform$OS.type == "windows") {
            if (getRversion() >= "2.6.0") 
                options(ffmaxbytes = 0.5 * memory.limit() * (1024^2))
            else options(ffmaxbytes = 0.5 * memory.limit())
        }
        else {
            options(ffmaxbytes = 0.5 * 1024^3)
        }
    }

I suggest you set the options(ffmaxbytes) manually and try to run the examples again.

# e.g. 500MB
options(ffmaxbytes =  500 * (1024^2))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants