Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory profiling #33

Open
wlandau opened this issue Jan 8, 2025 · 4 comments
Open

Memory profiling #33

wlandau opened this issue Jan 8, 2025 · 4 comments

Comments

@wlandau
Copy link
Member

wlandau commented Jan 8, 2025

I received a request to support memory profiling in proffer (r-prof/proffer#32) and I am trying to figure out why proffer::pprof(memory.profiling = TRUE) does not show memory profiling in the pprof dashboard. I am wondering if profile is already capturing the memory data. I have a small example:

path <- tempfile()
Rprof(filename = path, memory.profiling = TRUE)
n <- 1e3
x <- data.frame(x = rnorm(n), y = rnorm(n))
for (i in seq_len(n)) {
  x[i, ] <- x[i, ] + 1
}
Rprof(filename = NULL)

It looks like Rprof() is correctly recording memory data.

head(summaryRprof(filename = path, memory = "both")$by.total)
#>                      total.time total.pct mem.total self.time self.pct
#> "Ops.data.frame"           0.10     83.33      89.6      0.00     0.00
#> "as.data.frame.list"       0.08     66.67      58.4      0.02    16.67
#> "as.data.frame"            0.08     66.67      58.4      0.02    16.67
#> "data.frame"               0.08     66.67      58.4      0.00     0.00
#> "<Anonymous>"              0.06     50.00      58.4      0.00     0.00
#> "do.call"                  0.06     50.00      58.4      0.00     0.00

But I am having trouble locating memory data in the data from profile::read_rprof(), which proffer uses to convert to pprof format..

packageVersion("profile")
#> [1] ‘1.0.3.9019’

samples <- profile::read_rprof(path)
#> Warning message:
#> Removing unexpected incomplete sampling information.

str(samples)
#> List of 6
#>  $ meta        : tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ key  : chr "version"
#>   ..$ value: chr "1.0"
#>  $ sample_types: tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ type: chr "samples"
#>   ..$ unit: chr "count"
#>  $ samples     : tibble [6 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ value    : int [1:6] 1 1 1 1 1 1
#>   ..$ locations:List of 6
#>   .. ..$ 1: tibble [1 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int 4
#>   .. ..$ 2: tibble [13 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int [1:13] 5 11 16 12 14 9 7 6 13 8 ...
#>   .. ..$ 3: tibble [2 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int [1:2] 2 1
#>   .. ..$ 4: tibble [3 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int [1:3] 3 2 1
#>   .. ..$ 5: tibble [4 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int [1:4] 18 17 10 15
#>   .. ..$ 6: tibble [8 × 1] (S3: tbl_df/tbl/data.frame)
#>   .. .. ..$ location_id: int [1:8] 9 7 6 13 8 7 10 15
#>  $ locations   : tibble [18 × 3] (S3: tbl_df/tbl/data.frame)
#>   ..$ location_id: int [1:18] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ function_id: int [1:18] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ line       : int [1:18] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ functions   : tibble [18 × 6] (S3: tbl_df/tbl/data.frame)
#>   ..$ function_id: int [1:18] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ name       : chr [1:18] "[" "[_data_frame" "[[" "[<-" ...
#>   ..$ system_name: chr [1:18] "[" "[.data.frame" "[[" "[<-" ...
#>   ..$ filename   : chr [1:18] "" "" "" "" ...
#>   ..$ start_line : int [1:18] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ .file_id   : int [1:18] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ .rprof      :List of 3
#>   ..$ header: chr "memory profiling: sample.interval=20000"
#>   ..$ files : chr(0) 
#>   ..$ traces: chr [1:6] ":478394:774433:53618096:9090:\"[<-.data.frame\" \"[<-\" " ":391832:614469:36974056:19040:\"mode\" \"%in%\" \"deparse\" \"paste\" \"deparse1\" \"force\" \"as.data.frame.nu"| __truncated__ ":559003:984469:64613920:22356:\"length\" \"[.data.frame\" \"[\" " ":484278:820469:52243128:22499:\"[[.data.frame\" \"[[\" \"[.data.frame\" \"[\" " ...
#>  - attr(*, "class")= chr "profile_data"
@krlmlr
Copy link
Member

krlmlr commented Jan 8, 2025

Thanks. I never actually looked into memory profiling, it's unlikely that the data is captured. Would you like to take a stab at extending the data model we're using here?

@wlandau
Copy link
Member Author

wlandau commented Jan 8, 2025

I'm not sure if I will have the capacity, but I made a note in case I have extra time.

@snowpong
Copy link

snowpong commented Jan 9, 2025

So the task would be to extend the internal format of profile with memory information, and find out how that could then be mapped to pprof file format?

@krlmlr
Copy link
Member

krlmlr commented Jan 10, 2025

I think so, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants